FilesDesk

2012

This started out as a side project between a couple of us at work to sharpen our C#/XML skills and generally do something challenging. A C#.NET Application + Service that enables browsing files on disk by Tag (as opposed to Path). Typical usage:

  • A Tag Cloud for your hard disk
  • Add/remove tag(s) to selected file(s)
  • List & Filter files by tag(s)
  • Monitor specified folders (ex: My Documents) for new/deleted/renamed files
  • Portable XML Database

Useful for Researchers, Writers and Obsessive-Compulsive File Hoarders.

XML To TreeView

Another nifty one at MSDN (no wait, it’s Microsoft Support!) about populating a TreeView control with XML data.

XML Schemas

An XML schema describes the “type” of an XML document, typically expressed in terms of constraints on the structure and content of XML documents of that type. These constraints or rules are defined above and beyond the basic syntax rules which qualify a document as being an XML document.

As an analogy in Object-Oriented Programming, think of an XML schema as a “class” and XML documents conforming to the schema as “instances” or “objects” of that class.

Several languages have been developed specifically to express XML schemas. “Validating Parsers” are used to validate the conformance of XML documents to XML schemas. The most common type are “DTD-Validating Parsers”, which support the Document Type Definition (DTD) language. DTD is a schema language of relatively limited capability, native to the XML specification.

Finally, Schemas can be programatically generated from XML Documents, with a little patience.

Resources:

XQuery

XQuery is a query language semantically similar to SQL, designed to query collections of XML data. From W3C: “The mission of the XML Query project is to provide flexible query facilities to extract data from real and virtual documents on the World Wide Web, therefore finally providing the needed interaction between the Web world and the database world. Ultimately, collections of XML files will be accessed like databases”.

Of all the possibilities that XQuery potentially offers, one of the most interesting would be its use in solving the Web’s “offline problem” and giving users seamless access to their data with or without an Internet connection.

Resources:

XQuery is a superset of XPath, which is an expression language for addressing parts of an XML document and possibly computing values based on its content.

XPath Resources:

XML Document Design Considerations

Resources on design considerations of XML documents:

Well-Formed XML Documents

There are two levels of correctness of an XML document:

  • Well-formed XML documents basically conform to XML sytnax rules, and nothing else. “Conforming Parsers” are not allowed to process XML documents that are not well-formed.
  • Valid XML documents, in addition to being well-formed, conform to some semantic rules, typically user-defined by means of an XML schema or DTD. “Validating Parsers” are not allowed to parse XML documents that are not valid.

The most significant rules that an XML document must follow to qualify as being “well-formed” are:

  1. There must be one, and only one, root (top-level) element.
  2. Non-empty elements must be delimited with matching start and end tags.
  3. Empty elements must have a self-closing tag.
  4. Attribute values must be delimited with matching single *or* double quotes.
  5. Tags may be nested but must not overlap.
  6. Element names must follow naming conventions:
    • Names can start with letters (including non-Latin characters) or the “_” character, but not numbers or other punctuation characters.
    • After the first character, numbers are allowed, as are the characters “-” and “.”.
    • Names can’t contain spaces.
    • Names can’t contain the reserved character “:”, unless namespaces are being used.
    • Names can’t start with the letters “xml”, in any case (upper/lower/mixed).
    • There can’t be a space after the opening “<” character; the name of the element must come immediately after it. However, there can be space before the closing “>”character, if desired.
  7. The document must comply with the specified character encoding (if any). If not specified, the default encoding is taken as Unicode/UTF-8.

Notes:

  • Unlike in HTML, whitespace is retained in XML (Web Browsers use XSLT to transform XML to HTML for display, so whitespaces appear to have been stripped).

XML

“It would be hard to predict now what kinds of libraries might be needed in a hundred years. Presumably many libraries will be for domains that don’t even exist yet. If SETI@home works, for example, we’ll need libraries for communicating with aliens. Unless of course they are sufficiently advanced that they already communicate in XML.”

-Paul Graham, The Hundred-Year Language

XML has emereged as the de facto industry standard for specifying how to store almost any kind of data, in a form that makes it incredibly easy to interchange between applications running on different platforms.

Resources: