Friday 3-Sep-2010.
New Book
XRay XML Editor
Company
University
Solutions
<TAG>
Xmlu.com
Current Weather
Ski Conditions

Article from April, 1999.


Alternatives to XMLDTDs: Four Proposals

By Bob DuCharme

Bob DuCharme is a senior software engineer at Moody's Investor's Service. He is the author of the brand new XML: The Annotated Specification, as well as SGMLCD, a tutorial and user's guide to free SGML software. Both books are part of Prentice Hall's Charles F. Goldfarb Series on Open Information Management.


Abstract

In follow-up to his analysis last month of XML's Document Type Definition (DTD) declaration syntax, Bob DuCharme focuses on the status of four alternative DTD schemas proposed by the W3C: XML-Data, XML Document Content Description (DCD), Schema for Object-oriented XML (SOX), and Document Definition Markup Language (DDML). In particular, DuCharme outlines the history and priorities behind each schema, and considers the functionality each affords to applications that manipulate metadata structured in XML.


In the last issue we looked at the reasons that some XML developers are dissatisfied with XMLDTD markup declaration syntax, and we surveyed the categories of new features included in the four alternative schema proposals submitted to the World Wide Web Consortium ( W3C). In this issue, we'll look at the origins, history, and priorities of each proposal as well as some features that the W3C Schema Working Group members could add to the schema proposal that they'll draft after studying the four existing proposals.

XML-Data

XML-Data became a W3C Note (an officially accepted submission) on January 5, 1999, making it the only proposal of the four to predate the XML specification's official status as a W3C Recommendation. It is available at http://www.w3.org/TR/1998/NOTE-XML-data/.

XML-Data started off as a position paper from Microsoft ( http://www.oasis-open.org/cover/xml-dataAnn.html ) and played a role in the Channel Definition Format used to implement the push technology that sends channels such as MSNBC to their Internet Explorer 4 Web browser. ( http://www.oasis-open.org/cover/xml-data9706223.html ) Most of the authors listed on the W3C Note are from either Microsoft or DataChannel, a Washington state company working closely with Microsoft on XML technology.

The XML-Data proposal presents itself as something more general than a mechanism for describing XML schemas: "It can be used for classes which as (sic) strictly syntactic (for example, XML) or those which indicate concepts and relations among concepts (as used in relational databases, KR graphs and RDF)." As the first alternative schema proposal, it set the baseline for what XML developers could hope for above and beyond traditional DTDs. It includes provisions for inheritance of element type definitions by identifying "supertypes" as well as the ability to specify data types and minimum and maximum values for numeric values. In fact, the only common feature of the four new proposals not to be found in XML-Data is a provision for global attributes.

Document Content Description for XML ( DCD)

The "Document Content Description for XML," a proposal edited by XML specification co-editor Tim Bray along with a representative from Microsoft and one from IBM, was submitted to the W3C as a Note ( http://www.w3.org/TR/NOTE-dcd ) on July 31, 1998. Its Abstract tells us, "The DCD proposal incorporates a subset of the XML-Data Submission and expresses it in a way which is consistent with the ongoing W3CRDF (Resource Description Framework) effort; in particular, DCD is an RDF vocabulary." The Resource Description Format, which is split into several parts each currently working their way through the W3C Recommendation process, describes itself as a "foundation for processing metadata" ("data about data") that makes it easier for applications to search, sort, and manipulate data. DCD's use of RDF means that applications will work better with other applications that use this standard. The downside is that there may not be many using RDF, because-rightly or wrongly-some people accuse RDF of being too complicated to be useful.

DCD's relationship with XML-Data is more complex than simply being a subset of it-is it a competing proposal? Does it replace XML-Data as Microsoft's current favorite schema plan? Inconsistent answers from Microsoft, taken as a whole, have been read to mean this: XML-Data is not history yet, because it is implemented in Internet Explorer 4 and perhaps even 5. Because the four proposals will all be replaced by the work of the W3C Schema Working Group, DCD may never be implemented. Look at it this way: now that schema proposals do not mean "here is the schema that everyone should use" but instead "Hey! W3C Schema Working Group! Here are some features you should implement and standards to interact with when you come up with a new schema plan!" we might consider DCD to be Microsoft's most recent vote on schema issues.

Interesting DCD features include global attributes, data typing modeled on the choices provided by SQL, the specification of minimum and maximum possible values for numeric element content, inheritance of element type definitions, and a mechanism for specifying the data type of an element in a document that has neither a traditional DTD nor a DCD schema specified for it.

Schema for Object-oriented XML

SOX became a W3C Note ( http://www.w3.org/TR/NOTE-SOX ) on September 9, 1998. Two of its three authors are from Veo Systems (which has since merged with Commerce One), a key developer of XML-based e-commerce systems. (The other author, Murray Maloney, is an independent developer and member of the original XML Working Group.) "Electronic Commerce" is mentioned four times in the proposal, and the requirements of efficient e-commerce development clearly influenced it.

The SOX Note's abstract tells us that it was "informed by" XML-Data and DCD, among other things. Despite mentioning XML right in its title, SOX, like XML-Data, sets its sites higher: it "is now being proposed not only as an XML instance replacement syntax for SGML and XML document type definitions, but a (sic) modeling language for information modeling itself."

As its title reveals, SOX is the most consciously object-oriented of the four proposals. While transformation of schema from any notation to class definitions for an object-oriented language such as Java or Smalltalk is not too difficult, SOX makes it an explicit goal in order to make this transformation as simple as possible. (Straightforward mapping to relational database structures is also a stated goal.)

To implement this, it allows for element and attribute inheritance, extended data typing, and user-defined data types. It also allows the definition of an "attribute interface," which "is similar to one of the uses for an XML parameter entity, but far more powerful than that. An attribute interface is a named object that contains one or more attribute definitions." This resembles the Java concept of an interface, offering several advantages of inheritance without requiring the complete definition of a class (or element type) to create a structure that can be easily re-used.

Other handy features of SOX include the enumeration of allowable values for any attribute or data type, the specification of minimum and maximum values for attributes, and structured documentation of schema (that is, specific element types provided for describing defined schema structures).

Document Definition Markup Language ( DDML)

The W3C accepted DDML as a Note (< http://www.w3.org/TR/NOTE-ddml ) on January 19, 1999. This cooperative effort of xml-dev mailing list members was originally called XSD ( XML Structure Definitions), then XSchema, and finally DDML upon submission to the W3C. While three members of the list are listed as DDML's editors, an appendix lists sixty-three contributors, including at least one editor or author from each of the other three schema submissions.

Although the DDML note says that it was "designed with future extensions, such as data typing and schema reuse, in mind," the facilities described in the note focus on the storage of schema information as XML elements and the elimination of physical structure information from the schema. Other features include structured documentation, identification of likely root element types in a schema, and the use of XLink to reuse schema definitions.

DDML asserts that it "is an experimental schema language designed to provide a starting point for.experiments [in schema design]." Despite this admission of its malleability, DDML's origin in an active developer community means that software supporting it may be more readily available than software supporting the other schemas. Xml-dev member Rick Jelliffe has already written a script that converts XML Markup Declarations to DDML.

The W3CXML Schema Working Group

So far, the Schema Working Group has released an " XML Schema Requirements" document ( http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215 ) listing their priorities as they develop their own proposal for an alternative to traditional DTD syntax. This short document outlines a schema language's responsibilities and identifies features currently not available in XMLDTDs that the Working Group would like to see in a new schema language.

Along with many key features of the four submitted proposals, the Schema Requirements document identifies several interesting new issues to address:

  • The possibility of including binary data as a data type

  • The impact of "query formulation and optimization" on good schema design

  • Schema maintainability--the Note lists "mechanism for addressing the evolution of schemata" as a "Structural Requirement."

The Working Group includes authors, editors, and contributors of all four proposals, so a real consensus opinion is possible. Because the W3C is ultimately a vendor consortium, it will be very interesting to see what new schema capabilities the Working Group decides are most worthwhile to offer to future XML applications. <end/>

Format for Printing



HomeContactusCopyright
All original material on this site is copyright © 1994-2010 by Architag International Corporation, All rights reserved. No part of this information may be reproduced in any form without express permission from
Architag International Corporation.