Friday 3-Sep-2010.
New Book
XRay XML Editor
Company
University
Solutions
<TAG>
Xmlu.com
Current Weather
Ski Conditions

Article from June, 1998.


The Document Object Model

By Bob DuCharme

Bob DuCharme is the author of " SGMLCD," a tutorial and users guide to free SGML software available from Prentice Hall. He also contributed to the " SGML Buyer's Guide" in the same series and QUE Publishing's "Using SGML."


Abstract

Regular columnist Bob DuCharme discusses the W3C Document Object Model from a programmers perspective. This is a preview of the final DOM, which is expected in the Summer of 1998.


SGML has always had a lot in common with the object-oriented approach to modeling data:

  • Both address the problem of representing data that won't fit neatly into the tables of fixed-width fields that make up a relational database.

  • Both define object classes (or in SGML's case, element types) by specifying what kinds of data and what members of other defined object classes make up the members of each newly defined class.

  • Both let you define attributes as categories of information that you can assign to each object.

The key difference is that the object-oriented principle of encapsulation means that the design for a class of objects must include the specification of the objects' behavior: what a given class of objects can do and what you can do with these objects. A key tenet of SGML, on the other hand, is the avoidance of specifying element "behavior" in order to give maximum flexibility to the application designers using that data.

For developers creating an object-oriented application around SGML and XML documents, this common ground gives them a good head start. Much work remains, however; incorporating SGML/ XML documents into the object-oriented analysis and design process means specifying the behavior of each class of objects-you must list the functions that can be performed with the various document components, specify the parameters to pass to these functions, and specify the data types of these parameters and of each function's return value. The gray area around the common approaches of SGML/ XML systems and object-oriented systems must also be sorted out; for example, an object-oriented system offers multiple ways to define the relationship between a particular element's data, subelements, and attributes.

A dozen different experts in SGML/ XML systems and object-oriented development would probably come up with twelve different approaches to these problems-if they were working separately. Working together, under the auspices of the World Wide Web Consortium (or W3C, the Internet standards body that oversees XML, HTML, HTTP and many other standards), they could put together an object model and interface definition that served as a common starting point for application development on many different platforms, in many different languages. This would enable people to create systems with better potential for fitting in with other systems because of their common approach to the data they use.

This group of experts is not hypothetical. They are the W3C's Document Object Model Working group, and they began work on this in the spring of 1997, releasing their most recent working draft last April 16th. A W3C press release quoted DOM Activity Leader Arnauld Le Hors as saying that the working group "is developing a platform- and language-neutral program interface that will allow programs and scripts to access every element in a document and update the content and structure of documents in a standard way." To object-oriented people, the "interface" is the list of classes, their structure, and the functions you can use with the members of each class.

The "implementation," on the other hand, is the code that actually does the work. When C++ programmers buy a class library to handle a certain job-for example, graphics functions-the implementation is the library that they link in with their own code to make the special graphics features available in their applications. They know about their choice of features by looking at a class library's interface, which describes the available classes and the functions.

The Document Object Model is an interface waiting for implementations in C++, Java, Visual Basic, ECMAScript, Perl, Python and any other popular document application development languages, not to mention the browser and editor implementations that will use the DOM classes. Considering the companies represented in the working group (Netscape, ArborText, Microsoft, INSO, SoftQuad, Sun, Novell, and IBM) we won't have to wait long.

The wide variety of backgrounds and viewpoints on the Working Group will help to make the Document Object Model useful to a broad range of document system developers. According to Lauren Wood, chair of the Working Group, "No single company could afford to put so many good people on to a project as this Working Group has. There are members from the Web HTML world as well as those from the SGML world. The advantage of having members from the HTML world and the SGML world is that, although it can slow the process a little while we try to find the best solution, we are merging the two world viewpoints to come up with a design that will be effective in both worlds, and enable users and implementers to move from one to the other." A common, well-planned document model will make it easier to integrate a wider array of document processing tools. Anyone who has ever moved from one SGML document manipulation tool to another knows how different approaches to the document processing model can steepen the learning curve. Each system has different ideas about what you want to do with document elements and what information is necessary to perform each action, so that learning programming syntax issues such as the comment delimiter or the definition and use of an array are the least of your troubles. When multiple systems use the same model, moving from one to the other-even if you're going from Visual Basic to Java coding-will be much easier. The DOM Working Group (and the rest of us) hope that vendors of these existing systems will provide DOM interfaces to their own data structures and application programming interfaces, joining in the effort toward easier interoperability.

The Working Group has several W3C working drafts (available at http://www.w3.org/TR/WD-DOM ) that each specify different aspects of the Document Object Model:

  • The DOM Requirements document lists the specific activities that the DOM must describe: structure navigation, document manipulation (such as adding, removing, and editing of elements and attributes), and similar manipulation of content and DTDs.

  • The Core DOM document defines the components and navigation abilities necessary to work with a broad range of marked-up documents. It specifies the basic features that provide a foundation for other work that can use these functions to define more complex operations such as querying and filtering.

  • The HTMLDOM specification builds on the Core DOM document to allow the manipulation of HTML pages.

  • The XMLDOM, to quote its abstract, "defines a set of objects that extends the Document Object Model (Core) such that the combination can represent all parts of a parsed XML document, and to allow XML validity checkers to be written using the interfaces described."

Such abilities will be nice with HTML documents, but particularly welcome to the XML world as we await the development of the XML applications we want (such as browsers that implement XML, XLink, XPointer, and XSL) and as we think of XML applications to develop ourselves, if they're not too much trouble. Commercial and free implementations of the Document Object Model will drastically lower that "too much trouble" bar for us. <end/>

Format for Printing



HomeContactusCopyright
All original material on this site is copyright © 1994-2010 by Architag International Corporation, All rights reserved. No part of this information may be reproduced in any form without express permission from
Architag International Corporation.