Thursday 29-Jul-2010.
New Book
XRay XML Editor
Company
University
Solutions
<TAG>
Xmlu.com
Current Weather
Ski Conditions

Article from December, 1997.


XML: Leading SGML Beyond Publishing

By Bob DuCharme, XML Correspondent

Bob DuCharme, a software engineer at Moody's Investors Service, is also the author of The Operating Systems Handbook , a crash course in mini and mainframe operating systems, published by McGraw-Hill Professional Books. © 1997 Bob DuCharme.


Abstract

XML, although originally designed to increase the proficiency of data exchange over the Web, is capable of solving so many other issues. Companies and industries have created many proprietary ways of solving this problem, but what happens when a transmission gets cut off and the tab delimited information isn't complete. Bob explains how XML elements as self-defining data structures addresses this issue. He continues to discuss the possibilities of transforming entire business practices and new tools that may be created because of XML. He then reminds us that it will also make Web sites cooler.


[Editor's Note: We are pleased to welcome Bob DuCharme as a new regular contributor to <TAG> . Bob has been an advocate for SGML for many years, and he has written The SGMLCD , published by Prentice Hall, and The Operating Systems Handbook , published by McGraw-Hill Professional Books. Bob will join <TAG> as a correspondent watching the progress of XML, and reporting important issues to our readers.]

XML seeks to ease SGML delivery over the web, but it's accomplishing something much grander: it simplifies the delivery of structured information from one process to another. On the small scale, this makes it easier to develop applications that exchange data; on the large scale, many see the possibility of entire businesses being transformed.

Many who thought of SGML as a culty electronic publishing fringe group now look to XML to handle some of the biggest headaches of interprocess communication. For years, developers responsible for sending information from one system to another each came up with their own notation to delimit the pieces of information and to identify their relationship, as well as their own code to parse their notation. Some conventions provided guidelines; for example, data that fits into a table may have its "fields" delimited by commas or tabs and its "records" separated by carriage returns, but what if the data was more complex than a two-dimensional table? What if one field actually included a delimiter character as data? What if the transmission was cut off?

For example, a program passing the item number, unit price, and quantity figures for a given order might pass the string "781,63,512." But what if a change in specifications led the receiving process to mix up the purpose of the second and third parameters? For the businesses making and filling the orders, such mistakes are expensive.

XML Elements as Self Defining Data Structures

XML provides an easily parseable notation for almost arbitrarily complex data structures, and the notation that identifies the pieces of information and their relationships can serve as a data structure schema packaged right in there with the data itself. Whether the first process sends

<order>
				<itemno>781</itemno>
				<quantity>512</quantity>
				<itemprice>63</itemprice>
				</order>

or

<order>
				<itemno>781</itemno>
				<itemprice>63</itemprice>
				<quantity>512</quantity>
				</order>

to the second process, it relays a large amount of unambiguous information:

  • The three fields of information and their identity.

  • The sibling relationship of the three pieces of information: they all belong to the same product order. (Obviously, more complex structures are possible; imagine a group of orders being sent inside of a container element that includes date and invoice attribute specifications.)

  • Successful receipt of the order end-tag shows that the transmission of the order was complete.

There are two reasons that XML makes sense here, while SGML didn't:

  • First, we don't need a DTD to extract the order element's information. While developers working on such an order communication system must agree on a DTD as part of the protocol specification--after all, the receiving program must know whether to look for an element called "itemno" or "itemnum"--a DTD isn't necessary to parse this data.

  • Second, the ability to parse a DTD and all the potential document markup allowed by full SGML means adding so much code to the receiving application's system that it would rarely be worth it. XML software can assume default values for so many SGML markup options that the code required to parse the data can be much smaller, leaner, and easier to write, or easier to find on the Internet.

Tim Bray, Norbert Mikula, and Microsoft have each already given the world Java classes to parse XML (Bray's in particular focuses on leanness of code), and soon code in many other languages will be available to automate the storage of XML elements in instance variables, fields, records, or whatever data structures each language offers. (The LISP and Scheme hackers, however, will snicker as they see this further evidence of their old adage "those who do not know LISP are condemned to reimplement it." A very simple transformation would convert the order example to "(order (itemno 781) (itemprice 63) (quantity 512))," which LISP and Scheme can parse and use with no new code. Incidentally, the pride with which XML's first advocates, at the SGML '96 conference, described the brevity of XML's specification compared with SGML's reminded me of the similar pride that Scheme's devotees take in comparing its specification to that of its larger, more versatile and difficult to implement parent LISP.)

It's not mere coincidence that Java XML tools have become available more quickly than XML parsing code in other languages. The kind of handheld and Internet-based applications that Java was designed for will prove particularly fertile environments for the growth of XML-based applications such as personal information managers, reference works, and electronic shopping.

The Possibilities

Many organizations have already begun this. The HTTP Distribution and Replication Protocol ( DRP) submitted to the W3 Consortium by Marimba, Netscape, Sun, Novell, and &commat;Home aims to make the distribution and update of file collections especially, more efficient by encoding the appropriate information in XML. The Web Distributed Authoring and Versioning protocol (Web- DAV) proposed by Microsoft, Netscape, Novell, and others, uses XML to improve on HTTP itself (the protocol used to actually send HTML and other files to your web browser). The Open Software Description Format ( OSD) proposed to the W3C by Microsoft and Marimba with the endorsement of Netscape, Lotus, InstallShield Software, and others, is an XML application that describes software packages and their dependencies in order to automate software distribution. The XML/ EDI group plans to revolutionize the Electronic Data Interchange systems used to automate business transactions by replacing the various protocols developed over the years with XML applications whose simplicity and power can make huge differences in the future of electronic business transactions on and off the web.

The excitement has even reached the mainstream press. A breathless article in a November issue of Time Magazine hints how entire businesses could be transformed: "If you can get everyone in, say, the real estate business--the brokers, the escrow agents, the mortgage banks--to adopt XML, says [electronic commerce advocate Marty] Tenenbaum, 'you really can start to think about changing the rules: paperless closings, real-time mortgage bidding...'"

And don't forget the handheld devices. Someday when you click an "Update" button to synchronize the address book and appointment calendar on your personal digital assistant with the ones on your PC, the two computers will send XML elements back and forth. When you receive e-mail asking you to attend a meeting, clicking "Yes" will trigger the e-mail program to send XML elements to your appointment calendar utility, adding the meeting's time and place. Documentation for these programs' APIs will provide a DTD so that you can create software that reads from and writes to them.

There's one more application I almost forgot: when you want to share a document with others in your company or on the web, you'll put an XML document on a server where applications known as "browsers" retrieve it via TCP/ IP and then display the different elements using formatting instructions stored in a separate file that also uses XML syntax. XML will be great for that too! <end/>

Format for Printing



HomeContactusCopyright
All original material on this site is copyright © 1994-2010 by Architag International Corporation, All rights reserved. No part of this information may be reproduced in any form without express permission from
Architag International Corporation.