Thursday 9-Sep-2010.
New Book
XRay XML Editor
Company
University
Solutions
<TAG>
Xmlu.com
Current Weather
Ski Conditions

Article from August, 2002.


XSD and Namespaces

By Brian Travis and Mae Ozkan

Brian Travis is founder and Chief Technical Officer of Architag International Corporation and Managing Editor of <TAG>

Mae Ozkan is Chief Architect of Architag International Corporation.

Their most recent book, Web Services Implementation Guide , offers a practical and technical explanation of what web services are, and how to make them work. This article is adapted from the book.


Abstract

An XML document, by itself, is not really good enough. Anyone can create an XML document that is perfectly conformant with W3C XML specification, but completely unreadable by anyone else.

That is because the W3C XML specification only defines the syntax for creating a markup language, but does not specify the tags that makeup the markup language. It is the "Schema " that specifies the markup language and enforces it.

This article shows how an XSD document specifies a markup language and the meaning and role of namespaces.


XML Schema Definition (XSD)

In September of 2001 the W3C Schema Working Group finished their work on a new schema syntax designed to be an alternative to the DTD. In 2001, it was approved by the W3C members. This syntax is called XML Schema Definition, or XSD. Microsoft has implemented XSD in version 4 of their MSXML tools suite. Most other XML parser manufacturers have also implemented XSD, so it is now the preferred schema syntax for interoperability.

An XSD schema is much more complex than the DTD or XDR, but it also has a simple personality that makes it easy to get started. Do not let that simplicity fool you, however. XSD can also be used to describe very complex data structures with all of the modern object-oriented features you can think of.

A simple XSD schema that describes our weather forecast is shown in An XSD Schema.

<?xml version="1.0"?>
<schema id="weather" elementFormDefault="qualified" 
	targetNamespace="Weather Markup Language"
	xmlns="http://www.w3.org/2001/XMLSchema ">
	
	<element name="Weather">
		<complexType>
			<sequence>
				<element name="city" type="string"/>
				<element name="temperature" type="integer"/>
				<element name="wind" type="integer"/>
				<element name="forecast">
					<complexType>
						<sequence>
							<element name="day" minOccurs="3" maxOccurs="10">
								<complexType>
									<sequence>
										<element name="temperature" type="integer"/>
										<element name="wind" type="integer"/>
									</sequence>
									<attribute name="date" type="date"/>
								</complexType>
							</element>
						</sequence>
					</complexType>
				</element>
			</sequence>
			<attribute name="date" type="date"/>
		</complexType>
	</element>
</schema>
						
An XSD Schema

The XML Schema Definition syntax is accepted by all members of the World Wide Web Consortium.

Namespaces

Before we go any further, we need to talk a little bit about namespaces. This XSD schema is a formal specification for a markup language, as we talked about earlier. That markup language needs a name, so we give it one on line 003. There, we indicate targetNamespace and assign it the value Weather Markup Language .

On our target document, then, we need to put an XML namespace declaration to indicate that that document is to use the Weather Markup Language . We will do that by creating an XML Namespace declaration: xmlns="Weather Markup Language"

XML Namespaces probably holds the human record for the smallest document ever created by a committee. The document is 11 pages long, but only seven pages contain the meat of the standard. You can get the specification at the W3C Web site, but do not expect the specification to mean much. It is written in "standards speak " , and really needs to be interpreted by a human to be useful. We'll be your humans today.

You will see lots of namespace declarations as you look at XML documents. Many, if not most, of them look like URLs. That is, they have the form "http://...something... " . This is misleading, as you would expect that you could type the string into your Web browser and go somewhere useful. This is a natural tendency, as we were all genetically programmed to do such a thing. Type in a URL and the faithful server responds with something you can use. You want to get a schema, or a software specification, or a listing of the tags, or even just a phone number of the person who can help us with more information.

However, you will be lucky to get anything at that "address " . That's because the XML namespace declaration is not a URL. It is a URI. "URI " stands for "Uniform Resource Identifier " . If an XML document has a namespace declaration, then the string that follows is sent to the application as a namespace. The string means nothing to the parser. That's right, it's just a string to the parser. The parser does not go out to any resource to check and see if there is a schema there. Since it is a string, its only requirement is that it be well-formed.

When that string is sent to the application, it is up to the application to do something useful with the string. You can instruct your application to ignore the string, or you can have the application look up the string in some kind of keyed environment or registry to try to resolve it to something that the application can use for validation. Or, if the namespace URI looks like a URL, you could have your application go to that resource and see if there is anything interesting there.

Let us illustrate by using an example in the human realm. One of us is named "Brian " . That is a string that his mom designed as a way of identifying him. However, there is nothing in that string that tells you where he is in the world right now. There is nothing in that string that tells you what kind of beer he likes. In order to get such information, you need to find someone who knows Brian. You might go to a friend of his and say, "Do you know Brian? " . At first, they will deny any knowledge of him, but if you keep pressing, they will admit that they do know him.

Then you can ask them information about him. If you ask what kind of beer he likes, they will laugh, then say that he prefers the "A-N-Y " brand of beer. "Any " beer will do.

XML namespaces are like that. In and of themselves, they do not really have anything to do with a collection of elements and attributes, except that they identify the collection with a name. You need to go to someone who knows how to resolve the namespace to find information about the members.

Having said all of that, you might be comforted to know that there are conventions that people are doing to help you get information on a document that is identified with a namespace. Quite often, if a namespace looks like a URL, there will be something at that endpoint that will give you, the human, some guidance in how to interpret the namespace. The important thing to understand, however, is that this is not official according to the XML namespaces specification, and just as often, you might just get a 404 error.

So, an XML namespace is just a string to the parser. It does nothing to resolve the namespace into anything meaningful. The parser passes the string to the application. It is the job of the application to do something useful with that string.

You can have as many namespace declarations on an element as you want, but only one of them can be the default namespace. The rest must declare a unique namespace prefix. A default namespace is created in the form that we have seen here: xmlns="this is a namespace URI" .

Any element or attribute that is within the scope of the element in which the default namespace is declared is said to be a member of that namespace, unless it is otherwise overridden.

If you want to override the default namespace, there are two ways. You can define another default namespace at any element that is a descendant of the element. In this case, that default namespace will be active until it's element is ended, at which time the higher namespace will take over.

Let us illustrate in An XSD Schema. There is a default namespace declaration on line 002. All elements and attributes within the scope of the patient element are members of the "patient ns " namespace. So patient and name are members, but notice that there is another default namespace declaration on line 004, in the age element. That namespace overrides the patient ns namespace for the scope of the age element. So age , base , and years are members of the age ns namespace. When age ends, the age ns namespace goes away, and the higher namespace takes over. The health element, then, is a member of the patient ns namespace.

<?xml version="1.0"?>
<patient xmlns="patient ns">
	<name>Brian Travis</name>
	<age xmlns="age ns">
		<base>16</base>
		<years>29</years>
	</age>
	<health>excellent</health>
</patient>
Redefining the Default Namespace

The default namespace can be defined at any level of the XML hierarchy. Redefining the default namespace results in a new namespace for the scope of the element in which it was declared.

It's really as simple as that.

You can only have one default namespace on an element.

There is another form of the XML namespace declaration that can be used if you have many different types of namespaces on a single element. In this case, you need to create namespace declarations and assign them a namespace prefix.

This is done using the following syntax: xmlns:p="patient ns"

Notice the :p attached to xmlns . We are declaring a namespace prefix called p:patient , which points to this namespace. Now, whenever we want to refer to elements or attributes that are members of the patient ns namespace, we need to prefix them with the patient namespace prefix.

You can have as many prefixed namespace declarations as you like on any element in your XML document.

The document shown in An XSD Schema is exactly equivalent to the document in An XSD Schema.

<?xml version="1.0"?>
<p:patient 
	xmlns:p="patient ns"
	xmlns:a="age ns">
	<p:name>Brian Travis</p:name>
	<a:age>
		<a:base>16</a:base>
		<a:years>29</a:years>
	</a:age>
	<p:health>excellent</p:health>
</p:patient>
Using Namespace Prefixes

The default namespace can be overridden by prefixing elements and attributes with a namespace prefix.

You can mix the default namespace with prefixed namespace. The document in An XSD Schema is exactly equivalent to the other two.

<?xml version="1.0"?>
<patient 
	xmlns="patient ns"
	xmlns:a="age ns">
	<name>Brian Travis</name>
	<a:age>
		<a:base>16</a:base>
		<a:years>29</a:years>
	</a:age>
	<health>excellent</health>
</patient>
Mixing Default Namespace and Namespace Prefix

Mixing the default namespace declaration can be done using prefixed namespace declarations.

Mixing it the other way is also possible. An XSD Schema is exactly the same, also.

<?xml version="1.0"?>
<patient:patient 
	xmlns:patient="patient ns"
	xmlns="age ns">
	<patient:name>Brian Travis</patient:name>
	<age>
		<base>16</base>
		<years>29</years>
	</age>
	<patient:health>excellent</patient:health>
</patient:patient>
Mixing Namespaces Another Way

Mixing the default namespace declaration can be done several different ways.

This last one might seem kind of strange. Remember that all elements and attributes in a document that has a default namespace declaration are members of that namespace unless they are overridden. We have seen that there are two ways to override the default namespace. One is by redefining the default namespace. The other is by indicating a namespace prefix. In An XSD Schema, we can see that the default namespace on the patient element is age ns . Even though we define the default namespace on the patient element, that element itself is not even a member. It is overridden using the patient: namespace prefix.

The same goes with name and health . The age element is not overridden, so it is a member of the default, age ns .

Now, let's get back to our weather document.

As we mentioned before, it is up to the application to resolve the namespace declaration and find the appropriate markup language. XRay is an application that has namespace support built-in. If you have an XSD schema currently open in XRay, you will see the targetNamespace indicated in the status line at the bottom of the window, as shown in An XSD Schema.

The Target Namespace Indicator

XRay shows the name of the target namespace in the status bar.

Now that we have identified an XSD schema and the associated namespace, we can indicate that namespace in our XML document by using the xmlns declaration on line 3. This is illustrated in An XSD Schema.

Specifying the Weather Markup Language

Associating an XML document with the Weather Markup Language namespace is done by setting the default namespace in the root element.

Notice the status line, XRay has found the Weather Markup Language and is indicating that it is being applied to this document.

It is important to note that XRay is making this association using its built in understanding of the world in which it lives. In other words, it is the application that is making sense out of the namespace and applying the associated schema.

If you use namespaces, it is up to your application to associate a schema with a document. However, you do not necessarily need to associate a schema with a document to make sure it is correct. For example, your application could recognize a particular namespace, and use that as an indicator that the document should be processed using a set of functions that validate the document in ways the XML parser cannot. Your application might use the value of some element or attribute as a variable that is placed into a database query.

The point is that a namespace does not always indicate that a schema is to be applied to the document. In later chapters, we will see how namespaces are used as a kind of library inclusion.

Back, again to our weather document. Notice that, by associating the document with an XSD schema, we have broken it. The error that is shown in An XSD Schema indicates that the 29th of February is not a valid date.

Fixing the date error, we get another error, as shown in An XSD Schema.

Structural Error

There is a problem with the document because it does not adhere to the structure specified by the schema.

We can see in An XSD Schema that the XSD schema states that a forecast has a minimum number of three days and a maximum number of 10 days. Our document only has two days.

minOccurs and maxOccurs

XSD data boundaries indicated by minOccurs and maxOccurs.

This is indicated with the minOccurs (minimum number of occurrences) and maxOccurs (maximum number of occurrences) attributes on the day element declaration.

There's a good reason for such boundary-setting. If we are doing a weather forecast, less than three days is useless, and if we get more than 10 days, it will be too inaccurate for any useful purpose.

Our document has only two days, and so does not meet the minimum number of occurrences. Fixing the error by adding a day is shown in An XSD Schema.

Valid Document

An XML document is valid according to the Weather Markup Language when the error area is green.

Three important things are happening in this document. First, the document is well-formed, as all XML documents must be. Second, the document is valid according to the markup language specified by the XSD schema. Lastly, the datatypes are correct according to the XSD schema.

Format for Printing



HomeContactusCopyright
All original material on this site is copyright © 1994-2010 by Architag International Corporation, All rights reserved. No part of this information may be reproduced in any form without express permission from
Architag International Corporation.