|
![]() |
Article from August, 2001. XSL Tutorial, Part 3: Getting to HTMLBy Brian E. Travis Brian Travis is founder and Chief Technical Officer of Architag International Corporation and Managing Editor of <TAG>. Abstract
This is part 3 of a multiple-part series on XSLT, a very important W3C standard that provides a way of transforming XML documents from one structure to another. XSLT can be used to create HTML, so your XML documents can be viewed in a Web browser, or XSL can be used to transform your XML documents to any other XML structure, or even non-XML structures. In this part, you will learn how to write template rules for converting an XML document into a pleasing HTML document. Then, you will see how to transform the document on the server, to provide cross-browser support of your information set. In February of last year, we published one in a series of articles describing the <TAG> newsletter online application. The article, http://architag.com/tag/Article.html?v=14&i=2&p=1&s=2 , described how to create a page using the Microsoft implementation of XML, commonly known as MSXSL. This implementation was based on a draft specification of the XSL specification almost two years before it was formally adopted. Now that the XSLT specification has been released from the W3C, I'd like to revisit the concept of taking an XML document and getting to HTML. XSLT is a rules-based, event-driven programming language. It is not a stylesheet language. For more on this topic, see Part 2 of this series at http://architag.com/tag/Article.html?v=15&i=8&p=1&s=1 Peeling the OnionI like to think of XSLT programming as similar to peeling an onion. First, the outside skin gets peeled, then more and more skins, until the entire onion is finished. For the XSLT program, this means building more and more functionality one step at a time. I will use that technique here. First, we need to create a baseline program that will process any XML document. An XSLT program is a plain XML document, so you can use any text editor to create it. I will be using the Architag XRay XML Editor for this example. You can get XRay at http://architag.com/xray . First, create the following stylesheet in your editor:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<DIV STYLE="margin-left:24pt;">
<SPAN STYLE="color:navy;">
[start:<xsl:value-of select="name()"/>]
</SPAN>
<xsl:apply-templates/>
<SPAN STYLE="color:navy;">
[end:<xsl:value-of select="name()"/>]
</SPAN>
</DIV>
</xsl:template>
</xsl:stylesheet>
The first line is, of course, the XML declaration, which is optional at the top of every XML document. The xsl:styesheet start tag has the required version attribute, and a namespace declaration that indicates that we are using W3C XSLT, which was formalized in 1999. This namespace is necessary so the XSLT processor knows which version to use. It is important to note that this is NOT a URL. It is a URI. In other words, it is just a string that the application uses to do something useful. You can't necessarily pop that string in a browser and expect to get anywhere. The string is just used by the application. Now get the document. I am using a document I found at http://architag.com/resources/article.xml . This document is an article that appeared in <TAG> in 1998 showing how to use XSL in the then-new Microsoft Internet Explorer 5 beta. XSLT has come a long way since then, as have tools that are used to process XSLT. Speaking of tools that process XSLT, you should install XRay and follow along the rest of the instructions. Load article.xml and your new stylesheet, which I call article.xsl, into XRay. Your screen should look like the mine shown in Figure 1.
Figure 1: XML document and XSLT program loaded into XRay.
In order to apply the transformation program to the XML document, you need to create an XSL Transformation window. In XRay, select File...New XSL Transform. You will see a new window with two drop down boxes. Select the XML and XSLT documents as appropriate, and you will see the result of the transformation in the window. This is shown in Figure 2.
Figure 2: XSL transform window in XRay.
Notice that the transformation produced an HTML page. You can see exactly what the program produced, but it is not very human readable. In order to view HTML documents, you can load the HTML View window in XRay. Select File...New HTML View. Select the transform window from the pulldown box. You should see the HTML rendered as it would be in a Web browser. Figure 3.
Figure 3: HTML View window in XRay.
We have just the shell now, the outer layer of the onion. This basic program will just display the structural contents of the document, showing where all of the elements start and end. Let's add some text. Add the following template rule inside your document. This should be placed just before the </xsl:stylesheet> end tag:
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
This code will trap all text that comes along and send it directly to the output. Your HTML View window should now look like the sample in Figure 4.
Figure 4: HTML View with text content.
One more template before our generic XSLT program is completed. Add the following rule:
<xsl:template match="/">
<HTML>
<BODY>
<xsl:apply-templates/>
</BODY>
</HTML>
</xsl:template>
This will process the root node (
/
). This is a good place to put all of your HTML header information, like CSS links, scripts, document titles, and things like that. Notice that
<xsl:apply-tempaltes/>
tag in the middle of the
<BODY>
element. This is where the entire contents of the article document will be processed.
This XSLT program will render any XML document in this generic way. It is the program that I use whenever I need to create a new output. Customizing the XSLT ProgramNow we need to start customizing this program for our particular document type. First, add a template rule that handles titles:
<xsl:template match="title">
<H1 STYLE="color:blue;"><xsl:apply-templates/></H1>
</xsl:template>
This template will process all titles, and wrap the contents with the H1 start and end tags. If you look at the HTML View window, you will see that's exactly what happened. However scoll down a bit farther, and you will see an unintended consequence. Section titles are also showing up as H1 elements, too. I don't want that; only the article title should be H1. So we need to create another template to handle the section titles. We will use the parent operator in our match attribute:
<xsl:template match="section/title">
<H2 STYLE="color:green;"><xsl:apply-templates/></H2>
</xsl:template>
Indicating
match="section/title"
tells the XSLT processor that we only want to find titles that are a child of section. Notice that the title on the section turned green, which means it is now an H2. The top title stayed blue. There are other titles in the document, though, so it is probably best to further qualify the top-level title by changing it to the following:
<xsl:template match="article/title">
<H1 STYLE="color:blue;"><xsl:apply-templates/></H1>
</xsl:template>
Let's process the author element. The author element contains two sub-elements: name and short-bio . I want to access the author's name, put the word "By " before it, and make the whole thing italic. Enter the following template:
<xsl:template match="author/name">
<DIV STYLE="font-style:italic;">By <xsl:apply-templates/></DIV>
</xsl:template>
That gives us the desired effect. Now let's process the acronym element,
acr
. An acronym element can have an optional
def
attribute, which indicates the expanded definition of that acronym. First, enter the following template:
<xsl:template match="acr">
<SPAN STYLE="font-size:70%;">
<xsl:value-of select="."/>
</SPAN>
</xsl:template>
Notice that this rule put all acronyms in an HTML SPAN element with a smaller point size. I like this because the capital letters don't stand out so much on the page.
But we've lost something here. If you look at the XML source document, you can see that some of the acronyms have a def attribute. We need to process this. Change the template to the following:
<xsl:template match="acr">
<SPAN STYLE="font-size:70%;">
<xsl:value-of select="."/>
</SPAN>
(<xsl:value-of select="@def"/>)
</xsl:template>
The new line will insert the value of the def attribute in between parenthesis. This is how typesetters have set this kind of thing for hundreds of years. However, while looking at the output, we can see a problem. Some of the acronym tags do not have the
def
attribute and, therefore, have empty parentheses. This is not right. We must be able to distinguish between those acronyms that have definitions and those that don't.
XSLT has an <xsl:if> element that we could use, but I don't like using that unless I really need to. Let's create another rule instead. chage the current template and add another one as shown:
<xsl:template match="acr[@def]">
<SPAN STYLE="font-size:70%;">
<xsl:value-of select="."/>
</SPAN>
(<xsl:value-of select="@def"/>)
</xsl:template>
<xsl:template match="acr">
<SPAN STYLE="font-size:70%;">
<xsl:value-of select="."/>
</SPAN>
</xsl:template>
The first template rule has an XPath "filter" pattern. This will return an acronym, but only if it has the def attribute specified on the start tag. It doesn't matter what the value of the def attribute is, just as long as it is specified. The second template matches all other acronyms. Which brings us to the topic of conflict resolution.
Conflict ResolutionThis is not where you go to your family counselor and try to figure out why your kids are fighting. This has to do with the XSLT process that determines which template rule is to be selected for a given event. A fundamental rule of XSLT is that only one template can fire for a given event. In the case where there is more than one candidate template, the XSLT processor much decide which one to choose. The XSLT specification talks about "specificity", and "priority", but all you need to know for now is, "the most specific wins". The XSLT committee worked on this problem for quite a while, and I must say that they did a great job. In fact, the less you think about how conflict between templates is resolved, the better off you are. In the case above, the pattern "acr[@def]" is more specific than the pattern "acr". When the XSLT program is first loaded, the XSLT processor assigns a default priority to each template rule. The more specific a pattern is, the higher the priority. When a conflict occurs, the XSLT processor picks the template with the highest priority. If there are two or more templates that have the highest priority, the XSLT processor picks the one that appears last in the program. This rarely happens, but it is possible. You can also set the priority attribute on any template to override the automatic calculation of priority, but I recommend against this in all but the most extreme cases. Like I said, the XSLT processor almost always does the right thing. Continuing OnThe process now is to continue adding template rules to your XSLT program until you get the output you desire. Next MonthIn next month's installment, we will be continuing this exercise, adding attributes to the output, and moving the elements around. Plus, we will be taking advantage of the HTML object model, creating information that is optimized for electronic delivery, instead of just duplicating the paper process.
URL for this Article:
http://architag.com/tag/Article.asp?v=15&i=8&p=1&s=1 This article was printed from the <TAG> Newsletter web site, http://www.architag.com/tag. Copyright © 2001 Architag International Corporation. All Rights Reserved. Printing, distribution, and use of this material is governed by U.S. and International copyright laws. |