XML Presentation

Download Report

Transcript XML Presentation

XML Introduction
Bill Jerome
Motivation

“Extensible Markup Language (XML) is a
simple, very flexible text format derived from
SGML (ISO 8879). Originally designed to
meet the challenges of large-scale electronic
publishing” – World Wide Web Consortium
Motivation



This is not meant to be a universal data
encoding. (See SGML)
Intent is to encode publishable data with XML
so that it can be presented to others through
transformation which adheres to guidelines
Thus, a text author need only write in XML
and send to book publishers who would
simply have to transform the XML into a book
that meats their publication guidelines
Motivation for the web

Page authors could learn to write XML rather
than HTML and then transform their pages
for display on the web. That same XML could
be transformed into printed text, etc., without
altering the XML at all. Choosing stylesheets
would allow a total change in appearance
(similar to CSS) without altering content
Motivation for web


From a technical standpoint, the goal is to
separate data for publication from its
presentation
To do it in a way that generates a more real
separation that Cascading Style Sheets do it
Realities



In fact, transformations can be much more
powerful. Table of contents, indices,
repositories for other sources (ex: search)
can all be transformed out of the XML
documents
XML is of limited use by itself and requires
design of guidelines (DTDs or schemas)
Simple to learn to do, not so simple to do
right
XML syntax


If you know HTML, XML is already familiar
More formalized



All tags have end tags
Certain document structure to identify XML doc
(like <HTML></HTML>)
Maximum one root node per document
<tag>Marked up text</tag>
<tag2 attribute=“setting”>More text</tag2>
XML not direct parent of HTML

What in html is not really XML?

Images <img src=“foo.jpg”>


Line breaks <br>


No end
No end
Empty (untagged content)

<html><head></head><body>Here is untagged
text</body></html>
XHTML



A “cleaned up” form of HTML that is XML
compliant
Exists in the form of the XHTML DTD
Is not rendered right in most browsers in
many cases


<img src=“foo.jpg”/> just treated as error
<br/> sometimes isn’t and won’t work
XML doesn’t look helpful



At first glance we have gained very little by
switching to XML.
Benefit will come from structure of XML tags,
described by DTDs or Schemas
We will focus on DTDs which are not as rich
as Schemas but are older and more widely
used
Document Type Descriptors


DTDs provide promises about XML
documents. They outline exactly what tags
are allowed, how the are allowed to relate (do
the nest, are they required like <body>, etc.)
Provide for some more complicated
formations (‘A textbook must have at least
one author, may or may not have an editor,
but must one and only one copyright’)
More on DTDs



Are not a complete language. Cannot specify
“three authors with at least one who is also
an editor” with a DTD
XML documents then reference a DTD that
they claim to adhere to within the header of
the document
XML Parsers should always enforce DTDs
and not assume the XML document is correct
Well-formedness vs. validity


An XML document is well-formed if it can be
interpreted as an XML document. Thus it has
matched start/end tags in an XML formatted
document
An XML document is valid if it adheres to the
DTD it references
XML Stylesheet



So now that we have a promise of how
content must be arranged and content to go
with it, how do we publish? Via XML
Stylesheet Transformation (XSLT)
XSL files explain to the transformer how to
interpret all elements of a particular DTD
XSLT engine then applies those rules to the
XML file
XSLT



XSL files are ugly. The syntax (like DTDs as
well) is itself XML. Is not intended for easy
readability
Is extremely powerful. XSL allows ‘callbacks’
to its engine within XSL.
These are the parts you swap out to get
different outputs
Map

We have introduced three new file types.
Here is a rough map of how they fit:
.xsl
XSLT Engine
.dtd
Informs
.xml
.html
Map

Different XSLs
book.xsl
.ps
web.xsl
XSLT Engine
.dtd
Informs
.xml
.html
Map

Different content
book.xsl
.ps
.ps
.ps
.ps
web.xsl
XSLT Engine
.dtd
Informs
.xml
.xml
.xml
.xml
.html
.html
.html
.html
Example XML
<?xml version="1.0"?>
<recipe name=“Cookies”>
<author>Carol Schmidt</author>
<ingredients>
<item unit=“C” measurement=“2/3”>
butter</item>
<item unit=“C” measurement=“2”> brown
sugar</item>
</ingredients>
</recipe>
Example DTD
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT recipe (author, ingredients)>
<!ATTLIST recipe
name CDATA #REQUIRED
>
<!ELEMENT ingredients (item+)>
<!ELEMENT item (#PCDATA|sub_item)*>
<!ATTLIST item
measurement CDATA #REQUIRED
unit CDATA #REQUIRED
>
<!ELEMENT author (#PCDATA)>
…
Example XSL
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="recipe">
<html>
<head/>
<body>
<p/>
<b>Recipe Name:</b>
<xsl:value-of select="@name"/>
<br/>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="ingredients">
<xsl:for-each select="item">
<xsl:value-of select="@measurement"/>
<xsl:value-of select="@unit"/>
<xsl:value-of select="."/>
<br/>
</xsl:for-each>
</xsl:template>
<xsl:template match="author">
<i>
<xsl:value-of select="."/>
</i>
<br/>
</xsl:template>
</xsl:stylesheet>
XSL Output
Recipe Name:Cookies
Carol Schmidt
2/3Cbutter
2C brown sugar
Resources




http://www.w3.org/XML/
http://www.w3.org/Style/XSL/
http://hotwired.lycos.com/webmonkey/98/41/i
ndex1a.html
http://www.xml.com/