PPT - Faculty Personal Homepage

Download Report

Transcript PPT - Faculty Personal Homepage

5. Data Description and Transformation
1.
XML
2.
XPath
3.
XSL /XSLT
4.
DTD
5.
XSD
6.
DOM
SWE 444 - Internet & Web Application Development
5.1
5.1 XML

What is XML?

Why XML?

Brief History and Versions

Sample XML Documents

XML Namespaces
SWE 444 - Internet & Web Application Development
5.2
What is XML?

XML stands for EXtensible Markup Language

A meta-language for descriptive markup: you invent your own tags

XML uses a Document Type Definition (DTD) or an XML Schema to
describe the data

XML with a DTD or XML Schema is designed to be self-descriptive

Built-in internationalization via Unicode

Built-in error-handling


A forgotten tag, or an attribute without quotes renders an XML
document unusable
Tons of support from the big IT companies
SWE 444 - Internet & Web Application Development
5.3
Why XML?

Much of shareable data reside in computer systems and databases
in incompatible formats

use conflicting hardware and/or software.

One of the most time-consuming challenges for developers has
been to exchange data between such systems over the Internet

Converting the data to XML can greatly reduce the complexity and
create data that can be read by many different applications


XML data is stored in plain text format – hardware and software
independent
XML can be used to create new languages

Allows us to define our own markup languages
SWE 444 - Internet & Web Application Development
5.4
Brief XML History

SGML (Standard Generalized Markup Language)







ISO Standard, 1986, for data storage & exchange
Metalanguage for defining languages (through DTDs)
A famous SGML language: HTML
Separation of content and display
Used in U.S. gvt. & contractors, large manufacturing
companies, technical info. Publishers,...
SGML reference is 600 pages long
XML



W3C recommendation in 1998
Simple subset (80/20 rule) of SGML: “ASCII of the Web”,
“Semantic Web”
XML specification is 26 pages long
SWE 444 - Internet & Web Application Development
5.5
… Brief XML History

1986


1989



XHTML becomes W3C Recommendation
A Reformulation of HTML 4 in XML 1.0
Feb 2004



XML 1.0 W3C Recommendation
Jan 2000


W3C established
1998


Tim Berners-Lee creates the WWW
1994


SGML becomes a standard
W3c XML 1.0 (Third Edition) Recommendation
http://www.w3.org/TR/2004/REC-xml-20040204/
Feb 2004



XML 1.1 Recommendation
http://www.w3.org/TR/2004/REC-xml11-20040204/
updates XML to use Unicode 3
SWE 444 - Internet & Web Application Development
5.6
XML and HTML

XML is not a replacement for HTML


In future Web development, XML is likely to be used to describe
data while HTML will be used to format and display the same
data (one interpretation of XML)
XML and HTML were designed with different goals

XML was designed to describe data and to focus on what data is


HTML was designed to display data and to focus on how data
looks.


XML describes only content, or “meaning”
HTML describes both structure (e.g. <p>, <h2>, <em>) and
appearance (e.g. <br>, <font>, <i>)
XML is for computers while HTML is for humans


XML is used to mark up data so it can be processed by
computers
HTML is used to mark up text so it can be displayed to users
SWE 444 - Internet & Web Application Development
5.7
XML does not DO anything

XML was not designed to DO anything


A piece of software must be written to do something (send, receive or
display the document)
The following example is a book info, stored as XML:
<?xml version='1.0'?>
<bookstore>
<book genre='autobiography' publicationdate='1981'
ISBN='1-861003-11-0'>
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
…
</bookstore>
SWE 444 - Internet & Web Application Development
5.8
XML is Free and Extensible

XML tags are not predefined




You must "invent" your own tags
The tags used to mark up HTML documents and the
structure of HTML documents are predefined
The author of HTML documents can only use tags
that are defined in the HTML standard
XML allows the author to define his own tags
and his own document structure
SWE 444 - Internet & Web Application Development
5.9
XML Future

XML is going to be everywhere


A large number of software vendors adopted the XML standard very quickly
XML is a cross-platform, software and hardware independent tool for
transmitting information.
XML
XML
Application X
Documents
XML
Repository
SWE 444 - Internet & Web Application Development
XML
Configuration
Database
5.10
Benefits of XML

Open W3C standard – non-proprietary

Representation of data across heterogeneous environments


Cross platform
Allows for high degree of interoperability


Strict rules that make it relatively easy to write XML parsers




E.g., ability to exchange data between incompatible applications with
incompatible data formats
Syntax
Structure
Case sensitive
XML can make data more useful

s/w, h/w and application independence of XML makes data available
to more users not only HTML browsers
SWE 444 - Internet & Web Application Development
5.11
Components of an XML Document

XML declaration

Processing instructions




Encoding specification (Unicode by default)
Namespace declaration
Schema declaration
Elements

Each element has a beginning and ending tag
<TAG_NAME>...</TAG_NAME>
 Elements can be empty (<TAG_NAME />)


Attributes


Describes an element; e.g. data type, data range, etc.
Can only appear on beginning tag
SWE 444 - Internet & Web Application Development
5.12
Components of an XML Document
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="template.xsl"?>
<ROOT>
<ELEMENT1><SUBELEMENT1 /><SUBELEMENT2 /></ELEMENT1>
<ELEMENT2> </ELEMENT2>
<ELEMENT3 type='string'> </ELEMENT3>
<ELEMENT4 type='integer' value='9.3'> </ELEMENT4>
</ROOT>
Elements with Attributes
Elements
Processing Instructions
SWE 444 - Internet & Web Application Development
5.13
XML Declaration

The XML declaration looks like this:
<?xml version="1.0" encoding="UTF-8"
standalone="yes"?>






The XML declaration is not required by browsers, but is
required by most XML processors (so include it!)
If present, the XML declaration must be first--not even
whitespace should precede it
Note that the brackets are <? and ?>
The version attribute is required
encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode),
or something else, or it can be omitted
An XML document is standalone if it makes use of no
external markup (DTD) declarations

Default value for this attribute is no
SWE 444 - Internet & Web Application Development
5.14
Processing Instructions

A PI is a command to the program processing the XML document to
handle it in a certain way

PIs (Processing Instructions) may occur anywhere in the XML document
(but usually first)

XML documents are typically processed by more than one program

Programs that do not recognize a given PI should just ignore it

General format of a PI:


<?target instructions?>
Example:

<?xml-stylesheet type="text/css" href="mySheet.css"?>
SWE 444 - Internet & Web Application Development
5.15
XML Elements

An XML element is everything from the element's start
tag to the element's end tag

XML Elements are extensible and they have
relationships


Related as parents and children
XML Elements have simple naming rules




Names can contain letters, numbers, and other characters
Names must not start with a number or punctuation character
Names must not start with the letters xml (or XML or Xml ..)
Names cannot contain spaces
SWE 444 - Internet & Web Application Development
5.16
XML Attributes

XML elements can have attributes

Data can be stored in child elements or in attributes

Should you avoid using attributes?

Here are some of the problems using attributes:






attributes cannot contain multiple values (child elements can)
attributes are not easily expandable (for future changes)
attributes cannot describe structures (child elements can)
attributes are more difficult to manipulate by program code
attribute values are not easy to test against a Document Type Definition
(DTD) - which is used to define the legal elements of an XML document
Experience shows that attributes are handy in HTML but child
elements should be used in their place in XML

Use attributes only to provide information that is not relevant to the
data
SWE 444 - Internet & Web Application Development
5.17
An XML Document
<?xml version='1.0'?>
<bookstore>
<book genre='autobiography' publicationdate='1981'
ISBN='1-861003-11-0'>
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre='novel' publicationdate='1967' ISBN='0-201-63361-2'>
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
SWE 444 - Internet & Web Application Development
5.18
Another XML Document
<?xml version="1.0"?>
<weatherReport>
<date>7/14/97</date>
<city>North Place</city>, <state>NX</state>
<country>USA</country>
High Temp: <high scale="F">103</high>
Low Temp: <low scale="F">70</low>
Morning: <morning>Partly cloudy, Hazy</morning>
Afternoon: <afternoon>Sunny &amp; hot</afternoon>
Evening: <evening>Clear and Cooler</evening>
</weatherReport>
SWE 444 - Internet & Web Application Development
5.19
XML Validation

There is a difference between a well-formed XML
document and a valid XML document

A well-formed XML document is one with correct XML
syntax

See next slide for well-formedness rules

XML syntax is constrained by a grammar (DTD or
Schema) that governs the permitted tag names,
attachment of attributes to tags, and so on.

A well-formed XML document that also conforms to a
given DTD or schema is said to be valid.

Every valid XML document is well-formed but the reverse is not
necessarily the case
SWE 444 - Internet & Web Application Development
5.20
Rules For Well-Formed XML

There must be one, and only one, root element

All XML elements must have a closing tag

Sub-elements must be properly nested

Attributes are optional

Defined by an optional schema

Attribute values must be enclosed in “” or ‘’

Processing instructions are optional

XML is case-sensitive
SWE 444 - Internet & Web Application Development
5.21
XML DTD

A DTD defines the legal elements of an XML document


XML Schema


The W3C XML specification states that a program should not continue
to process an XML document if it finds a validation error
Processing an XML document requires a software program called
an XML Parser (or XML Processor)


XML Schema is an XML based alternative to DTD
Errors in XML documents will stop the XML program


defines the document structure with a list of legal elements
http://www.xml.com/xml/pub/Guide/xml_parsers
There are two flavors of parsers:


Non-validating: checks for a document’s well-formedness (e.g.,
Browsers)
Validating: checks for a document’s validity
SWE 444 - Internet & Web Application Development
5.22
Browsers Support for XML

Netscape 6 supports XML

Internet Explorer 5.0 supports the XML 1.0 standard

Internet Explorer 5.0 has the following XML support:







Viewing of XML documents
Displaying XML with CSS
Transforming and displaying XML with XSL
XML embedded in HTML as Data Islands
Binding XML data to HTML elements
Access to the XML DOM
Full support for W3C DTD standards
SWE 444 - Internet & Web Application Development
5.23
Viewing XML Documents

Raw XML files can be viewed in IE 5.0 (and higher) and
in Netscape 6

XML documents do not carry information about how to
display the data

To make them display like a web page, you have to add some
display information

Different solutions to the display problem, using CSS,
XSL, XML Data Islands, and JavaScript

Will you be writing your future Homepages in XML?

Most Microsoft pages are XML based and the server converts
them to HTML on-the-fly when requested
SWE 444 - Internet & Web Application Development
5.24
Displaying XML with CSS

With CSS (Cascading Style Sheets) you can
add display information to an XML document

Formatting XML with CSS is NOT the future of
the Web

Formatting with XSL will be the new standard
SWE 444 - Internet & Web Application Development
5.25
Example: the xml file
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="cd_catalog.css"?>
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR> </CD>
....
</CATALOG>
SWE 444 - Internet & Web Application Development
5.26
Example: the css file
CATALOG
{ background-color: white; width: 100%; }
CD
{ display: block; margin-bottom: 30pt; margin-left: 0; }
TITLE
{ color: red; font-size: 20pt; }
ARTIST
{ color: blue; font-size: 20pt; }
COUNTRY,PRICE,YEAR,COMPANY
{ display: block; color: black; margin-left: 20pt; }
SWE 444 - Internet & Web Application Development
5.27
Displaying XML with XSL

With XSL you can add display information to
your XML document

XSL is the preferred style sheet language of
XML


XSL (the eXtensible Stylesheet Language) is far
more sophisticated than CSS
One way to use XSL is to transform XML into HTML
before it is displayed by the browser
SWE 444 - Internet & Web Application Development
5.28
Example: the xml file
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="simple.xsl" ?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
…
</breakfast_menu>
SWE 444 - Internet & Web Application Development
5.29
Example: the xsl file
<?xml version="1.0" encoding="ISO-8859-1"?>
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/xhtml1/strict">
<body style="font-family:Arial,helvetica,sans-serif;font-size:12pt; background-color:#EEEEEE">
<xsl:for-each select="breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold;color:white">
<xsl:value-of select="name"/></span>
- <xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<xsl:value-of select="description"/>
<span style="font-style:italic">
(<xsl:value-of select="calories"/> calories per serving)
</span>
</div>
</xsl:for-each>
</body>
</html>
SWE 444 - Internet & Web Application Development
5.30
View the result in IE 6
SWE 444 - Internet & Web Application Development
5.31
XML Embedded in HTML

XML can be embedded within HTML pages in Data Islands

Manipulated via client side script or data binding

The unofficial <xml> tag is used to embed XML data within HTML

The id attribute of the <xml> tag defines an ID for the data island, and the
src attribute points to the XML file to embed:
<html>
<body>
<xml id="note" src="note.xml"></xml>
</body>
</html>

The next step is to format and display the data in the data island by binding
it to HTML elements.
SWE 444 - Internet & Web Application Development
5.32
Bind Data Island to HTML Elements

Data Islands can be bound to HTML elements (like HTML tables)
<html>
<body>
<xml id="cdcat" src="cd_catalog.xml"></xml>
<table border="1" datasrc="#cdcat">
<tr>
<td> <span datafld="ARTIST"> </span> </td>
<td> <span datafld="TITLE"> </span> </td>
</tr>
</table>
</body>
</html>

An XML data island with ID “cdcat” is loaded from an external file XML file

An HTML table is bound to the data Island with a datasrc attribute

The td elements are bound to the XML data with a datafld attribute inside a span.
SWE 444 - Internet & Web Application Development
5.33
The Microsoft XML Parser

To read and update an XML document, you need an XML parser

The Microsoft XML parser comes with Microsoft Internet Explorer
5.0

Once you have installed IE 5.0, the parser is available to scripts,
both inside HTML documents.

The parser features a language-neutral programming model that
supports:




JavaScript, VBScript, Perl, VB, Java, C++ and more
W3C XML 1.0 and XML DOM
DTD and validation
You can create an XML document object with the following code:

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM")
SWE 444 - Internet & Web Application Development
5.34
Loading an XML file into the parser

XML files can be loaded into the parser using script code.

The following code loads an XML document (note.xml)
into the XML parser:
<script type="text/javascript">
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.load("note.xml")
// ....... processing the document goes here
</script>
 The second line in the code above creates an instance of the
Microsoft XML parser
 The third line turns off asynchronized loading, to make sure that the
parser will not continue execution before the document is fully
loaded
 The fourth line tells the parser to load the XML document called
note.xml

We will revisit these issues later
SWE 444 - Internet & Web Application Development
5.35
Namespaces

XML allows you to define a new document format by combining and
reusing other formats


This can lead to name conflicts since the document formats being combined
may have the same element names that are used for different purposes
Namespaces allow authors to differentiate between tags of the same name
(using a prefix)


That is, name conflicts are solved using a prefix
Frees author to focus on the data and decide how to best describe it

The W3C namespace specification states that a namespace should be
identified by a URI (Uniform Resource Identifier)

A URI is a string of characters which identifies an Internet resource

A URL is the most common URI used to identify resources and their location on
the Internet


Another less common type of URI is URN (Universal Resource Name)
When a URL is used in a namespace declaration, the URL does NOT have to
represent a live server

The only purpose is to give the namespace a unique name. However, very often
companies use the namespace as a pointer to a real Web page containing information
about the namespace
SWE 444 - Internet & Web Application Development
5.36
Namespaces: Declaration
Namespace declaration examples:
xmlns: bk = "http://www.example.com/bookinfo/"
xmlns: bk = "urn:mybookstuff.org:bookinfo"
xmlns: bk = "http://www.example.com/bookinfo/"
Namespace declaration
SWE 444 - Internet & Web Application Development
Prefix
URI (URL)
5.37
Namespaces: Examples
<BOOK xmlns:bk="http://www.bookstuff.org/bookinfo">
<bk:TITLE>All About XML</bk:TITLE>
<bk:AUTHOR>Joe Developer</bk:AUTHOR>
<bk:PRICE currency='US Dollar'>19.99</bk:PRICE>
</BOOK>
<bk:BOOK xmlns:bk="http://www.bookstuff.org/bookinfo"
xmlns:money="urn:finance:money">
<bk:TITLE>All About XML</bk:TITLE>
<bk:AUTHOR>Joe Developer</bk:AUTHOR>
<bk:PRICE money:currency='US Dollar'>
19.99</bk:PRICE>
</bk:BOOK>
SWE 444 - Internet & Web Application Development
5.38
Namespaces: Default Namespace

An XML namespace declared without a prefix
becomes the default namespace for all
sub-elements

All elements without a prefix will belong to the
default namespace:
<BOOK xmlns="http://www.bookstuff.org/bookinfo">
<TITLE>All About XML</TITLE>
<AUTHOR>Joe Developer</AUTHOR>
SWE 444 - Internet & Web Application Development
5.39
Namespaces: Scope

Unqualified elements belong to the inner-most
default namespace.

BOOK, TITLE,
and AUTHOR belong to the default BOOK

namespace
PUBLISHER and NAME belong to the default PUBLISHER
namespace
<BOOK xmlns="www.bookstuff.org/bookinfo">
<TITLE>All About XML</TITLE>
<AUTHOR>Joe Developer</AUTHOR>
<PUBLISHER xmlns="urn:publishers:publinfo">
<NAME>Microsoft Press</NAME>
</PUBLISHER>
</BOOK>
SWE 444 - Internet & Web Application Development
5.40
Namespaces: Attributes

Unqualified attributes do NOT belong to any
namespace



Even if there is a default namespace
They don’t need to since scope of attributes is only
within the element for which they are attributes
This differs from elements, which belong to the
default namespace
SWE 444 - Internet & Web Application Development
5.41
Entities

Entities provide a mechanism for textual substitution for special
characters, e.g.
Entity
Substitution
&lt;
<
&amp;
&

XML parsers normally parse all the text in an XML document

When an XML element is parsed, the text between the XML tags is
also parsed

If you place special characters like “<“ inside an XML element, it will
generate an error because the parser interprets it as the start of a
new element

Entity references are used to avoid such errors
SWE 444 - Internet & Web Application Development
5.42
CDATA

By default, all text inside an XML document is parsed

You can force text to be treated as unparsed character data by enclosing it in <![CDATA[

Any characters, even & and <, can occur inside a CDATA

Whitespace inside a CDATA is (usually) preserved

The only real restriction is that the character sequence ]]> cannot occur inside a CDATA

CDATA is useful when your text has a lot of illegal characters (for example, if your XML document
contains some HTML text)

Example:
...
]]>
<?xml version=‘1.0’?>
<myTag>
<![CDATA[
function matchwo(a,b){
if(a<b) && a<0) then
return 1;
else
return 0;
}
]]>
</myTag>
SWE 444 - Internet & Web Application Development
5.43
References

W3 Schools XML Tutorial


W3C XML page


http://www.programmingtutorials.com/xml.aspx
Online resource for markup language technologies


http://www.w3.org/XML/
XML Tutorials


http://www.w3schools.com/xml/default.asp
http://xml.coverpages.org/
Several Online Presentations
SWE 444 - Internet & Web Application Development
5.44
5.2 XPath

What is XPath?

Sample Syntactic Elements

Path
 Slashes
 Brackets
 Stars

Arithmetic Expressions

Some XPath Functions
SWE 444 - Internet & Web Application Development
5.45
What is XPath?

XPath is a syntax used for selecting parts of an XML
document

The way XPath describes paths to elements is similar to
the way an operating system describes paths to files

XPath is almost a small programming language; it has
functions, tests, and expressions

XPath is a W3C standard


http://www.w3.org/TR/xpath
XPath is not itself written as XML, but is used heavily in
XSLT
SWE 444 - Internet & Web Application Development
5.46
SWE 444 - Internet & Web Application Development
5.47
Terminology

<library>
<book>
<chapter>
</chapter>
<chapter>
<section>
<paragraph/>
<paragraph/>
</section>
</chapter>
</book>
</library>
SWE 444 - Internet & Web Application Development

library is the parent of book; book is the
parent of the two chapters

The two chapters are the children of
book, and the section is the child of the
second chapter

The two chapters of the book are
siblings (they have the same parent)

library, book, and the second chapter
are the ancestors of the section

The two chapters, the section, and the
two paragraphs are the descendents of
the book
5.48
Paths
Operating system:
XPath:
/ = the root directory
/library = the root element (if named
library )
/users/dave/foo = the
file named foo in dave in
users
/library/book/chapter/section = every
section element in a chapter in every
book in the library
foo = the file named foo in the
section = every section element that
current directory
is a child of the current element
. = the current directory
. = the current element
.. = the parent directory
.. = parent of the current element
/users/dave/* = all the files in
/users/dave
/library/book/chapter/* = all the
elements in /library/book/chapter
SWE 444 - Internet & Web Application Development
5.49
Slashes

A path that begins with a / represents an absolute path, starting
from the top of the document
 Example: /email/message/header/from


Note that even an absolute path can select more than one element
A slash by itself means “the whole document”

A path that does not begin with a / represents a path starting from
the current element
 Example: header/from

A path that begins with // can start from anywhere in the
document
 Example: //header/from selects every element from that is a child
of an element header

This can be expensive, since it involves searching the entire
document
SWE 444 - Internet & Web Application Development
5.50
Brackets and last()

A number in brackets selects a particular matching child





The function last() in brackets selects the last matching
child


Example: /library/book[1] selects the first book of the library
Example: //chapter/section[2] selects the second section of
every chapter in the XML document
Example: //book/chapter[1]/section[2]
Only matching elements are counted; for example, if a book
has both sections and exercises, the latter are ignored when
counting sections
Example: /library/book/chapter[last()]
You can even do simple arithmetic

Example: /library/book/chapter[last()-1]
SWE 444 - Internet & Web Application Development
5.51
Stars

A star, or asterisk, is a “wild card”--it means “all
the elements at this level”




Example: /library/book/chapter/* selects every
child of every chapter of every book in the library
Example: //book/* selects every child of every book
(chapters, tableOfContents, index, etc.)
Example: /*/*/*/paragraph selects every paragraph
that has exactly three ancestors
Example: //* selects every element in the entire
document
SWE 444 - Internet & Web Application Development
5.52
Attributes I

You can select attributes by themselves, or elements
that have certain attributes





Remember: an attribute consists of a name-value pair, for
example in <chapter num="5">, the attribute is named num
To choose the attribute itself, prefix the name with @
Example: @num will choose every attribute named num
Example: //@* will choose every attribute, everywhere in the
document
To choose elements that have a given attribute, put the
attribute name in square brackets

Example: //chapter[@num] will select every chapter element
(anywhere in the document) that has an attribute named num
SWE 444 - Internet & Web Application Development
5.53
Attributes II

//chapter[@num] selects every chapter
element with an attribute num

//chapter[not(@num)] selects every chapter
element that does not have a num attribute

//chapter[@*] selects every chapter element
that has any attribute

//chapter[not(@*)] selects every chapter
element with no attributes
SWE 444 - Internet & Web Application Development
5.54
Values of attributes

//chapter[@num='3'] selects every chapter element with
an attribute num with value 3

The normalize-space() function can be used to remove
leading and trailing spaces from a value before
comparison
 Example: //chapter[normalize-space(@num)="3"]
SWE 444 - Internet & Web Application Development
5.55
Arithmetic Expressions

+
add

-
subtract

*
multiply

div
(not /) divide

mod
modulo (remainder)
SWE 444 - Internet & Web Application Development
5.56
Equality Tests

=
“equals”

!=
“not equals”

But it’s not that simple!



(Notice it’s not ==)
value = node-set will be true if the node-set contains any
node with a value that matches value
value != node-set will be true if the node-set contains any
node with a value that does not match value
Hence,

value = node-set and value != node-set may both be
true at the same time!
SWE 444 - Internet & Web Application Development
5.57
Other Boolean Operators

and
(infix operator)

or
(infix operator)

Example: count = 0 or count = 1

not()

The following are used for numerical comparisons only:




<
<=
>
>=
(function)
“less than”
“less than or equal to”
“greater than”
“greater than or equal to”
SWE 444 - Internet & Web Application Development
5.58
Some XPath Functions

XPath contains a number of functions on node sets,
numbers, and strings; here are a few of them:

count(elem) counts the number of selected elements


name() returns the name of the element


Example: //*[starts-with(name(), 'sec')]
contains(arg1, arg2) tests if arg1 contains arg2


Example: //*[name()='section'] is the same as //section
starts-with(arg1, arg2) tests if arg1 starts with arg2


Example: //chapter[count(section)=1] selects chapters with
exactly one section child
Example: //*[contains(name(), 'ect')]
Examples

http://www.zvon.org/xxl/XPathTutorial/General/examples.html
SWE 444 - Internet & Web Application Development
5.59
References

W3School XPath Tutorial

http://www.w3schools.com/xpath/default.asp

MSXML 4.0 SDK

Several online presentations
SWE 444 - Internet & Web Application Development
5.60
5.3 XSL / XSLT

What is XSL?

Some XSLT Constructs







xsl:value-of
xsl:for-each
xsl:if
xsl:choose
xsl:sort
xsl:text
xsl:attribute

Templates

XSL on the Client

XSL on the Server
SWE 444 - Internet & Web Application Development
5.61
What is XSL?

XSL stands for eXtensible Stylesheet Language


a standard recommended by the W3C
http://www.w3.org/TR/xsl/

CSS was designed for styling HTML pages, and can be used to style XML
pages

XSL was designed specifically to style XML pages, and is much more
sophisticated than CSS

XSL consists of three languages:



XSLT (XSL Transformations) is a language used to transform XML documents
into other kinds of documents (most commonly HTML, so they can be
displayed)
XPath is a language to select parts of an XML document to transform with
XSLT
XSL-FO (XSL Formatting Objects) is a replacement for CSS

The future of XSL-FO as a standard is uncertain, because much of its functionality
overlaps with that provided by cascading style sheets (CSS) and the HTML tag set
SWE 444 - Internet & Web Application Development
5.62
How does it work?

The XML source document is parsed into an XML source
tree

You use XPath to define templates that match parts of the
source tree

You use XSLT to transform the matched part and put the
transformed information into the result tree

The result tree is output as a result document

Parts of the source document that are not matched by a
template are typically copied unchanged
SWE 444 - Internet & Web Application Development
5.63
Simple XPath

Here’s a simple XML document:

<?xml version="1.0"?>
<library>
<book>
<title>XML</title>
<author>Gregory Brill</author>
</book>
<book>
<title>Java and XML</title>
<author>Brett Scott</author>
</book>
</library >
SWE 444 - Internet & Web Application Development
XPath expressions look a
lot like paths in a
computer file system
/ means the document
itself (but no specific
elements)
 /library selects the
root element
 /library/book
selects every book
element
 //author selects every
author element,
wherever it occurs

5.64
Simple XSLT

<xsl:for-each
select="//book"> loops through every
book element, everywhere in the document

<xsl:value-of
select="title"/> chooses the content
of the title element at the current location

<xsl:for-each select="//book">
<xsl:value-of select="title"/>
</xsl:for-each>
chooses the content of the title element for each book in
the XML document
SWE 444 - Internet & Web Application Development
5.65
Using XSL to Create HTML

Our goal is to turn this:
<?xml version="1.0"?>
<library>
<book>
<title>XML</title>
<author>Gregory Brill</author>
</book>
<book>
<title>Java and XML</title>
<author>Brett Scott</author>
</book>
</library >

Book Titles:
• XML
• Java and XML
Book Authors:
• Gregory Brill
• Brett Scott

SWE 444 - Internet & Web Application Development
Into HTML that displays
something like this:
Note that we’ve grouped titles
and authors separately
5.66
What we need to do

We need to save our XML into a file (let’s call it
books.xml)

We need to create a file (say, books.xsl) that
describes how to select elements from
books.xml and embed them into an HTML page


We do this by intermixing the HTML and the XSL in
the books.xsl file
We need to add a line to our books.xml file to
tell it to refer to books.xsl for formatting
information
SWE 444 - Internet & Web Application Development
5.67
books.xml, revised
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
<library>
<book>
This tells you where
<title>XML</title>
to find the XSL file
<author>Gregory Brill</author>
</book>
<book>
<title>Java and XML</title>
<author>Brett McLaughlin</author>
</book>
</library >
SWE 444 - Internet & Web Application Development
5.68
Desired HTML
<html>
<head>
<title>Book Titles and Authors</title>
</head>
<body>
Red text is data extracted
<h2>Book titles:</h2>
from the XML document
<ul>
<li>XML</li>
Blue text is our
<li>Java and XML</li>
</ul>
HTML template
<h2>Book authors:</h2>
<ul>
<li>Gregory Brill</li>
We don’t necessarily
<li>Brett Scott</li>
know how much data
</ul>
we will have
</body>
</html>
SWE 444 - Internet & Web Application Development
5.69
XSL Outline
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html> ... </html>
</xsl:template>
</xsl:stylesheet>
SWE 444 - Internet & Web Application Development
5.70
Selecting Titles and Authors

<h2>Book titles:</h2>
<ul>
<xsl:for-each select="//book">
<li>
<xsl:value-of select="title"/>
</li>
</xsl:for-each>
</ul>
<h2>Book authors:</h2>
Notice the
xsl:foreach loop
...same thing, replacing title with author

Notice that XSL can rearrange the data; the HTML result
can present information in a different order than the XML
SWE 444 - Internet & Web Application Development
5.71
All of books.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="books.xsl"?>
<library>
<book>
<title>XML</title>
<author>Gregory Brill</author>
</book>
<book>
<title>Java and XML</title>
<author>Brett Scott</author>
</book>
</library >
Note: if you do View Source, this is
what you will see, not the resultant
HTML
SWE 444 - Internet & Web Application Development
5.72
All of books.xsl
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/
<h2>Book authors:</h2>
XSL/Transform">
<ul>
<xsl:template match="/">
<xsl:for-each
<html>
select="//book">
<head>
<li>
<title>Book Titles and Authors</title>
<xsl:value-of
</head>
<body>
select="author"/>
<h2>Book titles:</h2>
</li>
<ul>
</xsl:for-each>
<xsl:for-each select="//book">
</ul>
<li>
</body>
<xsl:value-of select="title"/>
</html>
</li>
</xsl:template>
</xsl:for-each>
</xsl:stylesheet>
</ul>
SWE 444 - Internet & Web Application Development
5.73
How to use it

In a modern browser, such as Netscape 6,
Internet Explorer 6, or Mozilla 1.0, you can just
open the XML file


Older browsers will ignore the XSL and just show
you the XML contents as continuous text
You can use a program such as Xalan, MSXML,
or Saxon to create the HTML as a file


This can be done on the server side, so that all the
client side browser sees is plain HTML
The server can create the HTML dynamically from
the information currently in XML
SWE 444 - Internet & Web Application Development
5.74
The result (in IE)
SWE 444 - Internet & Web Application Development
5.75
XSLT

XSLT stands for eXtensible Stylesheet
Language Transformations

XSLT is used to transform XML documents into
other kinds of documents--usually, but not
necessarily, XHTML

XSLT uses two input files:

The XML document containing the actual data
 The XSL document containing both the “framework”
in which to insert the data, and XSLT commands to
do so
SWE 444 - Internet & Web Application Development
5.76
Understanding the XSLT Process
SWE 444 - Internet & Web Application Development
5.77
The XSLT Processor
SWE 444 - Internet & Web Application Development
5.78
The .xsl file

An XSLT document has the .xsl extension

The XSLT document begins with:



Contains one or more templates, such as:


<xsl:template match="/"> ... </xsl:template>
And ends with:


<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</xsl:stylesheet>
The template <xsl:template match="/"> says select the
entire file

You can think of this as selecting the root node of the XML tree
SWE 444 - Internet & Web Application Development
5.79
Where XSLT can be used

A server can use XSLT to change XML files into
HTML files before sending them to the client

A modern browser can use XSLT to change
XML into HTML on the client side


This is what we will mostly be doing here
Most users seldom update their browsers


If you want “everyone” to see your pages, do any
XSL processing on the server side
Otherwise, think about what best fits your situation
SWE 444 - Internet & Web Application Development
5.80
xsl:value-of

<xsl:value-of select="XPath expression"/>
selects the contents of an element and adds it to
the output stream


The select attribute is required
Notice that xsl:value-of is not a container tag,
hence it needs to end with a slash
SWE 444 - Internet & Web Application Development
5.81
xsl:for-each

xsl:for-each is a kind of loop statement

The syntax is
<xsl:for-each select="XPath expression">
Text to insert and rules to apply
</xsl:for-each>

Example: to select every book (//book) and make an
unordered list (<ul>) of their titles (title), use:
<ul>
<xsl:for-each select="//book">
<li> <xsl:value-of select="title"/> </li>
</xsl:for-each>
</ul>
SWE 444 - Internet & Web Application Development
5.82
Filtering Output

You can filter (restrict) output by adding a
criterion to the select attribute’s value:
<ul>
<xsl:for-each select="//book">
<li>
<xsl:value-of
select="title[../author='Brett Scott']"/>
</li>
</xsl:for-each>
</ul>

This will select book titles by Brett Scott
SWE 444 - Internet & Web Application Development
5.83
Filter Details

Here is the filter we just used:
<xsl:value-of
select="title[../author='Brett Scott']"/>

author is a sibling of title, so from title we have
to go up to its parent, book, then back down to author

This filter requires a quote within a quote, so we need
both single quotes and double quotes

Legal filter operators are:
=

!=
&lt;
&gt;
Numbers should be quoted
SWE 444 - Internet & Web Application Development
5.84
But it doesn’t work right!

Here’s what we did:
<xsl:for-each select="//book">
<li>
<xsl:value-of
select="title[../author='Brett Scott']"/>
</li>
</xsl:for-each>

This will output <li> and </li> for every book, so we will
get empty bullets for authors other than Brett Scott

There is no obvious way to solve this with just
xsl:value-of
SWE 444 - Internet & Web Application Development
5.85
xsl:if

xsl:if allows us to include content if a given
condition (in the test attribute) is true

Example:
<xsl:for-each select="//book">
<xsl:if test="author='Brett Scott'">
<li>
<xsl:value-of select="title"/>
</li>
</xsl:if>
</xsl:for-each>

This does work correctly!
SWE 444 - Internet & Web Application Development
5.86
xsl:choose

The xsl:choose ... xsl:when ... xsl:otherwise
construct is XML’s equivalent of Java’s switch ...
case ... default statement

The syntax is:
<xsl:choose>
<xsl:when test="some condition">
... some code ...
</xsl:when>
<xsl:otherwise>
... some code ...
xsl:choose is often
</xsl:otherwise>
used within an
</xsl:choose>
xsl:for-each loop
SWE 444 - Internet & Web Application Development
5.87
xsl:sort

You can place an xsl:sort inside an xsl:for-each

The attribute of the sort tells what field to sort on

Example:
<ul>
<xsl:for-each select="//book">
<xsl:sort select="author"/>
<li> <xsl:value-of select="title"/> by
<xsl:value-of select="author">
</li>
</xsl:for-each>
</ul>

This example creates a list of titles and authors, sorted by
author
SWE 444 - Internet & Web Application Development
5.88
xsl:text

Used inside templates to indicate that its contents should be output as text


Its contents are pure text, not elements, and white space is not collapsed
<xsl:text>...</xsl:text> helps deal with two common problems:

XSL isn’t very careful with whitespace in the document



This doesn’t matter much for HTML, which collapses all whitespace anyway
<xsl:text> gives you much better control over whitespace; it acts like the
<pre> element in HTML
Since XML defines only five entities, you cannot readily put other entities
(such as &nbsp;) in your XSL



These are &amp; (&), &lt; (<), &gt; (>), &quot; (“), &apos; (‘)
Others can be inserted using their decimal or hexadecimal number forms
You may use the following secret formula for entities:
<xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text>
•
A “yes” value means special characters like “<“ should be output as is. “no”
indicates that “<“ should be output as “&lt;”. Default is “no”
SWE 444 - Internet & Web Application Development
5.89
Creating Tags from XML Data

Suppose the XML contains
<name>Dr. Scott's Home Page</name>
<url>http://www.kfupm.edu/~scott</url>

And you want to turn this into
<a href="http://www.kfupm.edu/~scott">
Dr. Scott's Home Page</a>

We need additional tools to do this

It doesn’t even help if the XML directly contains
<a href="http://www.kfupm.edu/~scott">
Dr. Scott's Home Page</a> -- we still can’t move it to the


output
The same problem occurs with images in the XML
A reason for the above is that attribute fields may not
contain reserved characters like < and > in XML
SWE 444 - Internet & Web Application Development
5.90
Creating Tags - solution 1

Suppose the XML contains
<name>Dr. Scott's Home Page</name>
<url>http://www.kfupm.edu/~scott</url>

<xsl:attribute name="..."> adds the named attribute to the
enclosing tag

The value of the attribute is the content of this tag

Example:

<a>
</a>

Result:
<xsl:attribute name="href">
<xsl:value-of select="url"/>
</xsl:attribute>
<xsl:value-of select="name"/>
<a href="http://www.kfupm.edu/~scott">
Dr. Scott's Home Page</a>
SWE 444 - Internet & Web Application Development
5.91
Creating Tags - solution 2

Suppose the XML contains
<name>Dr. Scott's Home Page</name>
<url>http://www.kfupm.edu/~scott</url>

An attribute value template (AVT) consists of braces { } inside the
attribute value

The content of the braces is replaced by its value

Example:


<a href="{url}">
<xsl:value-of select="name"/>
</a>
Result:
<a href="http://www.kfupm.edu/~scott">
Dr. Scott's Home Page</a>
SWE 444 - Internet & Web Application Development
5.92
Modularization

Modularization: breaking up a complex program into
simpler parts (is an important programming tool)



For example, suppose we have a DTD for book with
parts titlePage, tableOfContents, chapter, and
index


In programming languages modularization is often done with
functions or methods
In XSL we can do something similar with
xsl:apply-templates
We can create separate templates for each of these parts
Template rules are used to control what output is
created from what input
SWE 444 - Internet & Web Application Development
5.93
…Modularization

A template rule is represented by an <xsl:template>
element

The <xsl:template> element has



A match attribute that contains an XPath pattern identifying the
input it matches
A template that is instantiated and output when the pattern is
matched
Template skeleton:
<xsl:template match=“person”>
A Person
</xsl:template>

The above says that every time a <person> element is
seen, the stylesheet processor should emit the text “A
Person”
SWE 444 - Internet & Web Application Development
5.94
Book example

<xsl:template match="/">
<html> <body>
<xsl:apply-templates/>
</body> </html>
</xsl:template>

<xsl:template match="tableOfContents">
<h1>Table of Contents</h1>
<xsl:apply-templates select="chapterNumber"/>
<xsl:apply-templates select="chapterName"/>
<xsl:apply-templates select="pageNumber"/>
</xsl:template>

Etc.
SWE 444 - Internet & Web Application Development
5.95
xsl:apply-templates

The <xsl:apply-templates> element
applies a template rule to the current element or
to the current element’s child nodes

If we add a select attribute, it applies the
template rule only to the child that matches

If we have multiple <xsl:apply-templates>
elements with select attributes, the child
nodes are processed in the same order as the
<xsl:apply-templates> elements
SWE 444 - Internet & Web Application Development
5.96
When templates are ignored

Templates aren’t used unless they are applied


Exception: Processing always starts with
select="/"
If it didn’t, nothing would ever happen

If your templates are ignored, you probably
forgot to apply them

If you apply a template to an element that has
child elements, templates are not automatically
applied to those child elements
SWE 444 - Internet & Web Application Development
5.97
Applying templates to children


<book>
<title>XML</title>
<author>Gregory Brill</author>
</book>
With this line:
XML by Gregory Brill
<xsl:template match="/">
<html> <head></head> <body>
<b><xsl:value-of select="/book/title"/></b>
<xsl:apply-templates select="/book/author"/>
</body> </html>
</xsl:template>
<xsl:template match="/book/author">
by <i><xsl:value-of select="."/></i>
</xsl:template>
Without this line:
XML
SWE 444 - Internet & Web Application Development
5.98
Built-in Templates

XSLT has a couple of built in templates, which say:
 when you apply templates to an element, process its child elements
 when you apply templates to a text node, give its value

Together, it means that if you apply templates to an element but don't have
an explicit template for that element, then its content gets processed and
eventually you end up with the text that the element contains.

Here are the built-in template rules for each of the seven XPath node types:
Elements
Apply templates to children
Text
Copy text to the result tree
Comments
Do nothing
PIs
Do nothing
Attributes
Copy the value of the attribute to the result tree
Name spaces
Do nothing
Root
Apply templates to children
SWE 444 - Internet & Web Application Development
5.99
XSL - On the Client

If your browser supports XML, XSL can be used to transform the
document to XHTML in your browser

A JavaScript Solution



By using JavaScript, we can:



Even if this works fine, it is not always desirable to include a style
sheet reference in an XML file (i.e. it will not work in a non XSL aware
browser.)
A more versatile solution would be to use a JavaScript to do the XML
to XHTML transformation
do browser-specific testing
use different style sheets according to browser and user needs
XSL transformation on the client side is bound to be a major part of
the browsers work tasks in the future, as we will see a growth in the
specialized browser market (Braille, aural browsers, Web printers,
handheld devices, etc.)
SWE 444 - Internet & Web Application Development
5.100
Transforming XML to XHTML in Your Browser
<html>
<body>
<script type="text/javascript">
// Load XML
var xml = new ActiveXObject("Microsoft.XMLDOM")
xml.async = false
xml.load(“books.xml")
// Load XSL
var xsl = new ActiveXObject("Microsoft.XMLDOM")
xsl.async = false
xsl.load(“books.xsl")
// Transform
document.write(xml.transformNode(xsl))
</script>
</body>
</html>
SWE 444 - Internet & Web Application Development
5.101
XSL - On the Server

Since not all browsers support XML and XSL, one
solution is to transform the XML to XHTML on the server

To make XML data available to all kinds of browsers, we
have to transform the XML document on the SERVER
and send it as pure XHTML to the BROWSER

That's another beauty of XSL! One of the design goals
for XSL was to make it possible to transform data from
one format to another on a server, returning readable
data to all kinds of future browsers
SWE 444 - Internet & Web Application Development
5.102
Thoughts on XSL

XSL is a programming language--and not a particularly
simple one

Expect to spend considerable time debugging your XSL

These slides have been an introduction to XSL and
XSLT--there’s a lot more of it we haven’t covered

As with any programming, it’s a good idea to start simple
and build it up incrementally: “Write a little, test a little”


This is especially a good idea for XSLT, because you don’t get
a lot of feedback about what went wrong
Try jEdit with the XML plugin

write (or change) a line or two, check for syntax errors, then
jump to IE and reload the XML file
SWE 444 - Internet & Web Application Development
5.103
References

W3School XSL Tutorial

http://www.w3schools.com/xsl/default.asp

MSXML 4.0 SDK

http://www.topxml.com

http://www.xml.org

http://www.xml.com

Several online presentations
SWE 444 - Internet & Web Application Development
5.104
5.4 Document Type Definitions (DTDs)

What are DTDs?

Why DTDs?

DTD Syntactic Elements




ELEMENT
ATTRIBUTE
ENTITY
Types

Examples

Validation
SWE 444 - Internet & Web Application Development
5.105
What are DTDs?

Document Type Definition (DTD) is a grammar that
describes the structure of a class of XML documents

structure of the documents is described via


Element declarations



name the allowable set of elements within the document, and
specify whether and how declared elements and runs of character
data may be contained within each element.
Attribute-list declarations


element and attribute-list declarations.
name the allowable set of attributes for each declared element,
including the type of each attribute value, if not an explicit set of
valid value(s).
DTDs are written in EBNF-like notation
SWE 444 - Internet & Web Application Development
5.106
Why DTDs?

XML documents are designed to be processed by
computer programs


If you can put just any tags in an XML document, it’s very hard to
write a program that knows how to process the tags
A DTD specifies what tags may occur, when they may occur, and
what attributes they may (or must) have

A DTD allows the XML document to be verified (shown to
be legal)

A DTD that is shared across groups allows the groups to
produce consistent XML documents
SWE 444 - Internet & Web Application Development
5.107
Parsers

An XML parser is an API that reads the content
of an XML document


Currently popular APIs are DOM (Document Object
Model) and SAX (Simple API for XML)
A validating parser is an XML parser that
compares the XML document to a DTD and
reports any errors
SWE 444 - Internet & Web Application Development
5.108
An XML example

<novel>
<foreword>
<paragraph>This is a great novel.
</paragraph>
</foreword>
<chapter number="1">
<paragraph>It was a dark and stormy
night.</paragraph>
<paragraph>Suddenly, a shot rang
out!</paragraph>
</chapter>
</novel>

An XML document contains (and the DTD describes):



Elements, such as novel and paragraph, consisting of tags and
content
Attributes, such as number="1", consisting of a name and a value
Entities (not used in this example)
SWE 444 - Internet & Web Application Development
5.109
A DTD example

<!DOCTYPE novel [
<!ELEMENT novel (foreword, chapter+)>
<!ELEMENT foreword (paragraph+)>
<!ELEMENT chapter (paragraph+)>
<!ELEMENT paragraph (#PCDATA)>
<!ATTLIST chapter number CDATA #REQUIRED>
]>

A novel consists of a foreword and one or more chapters, in that order

Each chapter must have a number attribute

A foreword consists of one or more paragraphs

A chapter also consists of one or more paragraphs

A paragraph consists of parsed character data (text that cannot contain any other
elements)

PCDATA is text that will be parsed by a parser. Tags inside the text will be treated
as markup and entities will be expanded.

CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT
be treated as markup and entities will not be expanded.
SWE 444 - Internet & Web Application Development
5.110
ELEMENT descriptions

Suffixes:
?
+
*

foreword?
chapter+
appendix*
Separators
,
|

optional
one or more
zero or more
both, in order
or
foreword?, chapter+
section|chapter
Grouping
( )
grouping
SWE 444 - Internet & Web Application Development
(section|chapter)+
5.111
Elements without children

The syntax is <!ELEMENT name category>


The name is the element name used in start and end
tags
The category may be EMPTY:

or just <br />
In the XML, an empty element may not have any
content between the start tag and the end tag
An empty element may (and usually does) have
attributes



In the DTD: <!ELEMENT br EMPTY>
In the XML: <br></br>
SWE 444 - Internet & Web Application Development
5.112
Elements with unstructured children

The syntax is <!ELEMENT name category>

The category may be ANY



This indicates that any content -- character data, elements, even
undeclared elements -- may be used
Since the whole point of using a DTD is to define the structure of a
document, ANY should be avoided wherever possible
The category may be (#PCDATA), indicating that only
character data may be used
 In the DTD: <!ELEMENT paragraph (#PCDATA)>





In the XML: <paragraph>A shot rang out!</paragraph>
The parentheses are required!
Note: In (#PCDATA), whitespace is kept exactly as entered
Elements may not be used within parsed character data
Entities are character data, and may be used
SWE 444 - Internet & Web Application Development
5.113
Elements with children

A category may describe one or more children:








<!ELEMENT novel (foreword, chapter+)>
Parentheses are required, even if there is only one child
A space must precede the opening parenthesis
Commas (,) between elements mean that all children must
appear, and must be in the order specified
“|” separators means any one child may be used
All child elements must themselves be declared
Children may have children
Parentheses can be used for grouping:

<!ELEMENT novel (foreword, (chapter+|section+))>
SWE 444 - Internet & Web Application Development
5.114
Elements with mixed content

#PCDATA describes elements with only
character data

#PCDATA can be used in an “or” grouping:
<!ELEMENT note (#PCDATA|message)*>
 This is called mixed content
 Certain (rather severe) restrictions apply:




#PCDATA must be first
The separators must be “|”
The group must be starred (meaning zero or more)
SWE 444 - Internet & Web Application Development
5.115
Names and namespaces

All names of elements, attributes, and entities, in both
the DTD and the XML, are formed as follows:



The name must begin with a letter or underscore
The name may contain only letters, digits, dots, hyphens,
underscores, and colons
The DTD doesn’t know about namespaces -- as far as it
knows, a colon is just part of a name

The following are different (and both legal):



<!ELEMENT chapter (paragraph+)>
<!ELEMENT myBook:chapter (myBook:paragraph+)>
Avoid colons in names, except to indicate namespaces
SWE 444 - Internet & Web Application Development
5.116
An expanded DTD example
<!DOCTYPE novel [
<!ELEMENT novel
(foreword, chapter+, biography?, criticalEssay*)>
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
foreword (paragraph+)>
chapter (section+|paragraph+)>
section (paragraph+)>
biography(paragraph+)>
criticalEssay (section+)>
paragraph (#PCDATA)>
]>
SWE 444 - Internet & Web Application Development
5.117
Attributes and entities

In addition to elements, a DTD may declare attributes
and entities

An attribute describes information that can be put within
the start tag of an element



In XML: <car name= "Toyota" model= "2001"></car>
In DTD: <!ATTLIST car
name CDATA #REQUIRED
model CDATA #IMPLIED >
An entity describes text to be substituted

In XML: &copyright;
In the DTD: <!ENTITY copyright "Copyright KFUPM">
SWE 444 - Internet & Web Application Development
5.118
Attributes

The format of an attribute is:
<!ATTLIST element-name
name
name
type
type
requirement
requirement>
where the name-type-requirement may be repeated as
many times as desired






Note that only spaces separate the parts, so careful counting is
essential
The element-name tells which element may have these
attributes
The name is the name of the attribute
Each attribute has a type, such as CDATA (character data)
Each attribute may be required, optional, or “fixed”
In the XML, attributes may occur in any order
SWE 444 - Internet & Web Application Development
5.119
Important attribute types

There are ten attribute types

These are the most important ones:



CDATA
The value is character data
(man|woman|child)
The value is one from this list
ID
The value is a unique identifier


NMTOKEN



ID values must be legal XML names and must be unique within the
document
The value is a legal XML name
This is sometimes used to disallow whitespace in the name
It also disallows numbers, since an XML name cannot begin with a
digit
The other seven, less frequently used, are:

IDREF, IDREFS, NMTOKENS, ENTITY, ENTITIES,
NOTATION, xml:
SWE 444 - Internet & Web Application Development
5.120
Requirements

Recall that an attribute has the form
<!ATTLIST element-name name

type
requirement>
The requirement is one of:

A default value, enclosed in quotes
 Example: <!ATTLIST degree CDATA "PhD">

#REQUIRED


#IMPLIED


The attribute must be present
The attribute is optional
#FIXED "value"


The attribute always has the given value
If specified in the XML, the same value must be used
SWE 444 - Internet & Web Application Development
5.121
Entities

There are exactly five predefined entities: &lt;, &gt;, &amp;,
&quot;, and &apos;

Additional entities can be defined in the DTD:
 <!ENTITY copyright "Copyright KFUPM">

Entities can be defined in another document:
 <!ENTITY copyright SYSTEM "MyURI">

Example of use in the XML:
 This document is &copyright; 2002.

Entities are a way to include fixed text (sometimes called
“boilerplate”)

Entities should not be confused with character references, which
are numerical values between & and #
 Example: &233#; or &xE9#; to indicate the character é
SWE 444 - Internet & Web Application Development
5.122
Another example: XML
<?xml version="1.0"?>
<!DOCTYPE myXmlDoc SYSTEM
"http://www.mysite.com/mydoc.dtd">
<weatherReport>
<date>05/29/2002</date>
<location>
<city>Philadelphia</city>, <state>PA</state>
<country>USA</country>
</location>
<temperature-range>
<high scale="F">84</high>
<low scale="F">51</low>
</temperature-range>
</weatherReport>
SWE 444 - Internet & Web Application Development
5.123
The DTD for this example

<!ELEMENT weatherReport (date, location,
temperature-range)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT location (city, state, country)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT country (#PCDATA)>
<!ELEMENT temperature-range
((low, high)|(high, low))>
<!ELEMENT low (#PCDATA)>
<!ELEMENT high (#PCDATA)>
<!ATTLIST low scale (C|F) #REQUIRED>
<!ATTLIST high scale (C|F) #REQUIRED>
SWE 444 - Internet & Web Application Development
5.124
Inline DTDs

If a DTD is used only by a single XML document,
it can be put directly in that document:


<?xml version="1.0">
<!DOCTYPE myRootElement [
<!-- DTD content goes here -->
]>
<myRootElement>
<!-- XML content goes here -->
</myRootElement>
An inline DTD can be used only by the document
in which it occurs
SWE 444 - Internet & Web Application Development
5.125
External DTDs

An external DTD (a DTD that is a separate document) is
declared with a SYSTEM or a PUBLIC command:




The file extension for an external DTD is .dtd


<!DOCTYPE myRootElement SYSTEM
"http://www.mysite.com/mydoc.dtd">
The name that appears after DOCTYPE (in this example,
myRootElement) must match the name of the XML document’s
root element
Use SYSTEM for external DTDs that you define yourself, and
use PUBLIC for official, published DTDs
External DTDs can only be referenced with a URL
External DTDs are almost always preferable to inline
DTDs, since they can be used by more than one
document
SWE 444 - Internet & Web Application Development
5.126
Limitations of DTDs

DTDs are a very weak specification language


You can’t put any restrictions on element contents
It’s difficult to specify:




All the children must occur, but may be in any order
This element must occur a certain number of times
There are only ten data types for attribute values
But most of all: DTDs aren’t written in XML!




If you want to do any validation, you need one parser for the
XML and another for the DTD
This makes XML parsing harder than it needs to be
There is a newer and more powerful technology: XML
Schemas
However, DTDs are still very much in use
SWE 444 - Internet & Web Application Development
5.127
Validators

Opera 5 and Internet Explorer 5 can validate your XML
against an internal DTD




jEdit with the XML plugin will check for wellstructuredness and (if the DTD is inline) will validate
your XML each time you do a Save


IE provides (slightly) better error messages
Opera apparently just ignores external DTDs
IE considers an external DTD to be an error
http://www.jedit.org/
Validate [Using Inline DTD]
http://www.stg.brown.edu/service/xmlvalid/
SWE 444 - Internet & Web Application Development
5.128
References

W3School DTD Tutorial

http://www.w3schools.com/dtd/default.asp

MSXML 4.0 SDK

http://www.topxml.com

http://www.xml.org

http://www.xml.com

Several online presentations
SWE 444 - Internet & Web Application Development
5.129
5.5 XML Schema Definition (XSD)

What is XSD?

An XML Document with Its Schema

Referencing A Schema from XML Document

Simple and Complex Elements

Predefined Types




Numeric types
Date and Time types
String types
Defining Schema Components





Simple Elements
Attributes
Restrictions or Facets
Enumeration
Complex Elements
SWE 444 - Internet & Web Application Development
5.130
What is XML Schema?

The origin of schema


XML Schema documents are used to define and
validate the content and structure of XML data
XML Schema was originally proposed by Microsoft,
but became an official W3C recommendation in May
2001

http://www.w3.org/XML/Schema
SWE 444 - Internet & Web Application Development
5.131
Why Schema?
Separating Information from Structure and Format
Information
Information
Structure
Format
Format
Structure
Traditional Document:
Everything is clumped together
SWE 444 - Internet & Web Application Development
“Fashionable” Document: A document
is broken into discrete parts, which
can be treated separately
5.132
Why Schema?
Schema Workflow
SWE 444 - Internet & Web Application Development
5.133
DTD vs. Schema
DTD
XSD
No constraints on character data
Can constrain character data like requiring
a string to be of a fixed characters
Not using XML syntax
Uses XML syntax and thus frees developer
of the need to learn another language.
XML transformations can be applied,
too.
No support for namespace
Very limited for reusability and extensibility
Very limited for reusability and extensibility
Can reuse in other schemas, create own
derived data types and reference
multiple schemas from same document
Easier to write DTD-based validators: may
only need to check existence of
content like PCDATA
Schema-based validators are more difficult
to write because we may have to
validate content detail
Easier to understand
More complex: The notion of “type” adds
an extra layer of confusing complexity
SWE 444 - Internet & Web Application Development
5.134
XML.org Registry

The XML.org Registry offers a central clearinghouse for developers and
standards bodies to publicly submit, publish and exchange XML schemas,
vocabularies and related documents
SWE 444 - Internet & Web Application Development
5.135
Example 1: An XML Document Instance
<?xml version="1.0" encoding="utf-8"?>
<book isbn="0836217462">
<title> … </title>
<author> … </author>
<qualification> … </qualification>
</book>
SWE 444 - Internet & Web Application Development
5.136
Schema for Example 1
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="book">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
<xs:element name="qualification" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
book.xsd
SWE 444 - Internet & Web Application Development
5.137
Example 2: An XML Document and Its Schema
<letter> Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid> will
be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:integer"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
SWE 444 - Internet & Web Application Development
5.138
The XSD Document

Since the XSD is written in XML, it can get
confusing which we are talking about

The file extension is .xsd

The root element is <schema>

The XSD starts like this:

<?xml version="1.0"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
SWE 444 - Internet & Web Application Development
5.139
<schema>

The <schema> element may have attributes:


xmlns:xs="http://www.w3.org/2001/XMLSchema"
 Indicates that the elements used in the schema (schema,
element, complextType, etc) come from this namespace
elementFormDefault="qualified"
 This means that all XML elements must be qualified (i.e.,
prefixed with xs)
SWE 444 - Internet & Web Application Development
5.140
Referring to a Schema

To refer to a DTD in an XML document, the reference goes before
the root element:


<?xml version="1.0"?>
<!DOCTYPE rootElement SYSTEM "url">
<rootElement> ... </rootElement>
To refer to an XML Schema in an XML document, the reference
goes in the root element:

<?xml version="1.0"?>
<rootElement
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="url.xsd">
...
</rootElement>

xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance


Schema instance namespace
This attribute has two values for


The namespace to use and
the second value is the location of the XML schema to use for that namespace:
SWE 444 - Internet & Web Application Development
5.141
“Simple” and “Complex” Elements

A “simple” element is one that contains text and
nothing else





A simple element cannot have attributes
A simple element cannot contain other elements
A simple element cannot be empty
However, the text can be of many different types,
and may have various restrictions applied to it
If an element isn’t simple, it’s “complex”


A complex element may have attributes
A complex element may be empty, or it may contain
text, other elements, or both text and other elements
SWE 444 - Internet & Web Application Development
5.142
Predefined Numeric Types

Here are some of the predefined numeric types:
xs:decimal
xs:byte
xs:short
xs:int
xs:long

xs:positiveInteger
xs:negativeInteger
xs:nonPositiveInteger
xs:nonNegativeInteger
Allowable restrictions on numeric types:

enumeration, minInclusive, minExclusive,
maxInclusive, maxExclusive, fractionDigits,
totalDigits, pattern, whiteSpace
SWE 444 - Internet & Web Application Development
5.143
Predefined Date and Time Types

xs:date - A date in the format CCYY-MM-DD, for
example, 2003-11-05

xs:time - A time in the format hh:mm:ss (hours,
minutes, seconds)

xs:dateTime - Format is CCYY-MM-DDThh:mm:ss

Allowable restrictions on dates and times:

enumeration, minInclusive,
minExclusive, maxInclusive,
maxExclusive, pattern, whiteSpace
SWE 444 - Internet & Web Application Development
5.144
Predefined String Types

Recall that a simple element is defined as:
<xs:element

type="type" />
Here are a few of the possible string types:




name="name"
xs:string - a string
xs:normalizedString - a string that doesn’t contain
tabs, newlines, or carriage returns
xs:token - a string that doesn’t contain any whitespace other
than single spaces
Allowable restrictions on strings:

enumeration, length, maxLength, minLength,
pattern, whiteSpace
SWE 444 - Internet & Web Application Development
5.145
Defining a Simple Element

A simple element is defined as
<xs:element
name="name"
type="type" />
where:


name is the name of the element
the most common values for type are
xs:boolean
xs:date
xs:decimal

xs:integer
xs:string
xs:time
Other attributes a definition of a simple element may
have:


default="default value"
fixed="value"
SWE 444 - Internet & Web Application Development
if no other value is specified
no other value may be specified
5.146
Defining an Attribute

Attributes themselves are always declared as simple
types

An attribute is defined as
<xs:attribute
name="name"
type="type" />
where:


name and type are the same as for xs:element
Other attributes a definition of a simple element may
have:




default="default value"
fixed="value"
use="optional"
use="required"
SWE 444 - Internet & Web Application Development
if no other value is specified
no other value may be specified
the attribute is not required (default)
the attribute must be present
5.147
Restrictions, or “Facets”

The general form for putting a restriction on a text value is:


<xs:element name="name">
(or xs:attribute)
<xs:simpleType>
<xs:restriction base="type">
... the restrictions ...
</xs:restriction>
</xs:simpleType>
</xs:element>
For example:

<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="20"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
SWE 444 - Internet & Web Application Development
5.148
Restrictions, or “Facets”

The “age" element is a simple type with a
restriction. The acceptable values are: 20 to 100

The example above could also have been
written like this:
<xs:element name="age" type="ageType"/>
<xs:simpleType name="ageType">
<xs:restriction base="xs:integer">
<xs:minInclusive value="20"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
SWE 444 - Internet & Web Application Development
5.149
Restrictions on numbers

minInclusive
number must be ≥ the given value

minExclusive
number must be > the given value

maxInclusive
number must be ≤ the given value

maxExclusive
number must be < the given value

totalDigits
number must have exactly value digits

fractionDigits number must have no more than value
digits after the decimal point
SWE 444 - Internet & Web Application Development
5.150
Restrictions on strings

length
the string must contain exactly value characters

minLength
the string must contain at least value characters

maxLength
the string must contain no more than value characters

pattern
the value is a regular expression that the string must match

whiteSpace not really a “restriction” - tells what to do with whitespace



value="preserve"
value="replace"
value="collapse"
SWE 444 - Internet & Web Application Development
Keep all whitespace
Change all whitespace characters to spaces
Remove leading and trailing whitespace, and replace
all sequences of whitespace with a single space
5.151
Restriction with Regular Expression Patterns


<xs:element name=“letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value=“([a-z])*"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

<xs:element name=“password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value=“[a-zA-Z0-9]{8}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Test these and find out whether the semantics of regular
expressions is the same as that in JavaScript
SWE 444 - Internet & Web Application Development
5.152
Enumeration

An enumeration restricts the value to be one of a fixed
set of values

Example:

<xs:element name="season">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Spring"/>
<xs:enumeration value="Summer"/>
<xs:enumeration value="Autumn"/>
<xs:enumeration value="Fall"/>
<xs:enumeration value="Winter"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
SWE 444 - Internet & Web Application Development
5.153
Complex Elements

A complex element is defined as
<xs:element
name="name">
<xs:complexType>
... information about the complex
type...
</xs:complexType>
</xs:element>

Example:
<xs:element
name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstName" type="xs:string" />
<xs:element name="lastName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
SWE 444 - Internet & Web Application Development
5.154
Complex Elements

Another example – using a type attribute
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
SWE 444 - Internet & Web Application Development
5.155
xs:sequence

We’ve already seen an example of a complex
type whose elements must occur in a specific
order:
<xs:element
name="person">
<xs:complexType>
<xs:sequence>
<xs:element
<xs:element
</xs:sequence>
</xs:complexType>
name="firstName" type="xs:string" />
name="lastName" type="xs:string" />
</xs:element>
SWE 444 - Internet & Web Application Development
5.156
xs:all

xs:all allows elements to appear in any order

<xs:element
name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstName" type="xs:string" />
<xs:element name="lastName" type="xs:string" />
</xs:all>
</xs:complexType>
</xs:element>

Despite the name, the members of an xs:all group can occur once
or not at all

You can use minOccurs="n" and maxOccurs="n" to specify how
many times an element may occur (default value is 1)

In this context, n may only be 0 or 1
SWE 444 - Internet & Web Application Development
5.157
Extensions

You can base a complex type on another
complex type

<xs:complexType name="newType">
<xs:complexContent>
<xs:extension base="otherType">
...new stuff...
</xs:extension>
</xs:complexContent>
</xs:complexType>
SWE 444 - Internet & Web Application Development
5.158
Text Element with Attributes

If a text element has attributes, it is no longer a
simple type
<xs:element name="population">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="year"
type="xs:integer">
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
SWE 444 - Internet & Web Application Development
5.159
Empty Elements

Empty elements are (ridiculously) complex

<xs:complexType name="counter">
<xs:complexContent>
<xs:extension base="xs:integer"/>
<xs:attribute name="count"
type="xs:integer"/>
</xs:complexContent>
</xs:complexType>
SWE 444 - Internet & Web Application Development
5.160
Mixed Elements

Mixed elements may contain both text and elements

We add mixed="true" to the xs:complexType
element

The text itself is not mentioned in the element, and may
go anywhere (it is basically ignored)

<xs:complexType name="paragraph" mixed="true">
<xs:sequence>
<xs:element name="someName"
type="xs:anyType"/>
</xs:sequence>
</xs:complexType>

See Example 2 at the start of this section
SWE 444 - Internet & Web Application Development
5.161
References

W3School XSD Tutorial

http://www.w3schools.com/schema/default.asp

MSXML 4.0 SDK

Several online presentations
SWE 444 - Internet & Web Application Development
5.162
Reading List

W3School XSD Tutorial

http://www.w3schools.com/schema/default.asp
SWE 444 - Internet & Web Application Development
5.163
5.6 XML DOM

The XML DOM

XML Parsers



DOM-based
SAX-based
Examples




Cross-browser XML DOM object creation
Creating HTML table using XML data
Some XML DOM properties and methods
Creating XML using DOM methods
SWE 444 - Internet & Web Application Development
5.164
XML DOM

The DOM is a collection of interfaces that parser vendors and browser
manufacturers implement


The DOM interfaces are specified in modules, making it possible for
implementations to support parts of the DOM


XML parsers, for instance, aren’t required to provide support for the HTMLspecific part of the DOM
The W3C DOM is separated into different parts (Core, XML, and HTML)
and different levels (DOM Level 1/2/3):




To enable creation and manipulation of XML documents
Core DOM - defines a standard set of objects for any structured document
XML DOM - defines a standard set of objects for XML documents
HTML DOM - defines a standard set of objects for HTML documents
HTML DOM extends the Core XML DOM


Core DOM provides interface definition for manipulating and working with any
XML.
HTML DOM augments this with additional interfaces definitions for HTML
specific elements.
SWE 444 - Internet & Web Application Development
5.165
… XML DOM

The XML DOM is designed to be used with any programming
language and any operating system.

It is fully described in the W3C DOM specification

http://www.w3.org/DOM/

With the XML DOM, a programmer can create an XML document,
navigate its structure, and add, modify, or delete its elements

DOM provides generic access to DOM-compliant documents: add,
edit, delete, manipulate

DOM is language-independent

The DOM is based on a tree view of your document. Nodes! Nodes!
Nodes!
SWE 444 - Internet & Web Application Development
5.166
SWE 444 - Internet & Web Application Development
5.167
XML Parsers

As mentioned earlier, a software program called a parser is required to
process an XML document

Parsers can support the DOM and/or the SAX for accessing a document’s
content programmatically using Java, C, JavaScript etc

A DOM-based parser builds a tree structure containing the XML document’s
data in memory

A SAX (Simple API for XML)-based parser processes the document and
generates events when tags, text, comments etc are encountered

SAX and DOM are standards for XML parsers



DOM is a W3C standard
SAX is an ad-hoc (but very popular) standard
Examples: JAXP (Java API for XML Parsing), MSXML 3.0 (Microsoft XML
parser), Xerces (Apatche’s Xerces parser)

All support both SAX and DOM
SWE 444 - Internet & Web Application Development
5.168
SAX Callbacks

SAX works through callbacks: you call the
parser, it calls methods that you supply
Your program
startDocument(...)
The SAX parser
main(...)
parse(...)
startElement(...)
characters(...)
endElement( )
endDocument( )
SWE 444 - Internet & Web Application Development
5.169
Difference between SAX and DOM
DOM
SAX
Tree-based model
Event-based model: invokes methods when
markup is encountered
Data can be accessed quickly (randomly)
since all data is in memory
No tree structure is created: data is passed to
the application from the XML document as
it is found. SAX provides only sequential
access to data
Provides facilities for adding and removing
nodes (i.e., modifying the document)
SAX implementations do not
Requires too much space. Cannot be used
for large XML documents
Less memory overhead
SWE 444 - Internet & Web Application Development
5.170
DOM components

Document top-level view of the document, with
access to all nodes (including root element)






createElement method - creates an element node
createAttribute method - creates an attribute node
createComment method - creates a comment node
getDocumentElement method - returns root element
appendChild method - appends a child node
getChildNodes method - returns child nodes
SWE 444 - Internet & Web Application Development
5.171
DOM components II

Node represents a node - "A node is a reference to an
element, its attributes, or text from the document."











cloneNode method - duplicates a node
getNodeName method - returns the node name
getNodeName method - returns the node's name
getNodeType method - returns the node's type
getNodeValue method - returns the node's value
getParentNode method - returns the node's parent's name
hasChildNodes method - true if has child nodes
insertBefore method - stuffs child in before specified child
removeChild method - removes the child node
replaceChild method - replaces one child with another
setNodeValue method - sets node's value
SWE 444 - Internet & Web Application Development
5.172
DOM components III

attribute represents an attribute node 
getAttribute method - gets attribute!

getTagName method - gets element's name

removeAttribute method - deletes it

setAttribute method - sets att's value
SWE 444 - Internet & Web Application Development
5.173
DOM Access with JavaScript

We have seen that the DOM can be mapped against XSLT style
sheets to transform an XML document into a formatted Web page

The DOM presents an XML document as a tree-structure (a node
tree), with the elements, attributes, and text defined as nodes.

We now show how JavaScript can be used to navigate the
document tree and manipulate its data.

Although this does not permit saving an XML document locally

An XML file is made accessible to scripting by loading it into a
Document Object Model.

When the MSXML parser loads an XML document, it reads it from
start to finish and creates a logical tree model of it.
SWE 444 - Internet & Web Application Development
5.174
Creating a DOM Object
<script type="text/javascript">
function Load_DOM() {
XMLDoc = new ActiveXObject("Microsoft.XMLDOM")
XMLDoc.async = false
XMLDoc.load("books.xml")
if (XMLDoc.parseError.errorCode != 0) {
alert("DOM Not Loaded: XML file has error(s)!")
}
}
…
</script>

By default, the load() method returns control to the caller before the download is
finished.

This async="true" action avoids unusually long waits for the Web page to load while
the DOM is loading a large document.

For small to moderate size XML files use async="false" to load and display the page
concurrently with the XML file.

The document's parseError object returns an errorCode property as a decimal number
associated with an error.

If the error code is zero, then loading of the file into the DOM was successful.
SWE 444 - Internet & Web Application Development
5.175
Example 1: Cross-Browser Code
<script type="text/javascript">
var xmlDoc
function loadXML() {
//load xml file
if (window.ActiveXObject) {// code for IE
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("note.xml");
getmessage()
}else if (document.implementation && //code for Mozilla
document.implementation.createDocument) {
xmlDoc=
document.implementation.createDocument("","",null);
xmlDoc.load("note.xml");
xmlDoc.onload=getmessage
} else {
alert('Your browser cannot handle this script');
}
}
// continued …
SWE 444 - Internet & Web Application Development
5.176
… Cross-Browser Code
function getmessage() {
document.getElementById("to").innerHTML=
xmlDoc.getElementsByTagName("to")[0].firstChild.nodeValue
document.getElementById("from").innerHTML=
xmlDoc.getElementsByTagName("from")[0].firstChild.nodeValue
document.getElementById("message").innerHTML=
xmlDoc.getElementsByTagName("body")[0].firstChild.nodeValue
}
</script>
</head><body onload="loadXML()" bgcolor="silver">
<h1>W3Schools Internal Note</h1>
<p><b>To:</b> <span id="to"></span><br />
<b>From:</b> <span id="from"></span>
<hr />
<b>Message:</b> <span id="message"></span>
SWE 444 - Internet & Web Application Development
5.177
Example 2: Creating HTML Table Using XML Data
<script type="text/javascript">
function Show_Node() {
…
Load_DOM();
OutString = "<table border='1'>"
OutString += "<tr style='background-color:#E6E6E6'>"
OutString += " <th>Title</th>"
OutString += " <th>Author</th>"
OutString += " <th>Price</th>"
OutString += "</tr>“
BookNode = XMLDoc.selectSingleNode("/library/book[@edition]")
OutString += "<tr>"
NodeList = BookNode.childNodes
for (i=0; i < NodeList.length; i++) {
if (i < NodeList.length) {
OutString += "<td>" + NodeList(i).text + "</td>"
}
}
OutString += "</tr>"
OutString += "</table>"
document.all.Output.innerHTML = OutString
}
</script>
SWE 444 - Internet & Web Application Development
5.178
Example 3: Some XML DOM Properties/Methods
<script type="text/javascript">
…
// get the root element
var element = xmlDocument.documentElement;
document.writeln("<p>Here is the root node of the document:" );
document.writeln( "<strong>" + element.nodeName + "</strong>" );
document.writeln(
"<br>The following are its child elements:" );
document.writeln( "</p><ul>" );
// traverse all child nodes of root element
for ( i = 0; i < element.childNodes.length; i++ ) {
var curNode = element.childNodes.item( i );
// print node name of each child element
document.writeln( "<li><strong>" + curNode.nodeName
+ "</strong></li>" );
}
document.writeln( "</ul>" );
// get the first child node of root element
var currentNode = element.firstChild;
…
</script>
SWE 444 - Internet & Web Application Development
5.179
Example 4: Creating XML Using DOM Methods
<script type="text/javascript">
var comment, stud1, stud2, studs
function mkRecord(){
comment = document.createComment(‘My Tullab records');
studs = document.createElement('tullab');
stud1 = createStudent("G. Al-Good", 4.0);
stud2 = createStudent("P. Al-Probation", 1.7);
studs.appendChild(comment);
studs.appendChild(stud1);
studs.appendChild(stud2);
showObject(studs);
}
function createStudent(name, gpa){
var result = document.createElement('talib');
nm = document.createElement('name');
GPA = document.createElement('gpa');
nm.appendChild(document.createTextNode(name));
GPA.appendChild(document.createTextNode(gpa));
result.appendChild(nm);
result.appendChild(GPA);
return result;
}
</script>
SWE 444 - Internet & Web Application Development
5.180