XML - Florida Gulf Coast University

Download Report

Transcript XML - Florida Gulf Coast University

XML – Extensible Markup Language
Sivakumar Kuttuva
& Janusz Zalewski
What is XML?
Extensible Markup Language (XML) is a
universal standard for electronic data
exchange
Provides a method of creating and using
tags to identify the structure and
contents of a document ignoring the
formatting
How XML look like
<?xml version="1.0"?>
<Course>
//Root Tag
<Name>Java Programming</Name> //Element Course Name
<Department>EECS</Department> //Element Dept
<Teacher>
<Name>Paul</Name>
</Teacher>
<Student>
<Name>Ron</Name>
</Student>
<Student>
<Name>Uma</Name>
</Student>
<Student>
<Name>Lindsay</Name>
</Student>
</Course>
Why XML came into existence?(1)
• Make it easier to provide metadata -- data
about information
<Department>EECS</Department>
<Teacher>
<Name>Paul Thompson</Name>
</Teacher>
Here Name, Department are Metadata
• Large-scale electronic publishing requires
dynamic documents without changing
document formats.
• Internationalized media-independent
electronic publishing.
Why XML came into existence? (2)
• Allow industries to define platformindependent protocols for the exchange of
data, especially the data of electronic
commerce.
• Make it easy for people to process data using
inexpensive software.
Two Types of Syntax Standards
• XML documents must meet one of two
syntax standards:
– Well-formed (the basic standard) Document
must meet minimum,
standard criteria.
– Valid
Document must be well-formed and
adhere to a DTD (Document Type Definition).
Well-Formed XML
– Well-formed criteria include:
• All elements have a start and end tag with matching
capitalization.
– <B></B>
• Proper element nesting.
– <B><I></I></B>
– not <B><I></B></I>
• Attribute values are in single or double quotes.
– <book call_no=" 3456-34567890-3456 ">
• Empty elements need an end or closed start tag.
– <IMG></IMG> or <IMG />
Why Well-Formed Matters
• Guarantees the document’s syntax
before sending it to an application.
• A clean syntax guarantee which means
less ambiguity which results in faster
processing.
• A well-formed violation is a fatal error.
Valid XML
• To be valid, a document must be wellformed and adhere to a DTD.
• A DTD Example is shown below
– <!ELEMENT BOOKCATALOG (BOOK)+>
– <!ELEMENT BOOK (TITLE, AUTHOR+, PUBLISHER?,PRICE?>
– <!ATTLIST BOOK ISBN CDATA #REQUIRED>
– <!ATTLIST BOOK BOOKTYPE (Fiction|SciFi|Fantasy)
#IMPLIED>
–
–
–
–
–
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT AUTHOR (LASTNAME)>
<!ELEMENT LASTNAME (#PCDATA)>
<!ELEMENT PUBLISHER (#PCDATA)>
<!ELEMENT PRICE (#PCDATA)>
Valid XML
• DTD - Document Type Definition specifies:
– Elements in the document.
• Author, Publisher
– Their attributes.
• For Book Author, Publisher, Price are attributes
– Whether they are mandatory or optional
• A DTD effectively specifies the document’s
grammatical rules.
A sample entry in the XML file
adhering to the given DTD
•
•
•
•
<BOOKCATALOG>
<BOOK>
<ISBN>3456-34567890-3456</ISBN>
<TITLE>C++ Primer</TITLE>
•
•
•
<AUTHOR_LASTNAME>Tendulkar</AUTHOR_LASTNAME>
<PUBLISHER>McGraw Hill</PUBLISHER>
<PRICE>41.99</PRICE> </BOOK>
•
</BOOKCATALOG>
Why use DTD
• Well-formed means the document meets a
minimum standard set of rules.
• A DTD helps to define user defined rules
and languages provided the XML content
adheres to the syntax standards like WML,
MAML, etc.
The Components – Line 1
• <!ELEMENT Bookcatalog (Book+)>
• Bookcatalog is the root element.
• Bookcatalog can have one or more
(indicated by the +) Book elements.
The Components – Line 2
• <!ELEMENT Book (Title, Author,
Publisher, Price)>
• Each Book element can contain:
• A title, author, publisher, price
The Components – Line 4
• <!ATTLIST Book BookType (Fiction |
SciFi | Nonfiction) Fiction.
• Each Book element has a attribute
BookType Three options (indicated by |)
Fiction, SciFi and Non-Fiction with
Fiction as default.
The Components – Lines 5-9
• The Remaining Elements Title through
Price are #PCDATA
– Parseable character data that the processor will
check for entities and markup characters
– Any <,>, or & in data specified as PCDATA
must be represented by &lt; or &gt; or &amp;.
Schemas
• The next step beyond DTDs
• Come from the database world
• More powerful and extensible than DTDs, which
come from the SGML world
• Schemas are XML documents, so they:
– Are extensible
– Use XML syntax unlike DTDs
– Support data types like dates, times, currencies,
important in eCommerce
DTDs vs Schemas
• Why use schemas?
– More powerful than DTDs
– Better suited for eCommerce.
• Why use DTDs?
– Wider tool support.
– More examples available for use and reference.
• HTML, XHTML, CALS, etc.
– Greater depth of experience in the industry
– Wider pool of developers
CSS and XML
• CSS was designed for HTML but works
fine under XML as well.
• Rather than create an XSL style sheet,
you can create a simpler CSS and
attach it to a XML document via a
command like:
– {?xml-stylesheet href=“mycss.css”
type=“text/css”?}
CSS and XSL
• XML uses custom tags that a browser does not
know how to display
• So XML documents may display like this
–
–
–
–
–
–
–
–
<BOOKCATALOG>
<BOOK>
<ISBN>3456-34567890-3456</ISBN>
<TITLE>C++ Primer</TITLE>
<AUTHOR_LASTNAME>Tendulkar</AUTHOR_LASTNAME>
<PUBLISHER>McGraw Hill</PUBLISHER>
<PRICE>41.99</PRICE> </BOOK>
</BOOKCATALOG>
• Legibility requires applying styles:
• – CSS
• – XSL
XSL (Extensible Style Language)
• XSL comes from DSSSL (Document
Style Semantics and Specification
Language), the SGML style language,
derived from LISP.
Benefits of XSL
• An XSL style sheet is well-formed XML.
• Supports a style sheet DTD for
validation.
• Far greater processing ability than CSS.
• XSL Transformations (XSLT) take part of
an XML document and transform it, such
as XML to HTML.
– This is why XML appears to be the route to
single-sourcing.
Advanced Features of XML
• Xlink
• Xpointer
• Parsing XML with DOM
(Document Object Model)
• XPath
XML Applications
•
•
•
•
Applications that require the Web client to mediate
between two or more heterogeneous databases like
information tracking system for a home health care
agency.
Applications that attempt to distribute a significant
proportion of the processing load from the Web server
to the Web client like technical data delivery system
for a wide range of products.
Applications that require the Web client to present
different views of the same data to different users.
Applications in which intelligent Web agents attempt
to tailor information discovery to the needs of
individual users.
Future Demands of XML
• Intelligent Web agents would have demand
for structured data
• User preferences must be represented in a
standard way to mass media providers.