Intro to XML - Server Configuration

Download Report

Transcript Intro to XML - Server Configuration

Introduction to XML
Current Internet Issues






Need for customized page layout – e.g. filter to
display only recent data
Downloadable product comparisons – e.g. import to
spreadsheet features, prices, etc.
Application integration – translation of data from
various application sources e.g. human resource,
accounts, projects
Data integration – coherent DB views
Interchangeable
files
–
enhance
software
development via number of tools, exchange
information effortlessly
Need for software to spontaneously connect across
the globe, exchange information, process and record
exchanges

Problems evident in


Web pages – undifferentiated text makes
search ability problematic, need to for
human intervention
HTML – page layout language, little scope
for data analysis, need to slice and dice
manually
Semantic Web (Web 2.0)
…. seeks to provide a common
framework that allows data to be shared
and reused across application, enterprise
and community boundaries… it is an
extension of the current Web in which
information
is
given
well-defined
meaning, better enabling computers and
people to work in cooperation
(Berners-Lee et al., 2001)
Berners-Lee’s Architecture of Semantic Web
The Resource Description Framework (Fensel, 2004)
XML Tools
Applications
Content Management
Tools
Authoring Tools
Web Infrastructure
Development Tools
Fundamental Components
Relationship of Major Tools
Goals of XML –
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
It shall be straightforward to use XML over the Internet
XML shall support a wide variety of applications
XML shall be compatible with SGML
It shall be easy to write programs that process XML
documents
The number of optional features in XML is to be kept to
an absolute minimum, ideally zero
XML documents should be human-legible and
reasonably clear
XML design should be prepared quickly
The design of XML should be formal and concise
XML documents shall be easy to create
Terseness in XML markup is of minimal importance
XML Conceptual Model





Human and machine readability
Defining content
Defining structure
Separation of content from relationships
Separation of structure from
presentation
XML Specifications


Document portion – specifies how to
use tagged markup to indicate meaning
of data (content + XSL)
DTD portion – specifies how to indicate
allowable structure for a class of XML
document (metadata + ontologies)
XML Approach

Simple approach to metadata and shared context
 Authors add metadata through tags
<author>William Shakespeare</author>
 Content and presentation are separate, linked files
 Document designers add shared context through Document
Type Definitions (DTDs) – a set of declarations that specify
allowable order, structure and attributes of tags for
particular document
 Uses prolog declaration as standard to define DTD –
<? XML version “1.0”?>
 Structured via URL so any party can spontaneously access
DTD, interpret its rules, and process the document
Extensible Stylesheet Language XSL


Used for formatting document display
Three primary requirements for solution




Define formatting templates that apply to an element and
its sub-elements



Apply formatting rules to elements
Usable with different display technologies
Document consumer may control application of stylesheets
Select particular elements by using “xsl” namespace
Instructions for formatting use “fo” namespace
Common practice is to use XSL template syntax but
directly insert HTML tags – XSL is used simply as
transformation language (see example at end)
XSL Example
<xsl:template match=“address”
<fo:block>
fo:font-size=“large”
fo:font-weight=“bold”
fo:font-family=“Arial”
fo:line-height=“2”
<xsl:appy-templates/>
</fo:block>
</xsl:template>
<xsl:template match=“address[@ADDTYPE=“ship”]>
<fo:block>
fo:font-size=“small”
fo:font-weight=“normal”
fo:font-family=“Times”
fo:line-height=“1”
<xsl:appy-templates/>
</fo:block>
</xsl:template>
Metadata

Spontaneous information exchange requires
metadata



One party indicates what each piece of
information means, two or more parties agree on
meaning, facilitating organisation of data in
database schema
Formal way to describe what a piece of data
means
Difficulty – parties need to agree on meaning
Shared Context

Formal description of rules metadata
must follow


Applies to particular type of document,
serves as contract between document
sender and receiver
One party can design shared context, post
on internet so available to anyone who
wishes to use/add
Simple example - content
<business card>
<name>
<given_name>David</given_name>
<middle_name>John</middle_name>
<family_name>Brown</family_name>
</name>
<title>Software Technology Analyst</title>
<author/>
<contact-methods>
<phone>028 7137 5555</phone>
<phone>07741265894</phone>
</contact_methods>
</business card>
Simple example – DTD
<!ELEMENT business_card (name, title, author?,
contact_methods)>
<!ELEMENT name (given_name, middle_name?, family_name)>
<!ELEMENT given_name (#PCDATA)>
<!ELEMENT middle_name (#PCDATA)>
<!ELEMENT family_name (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author EMPTY>
<!ELEMENT contact_methods (phone*)>
<!ELEMENT phone (#PCDATA)>
CONSTRAINTS
? 0 or 1 elements
* 0 or more
Elements




Fundamental unit of XML – author-specific chunk of information
Consists of element name and element content
One element at top – root or document element
Four types of allowable content





data content – contain only data
Element content – contain only other elements
Empty – contain neither data or elements
Mixed content – contain both data and elements (viewed as poor
practice)
Attributes – attribute name and value, bounded by quotation
marks


database DBTYPE=“Oracle”
column DATATYPE=“String”
Well formed XML




There is one root element
All non-empty elements have start tags
and end tags that match exactly
All empty elements have the correct
empty tag syntax
Elements are strictly nested – no
overlapping elements
Special XML characteristics





Less than symbol (<) - &lt;
Greater than symbol (>) - &gt;
Quote symbol (“) - &quot;
Apostrophe symbol (‘) - &apos;
Ampersand symbol (&) - &amp;
Related Standards


XML documents and DTDs provide
foundation for an Internet document
paradigm – do not provide all necessary
features!
Need for XML standards (some of which
are still under development)
Standard Efforts
Standard
Abbreviation
Specification Status
Purpose
XML Namespaces
Namespaces
Recommendation
Prevent overlap of
names used by
different software
XML Linking
Language
XLink
Working Draft
Flexible document
linking
Extensible Stylesheet
Language
XSL
Working Draft
Flexible document
presentation
XSL Transformations
XSLT
Working Draft
Easy transformation
of XML content from
one data format to
another
XML Schema
N/A
Working Draft
More extensive
document definition
rules than DTDs
XML Namespaces 



XML is based on elements, each is distinguished
by unique element name
Applications process a document they associate
element content with corresponding element
name
Naming collisions are problematic – for example
both Accounting and Fullfillment applications use
term ‘status’ to signify different meanings
Using XML Namespaces developers can qualify
the use of ‘status’ element in reconciliation
document
XML Linking Language - XLink



Similar to HTML linking but XLink
enables people to specify links with
multiple target documents by using
XLink namespaces
Two forms – simple and extended
As still underdevelopment, standard
HTML linking is acceptable
XSL Transformations - XSLT



XSLT is used to transform data from
one format to another
Used when two companies want to
exchange same information but apply
own internal DTD
Provides
mechanism
to
support
customised data flow – delivers data to
each application in format it desires
XML Schema


Using XML Schema instead of DTDs to specify
the content of XML documents offers
advantages of using XML syntax and
enforcing datatype restrictions on element
content and attribute values
Additional features to DTD, such as ability to
define recurring blocks of elements or
attributes once and then re-use definition
many times
XML software support



Fundamental software components – low level XML
capabilities such as parsing and generating documents
Software development tool support – tools for rapid
application
development,
include
utilities
for
manipulating XML documents and integrating with
document development tools
Document development tools – define DTDs and author
XML documents easier than text editors. Graphical
modeling tools/browsers for directly manipulating XML
documents, support existing web content development
and management
XML software support


Web infrastructure support – to develop adaption
momentum, browsers and servers must support
XML – anyone can view XML from standard
browser
Translation components – convert all data
formats to XML, similar to WPs supporting ‘save
as rich text format’ feature
Inevitable barriers to deployment (standards,
security) need to overcome
(will be investigated later in module)
XML Editor



Windows Notepad or WordPad
Altova Development Software - Spy
Microsoft XML Notepad
http://msdn.microsoft.com/xml/notepad/intro.asp

XML writer – available from Wattle Software
http://www.xmlwriter.com/

Macintosh – Emile editor
http://www.in-progress.com/emile/
Web Browsers





Internet Explorer
Netscape Navigator
Modzilla Firefox
W3C – Amaya Web browser – derived
for testing XML documents
Download for FREE

http://www.w3c.org/Amaya
Application Areas












Information accessibility and sharing
Information distribution capabilities
Document life cycle and content management
Accumulating knowledge from employees (tacit and explicit)
Searching knowledge repositories, such as database systems
and the Internet
Knowledge categorisation systems and limitations of
classifications
Workflow systems and integration barriers
Application integration
System architecture standardisation and compatibility
Customized publishing
File configuration and logging
Electronic commerce
Real world XML







WML – Web pages for
mobile devices
XMLNews – news stories
CDF – Web channels
OSD – descriptions of
software
OFX – financial information
(EFT, etc.)
RDF – descriptions of info in
web pages (helps to aid
search engines)
MathML – mathematical
equations








P3P – Web privacy policies
RELML – real estate listings
HRMML – human resource
infomration (resumes, etc.)
VoxML – voice response
scripts (Press 1 for this,
Press 2 for that, etc.)
VML – vector graphics
SVG – vector graphics
SMIL – multimedia
presentations
3DML – three-dimensional
virtual worlds
XML Example
employee.xml
staff.xsl