Powerpoint 97/2000 Format

Download Report

Transcript Powerpoint 97/2000 Format

Standards For Hybrid Libraries:
Web Standards
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath, BA2 7AY
[email protected]
http://www.ukoln.ac.uk/
UKOLN is funded by the Library and Information Commission, the Joint
Information Systems Committee (JISC) of the Higher Education Funding Councils,
as well as by project funding from the JISC and the European Union.
UKOLN also receives support from the University of Bath where it is based.
Contents
• Introduction
• Background To The Web Architecture:
• Addressing
• Data Format
• Transfer
• Metadata
• Conclusions
2
Standardisation
Community
• Library groups
• Cultural Heritage
• Government
Proprietary
23950
PNG
HTML
Java?
• De facto standards
• Often initially
Formal
appealing (cf GIF, • Formal international/
PowerPoint, PDF)
national standards
• May emerge as
processes
W3C
standards
• ISO, CEN, NISO,
• Produces W3C
ECMA, ANSI, BSI…
Relevant
Recommendations
• Can be slow-moving
• Managed approach
Bodies
and bureaucratic
• Protocols initially
• Produce robust
IETF
developed by
standards
• Produces Internet
W3C members
Drafts on Internet protocols
• Decisions made by
W3C, influenced by • Bottom-up approach to developments
• Protocols developed by
HTTP
member & PNG
interested individuals
URN
public
HTML
• "Rough consensus and working
whois++
review
HTTP
3
code"
Background to the Web
The web was initially very successful due to
its simplicity
HTML
Client
Mosiac
Netscape
IE
Give me foo.html
from www.bath.ac.uk
Here it is
Server
CERN
Apache
IIS
The web is based on three key architectural components:
4
Data Format:
HTML (HyperText Markup Language)
Addressing:
URLs (Uniform Resource Locators)
Transport:
HTTP (Hypertext Transfer Protocol)
URLs
HTML HTTP
Problems With the Web
Although the web has been successful, there
are problems:
• Performance - the web is too slow
• Resource discovery - lack of a metadata
architecture
• HTML’s lack of arbitrary structure
• Accessibility - difficulties of accessing information
by visually impaired, people using PDAs, etc.
• Functionality - difficult to deploy new applications
on the web
• Addressing
• etc.
5
Solutions (Today)
HTML 4.0 used in conjunction with CSS 2.0 (Cascading
Style Sheets) and the DOM provides an architecturally
pure, yet functionally rich environment
HTML 4.0 - W3C-Rec
• Improved forms
• Hooks for stylesheets
• Hooks for scripting
languages
• Table enhancements
• Better printing
Problems
• Changes during CSS development
• Netscape & IE incompatibilities
• Continued use of browsers with
known bugs
6
CSS 2.0 - W3C-Rec
• Support for all HTML
formatting
• Positioning of HTML
elements
• Multiple media support
DOM - W3C-Rec
• Document Object Model
• Hooks for scripting
languages
• Permits changes to
HTML & CSS properties
and content
HTML's Limitations
HTML 4.0 / CSS 2.0 have limitations:
• Difficulties in introducing new elements
 Time-consuming standardisation process
(<ABBREV>)
 Dictated by browser vendor (<BLINK>, <MARQUEE>)
• Area may be inappropriate for standarisation:
 Covers specialist area (maths, music, ...)
 Application-specific (<STUD-NUM>)
• HTML is a display (output) format
• HTML's lack of arbitrary structure limits functionality:
 Find all memos copied to John Smith
 How many unique tracks on Jackson Browne CDs
7
XML
XML:
•
•
•
•
•
•
•
•
8
Extensible Markup Language
A lightweight SGML designed for network use
Addresses HTML's lack of evolvability
Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc)
Agreement achieved quickly - XML 1.0 became
W3C Recommendation in Feb 1998
Forms the basis of B2B applications
Support from industry (SGML vendors,
Microsoft, etc.)
Support in Netscape 5 and IE 5
XML Deployment
Ariadne issue 15 has article
on "What Is XML?"
Describes how XML support
can be provided:
• Natively by new browsers
• Back end conversion
of XML - HTML
• Client-side conversion
of XML - HTML / CSS
• Java rendering of XML
Examples of intermediaries
See http://www.ariadne.ac.uk/issue15/what-is/
9
XHTML
XHTML:
• an XML representation of HTML
Issues:
• Documents must be well-formed
• Tags in lowercase
• Quote attributes: <img src="foo" height="10"
•<li>End tags required</li>
• Empty elements: <img src="foo" / > <br / >
• Tidy utility – see
<http://www.w3.org/People/Raggett/tidy/>
• See <URL: http://www.w3.org/TR/
WD-html-in-xml/>
Question: Is it time to produce XHTML documents?
10
Namespaces and Linking
XML Namespaces
What if an XML document contains a <TITLE> for
the document and a <TITLE> for the name of a
book?
XML Namespaces enable such clashes to be
resolved
The naming conventions are defined at a URL
XSL stylesheet language will provide extensibility and
transformation facilities (e.g. create a table of contents
or create metadata from structured data)
XLink and XPointer should provide richer hyperlinking
mechanisms in the future
11
Addressing (Problems)
URLs (e.g. http://www.bris-poly.ac.uk/
depts/music/) have limitations:
• Lack of long-term persistency
– Organisation changes name
– Department shut down or merged
– Directory structure reorganised
• Inability to support multiple versions of
resources (mirroring)
ISBN/ISSN also problematic:
• Not tied to the work
• Nor to the item at hand
12
Addressing (Solutions)
PURLs (Persistent URLs):
• Provide single level of redirection
DOIs (Document Object Identifiers):
• Proposed by publishing industry as a
solution
• Aimed at supporting rights ownership
• Business model needed
• Do two copies of a digital object get
separate DOIs?
13
Transport
HTTP/0.9 and HTTP/1.0:
 Design flaws and implementation problems
HTTP/1.1:





Addresses some of these problems
60% server support
Performance benefits! (60% packet traffic reduction)
Is acting as fire-fighter
Not sufficiently flexible or extensible
HTTP/NG:




14
Radical redesign using object-oriented technologies
Undergoing trials
Gradual transition (using proxies)
Integration of application (distributed searching?)
Metadata
Metadata - the missing architectural component
from the initial implementation
of the web
Addressing
URL
Metadata Needs:
15
•
•
•
•
•
•
Resource discovery
Content filtering
Authentication
Improved navigation
Multiple format support
Rights management
Transport Data format
HTTP
HTML
RDF
RDF Data Model
RDF - the metadata
framework
Resource
• Based on a formal
data model (direct
label graphs)
• Syntax for interchange
of data
• Schema model
page.html
Cost
Property
PropName
Cost
16
Value
Property
page.html
£0.05
PropObj
InstanceOf
PropertyType
Value
ValidUntil
11-May-98
Cost
£0.05
ValidUntil
11-May-98
Conclusions
To conclude:
• Standards are important, especially for national
initiatives and other large-scale services
• Proprietary solutions are often tempting because:
–
–
–
–
They are available
They are often well-marketed and well-supported
They may become standardised
Solutions based on standards may not be properly
supported by applications
• Metadata and structured data formats are big growth
areas
• Deployment of new standards is an important
question
17