Standards For Building Web Sites

Download Report

Transcript Standards For Building Web Sites

Standards For Building
Web Sites
Brian Kelly
UK Web Focus
Email Address
[email protected]
UKOLN
University of Bath
http://www.ukoln.ac.uk/
UKOLN is funded by Resource: The Council for Museums, Archives and Libraries,
the Joint Information Systems Committee (JISC) of the Higher Education Funding
Councils, as well as by project funding from the JISC and the European Union.
UKOLN also receives support from the University of Bath where it is based.
Contents
• Introduction
• Web Standards Overview
• Web Standards:
•
•
•
•
Data Formats
Transport
Addressing
Metadata
• Deployment Issues
• Questions
2
Aims of Talk
• To describe standards
bodies involved with
the Web
• To review key Web
standards
• To report on
developments to Web
standards
• To briefly address
implementation models
UK Web Focus / W3C
UK Web Focus:
• JISC funded post based at UKOLN (Bath Univ)
• Advises UK HE community on web issues
• Represents JISC on W3C
UKOLN
• UK Office for Library and Information Networking
• Applied research (e.g. JISC and EU-funded
projects) and dissemination
W3C (World Wide Web Consortium):
3
• International consortium, with headquarters at
MIT, INRIA and Keio University (Japan)
• Coordinates development of web protocols
and file formats
Standards, Architectures,
Applications, Resources
This talk is concerned primarily with the standards used to
develop web services
Standards: concerned with
protocols and file formats
Open standards vs. Proprietary
HTML / XML vs. PDF
CSS / XSL vs. HTML
Applications: software
products used to implement
systems
Apache / IIS
FrontPage / Dreamweaver
Oracle / SQLServer
4
Architectures: models for
implementing systems
NT / Unix
File system / database application
HTML tools / content management
Resources: financial and staff
costs needed to implement
systems
Development vs. Migration costs
Use of in-house expertise
In-house vs. out-sourced
Licensed vs. open source
Standards
Need for standards to provide:
• Platform independence
• Application independence
• Avoidance of patented technologies
• Flexibility ("evolvability" - Tim Berners-Lee)
• Architectural integrity
• Long-term access to data
Ideally look at standards first, then find applications
which support the standards
Difficult to achieve this ideal!
5
Standardisation
Other
• Standards bodies
such as ECMA
• Community groups
which can agree on,
say, profiles
HTML
Proprietary
extensions
• De facto standards
PDF and Java?
• Often initially appealing
W3C
(cf PowerPoint)
PNG
• Produces W3C
• May emerge as
HTML
ISO
Recommendations
standards
• Produces ISO Z39.50
on Web protocols
Java?
Standards
• Managed approach to
• Can be slow moving
developments
and bureaucratic
• Protocols initially
• Produce robust
IETF
developed by
standards
W3C members
• Produces Internet
• Decisions made by
Drafts on Internet protocols
W3C, influenced by
• Bottom-up approach to developments
member and public
• Protocols developed by
HTTP
review
interested
individuals
PNG
URN
•
"Rough
consensus
and
working
HTML
whois++
code"
HTTP
6
The Web Vision
Tim Berners-Lee's vision for the Web:
• Automation of information management:
If a decision can be made by machine, it should
• All structured data formats should be based on
XML
• Migrate HTML to XML
• All logical assertions to map onto RDF model
• All metadata to use RDF
A useful overview of Tim Berners-Lee's vision for the
Web is given in his book Weaving The Web.
7
Web Protocols
Web initially based on three
simple protocols:
Data Format
HTML
Addressing Transport
URL
HTTP
• Data Formats
HTML (HyperText Markup Language)
provides the data format for native documents
• Addressing
URLs (Uniform Resource Locator) provides an
addressing mechanism for web resources
• Transport
HTTP (HyperText Transfer Protocol) defines
transfer of resources between client and server
8
HTML History
HTML 1.0
Unpublished specification. DTD developed
by Tim Berners-Lee (CERN).
HTML 2.0 Spec. based on innovations from NCSA
(forms and inline images!)
HTML 3.0 Proposed spec. (renamed from HTML+).
Very comprehensive
Failed to complete IETF standardisation
Little implementation experience
Proprietary Introduction of proprietary HTML elements
by Netscape and Microsoft (browser wars)
HTML 3.2 Spec. based on description of mainstream
innovations in marketplace
HTML 4.0 Current recommendation
9
Problems with Extensions
Device Dependency
• Resources are dependent on a particular browser
• Platform dependency
Costs
• Read costs in supporting multiple architectures
• Potential costs in re-engineering
Architecture
• Proprietary innovations have been flawed:
– Merging content and appearance
– Maintenance of resources
• Accessibility problems:
– Poor support for access by disabled
10
But:
• Experiments are needed
HTML 4.0, CSS 2.0 and DOM
HTML 4.0 used in conjunction with CSS 2.0
(Cascading Style Sheets) and DOM 1.0 provides an
architecturally pure, yet functionally rich environment
HTML 4.0
• Improved forms
• Hooks for stylesheets
• Hooks for scripting
languages
• Table enhancements
• Better printing
CSS Problems
• Changes during CSS development
• Netscape & IE incompatibilities
• Continued use of browsers with
known bugs
11
CSS 2.0
• Support for all HTML
formatting
• Positioning of HTML
elements
• Multiple media support
DOM 1.0
• Document Object Model
• Hooks for scripting
languages
• Permits changes to
HTML & CSS properties
and content
HTML Limitations
HTML 4.0 / CSS 2.0 have limitations:
• Difficulties in introducing new elements
– Time-consuming standardisation process
(<ABBREV>)
– Dictated by browser vendor (<BLINK>, <MARQUEE>)
• Area may be inappropriate for standarisation:
– Covers specialist area (maths, music, ...)
– Application-specific (<STUD-NUM>)
• HTML is a display (output) format
• HTML's lack of arbitrary structure limits
functionality:
12
– Find all memos copied to John Smith
– How many unique tracks on Jackson Browne CDs
XML
XML:
•
•
•
•
Extensible Markup Language
A lightweight SGML designed for network use
Addresses HTML's lack of evolvability
Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc)
• Agreement achieved quickly - XML 1.0 became
W3C Recommendation in Feb 1998
• Support from industry (SGML vendors, Microsoft,
etc.)
• Support in Netscape 6 (?) and IE 5
13
XML Concepts
Well-formed XML resources:
Make end-tags explicit: <li>...</li>
Make empty elements explicit: <img ... />
Quote attributes <img src="logo.gif" height="20"
Use consistent upper/lower case
Valid XML resources:
Need DTD
XML Namespaces:
Mechanism for ensuring unique XML elements:
<?xml:namespace ns="http://foo.org/
1998-001" prefix="i">
<p>Insert <i:PART>M-471</i:PART></p>
14
XLink, XPointer and XSL
XLink will provide sophisticated
England
hyperlinking missing in HTML:
France
• Links that lead user to multiple destinations
• Bidirectional links
• Links with special behaviors:
– Expand-in-place / Replace / Create new window
– Link on load / Link on user action
<commentary xml:link="extended" inline="false">
• Link databases
<locator href="smith2.1" role="Essay"/>
<locator href="jones1.4" role="Rebuttal"/>
XPointer will provide
<locator href="robin3.2" role="Comparison"/>
access to arbitrary
</commentary>
portions of XML resource
XSL stylesheet language will provide extensibility and
transformation facilities (e.g. create a table of contents)
15
More XML Developments
Momentum behind XML is driving additional standardisation
developments
16
XML Path
A language for addressing parts of an XML document, designed to be
used by XSLT and XPointer
XML Schemas (Ii)
Defining the nature of XML schemas and their component parts
XML Schemas (II)
Facilities for defining datatypes to be used in XML Schemas and other
XML specifications
XSLT
A language for transforming XML documents into other XML
documents
XML Infospace
An abstract data set containing the information available
from an XML document
XHTML
XHTML:
• Extensible Hypertext Markup Language
• HTML represented in XML
• Some small changes to HTML:
–
–
–
–
Elements in lowercase (<p> not <P>)
Attributes must be quoted (<img src="logo" height="50">
Elements must be closed (<p>..</p>)
Empty elements must be closed (<img src="logo" .. />)
• Gain benefits from XML
• Tools available (e.g. HTML-Kit from
http://www.chami.com/html-kit/)
• See <http://www.webreference.com/xml/
column6/> and <http://www.builder.com/
Authoring/Xhtml/>
17
Addressing
URLs (e.g. http://www.bristol-poly
.ac.uk/depts/music/) have limitations:
• Lack of long-term persistency
– Organisation changes name
– Department scrapped
– Directory structure reorganised
• Inability to support multiple versions of resources
(mirroring)
URNs (Uniform Resource Names):
• Proposed as solution
• Difficult to implement (no W3C activity in this
area)
18
Addressing - Solutions
DOIs (Document Object Identifiers):
• Proposed by publishing industry as a solution
• Aimed at supporting rights ownership
• Business model needed
PURLs (Persistent URLs):
• Provide single level of redirection
Cache support:
• National caches could provide simple URN
support
For further information see:
19
<URL: http://www.ukoln.ac.uk/metadata/
resources/urn/>
<URL: http://hosted.ukoln.ac.uk/biblink/
wp2/links.html>
Transport
HTTP/0.9 and HTTP/1.0:
 Made the Web popular
 Design flaws and implementation problems
caused poor performance
HTTP/1.1:
 Addresses some of these problems
 60% server support, client & proxy support
beginning
 Performance benefits! (optimised implementation
reduces packet traffic by 2/3)
 Is acting as fire-fighter
 Poor usage counting
 Not sufficiently flexible or extensible
20
HTTP/NG
HTTP/NG:
• Ideas for next generation of HTTP
• Produced various studies and
reports
• No longer being developed within
W3C
• Work now being coordinated by the
IETF
21
Metadata
Metadata - the missing architectural component
from the initial implementation
of the web
Addressing
URL
Metadata Needs:
22
•
•
•
•
•
•
Resource discovery
Content filtering
Authentication
Improved navigation
Multiple format support
Rights management
Transport Data format
HTTP
HTML
Privacy
P3P (Platform for Privacy Preferences):
• Example of a metadata application
• Privacy concerns are a current barrier to Web
development (esp. in US)
• P3P project developing methods for exchanging
Privacy Practices of Web sites and user
• Documents on architecture and vocabulary
available
• See <URL: http://www.w3.org/TR/P3P/>
23
Digital Signatures
DSig (Digital Signatures initiative):
• Key component for providing trust on the web
• DSig 1.0 is based on PICS
• DSig 2.0 will be based on RDF and will
support signed assertion:
– This page is from the University of Bath
– This page is a legally-binding list of courses
provided by the University
• See <http://www.w3.org/DSig/>
24
RDF
RDF (Resource Description Framework):
• Highlight of WWW 7 conference
• Provides a metadata framework ("machine
understandable metadata for the web")
• Based on ideas from content rating (PICS),
resource discovery (Dublin Core) and site
mapping (MCF)
• Applications include:
–
–
–
–
cataloging resources
electronic commerce
digital signatures
intellectual property rights
– resource discovery
– intelligent agents
– content rating
– privacy
• See <URL: http://www.w3.org/RDF/>
25
RDF Model
RDF Data Model
RDF:
• Based on a formal
data model (direct
label graphs)
• Syntax for
interchange of data
• Schema model
page.html
Cost
Resource
Property
PropName
Cost
26
Value
Property
page.html
£0.05
PropObj
InstanceOf
PropertyType
Value
ValidUntil
11-May-98
Cost
£0.05
ValidUntil
11-May-98
RDF Example
Example of Dublin Core metadata in RDF
<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:DC="http://purl.org/dc/elements/1.0/">
<Description about="http://www.w3.org/folio.html">
<DC:title>The W3C Folio 1999</DC:title>
<DC:creator>W3C Communications Team</DC:creator>
<DC:date>1999-03-10</DC:date>
<DC:subject>Web development, World Wide Web Consortium,
Interoperability of the Web</DC:subject>
</Description>
See <http://www.w3.org/Metadata/Activity/>
</RDF>
RDF has been used to express data about the W3C Folio. The basic concept
is that metadata about this item on the Web is described through a collection
of properties called an RDF Description. Notice that RDF uses the familiar
XML syntax. This example also illustrates XML Namespaces.
27
RDF Conclusion
 RDF is a general-purpose framework
 RDF provides structured, machineunderstandable metadata for the Web
 Metadata vocabularies can be developed
without central coordination
 RDF Schemas describe the meaning of
each property name
 Signed RDF is the basis for trust
28
Deployment Issues
What part of the spectrum are you closest to?
Must support standards
29
Go with the marketplace
I Support Standards
But:
• You probably use PowerPoint, don't you?
• Software vendors will subtly suck you into use of
proprietary features
• Home-grown solutions can be expensive (where are
all the good Perl / C programmers willing to work on
short-term contracts for a pittance in Universities?)
• Standards may not take off – remember Coloured
Book network protocols?
• Proprietary solutions may become standardised
• Standards may not yet be available (or finalised)
• Do users want standards? Will "We support
standards" conflict with "Our services are based on
user requirements"?
30
I Follow The Marketplace
Good New Labour philosophy, but:
• Can you trust your software vendor?
• Will your software vendor be around in a few
years time ("I only buy Rover")
• Will your system be interoperable?
• What happens when you want to interwork
with partners or your organisation merges / is
taken over?
• What happens when you want to extend your
system beyond the limits set by your software
vendor?
31
Some Difficulties
We should acknowledge some difficulties in a
standards-based approach:
• Keeping up-to-date (look at nos. of documents at
http://www.w3c.org/TR/ and size of
http://www.diffuse.org/standards.html)
•
•
•
•
32
Spotting the winning standards
Implementing the standard in a timely way
Dealing with the problems of the software vendor
Resources!
Is It Worth It?
Has the Web stabilised?
• Are you thinking about WAP services?
• Will you want to (be forced to) make your
web service accessible?
• Will you want to deploy personalised
interfaces (e.g. My.Oxford.ac.uk)
• Will your web service move from
information provision to e-business?
• Do you want your University web site to
use business-to-business (B2B) protocols
to automate transfer of link and news items
to HERO (neé HE Mall)?
33
What Should I Do?
What approaches should I use?
• Storing information in a structured format makes
subsequent redevelopment easier
• Be driven initially by standards and architectural
considerations, not by applications
• Consider use of more sophisticated web
management tools, rather than HTML authoring
tools
• An organisational standards guidelines document
(part of a Web Strategy document) may be useful
• Don't work in isolation:
– Monitor standards development (e.g. W3C)
– Listen to others in your community
– Talk and discuss issues within your community
34
Architectural Models
There is a need for more intelligent software which can
process structured resources or reformat unstructured ones
HTML
resource
HTML /
XML /
database
resource
Web
server
browser
Intelligent
Web
server
Intermediaries can provide
functionality not available at client:
• DOI support
• XML support
35 • Format conversion
Web server simply sends
file to client
File contains redundant
information (for old
browsers) plus client
interrogation support
Client
proxy
browser
Server
proxy
Architectural Models –
e.g. XML Deployment
Ariadne issue 14 has
article on "What Is XML?"
Describes how XML
support can be provided:
• Natively by new browsers
• Back end conversion
of XML - HTML
• Client-side conversion
of XML - HTML / CSS
• Java rendering of XML
Examples of intermediaries
36
See http://www.ariadne.ac.uk/issue15/what-is/
Conclusions
To conclude:
• Standards are important, especially for large
organisation and national initiatives
• Proprietary solutions are often tempting
because:
– They are available
– They are often well-marketed and well-supported
– They may become standardised
– Solutions based on standards may not be
properly supported by applications
• Intermediaries may have a role to play in
deploying standards-based solutions
37
Further Information
W3C web site: <http://www.w3.org/>
W3C Tech Reports: <http://www.w3.org/TR/>
"The Development Of Web Protocols And
Formats", Exploit Interactive issue 1,
<http://www.exploit-lib.org/issue1/web/>
"Wilde's WWW: Technical Foundations of the World Wide
Web", Erik Wilde, ISBN 3-540-64285-4
Diffuse Project web site: <http://www.diffuse.org/>
"On Julius Caesar, Queen Eanfleda, and the lessons from time
past" Brian Meek, KCL
<http://www.kcl.ac.uk/kis/support/cc/staff/
brian/caesar.html>
38
Community Information
Discuss standards, architectures and
applications on various mailing lists:
• website-info-mgt Mailbase list
• web-support Mailbase list
See <http://www.mailbase.ac.uk/>
Participate in the Institutional Web
Management workshop (Bath University, 79th Sept) – details will be announced on
website-info-mgt Mailbase list
39
Question Time
Any questions?
40