Standards In A Digital World: Z39.50, HTML, Java: Do They Really

Download Report

Transcript Standards In A Digital World: Z39.50, HTML, Java: Do They Really

Standards In A Digital World:
Z39.50, HTML, Java:
Do They Really Work?
1
Brian Kelly
UK Web Focus
UKOLN
University of Bath
[email protected]
http://www.ukoln.ac.uk/
Contents
• Introduction
• HTML
• Initial Roadmap / The Diversion / Back on Course
• W3C Standardisation Process
• Rivals to HTML
• PDF
• Viewers
• Scripting
• Client-side Scripting Languages
• Server side Scripting
• Distributed Searching
• Z39.50
• Other Protocols
2
• Conclusions
UK Web Focus
UK Web Focus:
• National web coordination post for UK HE community
• Based at UKOLN, University of Bath
• Responsibilities include:
– Technology watch
– Information dissemination in variety of ways:
– Workshops (national, regional)
– Presentations at conferences and seminars
– Online
– Coordination activities
– Representing JISC on W3C
• Brian Kelly appointed on 1st November 1996
3
– Involved with web since January 1993
– Previously worked at University of Newcastle, Leeds,
Liverpool, and Loughborough
The Question
Where do you stand?
The success of the Web
is based on competition
in the marketplace.
Just look at the benefits
provided by competition
between Netscape and
Microsoft.
4
The success of the Web
is based on building on
open, non-proprietary
standards.
Use of proprietary systems
has increased costs for
the user, and resulted in
flawed systems.
HTML Roadmap
HTML 1.0
HTML 2.0
HTML +
Style Sheets
5
Gets things started
CERN / NCSA partnership
introduces NCSA Mosaic with
support for forms and inline images
Proposal for enhancements
including improved layout control
(e.g. tables), maths, etc.
Mechanism for defining appearance
Structure separate from appearance
Various proposals (DSSSL, CSS,
…)
HTML History
HTML 1.0
HTML 2.0
HTML 3.0
HTML 3.2
HTML 4.0
6
Unpublished specification. DTD developed
by Tim Berners-Lee (CERN).
Spec. based on innovations from NCSA
(forms and inline images!)
Proposed spec. (renamed from HTML+).
Very comprehensive
Failed to complete IETF standardisation
process
Little implementation experience
Spec. based on description of mainstream
innovations in marketplace
Current proposal.
HTML Wars
October 1994
1995 
1996 
7
Netscape released (Mosaic
Communication Corporation)
Quality browser, but supported
proprietary tags (<BLINK>, <FONT>,
etc.)
New versions of Netscape released,
supporting additional proprietary tags
(<SPACER>, <LAYER>, etc.)
Microsoft respond to competition with
their own proprietary tags
(<MARQUEE>, etc)
HTML Wars - The Problems
Device Dependency
• Resources are dependent on a particular browser
• Platform dependency
Costs
• Costs in supporting authoring tool
• Potential costs in re-engineering
Architecture
• Proprietary innovations have been flawed:
– Merging content and appearance
– Maintenance of resources
• Accessibility problems:
8
– Poor support for access by disabled (e.g. speaking
browsers for visually impaired)
End of the Wars?
Thursday, August 21 1996
Microsoft Pledge on HTML Standards
"HTML is the most basic and fundamental data format
of the Web.
Support for HTML standards ensures that content can
be viewed by any browser as the creator intended.
…. agreement on the most basic data format is critical
to interoperability and the continued growth of the
industry."
See http://www.microsoft.com/internet/
html.htm
9
Microsoft Pledge (Cont.)
"Previous proprietary HTML extensions from Microsoft and other
vendors have confused the market, hampered interoperability and
been ill-conceived with respect to [HTML] design principles ...
Microsoft will agree to:
 Not ship extensions to HTML without first submitting them to
W3C.
 Implement all W3C approved HTML standards.
 Clearly identify any not-yet-approved HTML tags we support as
such.
 Publish a Document Type Definition (DTD) for its browser as
mandated by SGML.
 Follow the architecture principles of HTML and its parent,
SGML, when proposing new extensions.
10
Microsoft agrees to hold itself to these standards. Will all the other
Web browser vendors, including Netscape, also agree to this
conduct of behavior?"
HTML 4.0 and CSS
HTML 4.0 and CSS will provide an architecturally pure,
yet functionally rich environment
HTML 4.0
• Improved forms
• Hooks for stylesheets
• Hooks for scripting
languages
• Table enhancements
• Better printing
CSS
• Support for all HTML
formatting
• Positioning of HTML
elements
• Support for multiple
media
Problems
Some problems with CSS are being experienced following:
• Use of CSS features which changed during CSS
development
• Browser supported features which changed
11
W3C Process
W3C:
• A consortium of subscribing member organisations
• Areas of work agreed by
User Interface:
members
• HTML
• Working group set up:
• Style Sheets
– Charter
– WG membership (restricted)
12
•
•
•
•
Document Object Model
Maths
Graphics
Fonts
• Initial recommendations
produced by WG
• Recommendation made public
• Feedback on open mailing lists and to editor
• Recommendation updated
• Members vote
W3C Process
Pros
• Work can be wellfocussed
• Avoids "flaming"
• Battle can take place
in private
• Implementation and
development of spec
closely linked
13
Cons
• Discussions are closed
• Process undemocratic
• Only rich companies
can afford to take part
• Difficult for nonmembers to contribute
their expertise
• Non-members may be
developing systems in
isolation
HTML - The Competition
What are the alternatives to HTML ?
HTML
14
An SGML DTD
Describes document structure
Used in conjunction with emerging style
sheet proposal
Agreements on standards emerging
PDF
Adobe's Portable Document Format
Provides control over appearance
Proprietary
Native file format
Store document in native format, and provide
user with reader on client machine
SGML / XML
Richer DTDs
PDF
PDF Pros
• Control over appearance not (yet) easily
available in HTML
• Functionality of PDF Reader can controlled (e.g.
prevent copying, printing with watermarks)
PDF Cons
• Does not store document structure
• Proprietary
– How would we feel about it if it where owned
by Microsoft?
– Remember GIF patent problems!
• Printing problems
15
Use of Native File Format
Files can be stored in their native file format (Word,
Powerpoint, LaTeX, DVI, etc.)
Files may then be viewed using the application or a
viewer which understands the format
Pros:
• No conversion needed
Cons:
•
•
•
•
•
16
Viewing software needed
Format version issues
Indexing issues
Viruses
Proprietary
XML
XML:
• Extensible Markup Language
• A lightweight SGML designed for network use
• Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc)
• Eliminates problems encountered in extending
HTML:
– Extension by fiat e.g. <FONT>
– Public experiments e.g. the <BLINK> tag
17
– The standards process e.g. Maths
• Agreement achieved quickly
• Support from industry (SGML vendors,
Microsoft, etc.)
XML Support
Microsoft have expressed support for XML:
"Internet Explorer version 4.0 will support a few
XML applications (such as CDF). Microsoft will
be supporting XML in future versions of Internet
Explorer"
See http://www.microsoft.com/
standards/xml-f.htm
Note how they will be supporting an ISO
standard!
18
Metadata
Metadata - the missing
architectural component
from the initial
implementation
of the web
Addressing
URL
Transport Data format
HTTP
HTML
19
Metadata Requirements
Imagine a university prospectus on the web
20
Requirement
Protocol
Available in Middle East,
where porn filters in use
Resource discovery (find
“Bath prospectus”)
PICS (rating system)
Legally binding assertion
Digital Signature
(DSig)
Delivered in appropriate
format (HTML, PDF)
Transparent Content
Negotiation
DubIin Core
Metadata Standards
PICS
Agreement within industry (US
Communications Decency Act
perceived as threat)
Format moving to XML in PICS/NG
Dublin Core Pressure from library community
results in changes to HTML 4
Format likely to move to XML
Digital Signatures
Based on PICS/NG
W3C to set up a Metadata Coordination Group
21
Other XML Developments
XML seems to be gaining momentum:
PICS
22
Moving from rating system to key part of
metadata architecture
CDF
Channel Definition Format
Microsoft proposal for push technology
OPS
Open Profiling Specification
Microsoft proposal
XML Web Collections
Microsoft proposal for defining relationships
between resource.
MCF using XML
Netscape proposal for describing metadata for
collections of resources using XML
CML
Chemical Markup Language
MML
Math Markup Language
Scripting
Background:
• Netscape's Javascript (renamed from
Livescript) was first widely-deployed scripting
language
• Problems with inter-working between different
versions
• Problems with inter-working across browsers
(Microsoft and Jscript)
• Problems with use of multiple scripting
languages in a document
23
Scripting
Developments:
• Javascript handed to standards body (ECMA)
See http://www.ecma.ch/memento/tc39.htm
• W3C developing standards for integrating scripting
languages with HTML
See http://www.w3.org/TR/WD-script
• W3C working on Document Object Model (DOM) "
.. a platform- and language-neutral interface that
will allow programs and scripts to dynamically
access and update the content, structure and style
of documents."
See http://www.w3.org/MarkUp/DOM/
24
Java
Java:
• Development began by Sun in early 1990s
(known as Oak)
• Moved to Web and released in 1995
• Programming language and virtual machine
environment (provides portability and
security)
• See http://java.sun.com/
25
Java Applications
Java is gaining momentum:
• Interactive applications
• Enhanced user interfaces
• Replacing conventional
desktop applications
• Extending browsers
26
http://www.mini.co.uk/
Java Standardisation
Java developments:
• Sun submitting Java to standards body
(ISO/IEC JTC1)
• Concerns over process ("Microsoft believes
that .. that Sun wishes to retain full ownership
and control over its Java specifications ..")
• See http://java.sun.com/aboutJava/
standardization/index.html
27
Distributed Searching The Problem
End users face difficulties due to
the wide variety of search
interfaces available
28
Possible Solutions
Agree to use the same software
• Unlikely to happen
• Undesirable
Agree to use implement similar interfaces
• Probably not feasible
Have a centralised database
• Scaling problems
Use software which implements protocol
designed to provide common search
interface across diverse services
• e.g. Z39.50
29
An Applications Solution
Metacrawler can
be used to search
several large
search engines.
Problems:
• Breaks if APIs
change
• Centralised
system
30
http://www.metacrawler.com/
Z39.50 - What Is It?
Z39.50:
• A protocol which specifies data structures
and interchange rules that allow a client
machine to search databases on a server
machine and retrieve records that are
identified as a result of the search
• Maintained by Library of Congress
• Developed by ZIG
Why is it important?
• Powerful searching
• Local, familiar interface
• Retrieves structured data
31
Z39.50 History
Z39.50 (1988)
• NISO work with roots in OSI work
• "an unimplementable abomination which should never
have been adopted"
• "Inspired" WAIS (which was not interoperable)
Z39.50 (1992)
• Implementation experience
• OSI now regarded as failure
Z39.50 (version 3)
• Accepted as ISO standard in 1996 ISO (23950)
• Implemented using TCP/IP
• Toolkits, profiles, etc now available
Taken from Clifford Lynch's article at
32
http://hosted.ukoln.ac.uk/mirrored/lisjournals/dlib/dlib/dlib/april97/04contents.html
Z39.50 Pilot
UKOLN is piloting
Z39.50 across a
number of services
(UKOLN web site,
BUBL, eLib project
database, ...)
Imagine searching
across JISC services
(and institutions):
33
Find the chemical XML browser,
and relevant reviews & papers.
Search HENSA software archive,
Mailbase lists, a Chemistry
gateway and Imperial college
web site
Related Protocols
LDAP
Lightweight Directory Access Protocol
Derived from X.500 directory service
See "Lightweight Directory Access Protocol"
http://ds.internic.net/rfc/rfc1777.txt
See also http://www.novell.com/
products/nds/ldap.html
http://www.critical-angle.com/
ldapworld/Welcome.html
whois++ Derived for whois protocol for finding
people (IETF)
See "Architecture of the Whois++ Index
Service" at the URL
http://ds.internic.net/rfc/
rfc1913.txt
34
What The Software Companies Say
Netscape (see http://search.netscape.com/newsref/std/standards_qa.html)
• [We will] aggressively support open standards wherever they
exist
• Work within the open standards process to innovate valuable
new functionality in ways that promote openness and
interoperability.
• All current Netscape products implement and support the
existing open standards appropriate to their functionality.
Microsoft (see http://premium.microsoft.com/msdn/library/sdkdoc/
inetcsdk_2htc.htm)
• Microsoft is fully committed to the HTML standards articulated by
the World Wide Web Consortium (W3C) and the international
Internet community.
35
Caveat Emptor!
Beware of free software - it can be expensive!
Remember Your Music
Collection?
7" single Your favourite single
12" LP
The album containing the hit
12" LP
Greatest hits
CD
When you bought your CD
Record companies are happy to sell you
the same information in several formats!
Is The Same True Of Your
Information Systems?
Home-grown
Gopher
The hit of 1992
WWW
The HTML 2 version
WWW (2)
Revamped, based on
Netscapeisms
WWW (3)
Revamped, based on
HTML 4 and CSS
WWW (4)
??
Microsoft and Netscape will be happy
to sell you tools to manipulate the
same information!
36
Conclusions
•
•
•
•
•
•
37
Without standards, costs are liable to escalate
Software companies are happy to take our money
OSI networking standard gave standardisation
process a bad name
Current IETF / W3C process of developing
standards and gaining implementation experience
is valuable
Standards are not frozen
The difficult choice may be "What standard?"
Further Information
List of Standards Bodies
http://www.yahoo.com/Reference/Standards/
http://www.iso.ch/VL/Standards.html
http://www.cmpcmm.com/cc/standards.html
World Wide Web Consortium
http://www.w3.org/
IETF
http://www.ietf.cnri.reston.va.us/home.html
http://info.isoc.org/home.html
ISO
http://www.iso.ch/welcome.html
ECMA
http://www.ecma.ch/
ISO-HTML
ftp://ftp.cs.tcd.ie/isohtml/
Microsoft and Standards
http://www.microsoft.com/standards/
Netscape and Standards
38
http://search.netscape.com/newsref/std/standards_qa.html
On Julius Caesar, Queen Eanfleda, and
the lessons from time past
1 Dual standards rather than a single standard cause trouble.
2 If you must have dual standards, specify mandatory conversions
or interfaces between them.
3 Never leave anything implementation-dependent
4 If irregularities are unavoidable in a standard (e.g. because of
external constraints), put them where they will do the least
damage.
5 Never alter standards to please the rich and powerful, unless
the changes can be justified on firm technical grounds.
6 Even the most rich and powerful can be persuaded that they will
benefit from changing from their local standard to a general one.
7 The most effective standards are those you take so for granted
you don't have to think about them.
8 If provisions of standards are based on external assumptions or
constraints unrelated to the purpose of the standard, they are
likely to appear irrational.
39
http://www.kcl.ac.uk/kis/support/cc/staff/brian/caesar.html