HTML, XML, PDF - Indian Academy of Sciences

Download Report

Transcript HTML, XML, PDF - Indian Academy of Sciences

HTML, XML, PDF
Pros and Cons
HTML : Pros
• Simplicity and Open standard
• HTML is easy to learn because it is very simple. There are only a
couple dozen tags, but less than half of them are used in most
situations.
• HTML browsers are cheap or free, and very powerful; with a
combination of third-party add-ins and server-side content support
• HTML document browser interfaces are easy to build into existing
products because of the simplicity of HTML.
• It has become very evident to users that the hypertext link really
does work across systems that are otherwise unrelated. Any page
can link to any other publicly accessible page simply by entering
the address.
• There are some specialized structures in HTML, but they are
mostly used to effect a certain formatting look.
HTML: Cons
• HTML is a very weak formatting tool that lacks even the most
fundamental page-oriented formatting capabilities, like hanging
indents, white-space control, justification, kerning, and hyphenation lead to highly variable coding for even simple designs
• HTML provides linking capabilities, but the linking is rudimentary;
it is only a one-to-one link, and requires an anchor on the target end
in order to access anything within the document.
• Issue of stability and versioning.
•
Browser manufacturers have created non-standard extensions to the
"standard" HTML tags, like the "blink" and "center" tags - lead to
viewing problems
HTML: more cons
•
•
•
•
•
•
•
One tag set for all - not extensible
Limited, predefined data structures
No formal validation
“Trades power for ease of use”
Good for simple applications only
Handcrafted - links, navigation, indexing
Concentrates on form, not substance
PDF: Pros
• PDF provides electronic pages with impressive
page fidelity. Type, graphics, and color are all
reproduced as they are on paper.
• Solves file sharing problem between platforms
• Hot links and other electronic object types, like
movies and sounds, can be added to a PDF file.
• New features are being added constantly by Adobe
• Rights management tools and security features
build-in
PDF: Pros
• PDF files are cheap to create, and are used by
many companies to deliver page-formatted
information without the high cost of postage.
• Since the end user gets something that looks very
much like paper, training costs are low
PDF: Cons
• Proprietary and not open to outside development
• Large file size and long to download
• PDF files are not nearly as flexible as other
electronic formats because the main goal is to
recreate a paper page, and not to provide a way of
delivering intelligent document structure to a user
• There is limited support for searching, although
Adobe has products that can index many different
PDF files for cross-document searching and
navigating
What about SGML?
•
•
•
•
•
•
•
Hard to learn
Costly to implement
Not web friendly in full form
Style support poor
Hard to get a fast start
Tooling up very expensive
Good linking (HyTime), hard to implement
XML
• A structured markup meta-language
• A sub-set of SGML designed for the Web
• Designed to work with companion standards
– for linking
– for styling
• Web friendly: “the next step in Web evolution”
• Overcomes limitations of HTML and CSS
• Enabler for new Web applications
XML and the Web
•
•
•
•
•
•
is Extensible
is Quicker, easier and cheaper to implement
preserves the structure of data
supports complex nesting of structures
has strong linking types - URLs and more
provides comprehensive style features
– CSS
– XSL
XML and theWeb
•
•
•
•
•
•
Is a standard approved by the W3C
XLink, XPointer, XSL also standards
Incorporates best features of DSSSL, CSS,
… and HyTime & TEI
Uses UNICODE for universality
Works with Java and JavaScript
XML Application
XML Application
Models
• Does your project intend to use XML for data storage
or for data exchange or both?
• What types or classes of data are to be stored or
shared using XML?
• Do standard XML DTDs or XML Schemas for the
description of this data exist?
• How will you create your XML documents? Will they
be authored with a suitable editor or generated by
software tools?
XML Questions to consider
• Are you exchanging data or metadata through specific
agreements with a defined number of partners?
• Are you supplying metadata to specific services? Do
those services specify requirements for the syntax,
structure and semantics of the data they accept? Do
those services specify conformance to XML DTDs or
XML Schemas which they provide?
• Or are you intending to make data or metadata available
in a more "open" environment, with the expectation that
it may be used by a potentially unlimited number of
services?
XML
• With whom do you need to share your data or
metadata?
• Is it appropriate to make that metadata available
through OAI?
XML example
• PubMed DTD
•
http://www.ncbi.nlm.nih.gov:80/entrez/query/static/publisher.htm