The Ultimate Web

Download Report

Transcript The Ultimate Web

1
The Ultimate Web:
Preparing for the Next Generation
of the WWW
Jim Carpenter
Bureau of Labor Statistics
November 1999
Draft: 7/18/2015 12:23:08
2
Objectives
• Describe the Ultimate Vision of the Web
• Describe its key technologies and standards
• Suggest how data managers can prepare
3
Personal Objectives
• Promote more data modeling
– Resistant Culture: process oriented
– The other category of physical-independent-thing:
object
– Basic duality of knowledge representation:
• an individual physical thing is both object and process
• the difference is scope of time in the context of the thing
(see Knowledge Representation by John Sowa, p. 71 ff.)
• Promote ISO Metadata Standards
– A key part of the semantic revolution
4
Outline
•
•
•
•
Some basic ideas about UML and Modeling
Understanding information resource webs
Semantic web
Conclusion: focus on your own metadata
5
UML in a Nutshell
• UML contains a fixed set of elements (approx. 100)
• Each element has
–
–
–
–
Concept
Term
Syntax role
Graphic representation
• Elements are grouped into 9 types of diagrams
• Diagrams can describe anything - a metalanguage
6
Some Basic Ideas
• A model is a set of concepts with notation and
syntax
• Types of models
–
–
–
–
human language
scientific language
architectural design
computer system
programming language
protocol, format
template
music, dance, DNA, atom, ...
• There is a map between any two models
– not necessarily 1-1 or onto (remember set theory?)
• Work is the process of mapping models
Work Examples
translating models
• Lecturing
Speaker  Listener
• Writing a book
Writer  Word Processor
• Building a house
Owner  Architect  Contractor  Craftsman  House
• Building a system
Owner  Analyst  Designer  Programmer  System
A house and a system as models!
Remember, Zachman describes the system as the “ultimate model”
7
8
Understanding Webs
Example: The Web of Pages
9
• Type of resource on WWW Server: WWW Page
– the Web as a big (virtual) book
• Unique Resource Locator
= machine name + page name
[page is the only addressable item]
• Resource Language: HTML
– for displaying pages
– for linking pages in a web topology
• Resource Language Interpreter: WWW browser
• Protocol: HTTP
– Simple communication language
See Weaving the Web by Tim Berners-Lee
10
Definitions - a starting point
• A web is an information resource network which
– exchanges a single type of resource described in a
single language between computer processes, and
– provides for arbitrarily linking these resources.
• An information resource network is a set of
distributed computer processes which transfer
information resources over packet sessions.
11
Some True Statements
• A computer contains at least one process
• An information resource is exchanged between
processes
• An information resource is carried in packets
• Packets travel in a session
• A session is established between processes
•
•
•
Translates to UML...
12
Sentence:
Computer contains process
Noun
Verb
Computer
contains
Process
Noun
Class
Association
Class
Note: “contains” is one of many types of this “whole-part” association
represented by the symbol at right.
See Haim Kilov’s book for a hierarchical breakdown of subtypes.
13
Sentence:
Computer contains process
Computer has role of container of process.
Process has role of content of computer.
Computer
0..*
container
Role
content
Role
Multiplicity
Constraints
1..*
Process
A computer must contain one or more processes.
A process can be the content of zero or more computers.
14
A Process ID uniquely identifies a process in a computer.
A process has control logic which governs the execution of
sentences which operate on local resources (resources within its
address space).
15
Continuing to translate
from English to UML
in this fashion
yields a general data model
(graphic representation of a web of sentences)
for
Information Resource
Networks
and
Webs ...
Information Resource Network
16
Generalized Web of Information Resources
17
Generalized Web - distinguishing features
18
19
Information Resource Networks
• Resource: Internet Name Information in DNS
– UR Locator: [email protected]
– Resource Language: Schema of DNS protocol
• Resource linkage topology: none
– Interpreter: DNS client interprets DNS protocol
– Protocol: DNS
• Resource: Email Message over SMTP
– UR Locator: none (message ID is not a Locator)
– Resource Language: 822 & MIME for describing email parts
• linkage topology: none (no locator)
– Interpreter: Mail client interprets 822 & MIME
– Protocol: SMTP & POP; 822 & MIME
20
Information Resource Networks
• Resource: File on FTP Server
– UR Locator: ftp://machine/directory path
– Resource Language: Schema of FTP Protocol
• Linkage topology: directory tree structure
– Interpreter: FTP client (part of WWW browser)
– Protocol: FTP
• Resource: Entry in LDAP Directory
– UR Locator: ldap://host/object
– Resource Language: Schema of LDAP Protocol
• Linkage topology: tree structure (can use aliases for web structure)
– Interpreter: LDAP client (new part of WWW browsers)
– Protocol: LDAP
21
Information Resource Networks
Resource
•
•
•
•
•
•
•
Page
File
Name Information
Email message
Directory Entry
Component
Database Entry
(summary)
Usual
Storage
Context
Type
WWW
FTP
DNS
SMTP
LDAP
file
file
database
database
database
DCOM or CORBA either
SQL & RPC
database
Resource Link
Topology
web
tree
none
none
tree (web?)
web
none
Every resource could be converted to web
to achieve a new dimension of power...
22
Web of Semantics
web of database entries
Database Entry:
•Schema
•Data Item
Note: corresponds to the 2 parts of SQL:
DDL = Data Definition Language
DML = Data Manipulation Language
Web of Semantics
23
-a web of database entries
• Types of Resource in web: model & data
schema = model = metadata = data about data
• Universal Resource Locator
= machine name + page name + tag name
• Resource Language: XML & DTD
– (extends & coexists with HTML)
– link topology: web
• Interpreter: XML application on WWW browser
• Protocol: HTTP
24
Extend WWW Infrastructure
• Models & Data Items on web page
• XML (eXtensible Markup Language)
– describes data items in the terms of the model
– locates each data item
• Absolute address
• Relative address in sequence of data items
– provides parsing instructions
• DTD (Document Type Definition)
– describes the model of the data
– locates each model element
– provides processing instructions for XML statements
Selling Your Car on Your Web
First Attempt: HTML
<HTML>
<BODY>
<P>Car For Sale.</P>
<P>Chevy</P>
<P>1965</P>
<P>Impala</P>
<P>$5000</P>
<P>email me at [email protected]</P>
</BODY>
</HTML>
How will search engines classify this page?
How will people find it?
25
Selling Your Car on Your Web
Use “well-known” XML tags ???
<Car>
<Make> Chevrolet </Make>
<Year> 1965 </Year>
<Model> Impala </Model>
<Price> 5000 </Price>
<Email> [email protected] </Email>
</Car>
26
27
Posting a Model
DTD Sentence
<!ELEMENT Car (Make, Year, Model, Price, Email) >
Note: there are other key parameters of !Element which
control how the XML tag data are processed.
DTD Language
28
Document Type Definition
• <!ELEMENT ElementName (content-model) >
– defines an element and what it may contain
• <!ATTLIST ElementName AttribName AttribValueInfo >
– Defines one or more attributes for the given element, including
the kind of content that the attribute value may take and default
specifications to use in the event no value is provided by the
author
• <!ENTITY % EntName parmlist >
– Defines macro like things for use in XML or DTD
• <!NOTATION NotationName SYSTEM ProgramURL >
– Defines a notation and the location of an external program which
can handle content of this media type
29
Essence of DTD
A Modeling Language
A language for writing
your own XML markup language
30
DTD Elements Can Be
Mapped to UML Elements
•USE XMI :
•a language that can model them both
•allows mapping common elements
•The model is called the MOF Model
•The facility is called MOF (Meta Object Facility)
So, you can make your models in UML
… and translate them into DTD
Managing the DTD’s
through Meta-Model Consortiums
•
•
•
•
•
•
•
•
Trust
Repositories
Registries
Software Assets
Data Warehouses
Development Tools
Electronic Commerce
Data Elements
– ISO Metadata Standards
31
32
Progress & Participation
• XMI Concept to Standard to Implementation
–
–
–
–
–
7/98 Initial submissions
11/98 XMI Interoperability Demo
1/99 OMG technology adoption begins
3/9 Initial implementations arrive
3/23 Collaboration begun between OMG & MDC
• ISO Metadata Standards
– Principles for managing shareable data
– ANSI’s X3.285 has UML model which was
successfully translated by XMI MOF into DTD
33
Suggestions
• Do Conceptual Modeling of you business terms
– The natural language model before the database
model
• Implement X3.285 (based on ISO 11179 principles)
– Managing your metadata
In another presentation, I’ll discuss the critical incompleteness
of XML/DTD and how ISO 11179 can addresses it.
34
References, organizations
• OMG = Object Management Group
– http://www.omg.org/
• MDC = MetaData Coalition
– http://www.mdcinfo.com/
• ISO/IEC JTC1 SC32 WG2 (Metadata Committee)
with ANSI NCITS/L8 as the US tech. Advisory
– http://pueblo.lbl.gov/~olken/X3L8/
• THE ISO/IEC 11179 METADATA REGISTRY
IMPLEMENTATION COALITION
– http://www.sdct.itl.nist.gov/~ftp/l8/other/coalition/Coalition.htm
35
References, books
• Just XML by John E. Simpson, Prentice Hall, 1999
– Entertaining but verbose.
• SQL : The Complete Reference by James R. Groff
& Paul N. Weinberg, McGraw-Hill, 1999
– Complete coverage of MS SQL Server 7, Oracle8,
IMB DB2, Informix, Sybase with all of their software
on CD
• Internetworking with TCP/IP Vol. I: Principles,
Protocols and Architecture by Douglas Comer, 3rd
ed., Prentice Hall, 1995
– clear, readable, good coverage
36
References, books
• Knowledge Representation: Logical, Philosophical,
and Computational Foundations, by John Sowa,
Brooks/Cole Pub. Co., 1999
– Well structured presentation of the key components of
knowledge representation, from Aristotle through
contemporary thinkers in artificial intelligence.