The Semantic Web: A network of understanding

Download Report

Transcript The Semantic Web: A network of understanding

The Semantic Web:
A network of understanding
Jim Hendler
Univ of MD/RPI
http://www.cs.umd.edu/~hendler
Outline
• The Semantic Web
• Past
• Present
• Future
May, 2001
March, 2000
May, 1994
ASWC 2006
Semantic Web hypothesis:
Heterogeneous Web-based Information Resources
can be connected by Web-based knowledge models
PVT
Burkitt’s Lymphoma
Rearrangement of a DNA
sequence homologous
to a <cell-type>cell-virus junction
fragment </cell-type>in several
<disease>Moloney murine
leukemia</disease>
virus-induced
<organism>rat</organism> thymomas
PubMed
Semantic Web
Oncogene(MYC):
Found_In_Organism(Human).
Gene_Has_Function(Transcriptional_Regulation).
Gene_Has_Function(Gene_Transcription).
In_Chromosomal_Location(8q24).
Gene_Associated_With_Disease(Burkitts_Lymphoma).
Burkitt’s Lymphoma
8q24
PVT1
ASWC 2006
PVT
Rearrangement of a DNA
sequence homologous
to a cell-virus junction
fragment in several Moloney
murine leukemia
virus-induced rat thymomas
PubMed
Web ontologies
•
Web Ontologies are
models allowing the
linking of
•
multimedia
•
databases
•
services
•
•
•
Web services
•
Grid computing
meta-data repos
Or any other Web
resource!
•
Other ontologies
•
Anything with a
URI
ASWC 2006
The "layercake"
T. Berners-Lee, 2001
ASWC 2006
2001
Funded Research
WG activity
Recommendation
• Research, experimentation,
early demonstrations
• Reminiscent of the early
days of the Web
Semantic Web Today

The Semantic Web of 2002 resembles the early days of the
World Wide Web


Development funded primarily by Govt, but emerging corporate interest
A lot of excitement, but confusion as to business case

Open source tools and “geeks in control ”
Standards starting to stabilize to point where they permit deployment

Developer tools, libraries, languages

10
ASWC 2006
10
2003
Funded Research
WG activity
Recommendation
Semantic Web Today

“Our” Semantic Web
 Jan 1, 03: Crawler finds 5.8M+ DAML statements on 20,000+ web pages
The Semantic Web of 2002 resembles the early
days of the
 Doesn’t include many instance KBs tied to ontologies
World Wide Web
 Doesn’t include many v ery large RDFS-based
KBs that include some OWL


Development funded primarily by Govt, but emerging
corporate
interest
 Ontology
library
at http://www.daml.org has 195 ontologies (March 2003)
 Open for any one to create
A lot of excitement, but confusion as to business case

Open for any one to use

Content prov iders: Daimler-Chry sler, Nokia, Motorola, EDS,Agfa

Open source tools and “geeks in control ”
 OWL is being supported by large corporation labs
Standards starting to stabilize to point where they permit deployment
Web tool dev elopers: IBM, HP, Sun, Intel, Fujitsu

Developer tools, libraries, languages

 OWL is starting to be used by thesaurus developers



C.f. National Cancer Institute metathesaurus released in OWL Lite
United Nations Standard Product Codes av ailable in DAML
NASA thesaurus av ailable in DAML
 Use of semantic markup for Web Services beginning to move beyond basic
research

10
10
Sanken, 03
ASWC 2006
DAML-S cited as required reading for Web Serv ices Composition WG
23
www.mindswap.org
23
• Early
government
adoption
• Emerging
corporate
interest
2005
Funded Research
WG activity
Recommendation
Semantic Web Today

“Our” Semantic Web
 Jan 1, 03: Crawler finds 5.8M+ DAML statements on 20,000+ web pages
The Semantic Web of 2002 resembles the early
days of the
 Doesn’t include many instance KBs tied to ontologies
World Wide Web
 Doesn’t include many v ery large RDFS-based
KBs that include some OWL
•


p a n ie s
g e t
t
in g
in t
Development funded primarily by Govt, but emerging
corporate
interest
 Ontology
library
at http://www.daml.org has 195 ontologies (March 2003)
 Open for any one to create
A lot of excitement, but confusion as to business case

Open for any one to use

Open source tools and “geeks in control ”
 OWL is being supported by large corporation labs
Standards starting to stabilize to point where they permit deployment
Web tool dev elopers: IBM, HP, Sun, Intel, Fujitsu

Developer tools, libraries, languages

C o m
No w
• Commercial tools

• Lots of open
source software
• Scalability


I
•
O
•
O
B
Content prov iders: Daimler-Chry sler, Nokia, Motorola, EDS,Agfa
 OWL is starting to be used by thesaurus developers

•
•
H
C.f. National Cancer Institute metathesaurus released in OWL Lite
United Nations Standard Product Codes av ailable in DAML
NASA thesaurus av ailable in DAML
•
M
M
r
a
p
e
c
S
•
O
O
N
le
s
u
b
t
n
•
p
o
o
p
u
o
r
r
a
s
e
s
u
p
c
e
t
s
s
u
p

10
Sanken, 03
L
W
a
b
im
e n t
L
s
o p e n
•
K
o
w
a
r
•
J
e
n
a
,
(
r
K
t
h e
o
n
t
o
w
D
F
S
p
o
r
t
a
t
a c t
o
R
D
r
i)
lo
g
y
F
m
in
s
a
O
c
a
r
la
b
n
a
le
a
g
c
le
t
r
e
m
1
ip
0
le
e
n
.
2
s
t
o
t
s
r
y
s
t
e
m
e
o
p
e
s o u r
a t
io n /
p
n
-
s
c e
i,
R
S
e
é
,
s
a
m
o
t
u
c
r
c
F
L
m
e
e
T
r
Q
I
e
F
u
n
e
g
n
a
A
b e c o m
ic
ib
in
J
o o ls
a c a d e m
D
o
P
I
in g
a v a ila b le
f
o r
u s e
,
3
S
t
o
r
…
e
…
e
DAML-S cited as required reading for Web Serv ices Composition WG
•
•
P
r
B u ild in g
o
t
é
c o r
g
p o r
S
a t
W
e
O
O
d e m
P
,
o n s t
O
n
r
a t
t
o
o r
(
x
x
s
x
)
…
b e c o m
in g
c h e a p
a n d
10
e a s y
23
SemTechCo
nf , 3
/ 05
ASWC 2006
o
R
 Use of semantic markup for Web Services beginning to move beyond basic
research
o
a
P
a n y
e x p e r
S
www.mindswap.org
23
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
k
ic
(
e
T
L
d
im
Z
d
e
e
W )
™
t
o
d
s
a
e
Web Modeling Languages - 2005
 Resource Description Framework (RDF)


Few, but important, constraint
A basic, extensible assertional language
 RDF Schema (RDFS)

Weak structuring of sets of terms (taxonomy-esque)
Class and property hierarchies

Domain and Range constraints

 The Web Ontology Language, OWL


Stronger structuring of sets of terms ( ontologies )
Everything in RDFS plus
 Complex Class constructors (unionOf, intersectionOf)
 Additional property features (inverse, transitive)
 Class local property type and cardinality constraints
 And more
ASWC 2006
2006: You Are Here!
ASWC 2006
Significant Corporate Activity
• Semantic (Web) technology companies starting & growing
• Siderean, SandPiper, SiberLogic, Ontology Works, Intellidimension,
Intellisophic, TopQuadrant, Data Grid, …
• Bigger players buying in
• Adobe, Cisco, HP, IBM, Nokia, Oracle, Sun, Vodaphone…
announcements/use in 2005-2006
• Gartner identifies Corporate Semantic Web as one of three "High impact"
Web technologies
• tools being announced: AllegroGraph, Altova, TopBraid, …
• Government projects in and across agencies
• US, UK, EU, Japan, Korea, …
• Life sciences/pharma an increasingly important market
• Health Care and Life Sciences Interest Group at W3C
• Many open source tools available
• Kowari, RDFLib, Jena, Sesame, Protégé, SWOOP, Onto(xxx), Wilbur, …
ASWC 2006
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
<pdf:Producer>Acrobat Distiller 7.0.5 for Macintosh<
/pdf:Producer>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:format>application/pdf</dc:format>
<dc:creator>
<rdf:Seq>
<rdf:li>James Hendler</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">XMLideas.ppt</rdf:li>
</rdf:Alt>
</dc:title>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:xapMM="http://ns.adobe.com/xap/1.0/mm/">
<xapMM:DocumentID>uuid:93277c40-5534-11da-a3f2-000a95d6b344</xapMM:DocumentID>
<xapMM:InstanceID>uuid:9327882b-5534-11da-a3f2-000a95d6b344</xapMM:InstanceID>
</rdf:Description>
</rdf:RDF>
Richer
metadata
Embedded meta-data
Data harvesting & visualization
Enterprise data integration
"Corporate Semantic Web", Gartner "hot pick" for 2006
ASWC 2006
Digital asset management
Ontology editors (and other tools)
ASWC 2006
Semantic Web portals
Semantic Web and
social networking
Significant Corporate Activity
50+ Semantic
Web press
releases each
month
ASWC 2006
Significant Government Activity
• Agencies moving
beyond the "talk"
phase
• primarily prototyping,
but first acquisitions
starting
• Example:
• NASA is developing
an enterprise data
strategy around using
existing data via
Semantic Web
integration
ASWC 2006
Lots of activities across NASA
• Science, Engineering, and Mission all have
SWT
production or development efforts in place
• Now focus in on re-using the data systems we already
have in place
•
Agency wide
integration planning
is underway for
building a federation
of models into an
integrated information
service across all
disciplines
(A. Schain, 3/06)
There's a Lot Out There!
Paid ads
2,120,000 hits on
"RDF filetype:rdf"
13,600 hits on
"ontology filetype:owl"
(March, 2006)
ASWC 2006
Where we are today
•
Survey of 1300 OWL
ontologies found by
crawl
•
•
•
•
Wang 06
19 ontologies with
2000+ classes
6 ontologies with 10000+
classes
2 ontologies with
50000+ classes
•
CYC, NCI
Species
RDFS
OWL Lite
OWL DL
OWL Full
Error
Count
587
199
149
337
3
ASWC 2006
Swoogle
http://swoogle.umbc.edu
ASWC 2006
Some "Swoogle" observations
The OWL namespace has been declared by 113,000
SWDs (8%) and actually used by 108,000 (7%). The
RDFS namespace enjoys more use, being declared by
677,000 (47%) and used by 538,000 (37%) SWDs.
Owl:Class is the most used term from the OWL namespace
with ~ 1,800.000 instantiations in 68,000 SWDs
We also noticed significant use of two OWL equality
assertions: owl:sameAs (280,000 assertions in 17,00
SWDs) and owl:equivalentClass (70,000 assertions in
4,300 SWDs). Their common use may be an indication of
increased ontology alignment.
(Ebiquity blog, Sept 1, 2006)
ASWC 2006
The cake is evolving as well..
(Tim Berners-Lee)
2001
ASWC 2006
(Tim Berners-Lee)
2006
New languages underway
• SPARQL
• Query language for (distributed) RDF triple stores
• The SQL of the Semantic Web
• GRDDL/RDFa
• Integration of HTML world and Semantic Web
• Means for "embedding" RDF-based annotation on traditional Web pages
• Means for generating RDF triple stores from (annotated) Web pages
• RIF
• Rules interchange format
• Representing rules on the Web
• Linking rule-based systems together
• And more
• Multimedia annotation, Web-page Metadata annotation, Health Care and
Life Science (LSID), Privacy
ASWC 2006
Next Steps
ASWC 2006
The Great Wall
ASWC 2006
Built in pieces at different times
Linked together for greater effect
ASWC 2006
The World Wide Web
Built in pieces at different times
Linking of
"Web Islands"
Linked together for greater effect
ASWC 2006
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rdf:RDF [
<!ENTITY feleuk.owl "http://www.mindswap.org/ontologies/feleuk.owl">
<!ENTITY owl "http://www.w3.org/2002/07/owl#">
<!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
<!ENTITY NCI "http://www.ncibi.nih.gov/NCIT/NCIT.owl#">
<!ENTITY CYC="http://www.cyc.com/2004/06/04/cyc#">
]>
<rdf:RDF xml:base="&feleuk.owl;"
xmlns:owl="&owl;"
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:NCI="&NCI;"
xmlns:CYC="&CYC;">
Linking is power!
<owl:Ontology rdf:about=""
rdfs:label="Feline Leukemia"
owl:versionInfo="Feline Leuk 1.0"/>
Link to 45000
terms at NCI
<owl:Class rdf:about="#Feline-Leukemia">
<rdfs:subClassOf rdf:resource="NCI:Leukemia"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:allValuesFrom rdf:resource="CYC:cat"/>
<owl:onProperty rdf:resource="#NCI:diseased-organism"/>
</owl:Restriction>
Link to 47000
</rdfs:subClassOf>
</owl:Class>
</rdf:RDF>
ASWC 2006
(Open)CYC terms
Linking is power
• Today we can find thousands of ontologies
• Available on the Web
•
•
•
•
Linked to Web resources
Linked to data resources
Linked to each other
Linked to Web 2.0-like annotations
• And billions of annotated (semi-Knowledge
engineered) objects
• Available on the Web
•
•
•
•
Linked to Web resources
Linked to data resources
Linked to each other
Linked to the ontologies
We must link these together for great effect!!
ASWC 2006
A key opportunity
• Vast amounts of "semi-engineered" knowledge
• Flickr: tens of millions of keyword tagged photos
• Wikipedia: thousands of carefully documented subjects
(in a hierarchy, with disambiguation, …)
• Etc. etc. etc.
• With "persistent" URIs
• "tank" http://en.wikipedia.org/wiki/Tank (armament)
• "tank" http://en.wikipedia.org/wiki/Tank%2C_Pakistan
(small town in Pakistan)
• And anything with a URI can be linked to the
Semantic Web!!!!!
ASWC 2006
For exciting linking possibilities
• Linking of
Web 2.0 and
Semantic
Web
• Using
informal KE
to bootstrap
"formal" KE
• Extending
formal KE
from Web
2.0
ASWC 2006
Evolving vision
Documents, linked to
Images, annotated with
Ontologies, linked to
Other ontologies, describing
Databases, exported as
RDF graphs, as input to
Services, which designate
Documents, linked to …
(ad infinitum)
Stay tuned…
2001
2000
1994
ASWC 2006
Semantic Web Challenges
• Today's Semantic Web Languages
• Are not-very-expressive-KR-language standards
• Not KIF, or even KL-ONE
• Create non-persistent knowledge bases
• Servers come and go
• Ontologies change over time
• And can't be kept consistent
• Disagreement, error, dishonesty…
ASWC 2006
Semantic Web opportunities
• Today's Semantic Web Languages
• Are not-very-expressive-KR-language standards
• Like HTML is to SGML
• Create non-persistent KBs
• Like the 404 error (w/o which there is no Web)
• And can't be kept consistent
• Like blog-space and Web 2.0
• We need to accept, and more importantly
exploit, these features
ASWC 2006
Note to Grad students
(and their advisors)
• The Semantic Web today, esp at the ontology layer, is
like the Web with no one using <a href=…>
• What makes the Web, the Web
• Please, No more one ontology, one domain, one set of
services, one … Theses
• There's a reason we built this stuff on top of RDF and URIs
The network effect is where the power is!
ASWC 2006
A few of the many things I've left out
• Semantic Web Services
• Crucial for linking "programs" into the mix
• Semantic Web tools and scaling issues
• Engineering approaches being used to scale Semantic Web stores to
database sizes
• Information extraction and Semantics
• Can we "retrofit" semantics on the existing Web
• Semantic Web Information Creations
• Can we make it so we don't have to retrofit future Web?
• Other information resources
• Personal data, unstructured resources, off-line collection information,
digital libraries, …
• There's more that isn't on the Web than is on it!
• New Web use patterns
• Social networks, blogs, wikis, …
• … are all fertile areas for Semantic Web exploration
ASWC 2006
Conclusion
• The Semantic Web is real
• Tremendous progress in the past five years
• Lots of it is out there
• Growing support in industry and govt use
• Development continues
• Easy to get involved
• Many open source tools
• New languages and techniques reaching critical mass
• The next steps are exciting
• The "network effect" of linking to other Semantic Web resources
• … and to non-Semantic Web resources
• And research opportunities still abound
• Scaling
• Inconsistency
• Access and acquisition
ASWC 2006
The SEMANTIC WEB
ASWC 2006