SW101-SKB - School of Computer Science

Download Report

Transcript SW101-SKB - School of Computer Science

The Semantic Web and
Ontologies
OntoGrid Semantic Grid Tutorial,
February, 2007, Manchester, UK
Sean Bechhofer,
School of Computer Science, University of Manchester, UK
The Semantic Web Vision
•
The Web was made possible through established standards
–
–
•
Applications able to exploit this common infrastructure
–
•
•
Both intended for direct human processing/interaction
In the next generation web, resources should be more accessible to automated
processes
–
–
•
Result is the WWW as we know it
1st generation web mostly handwritten HTML pages
2nd generation (current) web often machine generated/active
–
•
TCP/IP for transporting bits down a wire
HTTP & HTML for transporting and rendering hyperlinked text
To be achieved via semantic markup
Metadata annotations that describe content/function
Coincides with vision of a Semantic Web
OntoGrid: Semantic Grid Tutorial 2
History of the Semantic Web
•
•
Web was “invented” by Tim Berners-Lee (amongst others), while
working at CERN
TBL’s original vision of the Web was much more ambitious than the
reality of the existing (syntactic) Web:
... a goal of the Web was that, if the interaction between person
and hypertext could be so intuitive that the machine-readable
information space gave an accurate representation of the state
of people's thoughts, interactions, and work patterns, then
machine analysis could become a very powerful management
tool, seeing patterns in our work and facilitating our working
together through the typical problems which beset the
management of large organizations.
•
A number of researchers have since been working towards realising this
vision, which has become known as the Semantic Web
– E.g., article in May 2001 issue of Scientific American…
OntoGrid: Semantic Grid Tutorial 3
Scientific American, May 2001:
Chuck D sez:
Don’t Believe
the Hype!
•
•
Realising the complete “vision” is too hard for now (probably)
But we can make a start by adding semantic annotation to web
resources
OntoGrid: Semantic Grid Tutorial 4
Where we are Today: the
Syntactic Web
Resource
href
Resource
Resource
href
href
href
Resource
Resource
href
Resource
href
Resource
href
href
href
Resource
href
href
Resource
href
Resource
•
•
A place where computers do the
presentation (easy) and people
do the linking and interpreting
(hard).
Why not get computers to do
more of the hard work?
OntoGrid: Semantic Grid Tutorial 5
Hard Work using the Syntactic
Web…
Find images of Steve Furber
…Carole Goble
… Alan Rector…
Rev. Alan M. Gates, Associate Rector
of the Church of the Holy Spirit, Lake
Forest, Illinois
OntoGrid: Semantic Grid Tutorial
6
What’s the Problem?
Typical web page
markup consists of:
rendering
information (e.g.,
font size and
colour)
Hyper-links to
related content
Semantic content is
accessible to humans
but not (easily) to
computers…
OntoGrid: Semantic Grid Tutorial 7
Information we can see…
WWW2006 Edinburgh, Scotland
The eleventh international world wide web conference
23rd--26th May
Edinburgh International Conference Centre
Who should attend and who will you meet?
No other event draws the breadth…
Look Who’s Talking
Richard Granger reviews the revamping of the NHS IT programme
Look Who’s Talking
VeriSign's pincipal scientist, Dr Phillip Hallam-Baker, goes phishing...
Registration opens with special offer tickets
Professor Wendy Hall has announced the opening of registration for the
15th annual World Wide Web Conference 2006…
OntoGrid: Semantic Grid Tutorial 8
Information a machine can
see…
WWW2002
The eleventh international world wide webcon
Sheraton waikiki hotel
Honolulu, hawaii, USA
7-11 may 2002
1 location 5 days learn interact
Registered participants coming from
australia, canada, chile denmark, france,
germany, ghana, hong kong, india, ireland,
italy, japan, malta, new zealand, the
netherlands, norway, singapore, switzerland,
the united kingdom, the united states,
vietnam, zaire
Register now
On the 7th May Honolulu will provide the
backdrop of the eleventh international world
wide web conference This prestigious event 
Speakers confirmed
Tim berners-lee
Tim is the well known inventor of the Web,…
OntoGrid: Semantic Grid Tutorial 9
Solution: XML markup with
“meaningful” tags?
<name>WWW2002
The eleventh international world wide webcon</name>
<date>7-11 may 2002</date>
<location>Sheraton waikiki hotel
Honolulu, hawaii, USA</location>
<introduction>Register now
On the 7th May Honolulu will provide the
backdrop of the eleventh international world
wide web conference This prestigious event 
Speakers confirmed</introduction>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the Web,</bio>
</speaker>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the Web,</bio>
</speaker>
<registration>Registered participants coming from
australia, canada, chile denmark, france,
germany, ghana, hong kong, india, ireland,
italy, japan, malta, new zealand, the
netherlands, norway, singapore, switzerland, the
united kingdom, the united states, vietnam,
zaire<registration>
OntoGrid: Semantic Grid Tutorial 10
But What About…?
<conf>WWW2002
The eleventh international world wide webcon<conf>
<date>7-11 may 2002</date>
<place>Sheraton waikiki hotel
Honolulu, hawaii, USA<place>
<introduction>Register now
On the 7th May Honolulu will provide the
backdrop of the eleventh international world
wide web conference This prestigious event 
Speakers confirmed</introduction>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the Web,</bio>
</speaker>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the Web,</bio>
</speaker>
<registration>Registered participants coming from
australia, canada, chile denmark, france,
germany, ghana, hong kong, india, ireland,
italy, japan, malta, new zealand, the
netherlands, norway, singapore, switzerland, the
united kingdom, the united states, vietnam,
zaire<registration>
OntoGrid: Semantic Grid Tutorial 11
Still the Machine only sees…
<conf>WWW2002
The eleventh international world wide webcon<conf>
<date>7-11 may 2002</date>
<place>Sheraton waikiki hotel
Honolulu, hawaii, USA<place>
<introduction>Register now
On the 7th May Honolulu will provide the
backdrop of the eleventh international world
wide web conference This prestigious event 
Speakers confirmed</introduction>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the
Web,</bio>
</speaker>
<speaker>Tim berners-lee
<bio>Tim is the well known inventor of the
Web,</bio>
</speaker>
<registration>Registered participants coming from
australia, canada, chile denmark, france,
germany, ghana, hong kong, india, ireland,
italy, japan, malta, new zealand, the
netherlands, norway, singapore, switzerland, the
united kingdom, the united states, vietnam,
zaire<registration>
OntoGrid: Semantic Grid Tutorial
12
Need to Add “Semantics”
• External agreement on meaning of annotations
– E.g., Dublin Core for annotation of library/bibliographic information
• Agree on the meaning of a set of annotation tags
– Problems with
this approach
Machine
Processable
• Inflexible
not
• Limited number of things
can be expressed
• Use Ontologies
to specifyUnderstandable
meaning of annotations
Machine
– Ontologies provide a vocabulary of terms
– New terms can be formed by combining existing ones
• “Conceptual Lego”
– Meaning (semantics) of such terms is formally specified
– Can also specify relationships between terms in multiple ontologies
OntoGrid: Semantic Grid Tutorial 13
Ontology in Computer Science
•
An ontology is an engineering artifact:
– It is constituted by a specific vocabulary used to describe a certain
reality, plus
– a set of explicit assumptions regarding the intended meaning of
the vocabulary.
• Almost always including how concepts should be classified
•
Thus, an ontology describes a formal specification of a certain domain:
– Shared understanding of a domain of interest
– Formal and machine manipulable model of a domain of interest
OntoGrid: Semantic Grid Tutorial 14
Building a Semantic Web
• Annotation
– Associating metadata with resources
• Integration
– Integrating information sources
• Inference
– Reasoning over the information we have.
– Could be light-weight (taxonomy)
– Could be heavy-weight (logic-style)
• Interoperation and Sharing are key goals
OntoGrid: Semantic Grid Tutorial 15
Languages
• Work on Semantic Web has concentrated on the definition of a
collection or “stack” of languages.
– These languages are then used to support the representation and
use of metadata.
RDF(S)
RDF
XML
Integration
Integration
XML
RDF
RDF(S)
OWL
…
Annotation
–
–
–
–
–
Inference
• The languages provide basic machinery that we can use to
represent the extra semantic information needed for the
Semantic Web
OWL
OntoGrid: Semantic Grid Tutorial 16
Ontology Languages
• We need languages that allow us to represent this information
– Ontology Languages!
• There are a wide variety of languages for this “Explicit
Specification”
– Graphical
• Semantic Networks, Topic Maps, UML, RDF
– Logical
• Description Logics, First Order Logic, Rules, Conceptual Graphs
mother(X,M) :- parent(X,M), female(M).
father(X,F) :- parent(X,F), male(F).
Every gardener likes the sun
sister(X,S) :- female(S), parent(S,P), parent(X,P), X \== S.
8x.gardener(x) ) likes(x, Sun)
You can fool some of the people all of the time
male(james1).
9x.8t.(person(x) Æ time(t)) ) can-fool(x,t) male(charles1).
male(charles2).
You can fool all of the people some of the time
8x.9t.(person(x) Æ time(t)) ) can-fool(x,t) male(james2).
male(george1).
All purple mushrooms are poisonous
female(catherine).
8x.(mushroom(x) Æ purple(x)) ) poisonous(x)
female(elizabeth).
No purple mushroom is poisonous
female(sophia).
:9x.(mushroom(x) Æ purple(x) Æ poisonous(x))
parent(charles1, james1).
8x.(mushroom(x) Æ purple(x)) ) : poisonous(x)
parent(elizabeth, james1).
There are exactly two purple mushrooms parent(charles2, charles1).
parent(catherine,
9x.9y.mushroom(x) Æ purple(x) Æ mushroom(y)
Æ purple(y) Æcharles1).
(:x=y)
charles1).
Æ (8x.mushroom(z) Æ purple(z) ) ((x=z) _parent(james2,
(y=z)))
parent(sophia, elizabeth).
Clinton is not tall
parent(george1, sophia).
: tall(Clinton)
OntoGrid: Semantic Grid Tutorial 17
Object Oriented Models
•
•
Many languages use an “object oriented model” with
Objects/Instances/Individuals
– Elements of the domain of discourse
– Equivalent to constants in FOL
•
Types/Classes/Concepts
– Sets of objects sharing certain characteristics
– Equivalent to unary predicates in FOL
•
Relations/Properties/Roles
– Sets of pairs (tuples) of objects
– Equivalent to binary predicates in FOL
•
Such languages are/can be:
–
–
–
–
Well understood
Formally specified
(Relatively) easy to use
Amenable to machine processing
OntoGrid: Semantic Grid Tutorial 18
Why (Formal) Semantics?
• Increased formality makes languages more amenable to
machine processing (e.g. automated reasoning).
• The formal semantics provides an unambiguous interpretation of
the descriptions.
– What does an expression in an ontology language mean?
– The semantics of a language tell us precisely how to interpret a
complex expression.
• Well defined semantics are vital if we are to support machine
interpretability
– They remove ambiguities in the interpretation of the descriptions.
Telephon
e
Blac
k
?
OntoGrid: Semantic Grid Tutorial 19
RDF
• RDF stands for Resource Description Framework
• It is a W3C Recommendation
– http://www.w3.org/RDF
• RDF is a graphical formalism ( + XML syntax + semantics)
– for representing metadata
– for describing the semantics of information in a machineaccessible way
• Provides a simple data model based on triples.
OntoGrid: Semantic Grid Tutorial 20
The RDF Data Model
• Statements are <subject, predicate, object> triples:
–
<Sean,hasColleague,Ian>
• Can be represented as a graph:
Sea
n
hasColleag
ue
Ian
• Statements describe properties of resources
• A resource is any object that can be pointed to by a URI:
– The generic set of all names/addresses that are short strings that
refer to resources
– a document, a picture, a paragraph on the Web,
http://www.cs.man.ac.uk/index.html, a book in the library, a real
person (?), isbn://0141184280
• Properties themselves are also resources (URIs)
OntoGrid: Semantic Grid Tutorial 21
Linking Statements
• The subject of one statement can be the object of another
• Such collections of statements form a directed, labeled graph
“Sean K.
Bechhofer”
hasName
Sea
n
hasColleagu
e
hasColleagu
e
Carol
e
Ia
n
hasHomePa
ge
http://www.cs.man.ac.uk/~horrock
s
• The object of a triple can also be a “literal” (a string)
OntoGrid: Semantic Grid Tutorial 22
RDF Syntax
• RDF has an XML syntax that has a specific meaning:
• Every Description element describes a resource
• Every attribute or nested element inside a Description is a
property of that Resource
• We can refer to resources by URIs
<rdf:Description rdf:about="some.uri/person/sean_bechhofer">
<o:hasColleague resource="some.uri/person/ian_horrocks"/>
<o:hasName rdf:datatype="&xsd;string">Sean K. Bechhofer</o:hasName>
</rdf:Description>
<rdf:Description rdf:about="some.uri/person/ian_horrocks">
<o:hasHomePage>http://www.cs.mam.ac.uk/~horrocks</o:hasHomePage>
</rdf:Description>
<rdf:Description rdf:about="some.uri/person/carole_goble">
<o:hasColleague resource="some.uri/person/ian_horrocks"/>
</rdf:Description>
OntoGrid: Semantic Grid Tutorial 23
What does RDF give us?
•
•
•
•
A mechanism for annotating data and resources.
Single (simple) data model.
Syntactic consistency between names (URIs).
Low level integration of data.
OntoGrid: Semantic Grid Tutorial 24
RDF(S): RDF Schema
•
RDF gives a formalism for meta data annotation, and a way to write it
down in XML, but it does not give any special meaning to vocabulary
such as subClassOf or type (supporting OO-style modelling)
– Interpretation is an arbitrary binary relation
•
RDF Schema extends RDF with a schema vocabulary that allows you
to define basic vocabulary terms and the relations between those terms
– Class, type, subClassOf,
– Property, subPropertyOf, range, domain
– it gives “extra meaning” to particular RDF predicates and resources
– this “extra meaning”, or semantics, specifies how a term should be
interpreted
OntoGrid: Semantic Grid Tutorial 25
RDF(S) Inference
rdf:type
rdfs:Clas
s
Person
rdf:type
rdfs:subClassOf
rdf:type
rdfs:subClassOf
Academi
c
rdf:subClassOf
Lecturer
OntoGrid: Semantic Grid Tutorial 26
RDF(S) Inference
rdf:type
rdfs:Clas
s
Academic
rdfs:subClassOf
rdfs:type
rdf:type
Lecturer
rdf:type
Sea
n
OntoGrid: Semantic Grid Tutorial 27
What does RDF(S) give us?
• Ability to use simple schema/vocabularies when describing our
resources.
• Consistent vocabulary use and sharing.
• Simple inference
• CS AktiveSpace
– Lightweight schema to integrate data from
University sites
• myGrid
– Service descriptions for e-Science
OntoGrid: Semantic Grid Tutorial 28
Problems with RDFS
• RDFS is too weak to describe resources in sufficient detail
– No localised range and domain constraints
• Can’t say that the range of hasChild is person when applied to
persons and elephant when applied to elephants
– No existence/cardinality constraints
• Can’t say that all instances of person have a mother that is also
a person, or that persons have exactly 2 parents
– No transitive, inverse or symmetrical properties
• Can’t say that isPartOf is a transitive property, that hasPart is
the inverse of isPartOf or that touches is symmetrical
• It can be difficult to provide reasoning support
– No “native” reasoners for non-standard semantics
– May be possible to reason via FO axiomatisation
OntoGrid: Semantic Grid Tutorial 29
Web Ontology Language
Requirements
Desirable features identified for Web Ontology Language:
• Extends existing Web standards
– Such as XML, RDF, RDFS
• Easy to understand and use
– Should be based on familiar KR idioms (e.g. OO-style, frames etc).
• Formally specified
• Of “adequate” expressive power
• Possible to provide automated reasoning support
OntoGrid: Semantic Grid Tutorial 30
The OWL Family Tree
DAML
RDF/RDF(S)
DAML-ONT
Joint EU/US Committee
DAML+OIL
Frames
OIL
OWL
W3C
OntoKnowledge+Others
Description
Logics
OntoGrid: Semantic Grid Tutorial 31
OWL
• W3C Recommendation (February 2004)
• Well defined RDF/XML serializations
• A family of Languages
– OWL Full
– OWL DL
– OWL Lite
• Formal semantics
– First Order (DL/Lite)
– Relationship with RDF
• Comprehensive test cases for tools/implementations
• Growing industrial takeup.
OntoGrid: Semantic Grid Tutorial 32
OWL Basics
• Set of constructors for concept expressions
– Booleans: and/or/not
– Quantification: some/all
• Axioms for expressing constraints
– Necessary and Sufficient conditions on classes
– Disjointness
– Property characteristics: transitivity, inverse
• Facts
– Assertions about individuals
OntoGrid: Semantic Grid Tutorial 33
Reasoning with OWL
• OWL (DL) has a well defined semantics that tells us how to
interpret expressions in the language.
• This semantics corresponds to “traditional” interpretations given
to first order logic or subsets of FOL like Description Logics.
• OWL DL based on a well understood Description Logic
(SHOIN(Dn))
– Formal properties well understood (complexity, decidability)
– Known reasoning algorithms
– Implemented systems (highly optimised)
• Because of this, we can reason about OWL ontologies, allowing
us to draw inferences from the basic facts that we provide.
OntoGrid: Semantic Grid Tutorial 34
Sean Bechhofer:
Concrete Examples: Grid/VO?
GONG?
Reasoning Tasks
• Subsumption reasoning
– Allows us to infer when one class is a subclass of another
– Can then build concept hierarchies representing the taxonomy.
– This is classification of classes.
• Satisfiability reasoning
– Tells us when a concept is unsatisfiable
• i.e. when it is impossible to have instances of the class.
– Allows us to check whether our model is consistent.
• Instance Retrieval/Instantiation
• What are the instances of a particular class C?
• What are the classes that x is an instance of?
OntoGrid: Semantic Grid Tutorial 35
Classification
OntoGrid: Semantic Grid Tutorial 36
Why Reasoning?
• Reasoning can be used as a design support tool
– Check logical consistency of classes
– Compute implicit class hierarchy
• May be less important in small local ontologies
– Can still be useful tool for design and maintenance
– Much more important with larger ontologies/multiple authors
• Valuable tool for integrating and sharing ontologies
– Use definitions/axioms to establish inter-ontology relationships
– Check for consistency and (unexpected) implied relationships
• Basis for answering queries.
• Reasoning can help underpin the provision of the machine
processing required of the Semantic Web.
OntoGrid: Semantic Grid Tutorial 37
What does OWL give us?
•
•
•
•
•
•
Rich language for describing domain models.
Unambiguous interpretations of complex descriptions.
The ability to use inference to manage our vocabularies.
GONG
VO Formation
PhosphaBase
OntoGrid: Semantic Grid Tutorial 38
More Languages
•
•
RDF, RDF(S) and OWL provide basic representational capabilities.
We also need mechanisms that allow us to access and query the
information.
– RDF has an underlying concrete syntax based on XML. Why not just use
something like XPath to query the RDF?
•
RDQL, RQL, SeRQL, …
– W3C Data Access Working Group attempting to standardise on SPARQL
• Elements of the earlier languages with a well-defined semantic basis
– OWL-QL Query language for OWL.
• Allow specification of conjunctive queries using OWL concept
expressions
•
Also investigations into extensions of the expressivity of OWL.
– Rules
OntoGrid: Semantic Grid Tutorial 39
Potential Pitfalls
OntoGrid: Semantic Grid Tutorial 40
Conflicting Views
• The Semantic Web community is diverse, with a rough division
between the “neats” and the “scruffies”.
•
Neats
–
–
–
–
–
–
Logic and languages
Completeness/decidability
Top down, well-behaved
Heavyweight
Rich ontologies
OWL
•
Scruffies
–
–
–
–
–
–
Practice
Bottom up/real-world
Lightweight
Folksonomies
FOAF
RDF
OntoGrid: Semantic Grid Tutorial 41
Semantic
Web
vs
Semantic
Web
• Semantics/AI/KR community with little attention paid to Web
aspects
– “You’re not doing it properly”
• Web community with little attention paid to Semantics.
– “Just stick everything in a big RDF store and it’ll all be fine”
• Diversity can be healthy, but can also lead to fragmentation and
pointless arguments.
Splitters!
OntoGrid: Semantic Grid Tutorial 42
Tools and Services
• We need to provide tools and services to help users to:
– Design and maintain high quality ontologies, e.g.:
• Meaningful — all named classes can have instances
• Correct — captured intuitions of domain experts
• Minimally redundant — no unintended synonyms
• Richly axiomatised — (sufficiently) detailed descriptions
– Store (large numbers) of instances of ontology classes, e.g.:
• Annotations from web pages
– Answer queries over ontology classes and instances, e.g.:
• Find more general/specific classes
• Retrieve annotations/pages matching a given description
– Integrate and align multiple ontologies
OntoGrid: Semantic Grid Tutorial 43
How thick is your
infrastructure?
• Sharing is about interoperations. Ensuring that when you look at
or process my data, you do it in a consistent way.
• “Thick” infrastructure can help interoperability. Clients don’t have
to guess how to interpret things.
– But can be harder to build
Thin Apps
Thin Apps
Thick Infrastructure
OntoGrid: Semantic Grid Tutorial 44
How thick is your
infrastructure?
• A lightweight infrastructure (e.g. RDF) means that clients/apps
have to do more. And may do it differently.
• Metadata can end up being locked away within the applications
where others can’t get at it. Is that sharing? Are you exposing
the semantics?
Thick Apps
Thick Apps
Thin Infrastructure
OntoGrid: Semantic Grid Tutorial 45
Trust and Security
• Publishing my information in machine-processable forms may
allow you to:
– Work out what I’m doing
– Integrate across multiple sources to produce new conclusions
• How do I control this?
• We need mechanisms that will allow us to control access to
knowledge
• We need mechanisms that allow us to
ascribe provenance and trust information
to our knowledge.
– The SW “stack” sees these at the top.
Some of this has to come from the
bottom though.
OntoGrid: Semantic Grid Tutorial 46
Scalability
•
•
•
•
Will this stuff work on a web scale?
Millions of triples/fact
Thousands of ontologies
Are you ever going to get global agreements?
OntoGrid: Semantic Grid Tutorial 47
Language Summary
• We’ve seen some of the technology being proposed as a basis
for building the Semantic Web
• These languages provide basic machinery that we can use to
represent the extra semantic information needed for the
Semantic Web
OWL
RDF(S)
XML
Annotation
RDF
Inference
XML
RDF
RDF(S)
OWL
Integration
Integration
–
–
–
–
OntoGrid: Semantic Grid Tutorial 48