Transcript Test

Semantic technologies
(implementation of a use case,
ontology engineering,
inference, triple stores,)
Peter Fox (RPI)
ESIP Summer Meeting
Santa Barbara CA, 2009, July 7, 10:30-12:00pm
1
Semantic Web Methodology and
Technology Development Process
•
•
Establish and improve a well-defined methodology vision for
Semantic Technology based application development
Leverage controlled vocabularies, et c.
Rapid
Open World:
Evolve, Iterate, Prototype
Redesign,
Redeploy
Leverage
Technology
Infrastructure
Adopt
Science/Expert
Technology
Approach Review & Iteration
Use Tools
Analysis
Use Case
Small Team,
mixed skills
Develop
model/
ontology
2
Semantic Web Layers
3
http://www.w3.org/2003/Talks/1023-iswc-tbl/slide26-0.html, http://flickr.com/photos/pshab/291147522/
Implementation
• Cover language representation choices, and
knowledge engineering
• Pull apart the use case
• Tools and services
• Architecture considerations and design
choices
4
Languages
•
•
•
•
•
•
OWL (1 and 2)
RDFS (+)
SKOS
RIF (OWL 2 RL)
SPARQL
OWL-S
5
RDFS
• Note: XMLS not an ontology language
– Changes format of DTDs (document schemas) to
be XML
– Adds an extensible type hierarchy
• Integers, Strings, etc.
• Can define sub-types, e.g., positive integers
• RDFS is recognisable as an ontology
language
– Classes and properties
– Sub/super-classes (and properties)
– Range and domain (of properties)
6
However
• RDFS too weak to describe resources in sufficient
detail
– No localized range and domain constraints
• Can’t say that the range of hasChild is person when applied to
persons and elephant when applied to elephants
– No existence/cardinality constraints
• Can’t say that all instances of person have a mother that is also a
person, or that persons have exactly 2 parents
– No transitive, inverse or symmetrical properties
• Can’t say that isPartOf is a transitive property, that hasPart is the
inverse of isPartOf or that touches is symmetrical
–…
• Difficult to provide reasoning support
– No “native” reasoners for non-standard semantics
– May be possible to reason via First Order axiomatisation
7
RDFS+
• Worth taking a look at but there is not a lot of
information readily available (as yet)
8
OWL requirements
Desirable features identified for Web Ontology
Language:
• Extends existing Web standards
– Such as XML, RDF, RDFS
• Easy to understand and use
– Should be based on familiar KR idioms
• Formally specified
• Of “adequate” expressive power
• Possible to provide automated reasoning support
9
The OWL language:
• Three species of OWL
– OWL full is union of OWL syntax and RDF
– OWL DL restricted to FOL fragment (¼ DAML+OIL)
– OWL Lite is “easier to implement” subset of OWL DL
• Semantic layering
– OWL DL ¼ OWL full within DL fragment
– DL semantics officially definitive
• OWL DL based on SHIQ Description Logic
– In fact it is equivalent to SHOIN(Dn) DL
• OWL DL Benefits from many years of DL research
–
–
–
–
Well defined semantics
Formal properties well understood (complexity, decidability)
Known reasoning algorithms
Implemented systems (highly optimized)
10
11
OWL 1 Class Constructors
12
OWL 1 axioms
13
OWL 2
• http://www.w3.org/2007/OWL/wiki/OWL_Wor
king_Group
• http://www.w3.org/2007/OWL/wiki/Image:Owl
2-refcard_2008-09-24.pdf
• Semtech slides from the W3 OWL 2 panel –
ask me for these
• Property chaining
• Numerics
OWL 2 RL
• http://www.w3.org/TR/owl2-profiles/
• http://ivan-herman.name/2009/04/27/simpleowl-2-rl-service/
15
SKOS properties
• skos:note
e.g. ‘Anything goes.’
• skos:definition
e.g. ‘A long curved fruit with a yellow skin and soft, sweet white flesh inside.’
• skos:example
e.g. ‘A bunch of bananas.’
• skos:scopeNote
e.g. ‘Historically members of a sheriff's retinue armed with pikes who escorted judges
at assizes.’
• skos:historyNote
e.g. ‘Deleted 1986. See now Detention, Institutionalization (Persons), or
Hospitalization.’
• skos:editorialNote
e.g. ‘Confer with Mr. X. re deletion.’
• skos:changeNote
e.g. ‘Promoted “love” to preferred label, demoted “affection” to alternative label, Joe
Bloggs, 2005-08-09.’
16
SKOS core and RDFS/OWL
• Disjoint?
– Should skos:Concept be disjoint with …
• rdf:Property ?
• rdfs:Class ?
• owl:Class ?
• DL?
– Should SKOS Core be an OWL DL ontology?
• Means not allowing flexibility in range of
documentation props
– It is now (2008)! OWL-Full
17
Summaries
• Michael Denny’s Table: (a bit out of date)
• http://www.xml.com/2004/07/14/examples/On
tology_Editor_Survey_2004_Table__Michael_Denny.pdf
• ESW Wiki:
http://esw.w3.org/topic/SemanticWebTools
Engineering an ontology to the ‘ground’
19
Editors
• Protégé (http://protégé.stanford.edu)
• SWOOP (http://mindswap.org/2004/SWOOP; see
also http://www.mindswap.org/downloads/)
• Altova SemanticWorks
(http://www.altova.com/download/semanticworks/se
mantic_web_rdf_owl_editor.html)
• SWeDE (http://owleclipse.projects.semwebcentral.org/InstallSwede.ht
ml), goes with Eclipse
• Medius
• TopBraid Composer and other commercial tools
• CMAP Ontology Editor (COE)
(http://cmap.ihmc.us/coe)
20
Protégé
•http://protege.stanford.edu/
•http://protegewiki.stanford.edu/index.php/Protege-OWL
•Please check version compatibility when choosing your
development otions. E.g. 3.4 v. 4.0
•Do you have plugins you like? They may not yet be
available in 4.0
e.g. Prompt not compatible with version 4.0)
Triple Stores
•
•
•
•
•
•
•
Jena (http://jena.sourceforge.net/)
SeSAME/SAIL (http://www.openrdf.org/)
KOWARI (http://www.kowari.org/) ->
Mulgara (http://www.mulgara.org/)
Redland (http://librdf.org/index.html)
Oracle (!)
Many others (relational, object-relational)
22
Software development tools
• Protégé, w/ plug-ins - some better than others
• SWOOP (OWL analyzer – species validator,
partitioner)
• Jena (http://jena.sourceforge.net/)
• ELMO ()
• Eclipse (full integrated development
environment for Java; http://www.eclipse.org/)
• Top Quadrant suite
• Sandsoft (Sandpiper Software)
• … see Semantic Technologies 2007/8/9
23
Implementing semantics
•
•
•
•
•
Query
Reasoning
Rules
A combination?
Use cases are the key to guide you
24
Reasoners (aka Inference engines)
• Pellet **
• Racer (and Racer Pro) **
• SHER (IBM)
http://www.alphaworks.ibm.com/tech/sher
• Medius KBS
• FACT++
• fuzzyDL
• KAON2
• MSPASS
• QuOnto
• Jess (for Rules)
• …
25
Implementation Basics
• Review your documented use case with team and
experts
• Go into detail of your ontology; test it using the tools
you have
• We will look at the use case document and examine
the actors, process flow, artifacts, etc.
• You will start to develop a design and an architecture
(more on architecture and middleware next week)
• Keep in mind that it is more flexible to place the
formal semantics between/ in your interfaces, i.e.
between layers and components in your architecture,
i.e. between ‘users’ and ‘information’ to mediate the 26
exchange
Actors
• The initial analysis will often have many
human actors
• Begin to see where these can be replaced
with machine actors – may require additional
semantics, i.e. knowledge encoding
• If you are doing this in a team, take steps to
ensure that actors know their role and what
inputs, outputs and preconditions are
expected of them
• Often, you may be able to ‘run’ the use case
(really the model) before you build anything
27
Process flow
• Each element in the process flow usually
denotes a distinct stage in what will need to
be implemented
• Often, actors mediate the process flow
• Consider the activity diagram (and often a
state diagram) as a means to turn the written
process flow into a visual one that your
experts can review
• Make sure the artifacts and services have an
entry in the resources section
• Often the time you may do some searching
28
Preconditions
• Often the preconditions are very syntactic
and may not be ready to fit with your
semantically-rich implementation
• Some level of modeling of these
preconditions may be required (often this will
not be in your first pass knowledge encoding
which focuses on the main process flow,
goal, description, etc.)
• Beware of using other entities data and
services: policies, access rights, registration,
and ‘cost’
29
Artifacts
• Add artifacts that the use case generates to the
resources list in the table
• It is often useful to record which artifacts are
critical and which are of secondary importance
• Be thinking of provenance and the way these
were produced, i.e. what semantics went into
them and produce suitable metadata or
annotations
• Engage the actors to determine the names of
these artifacts and who should have
responsibility for them (usually you want the 30
actors to have responsibility for evolution)
Reviewing the resources
• Apart from the artifacts and actor resources,
you may find gaps
• Your knowledge encoding is also a resource,
make it a first class citizen, i.e. give it a
namespace and a URI
• Sometimes, a test-bed with local data is very
useful as you start the implementation
process, i.e. pull the data, maybe even
implement their service (database, etc.)
31
Back to the knowledge encoding
• Declarative: in CL, OWL (probably OWL-DL),
RDF, SKOS?
• Need rules?
• Need query?
• Science expert review and iteration
• Means you need something that they can
review, with precise names, properties,
relations, etc.
• The knowledge engineering stage is much
like a software engineering process
32
Knowledge engineering
• Mostly choose OWL-DL (and OWL 2)
• We may need to go to OWL 2 for numerical
comparisons and if so, separate your OWL 1
from OWL 2 representations
• The interplay between tools like Protégé and
CMAP maybe very important in implementing
a knowledge base that has ‘just enough’
33
Implementation Basics
• Review documented use case now
• Go into detail of the ontology
• Now we will look at the use case document and
examine the actors, process flow, artifacts, etc.
• Start thinking of a design and an architecture
• Semantics between/ in your interfaces
34
Roles and skill-sets
• Facilitator – changes slightly for implementation sometime the facilitator becomes chief architect,
sometimes steps back
• Domain experts are needed for expert review
(domain literate, know resources; data, applications,
tools, etc)
• You are the modeler (to extract objects, triples)
• You are likely to play the role of a software engineer
(architecture, technology) but you can also ask
someone for help with this
• Document, document, document
• It is social – a team effort
Summary
• By now, the reality of going into complete
detail for the knowledge representation
should be apparent
• Keeping it simple is also very important as
you begin to implement
• Being prepared to iterate is really essential
• Now is the time to validate your ontology with
domain experts and your team, use the tools
• The next stage is to choose your technology
components and build and test
36
E.g. use cases and implementation
• Apple orchard sensor network
– Semantic mediawiki
• Stargazer
– OWL-DL, Jena, ELMO, Pellet
• Diet exchange portal
– RDF, Perl, SPARLQL, re-use
• BCO-DMO faceted search
37
Inventory
• Refer to the Resources
– Files
– Databases
– Catalogs
– Existing UI
– Services
– User database/ security
– Logging
– Backup/ archive
38
Limited
interoperability
Geo App1
Geo App2
App3
WSComm
on
Web
Coverage
Service
Web
Feature
Service
Web
Mapping
Service
DB2
DB1
DB3
…………
DBn
The Astronomy approach; datatypes as a service
Limited
interoperability
VO App1
VO App2
VO App3
VOTabl
e
Simple
Image
Access
Protocol
Simple
Spectrum
Access
Protocol
VO layer
Lightweight semantics
Limited meaning, hard
coded
DB2
DB1
DB3
…………
Limited extensibility
Fox WHOI: Semantic
Data
Under review
Frameworks March 20, 2008
DBn
Simple
Time
Access40
Protocol
Education, clearinghouses,
disciplines, etc.
other
services,
Semantic mediation layer - midupper-level
Semantic
interoperability
Web
Serv.
Web
Portal
API
Query,
access
and use
of data
Semantic query,
hypothesis and
inference
Semantic mediation layer: Ontology - capturing concepts of Parameters,
Instruments, Date/Time, Space, Event, Feature, Data Product (and associated classes,
properties) and Service Classes. Maps queries to underlying data. Generates access
requests for metadata, data. Allows queries, reasoning, analysis, new hypothesis generation,
Data as Service
testing, explanation, etc.
Metadata, schema,
data
DB1
DB2
DB3
…………
DB…
Implementing
• Let’s take an example
– VSTO
– Representative but does not exercise all
semantic web capabilities
42
Web Service
43
Fox RPI: Semantic Data
Frameworks May 14, 2008
44
45
46
47
48
2
49
Additional middleware
• Web server, Tomcat are essential (Axis)
• MySQL (or similar) is very handy to have
• OPeNDAP – for data access and transport
50
Web Service
51
Fox RPI: Semantic Data
Frameworks May 14, 2008
Infrastructure
• Protégé-OWL-API
– http://protege.stanford.edu/plugins/owl/api/in
dex.html
– http://protege.stanford.edu/plugins/owl/api/g
uide.html
• Jena (Java API for RDF and OWL)
– http://protege.stanford.edu/plugins/owl/jenaintegration.html
– http://jena.sourceforge.net/
– Migrate to other triple stores when needed 52
Using Protégé
• Load VSTO into Protégé 3.4beta
• Generate Java-OWL classes from Tool menu
• Review other tools for generating code stubs
53
Examine some of the code
• Java Factory class
• Code stubs and ‘myclass’
• VSTO code base browse it
54
Jena
55
Infrastructure
• Reasoner – DIG/Pellet
– http://protegewiki.stanford.edu/index.php/Protege
ReasonerAPI
• SPARQL
– http://www.w3.org/2001/sw/DataAccess/tests/imp
lementations
• Spring (Application Framework - optional)
– http://www.springframework.org/
• Eclipse (IDE)
– http://www.eclipse.org/
56
Software development
• Junit (generated in Eclipse)
– http://smiprotege.stanford.edu/repos/protege/owl/trunk/juni
t.properties.template
• Faceted browsing – mspace, jspace
57
Metadata
• Migrate metadata into ontologies – instances,
choose how you will populate them
– Manual – okay to start with sufficient annotation
– Scripted – preferred
– rdfs_comment: essential
• Choose what you will not, cannot move
• W3 Recommendations like GRDDL are very
useful
58
Services
• If you are going to put up services, include an
end-point and a link to your WSDL (or
SAWSDL)
• At this point, developing a full services
ontology, e.g. in OWL-S may be beyond the
initial implementation
59
Semantic Web Services
60
Fox RPI: Semantic Data
Frameworks May 14, 2008
Semantic Web Services
OWL document
returned using
VSTO ontology can be used both
syntactically or
semantically
61
Fox RPI: Semantic Data
Frameworks May 14, 2008
Result/ outcome
• Refer to the use case document
• Check the expected outcome and see if the
test (to verify outcome) is complete
• Document all variations, note alternate flows
• Document in sufficient detail that someone
else could come along and re-produce your
work
• Include URLs for access, etc.
62
Summary
• Architectural design needs to take into
account existing resources that you will
leverage
• Keeping it simple is also very important as
you begin to implement
• Take time to learn the tools and the
supporting APIs; look at existing examples
and working code
• Being prepared to iterate is really essential
63
Tutorial Summary
• Many different options for ontology querying none are standard
• RDF query is most advanced
• Inference needs and choice will depend on
descriptive requirements (e.g. DL, Full, RDF,
etc.)
64
Reference material
65
Terminology
• Ontology (n.d.). The Free On-line Dictionary of
Computing.
http://dictionary.reference.com/browse/ontology
– An explicit formal specification of how to
represent the objects, concepts and other entities
that are assumed to exist in some area of interest
and the relationships that hold among them.
• Semantic Web
– An extension of the current web in which
information is given well-defined meaning, better
enabling computers and people to work in
cooperation, www.semanticweb.org
– Primer: http://www.ics.forth.gr/isl/swprimer/
66
Ontology Spectrum
Thesauri
“narrower
Catalog/
term”
ID
relation
Terms/
glossary
Informal
is-a
Selected
Formal Frames
Logical
is-a (properties)Constraints
(disjointness,
inverse, …)
Formal
Value
instance
Restrs.
General
Logical
constraints
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;
– updated by McGuinness.
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
67
Ontology - declarative knowledge
• The triple: {subject-object-predicate}
interferometer is-a optical instrument
Fabry-Perot is-a interferometer
Optical instrument has focal length
Optical instrument is-a instrument
Instrument has instrument operating mode
Data archive has measured parameter
SO2 concentration is-a concentration
Concentration is-a parameter
A query: select all optical instruments which have
operating mode vertical
An inference: infer operating modes for a FabryPerot Interferometer which measures neutral
temperature
68