View/Download
Download
Report
Transcript View/Download
OWL-AA: Enriching OWL with
Instance Recognition Semantics for
Automated Semantic Annotation
2006 Spring Research Conference
Yihong Ding
Semantic Web and
Automated Semantic Annotation
Semantic Web: the web containing machine-
processable web data
Semantic Annotation: adds formal metadata to web
pages
Metadata links data in a web page to defined concepts
in an ontology
Annotated data becomes machine-processable
Annotation needs automation to be scalable
2
“Main Drawback” of Current
Automated Semantic Annotation
Problem: “post-processing and mapping of the IE
[information extraction] results to an ontology”
[Kiryakov 2004]
Needs human intervention
Decreases system automation and scalability
Solution: “use ontolog[ies] more directly during the
process of extraction” [Kiryakov 2004]
Does work (as our ontology-based annotation shows)
But …
3
A Hidden Problem:
Compatible with Standards
A solution should be compatible with semantic web
standards
OWL (Web Ontology Language): standard
Solutions must be OWL-compatible
Current Solution
OSMX (Object-oriented Systems Model in XML): not a
standard, not OWL-compatible
Declarative instance recognition semantics
Needed by automated annotation process
Lacking in OWL
4
Instance Recognition Semantics in
Extraction Ontologies
Instance recognition semantics: machine-
processable recognizers of instances that belong to
the extention of a concept in a specified domain.
Examples in extraction ontologies
External Representation
Price: \d+|\d?\d?\d,\d\d\d
Make: CarMake.lexicon
Contextual Representation
Context phrases (left, right), e.g. \$?
Context keywords: e.g. price | obo | neg(\.|otiable)
5
OWL: Lacks Instance Recognition
Semantics
In general, OWL
Declares class, property, hierarchical relationship, restriction.
Declares instantiations.
Does not support declaration of “instance recognition”
Consequently,
Not enough declarative semantics in OWL directly useable
by automated annotation
Mixture of knowledge declaration and knowledge processing
Domain experts must know program implementation;
Or, program developers must be domain experts.
No annotation integrity checking
<carad:Make>Taurus</carad:Make> is legal, though it is
incorrect;
And, machines cannot catch this error.
6
OWL-AA (RDF Schema)
7
OWL-AA (RDF Schema)
8
OWL-AA Declarations
9
Implementation
Jena API converts OWL-AA ontologies to
OSMX ontologies
Use OSMX ontologies to do automated
annotation
10
Conclusion
OWL-AA is a way to extend OWL to provide for
automated semantic annotation.
OWL-AA overcomes the “main drawback” of
automated semantic annotation.
OWL-AA allows us to separate the creation of domain
knowledge from the implementation of a processor to
use domain knowledge for the purpose of annotating
web pages.
OWL-AA provides for annotation integrity checking.
11
Declaration vs. Instantiation
Instantiation
Instantiation
Declaration
12
Instance Recognition Semantics
Machine-processable recognizers of
instances that belong to the extention of a
concept in a specified domain.
IRecS of Left
Concept:
has line in eye
IRecS of Right
Concept:
no line in eye
13