Ontology-Based Computing
Download
Report
Transcript Ontology-Based Computing
Ontology-Based Computing
Kenneth Baclawski
Northeastern University and Jarg
The Onslaught
Increasingly large amounts of information is
becoming accessible electronically.
The information sources are increasingly
complicated.
The diversity of types of information source
is also increasing.
Technologies are emerging to cope with this
onslaught: ontology-based computing.
Ontologies
Shared understanding within a community
of people
Declarative specification of entities and
their relationships with each other
Constraints and rules that permit reasoning
within the ontology
Behavior associated with stated or inferred
facts
Relational Database Schemas
Well established technique for specifying
the structure of shared data, not for
communication between people or agents
Declarative specification but of tables, not
of entities and relationships
Some constraints are expressible but no
significant rules (such as inheritance)
No explicit behavior
Standard language is SQL.
Object-Oriented Schemas
Emerging technology for communication
between software components
Declarative specifications
Constraints and some rules
Several ways to specify behavior
The Unified Modeling Language (UML) is
the standard OO modeling language.
Pathway
name : string
1..1
input
consists of
0..*
2..*
Reaction
0..*
description : string
1..1
1..*
Chemical
name : string
f ormula : string
weight : number
1..*
output
catalyzed by
0..1
Enzy me
sequence : string
Logic
Very expressive but very difficult to use.
Not designed for communication.
Most logical languages are not based on
entities and relationships.
Very powerful inferencing capabilities.
Do not usually have any associated
behavior.
Many examples: Prolog, KIF, Slang, ...
XML DTDs and XML Schema
Defines a hierarchical document type.
XML Schema defines data types. Designed
for communication over the Web.
Good support for entities and hierarchical
relationships; awkward for others.
Constraints can be imposed on the
hierarchical structure and on data types.
Behavior can be specified procedurally.
Knowledge Representations
Very well developed branch of AI. Many
tools, but mostly academic. Not yet used
for communication over the Web.
Powerful language for specifying entities
and their relationships.
Most are linked with inference engines.
Behavior is typically handled in an ad hoc
manner.
RDF and DAML
Resource Description Framework (RDF) is
a knowledge representation language
represented in XML. It is a WWW
Consortium Recommendation.
The DARPA Agent Markup Language
(DAML) is an extension of RDF to serve as
the basis for ontology-based computing
over the Web: the Semantic Web.
Ontological Reasoning in RDF
Property
Class
type
Wendy
type
type
Person
Fish
type
owns
type
range
type
domain
owns
Wanda
Type constraint violation: The range of owns is Fish.
OR There is no inconsistency: Wanda is a fish!
Mermaid?
DAML
type
type
Student
College
type
type
domain
range majors
subClassOf
onProperty
type
Engineering
equivalentTo
Property
Class
maxCardinality
majors type
Arts & Sciences
majors
George
1
type
Restriction
Cardinality constraint violation: George can’t have two majors
OR There is no inconsistency: Engineering = Arts & Sciences
Representing information
Relational database: records
OO database: objects and links
Logic: facts
XML: documents
Knowledge Representations: annotations
All of these are graph structures: entities
related to other entities by relationships.
Where is the meaning?
Databases: select-project-join queries
Logic: rules determined by unification
XML: XSLT patterns
Knowledge Representations: templates
All of these are forms of graph matching.
The units of meaning are small connected
subgraphs that I call motifs.
Ontology Infrastructure
Simply introducing a language is not enough.
There must be an infrastructure to support
ontology-based computing, including:
Ontology development tools
Content creation systems
Storage and retrieval systems
Ontology reasoning, mediation, ...
Integration with applications
Ontology Development
Ontologies can be developed using
graphical tools specifically for ontologies or
by adapting existing tools such as CASE
tools.
Testing ontologies is not easy because they
include constraints and inference rules.
Ontology testing is analogous to type
checking in programming languages.
Content Creation
Databases: Data warehousing technology
Text: Natural Language Processing (NLP)
Image processing
Direct creation of content
No matter how the content is created it must
be tested using consistency checking.
Storage and Retrieval
Scaling up will require high-performance,
distributed storage and indexing technology.
The natural units for indexing are the motifs
(precomputed joins), but the number of
motifs is large.
Jarg Corporation has developed a scalable,
high-performance indexing technology for
ontology-based knowledge representations.
Jarg Architecture
Document NLP Knowledge Representation
fragmentation
Knowledge Fragments
Distributed Index Engine
Query
NLP
Knowledge Motifs
fragmentation
Knowledge Representation
Matching
Documents
Conclusion
Ontology-based computing is emerging as a
natural evolution of existing technologies to
cope with the information onslaught.
Ontology-based technology must be
scalable if it is to contribute to the solution
rather than add to the problem.
Consistency checking is important for the
development of ontologies and content.
Bibliography
Semantic Web: www.w3.org/2001/sw
Ontologies: www.ontology.org
Unified Modeling Language: www.omg.org/uml
Knowledge Interchange Format: logic.stanford.edu/kif
Specware and Slang: www.kestrel.edu
XML and XML Schema: www.w3.org/xml
RDF and RDFS: www.w3.org/rdf
DAML: www.daml.org
Notation 3: www.w3.org/DesignIssues/Notation3.html
Consistency checking: vis.home.mindspring.com
Jarg Knowledge Engine: www.jarg.com