FormalModelling

Download Report

Transcript FormalModelling

Formal Modelling
Alan Rector
O p e n G A LE N
BioHealth
Informatics
Group
1
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
When to use a classifier
1. At author time: As a compiler – Pre-coordination
► Ontologies will be delivered as “pre-coordinated” ontologies to be
used without a reasoner
► To make extensions and additions quick, easy, and responsive,
distri but developments, empower users to make changes
► Part of an ontology life cycle
2. At delivery time: As a service: - Post-coordination
► Many fixed ontologies are too big and too small
► Too big to find things; too small to contain what you need
► Create them on the fly
► Part of an ontology service
3. At application time: as a reasoner - Postcoordination & inference
► Decision support, query optimisation, schema integration, …, …, …
O p e n G A LE N
► Part of a reasoning service
BioHealth
Informatics
Group
2
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
When to use a classifier 1:
Pre-coordinated delivery
classifier as compiler
► The life cycle
► Gather requirements, sketch, experiment
► Establish patterns – design a “language”
► Criteria for success: What a subject domain expert can learn in a few days
► Bulk authoring
► Classification
Development & evolution
► Quality assurance
► Commit classifier results to a pre-coordinated ontology & deliver
► Polyhierarchies (Protégé, DAG-Edit, OWL-Lite, RDF(S), Topic Maps, …
► Query and use with you favourite tool
O p e n G A LE N
BioHealth
Informatics
Group
3
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Commit Results to a PreCoordinated Ontology
Assert
(“Commit”)
changes
inferred by
classifier
O p e n G A LE N
BioHealth
Informatics
Group
4
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
When to use a classifier 2:
Post Coordination
► When the ontology too big – “Lazy classification” on
demand
API
kernel
model
Externally available
resource
O p e n G A LE N
BioHealth
Informatics
Group
Big on the outside: small kernel on the inside
5
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Often combined with other
services:
Example - the GALEN Server
Client
Ontology Services
Multilingual
Module
Multilingual Lexicons
Service API
Client
Applications
Users
Ontology
Module
Ontologies & Classifier
External Resources &
“Coding” Module
Resource References
Run time classifier
O p e n G A LE N
BioHealth
Informatics
Group
6
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
The problem
Knowledge is Big, Fractal & Changeable!
But we want to
► Build ontologies cooperatively with different groups
► Extend ontologies smoothly
► Re-use pieces of ontologies
► Build new ontologies on top of old
► Quit starting from scratch
O p e n G A LE N
BioHealth
Informatics
Group
7
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Assertion:
The arrival of logic-based
ontologies/OWL gives new opportunities
to make ontologies more manable and
modular
► Let the ontology authors
► create discrete modules
► describe the links between modules
► Let the logic reasoner
► Organise the result
O p e n G A LE N
BioHealth
Informatics
Group
8
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Logic-based Ontologies:
Conceptual Lego
gene
hand
protein
extremity
cell
body
expression
chronic
Lung
inflammation
infection
acute
abnormal
normal
bacterial
deletion
polymorphism
ischaemic
O p e n G A LE N
BioHealth
Informatics
Group
9
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Logic-based Ontologies:
Conceptual Lego
“SNPolymorphism of CFTRGene causing Defect in MembraneTransport of
Chloride Ion causing Increase in Viscosity of Mucus in CysticFibrosis…”
“Hand which is
anatomically
normal”
O p e n G A LE N
BioHealth
Informatics
Group
10
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Species
Genes
Protein
Function
Gene in humans
Disease
Protein coded by
gene in humans
O p e n G A LE N
BioHealth
Informatics
Group
Logical Constructs
build complex
concepts from
modularised
primitives
11
Function of
Protein coded by
gene in humans
Disease caused by abnormality in
Function of
Protein coded by
gene in humans
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalising (untangling)
Ontologies
Structure
Structure
Part-whole
Function
Function
Part-whole
O p e n G A LE N
BioHealth
Informatics
Group
12
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Rationale for Normalisation
► Explicit distinctions between modules
► Primitives are opaque to the reasoner
►Information implicit in primitive names cannot contribute to modularisation
► Primitives are indivisible to both human and reasoner
►Each primitive should represent a single notion
► Therefore, each primitive must belong to exactly one module
►If a primitive belongs to two modules, they are not modular.
►If a primitive belongs to two modules, it probably conflates two notions
► Therefore concentrate on the “primitive skeleton” of the domain
ontology
O p e n G A LE N
BioHealth
Informatics
Group
13
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation and
Untangling
Let the reasoner do multiple classification
► Tree
► Everything has just one parent
►A ‘strict hierarchy’
► Directed Acyclic Graph (DAG)
► Things can have multiple parents
►A ‘Polyhierarchy’
► Normalisation
► Separate primitives into disjoint trees
► Link the trees with restrictions
►Fill in the values
O p e n G A LE N
BioHealth
Informatics
Group
14
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Tables are easier to manage
than DAGs /
Polyhierarchies
…and get the benefit of inference:
Grass and Leafy_plants are both kinds of
Plant
O p e n G A LE N
BioHealth
Informatics
Group
15
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Remember to add any
closure axioms
Closure
Axiom
Then let the reasoner do the work
O p e n G A LE N
BioHealth
Informatics
Group
16
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation:
From Trees to DAGs

Before classification

A tree

After classification

A DAG

Directed Acyclic Graph
O p e n G A LE N
BioHealth
Informatics
Group
17
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation: Criterion 1
The skeleton should consist
of disjoint trees
► Every primitive concept should have exactly one
primitive parent
► All multiple hierarchies the result of inference by reasoner
O p e n G A LE N
BioHealth
Informatics
Group
18
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation Criterion 2:
No hidden changes of meaning
►Each branch should be homogeneous and
logical
(“Aristotelian”)
► Hierarchical principle should be subsumption
►Otherwise we are “lying to the logic”
► The criteria for differentiation should follow consistent principles in
each branch
eg. structure XOR function XOR cause
O p e n G A LE N
BioHealth
Informatics
Group
19
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
A Non-homogeneous
taxonomy
“On those remote pages it is written that animals are divided into:
O p e n G A LE N
a. those that belong to the Emperor
b. embalmed ones
c. those that are trained
d. suckling pigs
e. mermaids
f. fabulous ones
g. stray dogs
h. those that are included in this classification
i. those that tremble as if they were mad
j. innumerable ones
k. those drawn with a very fine camel's hair brush
l. others
m. those that have just broken a flower vase
n. those that resemble flies from a distance"
From The Celestial Emporium of Benevolent Knowledge, Borges
BioHealth
Informatics
Group
20
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation Criterion 3
Distinguish “Self-standing” and
“Refining” Concepts
►Self-standing concepts
► Roughly Welty & Guarino’s “sortals”
►person, idea, plant, committee, belief,…
► Refining concepts – depend on self-standing
concepts
►mild|moderate|severe, hot|cold, left|right,…
► Roughly Welty & Guarino’s non-sortals
► Closely related to Smith’s “fiat partitions”
► Usefully thought of as Value Types by engineers
► For us an engineering distinction…
O p e n G A LE N
BioHealth
Informatics
Group
21
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation Criterion 3a
Self-standing primitives should be
globally disjoint & open
► Primitives are atomic
► If primitives overlap, the overlap conceals implicit information
► A list of self-standing primitives can never be
guaranteed complete
► How many kinds of person? of plant? of committee? of belief?
► Can’t infer:
Parent & ¬sub1 &…& ¬subn-1  subn
► Heuristic:
► Diagnosis by exclusion about self-standing concepts should NOT
be part of ‘standard’ ontological reasoning
O p e n G A LE N
BioHealth
Informatics
Group
22
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation Criterion 3b
Refining primitives should be locally
disjoint & closed
► Individual
values must be disjoint
► but can be hierarchical
►e.g. “very hot”, “moderately severe”
► Each list can be guaranteed to be complete
► Can infer Parent & ¬sub1 &…& ¬subn-1  subn
► Value types themselves need not be disjoint
► “being hot” is not disjoint from “being severe”
►Allowing Valuetypes to overlap is a useful trick, e.g.
►restriction has_state someValuesFrom (severe and hot)
O p e n G A LE N
BioHealth
Informatics
Group
23
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Normalisation Criterion 4
Axioms
► No axiom should denormalise the ontology
No axiom should imply that a primitive is part of
more than one branch of primitive skeleton
► If all primitives are disjoint, any such axioms will make that
primitive unsatisfiable
►A partial test for normalisation:
►Create random conjunctions of primitives which do not subsume
each other.
►If any are satisfiable, the ontology is not normalised
O p e n G A LE N
BioHealth
Informatics
Group
24
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Consequences 1
► All self-standing primitives are disjoint
► All multiple classification is inferred
► For any two primitive self-standing classes, either
one subsumes the other or they are disjoint
► Every self standing concept is part of exactly one
primitive branch of the skeleton
► Every self-standing concept has exactly one most
specific primitive ancestor
O p e n G A LE N
BioHealth
Informatics
Group
25
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Consequences 2
► Primitives introduced by a conjunction of one class and
a boolean combination of zero or more restrictions
► Tree subclass-of Plant and
restriction isMadeOf someValuesFrom Wood
► Resort subclass-of Accommodation
restriction isIntendedFor someValueFrom Holidays
O p e n G A LE N
BioHealth
Informatics
Group
26
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Consequences 3
►Use of axioms limited (outside of skeleton
construction)
The following are a safe but not exhaustive
guide:
►The right side of subclass axioms limited to
restrictions
►Both sides of disjointness axioms limited to
restrictions
►No equivalence axioms with primitives on either
side
O p e n G A LE N
BioHealth
Informatics
Group
27
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
A real example:
Build a simple treee
easy to maintain
O p e n G A LE N
BioHealth
Informatics
Group
28
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Let the classifier organise it
O p e n G A LE N
BioHealth
Informatics
Group
29
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
If you want more
abstractions,
just add new definitions
(re-use existing data)
O p e n G A LE N
BioHealth
Informatics
Group
“Diseases linked to abnormal
proteins”
30
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
And let the classifier work
again
O p e n G A LE N
BioHealth
Informatics
Group
31
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
And again – even for a quite
different category
“Diseases linked genes
described in the mouse”
O p e n G A LE N
BioHealth
Informatics
Group
32
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
And let classifier check
consistency
(My first try wasn’t)
O p e n G A LE N
BioHealth
Informatics
Group
33
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005
Summary
Why use a Classifier?
► To compose concepts
► Allow conceptual lego
► To manage polyhierarchies
► Adding abstractions (“axes”) as needed
► Normalisation
► Untangling
► labelling of “kinds of is-a”
► To avoid combinatorial explosions
► Keep bicycles from exploding
► To manage context
► Cross species, Cross disciplines, Cross studies
► To check consistency and help users find errors
O p e n G A LE N
BioHealth
Informatics
Group
34
Ontology tutorial, © 2005 Univ. of Manchester
Feb 2005