UncertML - Describing and communicating uncertainty within the

Download Report

Transcript UncertML - Describing and communicating uncertainty within the

UNCERTML - DESCRIBING
AND COMMUNICATING
UNCERTAINTY WITHIN
THE (SEMANTIC) WEB
Matthew Williams
[email protected]
OVERVIEW
 Introduction.
 Motivation
Webs.
 UncertML
 Use
– the Semantic and Sensor
overview & design choices.
case – The INTAMAP project.
 Conclusions.
MOTIVATION
The semantic and sensor webs
THE SEMANTIC WEB




Most Web content today is designed for humans
to read, not computers.
Semantic Web will bring structure to the
meaningful content of Web pages.
Adding logic to the Web allows rules to be used
for inference.
Ontologies are used to describe entities and
relations between entities.
HOW UNCERTAINTY IS USED
WITHIN THE SEMANTIC WEB

PW-OWL: a Bayesian Ontology Language for the
Semantic Web:
Extends OWL to allow probabilistic knowledge to be
represented in an ontology.
 Used for reasoning with Bayesian inference.
 Random variables are described by either a PR-OWL
table (discrete probability) or using a proprietary
format – NOT freely available.


Other standards looking at similar concepts:


BayesOWL.
FuzzyOWL.
THE SENSOR WEB
SENSOR WEB ENABLEMENT
(SWE)

Open Geospatial Consortium (OGC) initiative
Interoperability interfaces and metadata encodings.
<Quantity
id="elevationAngle" fixed="false"
 Real time integration of heterogeneous sensor webs
definition="urn:ogc:def:scanElevationAngle">
into the information infrastructure.
<uom xlink:href="urn:ogc:unit:degree"/>

<quality>
<Tolerance definition="urn:ogc:def:tolerance2std">
 Current
SWE
standards
<value>
-0.02
0.02 </value>
</Tolerance> & Measurements
 Observations
</quality>
 SensorML
<value> 25.3 </value>
</Quantity>
 SWE Common

No formal standard for quantifying uncertainty
WHAT IS MISSING?

A formal open standard for quantifying complex
uncertainties:
Distributions.
 Statistics.
 Realisations.

UNCERTML
I’ve done it!!
OVERVIEW

Split into three distinct packages (distributions,
statistics & realisations).
STATISTICS
<un:Statistic
definition="http://dictionary.uncertml.org/statistics/standard_
deviation">
<un:value>12.08</un:value>
</un:Statistic>
DISTRIBUTIONS
<un:Distribution
definition="http://dictionary.uncertml.org/distributions/gauss
ian">
<un:parameters>
<un:Parameter
definition="http://dictionary.uncertml.org/distributions/gauss
ian/mean">
<un:value>34.564</un:value>
</un:Parameter>
<un:Parameter
definition="http://dictionary.uncertml.org/distributions/gauss
ian/variance">
<un:value>67.45</un:value>
</un:Parameter>
</un:parameters>
</un:Distribution>
REALISATIONS
<un:Realisations
definition="http://dictionary.uncertml.org/realisation"
samplingMethod="http://dictionary.uncertml.org/realisations/sampling_
methods/MCMC"
realisedFrom="http://dictionary.uncertml.org/distributions/gaussian">
<un:realisationsCount>100</un:realisationsCount>
<un:elementCount>100</un:elementCount>
<swe:encoding>
<swe:TextBlock decimalSeparator="." blockSeparator=" "
tokenSeparator=","/>
</swe:encoding>
<swe:values>
<!-- [100 space separated values] -->
</swe:values>
</un:Realisations>
UNCERTML
Difficult decisions and design principles
WEAK VS. STRONG
Weak-typed

Benefits
Strong-typed

Benefits
<Feature
type="Road">
 Generic
features have
 Produces relatively
<property name="description" type="string">...</property>
genericname="surfaceTreatment"
properties –
simple XML features
<property
type="token">Bitumen</property>
</Feature>
extensible

Drawbacks

Drawbacks
 Not easily extended
<Road>
<description>...</description>
 Validation becomes
all domain features
<surfaceTreatment>Bitumen</surfaceTreatment>
less meaningful
must be known a
</Road>
priori
–
THE UNCERTML DICTIONARY




Weak-typed designs rely on dictionaries.
Includes definitions of key distributions &
statistics.
URIs link to dictionary entry and provide
semantics.
Could be written in Semantic Web standards
(OWL, RDF etc).
UNCERTML – DICTIONARY
EXAMPLE
<gml:Dictionary xmlns:gml="http://www.opengis.net/gml" gml:id="DISTRIBUTIONS">
<gml:name>All Probability Distributions</gml:name>
<gml:description>This is a dictionary...</gml:description>
<gml:dictionaryEntry>
<un:DistributionDefinition xmlns:un="http://www.intamap.org/uncertml"
gml:id="Gaussian">
<gml:description>This is a Gaussian distribution</gml:description>
<gml:name>Gaussian</gml:name>
<gml:name>Normal</gml:name>
<un:functions>
<un:FunctionDefinition
gml:id="Gaussian_Cumulative_Distribution_Function">
<gml:description>This is a cumulative distribution
function</gml:description>
<gml:name>Cumulative Distribution Function</gml:name>
<un:mathML>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
SEPARATION OF CONCERNS



Several competing standards already exist
addressing the issue of units and location.
Geospatial information not always relevant –
Systems biology.
Do what we know – do it well!
UNCERTML WITHIN THE SEMANTIC
WEB



Proprietary software can impede interoperability
which is detrimental to the Semantic Web.
Discrete probability tables can only provide so
much information.
Provide an open standard for describing the
complex probability distributions that are
currently lacking within PR-OWL.
UNCERTML WITHIN THE SENSOR
WEB

resultQuality of an O&M Observation.


Quality property of SWE types.


Encode sensor bias and other inherent uncertainties
of a sensor observation.
Effectively provides a ‘Random Variable’ type.
Positional uncertainty within GML.

Extending GML would allow UncertML to integrate
with the geometry types to provide positional
uncertainty information.
UNCERTML
Does it actually work??
THE INTAMAP PROJECT



An automatic, interoperable
service providing real time
interpolation between
observations.
EURDEP providing
radiological data as a case
study.
Provide real time predictions
to aid risk management
through a Web Processing
Service interface.
<om:Observation>
<om:procedure xlink:href="http://www.mydomain.com/sensor_models/temperature"/>
<un:DistributionArray>
<om:resultQuality>
<un:elementType>
<un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian">
<un:Distribution
<un:parameters>
<un:Parameter
definition="http://dictionary.uncertml.org/distributions/gaussian">

definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/mean">
<un:parameters>
<un:value>0.0</un:value>
<un:Parameter
</un:Parameter>
definition="http://dictionary.uncertml.org/distributions/gaussian/mean"/>
<un:Parameter

<un:Parameter
definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/variance">
<un:value>3.6</un:value>
definition="http://dictionary.uncertml.org/distributions/gaussian/variance"/>

</un:Parameter>
</un:parameters>
</un:parameters>
</un:Distribution>

</un:Distribution>
</un:elementType>
</om:resultQuality>
<un:elementCount>5</un:elementCount>
<om:observedProperty
xlink:href="urn:x-ogc:def:phenomenon:OGC:AirTemperature"/>
<om:featureOfInterest>
<swe:encoding>
<sa:SamplingPoint>
<swe:TextBlock decimalSeparator="." blockSeparator=" "
<sa:sampledFeature xlink:href="http://www.mydomain.com/sampling_stations/ws-04231"/>
tokenSeparator=","/>
<sa:position>
</swe:encoding>
<gml:Point>
<swe:values>
<gml:pos srsName="urn:x-ogc:def:crs:EPSG:4326">
52.4773635864 -1.89538836479
35.2,56.75
</gml:pos>
31.2,65.31
</gml:Point>
28.2,54.23
</sa:position>
35.6,45.21
</sa:SamplingPoint>
41.5,85.24
</om:featureOfInterest>
<om:result
xsi:type="gml:MeasureType" uom="urn:ogc:def:uom:OGC:degC">19.4</om:result>
</swe:values>
</om:Observation>
</un:DistributionArray>
UNCERTML IN INTAMAP
‘Really clever’ Bayesian
inference:
Different sensor errors.
Change of support.
Fast & approximate
algorithms.
COMPARING PREDICTIONS WITH
AND WITHOUT UNCERTML
Without UncertML
With UncertML
CONCLUSIONS
 Currently
no existing standard to
describe uncertainty within the
Semantic and Sensor Webs.
 UncertML provides an extensible,
weak-typed, design that can quantify
uncertainty using:



Distributions.
Statistics.
Realisations.
 Provide
more information for use in
decision support systems – especially
useful in risk management.