Digital Antiquity CI Feb 7-8, 2013, Arlington VA

Download Report

Transcript Digital Antiquity CI Feb 7-8, 2013, Arlington VA

Semantics and analytics = making
the data and the decisions smarter?
Digital Antiquity CI
Feb 7-8, 2013, Arlington VA
Peter Fox (RPI and WHOI) [email protected], @taswegian,
http://tw.rpi.edu/web/person/PeterFox
Tetherless World Constellation http://tw.rpi.edu and AOP&E
Analytics – data and visual
Producers
Consumers
Experience
• Analytics
Data
Information
Knowledge
Ecosystem
• Stimulate
Innovation
Research
Exploration
Discovery
Creation
Gathering
Presentation
Organization
Integration
Conversation
Context
4
Data as Infostructure
Curation for analytics
Producers
Consumers
Quality Control
Quality Assessment
Fitness for Purpose
Fitness for Use
Trustee
Trustor
Others…
Others…
6
Technical advances
From: C. Borgman, 2008, NSF Cyberlearning Report
Working with knowledge
Rule execution
Expressivity
Implement
-ability
Query
Maintainability/ Extensibility
Inference
For real discovery –
we need abduction!
Importantly human intuition
is needed in
interacting with
large-scale data
- a method of logical inference
introduced by C. S. Peirce
which comes prior to
induction and deduction for
which the colloquial name is
to have a "hunch”
Yes, we need a Knowledge Base
What/when/why/
how
Questions
Answer
Knowledge
provenance
Who
Domain
specific
terms/
language
Descriptions
of the
artifacts
10
Smart visual exploration
Semantics - Modern informatics enables
a new scale-free** framework approach
• Use cases
• Stakeholders
• Distributed
authority
• Access control
• Ontologies
• Maintaining
Identity
Finally
• Significant opportunities for smart data-as-a-service
approaches to ‘scale’ for big data (on the web)
• Delivering ‘products’ allows analytics on the back end,
but tools to plug into a framework are lacking
• Exploit late semantic binding for ABDUCTION
• Next generation analytics must accommodate:
abduction, translucency, interactivity and retain what
they do well!
• So we all need to get cracking!
• Thanks. @taswegian, [email protected]
Back shed
1: Integrating Multiple
Data Sources
• The Semantic Web lets
us merge statements
from different sources
• The RDF Graph Model
allows programs to use
data uniformly regardless
of the source
• Figuring out where to find
such data is a motivator
for Semantic Web
Services
hasCoordinates
#Ionosphere
#magnetic
name
hasLowerBoundaryV
“Terrestrial
Ionosphere”
“100”
hasLowerBoundaryUnit
“km”
Different line & text colors
represent different data sources
Fox & McGuinness Semantic Technologies
May 21, 2007
2: Drill Down /Focused
Perusal
• The Semantic Web uses
Uniform Resource Identifiers
…#NeutralTemperature
…#Norway
(URIs) to name things
• These can typically be
resolved to get more
information about the resource
locatedI
measuredby
• This essentially creates a web
of data analogous to the web
of text created by the World
Wide Web
Internet
• Ontologies are represented
...#ISR
using the same structure as
content
...#FPI
type
operatedby
– We can resolve class and
property URIs to learn about
the ontology
…#EISCAT
...#MilllstoneHill
Fox & McGuinness Semantic Technologies
May 21, 2007
3: Statements about
Statements
• The Semantic Web
allows us to make
statements about
statements
– Timestamps
– Provenance / Lineage
– Authoritativeness /
Probability / Uncertainty
– Security classification
– …
#Danny’s
#Aurora
hasSource
hasDateTime hascolor
20031031
• This is an unsung virtue
of the Semantic Web
Fox & McGuinness Semantic Technologies
Red
Ontologies Workshop, APL May 26, 2006
May 21, 2007
8: Proof
hasCalibration
• The logical foundations
#Critical
of the Semantic Web
#FlatField
Dataset
allow us to construct
proofs that can be used
hasPeerRevie
to improve transparency,
understanding, and trust
#Solar
Physics
• Proof and Trust are onPaper
going research areas for
the Semantic Web
“Critical Dataset has been calibrated
with a flat field program that is published
In the peer reviewed literature.”
Fox & McGuinness Semantic Technologies
May 21, 2007
Knowledge representation
• Statements as triples: {subject-predicate-object}
interferometer is-a optical instrument
Fabry-Perot is-a interferometer
Optical instrument has focal length
Optical instrument is-a instrument
Instrument has instrument operating mode
Instrument has measured parameter
Instrument operating mode has measured parameter
NeutralTemperature is-a temperature
Temperature is-a parameter
• A query*: select all optical instruments which have operating
mode vertical
• An inference: infer operating modes for a Fabry-Perot
Interferometer which measures neutral temperature
• ISWC paper award 2006, IAAI best paper (2007), Fox et al.
2009 in Computers and Geosciences.
19
Visual discovery
Traversal for new patterns
However - Skill/ tools?
Summary
• Get the data well structured! Be
aware of the distinctions between
data, information, knowledge.
• Develop multi-domain KBs
• Use the standards, and tools that are
available
• Get familiar with semantic technology
but do not let it drive what you explore
And…
• Frameworks more than systems
• Leverage semantic methodologies
that are shown to work/ be useful
• Vocabulary development … by
communities, leverage what you
have and for the things that matter
• Exploit late semantic binding for
ABDUCTION