HCLSIG$$Meetings$$2009-11

Download Report

Transcript HCLSIG$$Meetings$$2009-11

W3C Incubator Group
on Provenance
Yolanda Gil (chair)
Information Sciences Institute
And Department of Computer Science
University of Southern California
[email protected]
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
1
Some Context:
W3C Incubator Activity (vs Working Group)

Fosters rapid development of new Web-related concepts
•

Exploratory efforts in areas of interest
•



Lightweight process with W3C support
Get input and pulse from a community
Duration is one year
Open to invited experts in the broader community, not
onlyW3C members
May result on follow-on activities, groups, or
standardization efforts
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
2
What is Provenance


Provenance: Initial sources of information + entities +
processes involved in producing a result
Some uses of provenance
•
•
•
•
Making trust judgments when information sources are diverse
and of varying quality (the Web!)
Providing justifications for conclusions
Establishing attribution
Enabling repeatability and reproducibility of processes
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
3
The Need for Provenance is Ubiquitous

Business practice
•

Cultural artifacts
•

How new results were obtained: from assumptions to
conclusions and everything in between
Licensing and attribution
•

Origins, owners, processes
Science applications
•

Manufacturing processes and providers of a given product
For a document/software that combines permissions and rights
Web search/use
•
Making trust judgments on what web content to trust
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
4
Immediate Need for Provenance in the
Semantic Web Activity

Web of trust
•

Reasoners
•

Attribution, authority, propagation
Social web
•

Use of conflicting data of varying degrees of quality
Social trust
•

Attribution of assertions from diverse sources
Linked data
•

Making trust judgments based on provenance
Privacy and use policies of sensitive (personal) data
Life sciences and e-Science at large
•
Method capture and reproducibility of scientific results
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
5
Major Issues in Provenance

What to record
•

Granularity
•

Provenance information can be much larger than base
data/assertions
Verification of provenance information
•

Reification issues
Scale
•

Finer grained recording has a cost in performance
Integration of provenance with base assertions
•

Depends on use
“oh yeah button”
Presentation to end user
•
What information and how to make it accessible to users
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
6
Some Prior Research

Databases
•

Knowledge representation and reasoning
•

Computations leading to new data products
Argumentation
•

Justification and explanation of reasoning
Workflow Systems
•

Aggregations of data, collections, streaming, queries
What is taken into account to make a judgment
Information retrieval
•
Question answering when documents are
contradictory/complementary
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
7
Relevant Activities at W3C

Enabling technologies
•
•
•

SPARQL Working Group
RDB2RDF Working Group
Web Security Activity
Provide requirements and use cases
•
•
•
•
E-Government
Semantic Web Health Care and Life Sciences Interest
Group
Web Security Activity
Social Web Incubator Group
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
8
Goals of the Incubator Group
Provide state-of-the-art understanding and develop a
roadmap for development and possible standardization
 Articulate requirements for accessing and reasoning about
provenance information
•

Identify issues in provenance that are direct concern to the
Semantic Web
•


Develop use cases
Articulate relationships with other aspects of Web architecture
Report on state-of-the-art work on provenance
Report on a roadmap for provenance in the Semantic Web
•
•
Identify starting points for provenance representations
Identifying elements of a provenance architecture that would benefit
from standardization
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
9
Status




September 22, 2009 – September 21, 2009
Broadening participation
Weekly telecons (started Oct 30, 2009)
Currently developing a timeline for group activities
•

Starting to gather use cases
Resources:
• http://www.w3.org/2005/Incubator/prov/
• http://www.w3.org/2005/Incubator/prov/wiki/
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
10




Eric’s pointer in IRC
Scott’s paper
“soft facts” from discourse statements
Confidence measures/uncertainty
W3C XG Provenance
Yolanda Gil, www.isi.edu/~gil
11