Exploiting diverse Sources of Scientific Data
Download
Report
Transcript Exploiting diverse Sources of Scientific Data
Re-use or Re-invention - a
Roadmap for Data Integration
27th-28th November 2006
Prof. Jessie Kennedy
e-SI Research Theme Leader
e-SI Theme:
Exploiting Diverse Sources of Scientific Data
e-SI Research Theme
Exploiting Diverse Sources of Scientific Data
Aim of theme is to investigate some of the
issues and solutions to exploiting diverse
sources of scientific data.
Theme Wiki
http://wiki.esi.ac.uk
Exploiting Diverse Sources of Scientific Data
2
Workshops hosted in theme:
Spatiotemporal Databases for Geosciences, Biomedical Sciences and
Physical sciences, eSI, 1-2 November 2005
Oracle Corporation and the e-Science Institute Seminar - Temporal
Database in Depth: Time and the Data Warehouse, eSI, 3 November 2005
The Second Workshop on Scientific Data Mining, Integration and
Visualization (SDMIV2), eSI, December 2005
DIALOGUE meeting, 9-10 February 2006
Integrated Health Records - Practice and Technology, 9-10 March 2006
Taxonomic Databases Working Group (TDWG) Technical Architecture Group
meeting, 11 April 2006
Taxonomic Databases Working Group (TDWG) Core Ontology, 16-18 May
RDF, Ontologies and Meta-Data Workshop, 7-9 June 2006
Taxonomic Databases Working Group (TDWG) GUIDs-2, 10-12 June 2006
The Closed World of Databases Meets the Open World of the Semantic Web
12-13th October 2006
Exploiting Diverse Sources of Scientific Data
3
Workshops hosted in theme:
Spatiotemporal Databases for Geosciences, Biomedical Sciences and
Physical sciences, eSI, 1-2 November 2005
Oracle Corporation and the e-Science Institute Seminar - Temporal
Database in Depth: Time and the Data Warehouse, eSI, 3 November 2005
The Second Workshop on Scientific Data Mining, Integration and
Visualization (SDMIV2), eSI, 14 December 2005
DIALOGUE meeting, 9-10 February 2006
Integrated Health Records - Practice and Technology, 9-10
March 2006
Taxonomic Databases Working Group (TDWG) Technical Architecture Group
meeting, 11 April 2006
Taxonomic Databases Working Group (TDWG) Core Ontology, 16-18 May
RDF, Ontologies and Meta-Data Workshop, 7-9 June 2006
Taxonomic Databases Working Group (TDWG) GUIDs-2, 10-12 June 2006
The Closed World of Databases Meets the Open World of the
Semantic Web, 12-13 October 2006
Exploiting Diverse Sources of Scientific Data
4
Recurring Issues Focus
Architectures for Data Integration
What strategies are used for data integration
workflow architectures
grid architectures
Globally Unique Identifiers
What gets a GUID? Who issues them? What technology?
Metadata, Terminologies and Ontologies needed…
Data discovery for sharing/analysis (integrating)
Understanding the content of data sets
Automatic transformations (semantic mediation)
Will these solve the problems?
Exploiting Diverse Sources of Scientific Data
5
Metadata and Ontologies
Issues
Standardisation of formats
Creation of metadata and ontologies
Manual and automatic
For whom and by whom?
Many or one ontology (granularity)?
How do we integrate them or map between them?
How do we choose which to use/reuse?
How do we know data is suitable for our purpose?
Exploiting Diverse Sources of Scientific Data
6
Re-use or Re-invention - a
Roadmap for Data Integration
Investigate data integration issues in the
Neuroscience community
Data
Collection across sites and populations
Integration of different types of data
Integration of data at different levels of granularity
Ontologies
Harmonisation or Integration
Day 1 - General issues
integration across sites and populations
Day 2 - Special areas of interest
ontologies + information governance
Format
Brief Introductory Talks followed by Roundtable discussions
Exploiting Diverse Sources of Scientific Data
7
Enjoy the Workshop!
e-SI Theme:
Exploiting Diverse Sources of Scientific Data