Transcript Edikt

e-Science Data Information and Knowledge Transformation
Edikt :
e-Science Data, Information
and Knowledge Transformation
NeSC Review, 30 September 2003
Dr. Denise Ecklund, edikt technical architect
What is edikt?
Requirements
analysis
Technology
matchmaking
Standards
E-Science Apps
CS Research
Edikt project
Gap filling
Grid Services for
e-Science Data
Management
Rigorous
engineering
Commercial SW
components
and skills
 The team: 8 professional software engineers,
architect, project manager, and support staff
 SHEFC funded research and development grant
– 3 years funding: May 2002 – 2005
– +3 years funding upon successful project and review
2
www.edikt.org
Current activities
 Eldas – Enterprise level data access services
– Core data services supporting e-Science virtual organisations
 BinX – Binary XML
– Supports data interchange for astronomy and other applications
 OSAGE – Ontology-based Species Atlas for Gene Expression
– Defines a database schema for storing and annotating
3D anatomy and gene expression data for multiple species
 Technology and research evaluations
3
www.edikt.org
Creating a Virtual Organization
Let’s
Its atshare
X.
Get
with Y
ourit data!
Great!
I can’t read it!
How do I get it?
Radio spectrum
Optical spectrum
ELDAS
DB2 DB
Great!
I can’t
Where
find is
it!it?
X-Ray spectrum
+ Grid Directory Services
Xindice DB
MySQL DB
4
www.edikt.org
ELDAS – Extensibility via DACs
User1
User2
Reusable
ELDAS Core
User3
ELDAS
ELDAS
Core
DAC2
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Data Access Components interface to distinct DBMSs
 Multiple DB drivers can be supported
– JDBC, ODBC for relational DBMSs
 Plug-n-Play installation of ELDAS
5
www.edikt.org
ELDAS – EJB Implementation
Grid User1
ELDAS runs anywhere
Suitable for grid & web
Grid User2
Grid Proxy
Web User1
Web Servlet
Java
Framework
ELDAS
EJB - GDS
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Java 2 Enterprise Edition implements basic server tasks
 Java Beans container used to implement ELDAS core
6
www.edikt.org
BinX – accessing legacy binary data
simulations
 The Problem:
– Many binary data files
– Applications must “know”
the data format
– Binary data formats are
machine-specific
Binary
Binary
Data File
Binary
Data File
Data File
 The Solution:
– Write a “stand-aside” format
description in XML
– Provide a library to
 Interpret the description
 Provide file access across
different machines
– Build higher-level services
BinX file
describes
binary file
structure
BinX Library
e-Science
Application
7
www.edikt.org
BinX – format transformation
 Even when we try to agree, we disagree
 Multiple data format standards require conversions
BinX
description
Binary
Data
File
Binary
Data
File
BinX Library
BinX Utilities
FITS
data format
VOTable
data format
Spectral Analysis
Application
Data format transformations
based on XML descriptions
BinX
description
3D Image
Data Mining
Application
8
www.edikt.org
OSAGE – Applying Computer Science
 Extend the Edinburgh Mouse Atlas
– Data model to describe multiple species
– Support scientific collaboration via data sharing
 Computer Science theory and best practice
– Generic data model for species anatomy
– Flexible data annotation and versioning with XML
CS theory
DB2 DB
Data
Access
Services
9
www.edikt.org
The Future – bringing components together
Extended Grid Data Services
for Virtual Organisations
Data
Versioning
Service
Constraint
Mgmt
Service
User
Annotation
Service
...
CS research results
layered over basic
ELDAS services
Xindice DB
Data
Archiving
Service
ELDAS
BinX Library
MySQL DB
DB2 DB
BinX is an intelligent Binary Files
binary file data source
10
www.edikt.org
e-Science Data Information and Knowledge Transformation
Thank you!
Questions?
http://www.edikt.org