eBank UK - linking research data, learning & scholarly

Download Report

Transcript eBank UK - linking research data, learning & scholarly

eBank UK : linking research data,
learning and scholarly communications.
Dr Liz Lyon, UKOLN, University of Bath
Dr Simon Coles, School of Chemistry, University of Southampton
JISC Joint Programmes Meeting 2005
1
The wider context
Why create the e-Framework?
The JISC strategic context
Sarah Porter, 2005
JISC Joint Programmes Meeting 2005
3
JISC-funded
content providers
institutional
content providers
external
content providers
authentication/authorisation (Athens)
service registries
metadata schema registries
brokers
aggregators
catalogues
indexes
identifier services
institutional profiling
services
OpenURL media-specific institutional
link servers
portals
portals
subject
portals
learning management
systems
terminology services
shared infrastructure
end-user
desktop/browser
© Andy Powell (UKOLN, University of Bath), 2005
This work is licensed under a Creative Commons License
Attribution-ShareAlike 2.0
JISC Information Environment architecture
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator
services: national,
commercial
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Learning &
Teaching
workflows
Research &
e-Science
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Deposit / selfarchiving
Validation
Publication
Resource
discovery, linking,
embedding
The scholarly knowledge cycle.
Liz Lyon, Ariadne, July 2003.
© Liz Lyon (UKOLN, University of Bath), 2005
This work is licensed under a Creative Commons License
Attribution-ShareAlike 2.0
Deposit / selfarchiving
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Peer-reviewed
publications: journals,
conference proceedings
JISC Joint Programmes Meeting 2005
Validation
Quality
assurance
bodies
5
eScience - the data deluge
Data
Overload!
EPSRC National
Crystallography
Service
How do we
disseminate?
JISC Joint Programmes Meeting 2005
6
JISC Joint Programmes Meeting 2005
7
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator services:
eBank UK
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Research &
e-Science
workflows
Deposit / selfarchiving
Learning &
Teaching
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Publication
Deposit / selfarchiving
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Resource
discovery, linking,
embedding
Peer-reviewed
publications: journals,
conference proceedings
JISC Joint Programmes Meeting 2005
Validation
Quality
assurance
bodies
8
The eBank UK Project
eBank UK: background
• JISC-funded September 2003, Phase 2 February 2005
• UKOLN at the University of Bath (lead), University of
Southampton, University of Manchester
• Exemplar: e-Science testbed ‘Combechem’
–
–
–
–
Grid-enabled combinatorial chemistry
Crystallography, laser and surface chemistry examples
Development of an e-Lab using pervasive computing technology
National Crystallography Service
• Resource Discovery Network / PSIgate physical
sciences portal
• http://www.ukoln.ac.uk/projects/ebank-uk/
JISC Joint Programmes Meeting 2005
10
The project team
•
•
•
•
•
•
•
•
UKOLN
Michael Day
Monica Duke
Rachel Heery
Traugott Koch
Liz Lyon
+
Andy Powell
•
•
•
•
•
•
•
Southampton
Les Carr
Simon Coles
Jeremy Frey
Chris Gutteridge
Mike Hursthouse
Andrew Milstead
• Manchester
• John Blunden-Ellis
JISC Joint Programmes Meeting 2005
11
Create
Data Flow in eBank UK
HTML
Deposition
Interface
Submit
Store/link
Institutional
repository
Index
and
Search
Harvest
(XML)
eBank
aggregator
Present
HTML
Present
OAI-PMH
Deposit
Service Provider
interfaces e.g.
Subject Portal
Local archive
search
interface
JISC Joint Programmes Meeting 2005
Data files
Metadata
12
Dataset
Searching,
linking and
embedding
eBank data model
Dataset
Dataset
dcterms:references
Harvesting
OAI-PMH
oai_dc
Crystal structure
(data holding)
Linking
ebank_dc
record (XML)
dc:identifier
dc:type=“CrystalStructure”
and/or “Collection”
Institutional
repository
Crystal structure
report (HTML)
Searching,
linking and
embedding
Harvesting
OAI-PMH
PSIgate
portal
ebank_dc
eBank UK
aggregator
service
dcterms:isReferencedBy
Eprint
“jump-off”
page
(HTML)
Eprint
manifestation
(e.g. PDF)
Deposit
ePrint UK
aggregator
service
dc:identifier
Linking
Model input Andy Powell, UKOLN.
Harvesting
OAI-PMH
oai_dc
Eprint oai_dc
record (XML)
dc:type=“Eprint”
and/or ”Text”
JISC Joint Programmes Meeting 2005
Subject service
Searching,
linking and
embedding
13
CombeChem: An EPSRC pilot project
Simulation
Video
Diffractometer
Properties
Analysis
Structures
Database
Properties
e-Lab
X-Ray
e-Lab
Grid Middleware
JISC Joint Programmes Meeting 2005
14
Crystallography data: The publication problem
2,000,000
Cl
Cl
N
Cl
O
O
Cl
+
N O
OCl
Cl
O
Cl
Cl
Cl
O
O
+
N O
Cl
O
Cl
Cl
N
Cl
N
O
N
25,000,000
300,000
JISC Joint Programmes Meeting 2005
15
Crystallography workflow
RAW DATA
DERIVED DATA
RESULTS DATA
• Initialisation: mount new sample set up data collection
• Collection: collect data
• Processing: process and correct images
• Solution: solve structures
• Refinement: refine structure
• CIF: produce CIF (Crystallographic Information File)
• Validation: chemical & crystallographic checks
• Report: generate Crystal Structure Report
JISC Joint Programmes Meeting 2005
16
A data repository entry
JISC Joint Programmes Meeting 2005
17
Access to the underlying data
ecrystals.chem.soton.ac.uk
JISC Joint Programmes Meeting 2005
18
Harvesting: OAIster
JISC Joint Programmes Meeting 2005
19
Aggregating: search & discover
JISC Joint Programmes Meeting 2005
20
Linking data to publications
JISC Joint Programmes Meeting 2005
21
eBank embedded in a science portal
JISC Joint Programmes Meeting 2005
22
Current Developments: Deposition
and validation tools
Validation
File format
manipulation
JISC Joint Programmes Meeting 2005
23
Current Developments: Integration into
crystallographic publishing practices
Publishers seal
of approval
JISC Joint Programmes Meeting 2005
24
Current Developments: Ontologies for
aggregating, linking & discovery
• Transform the ‘list’ into an
‘ontology’
• Embed ontology into the
deposition process
• Publish keywords in OAI
• Aggregators use keywords
for linking with the broader
literature
• Researchers use keyword
ontology in search and
discovery services
JISC Joint Programmes Meeting 2005
25
eBank : linking to learning
• Embedding in e-Learning
processes
• Evaluating the pedagogical
benefits
– MChem course
– Chemical informatics
course
JISC Joint Programmes Meeting 2005
26
Issues and challenges
1. Issues: research data as content
• Sharing it!
• Data diversity
–
–
–
–
Homo- or heterogeneous
Raw and derived / processed
Sensitivity
Fast or slow growth in volume
• Repository evolution:
– Likelihood to scale up (from bytes to petabytes)
– Quality assurance (from the start)
– Community-based standards development
(“folksonomies”)
– Build robust services
JISC Joint Programmes Meeting 2005
28
2. Issues: generic data models,
metadata schema & terminology
• Validation against other schema
– CCLRC Scientific Data Model Vs 2
• Complex digital objects and packaging options
– METS
– MPEG 21 DIDL
• Terminologies
– Domain: crystallography
– Inter-disciplinary e.g. biomaterials
– Metadata enhancement: subject keyword additions to
datasets based on knowledge of keywords in related
publications
– Meaningful resource discovery?
JISC Joint Programmes Meeting 2005
29
3. Issues: linking and identifiers
•
•
•
•
Links to individual datasets within an experiment
Links to all datasets associated with an experiment
or a data collection
Links to derived eprints and published literature
Context sensitive linking: find me
–
–
–
–
•
Datasets by this author / creator
Datasets related to this subject
Learning objects by this author / creator
Learning objects related to this subject
Identifiers and persistence
– “generic”
– domain: International Chemical Identifier (InChI code)
•
•
Resource discovery : Google Scholar?
Provenance: authenticity, authority, integrity?
JISC Joint Programmes Meeting 2005
30
4. Issues: embedding and workflow
• Into the crystallographic publishing community
International Union of Crystallography
• Into the chemistry research workflow
– SMART TEA Digital Lab Book e-synthesis Lab
– Other analytical techniques and instrumentation
– RAE procedures?
• Into the curriculum and e-Learning workflows
– MChem course
– Undergraduate Chemical Informatics courses
JISC Joint Programmes Meeting 2005
31
Next in Phase 2…….
• Full embedding into the crystallographic
research and publishing communities
• Chemistry workflow embedding
– R4L Repository for the Laboratory
– Related sub-domains of chemistry SPECTRa
• e-Learning embedding and pedagogic
evaluation
– Assess role in u/g chemical informatics courses
– Introducing school children to e-research
• Enabling interdisciplinary research
– Physical, mathematical, earth, environmental and
engineering sciences
JISC Joint Programmes Meeting 2005
32
Thank you.
Questions?…..