Transcript Powerpoint
Digital Library developments
supporting eResearch
Dr Liz Lyon, Director
UKOLN, University of Bath, UK
British Library, November 2004
UKOLN is supported by:
www.ukoln.ac.uk
a centre of expertise in digital information management
www.bath.ac.uk
Overview
1.
2.
3.
4.
5.
Background and context
Digital curation
Scholarly communications
eBank UK Project
Linking research data and learning
British Library, November 2004
2
1. Background and context
British Library, November 2004
4
About UKOLN
• “a centre of expertise in digital information management”
• Funding
– Joint Information Systems Committee (JISC) + Museums, Libraries &
Archives Council (MLA)
– Portfolio of R&D projects
• 27 staff
• Inform the library, information, education and cultural heritage
communities
–
–
–
–
–
–
Policy and advisory role at national level
Build innovative Web-based systems & services
Research and development
Dissemination: http://www.ukoln.ac.uk/
e-journal Ariadne
Events: ECDL 2004
British Library, November 2004
5
Infrastructure + IIE
• Supporting cross-sectoral engagement
– Common Information Environment Group (CIE)
• Building shared services for the IIE
–
–
–
–
–
Informing strategic development
Demonstrators and prototypes
IE Service Registry
IE Metadata Schemas Registry
A common framework?
• Embedding open standards
– Dublin Core Metadata Initiative (DCMI) UK Affiliate
Managing Agent
– W3C representative
– OAI Technical Committee
– NISO MetaSearch
• Reviewing approaches to
Resource Discovery
• Digital repositories study
British Library, November 2004
6
eLearning activities
• Collaborating with CETIS
–
–
–
–
Technical Framework to support e-Learning (ELF)
Contributing to UK LOM Core profile
Shaping an approach to Persistent Identifiers
Distributed eLearning Advisory Board
• Partnership with RDN / LTSN
– Enable greater records sharing
– RDN / LTSN LOM application profile
– Shared vocabularies
•
•
•
•
Advising Jorum+ learning object repository team
Facilitating interoperability with digital libraries (IMS)
Developing eLearning tools (GroupLog Project)
Supporting the FE community
– Regional Support Centres workshops: RSC Wales, RSC SE
events in November
British Library, November 2004
7
eResearch activities
• Facilitating knowledge transfer
– Digital library, computer science and disciplinary communities
– NeSC / UKOLN workshops
• Developing Virtual Research Environment (VRE)
concepts
• Supporting the arts & humanities
– AHRB expert group
• Digital Curation Centre partner
• Changing nature of scholarly communications
• Research and development projects
– ePrints UK
– eBank UK
British Library, November 2004
8
2. Digital Curation
Obsolete media
Images by Philip Hunter, UKOLN
British Library, November 2004
10
DCC Web site
www.dcc.ac.uk
British Library, November 2004
11
Consortium partners
• Management & Co-ordination
– Director Designate Chris Rusbridge (University of Glasgow)
– Director (Phase One): Peter Burnhill
with Phase One Project Co-ordinator: Robin Rice
(both from EDINA & Data Library, University of Edinburgh)
• Community Support & Outreach
– Led by Dr Liz Lyon (UKOLN, University of Bath)
• Service Definition & Delivery
– Led by Professor Seamus Ross (HATII [ERPANET], University of Glasgow)
• Development
– Led by Dr David Giaretta (Astronomical Software & Services, CCLRC)
• Research
– Led by Professor Peter Buneman (Informatics, University of Edinburgh)
British Library, November 2004
12
Organisation to Engage & Collaborate
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of
Data
Organisations
service
definition
& delivery
management
& admin
support
research
research
collaborators
development
co-ordination
testbeds
& tools
Industry
standards bodies
Draft Home page of new
DCC eJournal
Launch January 2005
British Library, November 2004
14
1st DCC International Conference
Planning so far:
•Location - Bath UK
•September 2005
(tbc)
•2-day event
•Invited speakers
•Peer-reviewed
research papers
British Library, November 2004
15
User requirements analysis
Commissioned study by Leona Carpenter
Desk-based research, focus groups, interviews. Taxonomy of “Users”
R&D issues: Annotation services, Ontology development, Automating
metadata creation, Tools and toolkits, Data Format Description
Language, Identifiers, Registries, Economic and cost-benefits studies
Advisory services :“Ask-a-Curator”,FAQs, reports, briefings,
awareness-raising materials, best practice guidance, Storage media,
“Like Erpanet”, advise Government, Research Councils, funding
bodies
Professional development: Short courses, conferences, seminars,
workshops, secondments to DCC and to working repository services
Outreach: Leadership for the future, case studies, sharing solutions,
collaboration with other partners, international peers, industry links
British Library, November 2004
16
Associates Network
Goals
Develop understanding, share best practice, advance
research, promote recognition, develop consensus
Membership
HEIs, FEIs, national bodies, industry partners, funders,
research groups, individuals……
Benefits
Early access to R&D outputs, advisory services,
training, input to definition and design, community
participation
Discussion Forum www.dcc.ac.uk
British Library, November 2004
Please join us!
17
Service definition & delivery
• Advisory services
•
Provide on-demand responses to queries—from legal to technical
guidance HELPDESK
• Information Services
–
–
–
–
Senior Scientist and Administrator Briefing Documents
Community developed DIGITAL CURATION MANUAL
Database of FAQs
Checklist for Compliance with best practices and standards
• Training services
– Residential skills development courses for practitioners and
researchers
• Repository services
– Build catalogue of tools
• Audit & Certification services
– Development of Certification Procedures
– Develop self-certification mechanisms
British Library, November 2004
18
Development – initial plans
• Approach to Digital Curation “White Paper”
– OAIS reference model
• Registries/Repositories
– offering a repository of tools and technical information,
a focal point for digital curators
– metadata standards
• Testbeds
– for testing and evaluating tools, methods, standards
and policies in realistic settings
• Certification
– standards
– work with hardware and software manufacturers to
certify hardware and software components
British Library, November 2004
19
The Research Agenda
• Data integration and publishing
– Slowly coming to market. Publishing in community formats is a
new twist
• Annotation
– Everybody agrees this is important. No-one understands it.
• Metadata extraction
– Semantic or otherwise, it’s a key part of annotation
• Archiving and Appraisal
– What do we do about databases – they change!
• Legal issues
– Can we at least help to clarify what is going on?
• Provenance and data quality
– Again, we don’t fully understand it.
• Organisational dynamics of repositories
• Economic analyses of curation
• Ontologies, performance, registries, structure evolution…
British Library, November 2004
20
Working with Others
•
•
•
•
•
•
•
•
•
•
•
Digital Preservation Coalition
Digital Library Federation
The National Archives
Global Grid Forum
NARA
Library of Congress
Research Library Group
JISC community
E-Science Community
Associates Network
…and many more
Leadership for the Future
British Library, November 2004
21
3. Scholarly communications
British Library, November 2004
23
British Library, November 2004
24
“It is envisaged that the sharing of primary data
would prevent unnecessary repetition of
experiments and enable scientists to build
directly on each others’ work, creating greater
efficiencies and productivity in the research
process.”
British Library, November 2004
25
“The Government
believes that the data
underpinning the results
of publicly-funded
research should be
made available as
widely and as rapidly as
possible, along with the
results themselves”.
British Library, November 2004
26
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
The scholarly knowledge
cycle.
Aggregator
services: national,
commercial
Liz Lyon, eBankUK article.
Ariadne, July 2003.
Harvesting
metadata
Research &
e-Science
workflows
Validation
Deposit / selfarchiving
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Publication
Linking
Data curation:
databases & databanks
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
27
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator
services: national,
commercial
Harvesting
metadata
Research &
e-Science
workflows
Validation
Deposit / selfarchiving
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Publication
Linking
Data curation:
databases & databanks
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
28
British Library, November 2004
29
British Library, November 2004
30
Presentation services: subject, media-specific, data, commercial portals
Searching ,
harvesting,
embedding
Aggregator
services: national,
commercial
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Learning &
Teaching
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Deposit / selfarchiving
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Resource
discovery, linking,
embedding
Validation
Quality
assurance
bodies
31
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator
services: national,
commercial
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Research &
e-Science
workflows
Validation
Deposit / selfarchiving
Learning &
Teaching
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Deposit / selfarchiving
Publication
Resource
discovery, linking,
embedding
Linking
Data curation:
databases & databanks
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
Validation
Quality
assurance
bodies
32
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator services:
eBank UK
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Research &
e-Science
workflows
Validation
Deposit / selfarchiving
Learning &
Teaching
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Deposit / selfarchiving
Publication
Resource
discovery, linking,
embedding
Linking
Data curation:
databases & databanks
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
Validation
Quality
assurance
bodies
33
4. The eBank UK Project
eBank UK: linking research data to learning
• JISC-funded for 1 year from September 2003 (Phase 1)
• Phase 2 funding secured
• UKOLN at the University of Bath (lead), University of
Southampton, University of Manchester
• e-Science testbed Combechem
–
–
–
–
Grid-enabled combinatorial chemistry
Crystallography, laser and surface chemistry
Development of an e-Lab using pervasive computing technology
National Crystallography Service
• Resource Discovery Network PSIgate physical sciences
portal
• http://www.ukoln.ac.uk/projects/ebank-uk/
British Library, November 2004
35
Comb-e-Chem Project
Video
Simulation
Diffractometer
Properties
Analysis
Structures
Database
X-Ray
e-Lab
Properties
e-Lab
Grid Middleware
British Library, November 2004
37
First steps: establishing common ground…
• Understand the data creation process
• Terminology and definitions
–
–
–
–
–
Data
Metadata
Datafile
Dataset
Data holding
• Different views
– Digital library researchers, computer scientists, chemists
– Generic vs specific
– Modeller vs practitioner
• Aim for a common ontology
• Modelling the domain
• Creating a metadata schema
British Library, November 2004
38
Some metadata issues
• Using simple and qualified Dublin Core
• Additional chemical information in schema for
harvesting e.g. empirical formula
• Schema contains International Chemical
Identifier (InChI)
• Links to all datasets associated with an
experiment
• Links to individual datasets within an experiment
• Links to eprints (and other published literature)
derived from the data
• Using vocabularies specific to crystallography
• Will substitute when standards emerge
British Library, November 2004
39
Project results so far…..
• Version 2.0 eBank metadata schema
• Pilot institutional e-data repository for harvesting raw,
derived, results data (enhanced ePrints.org software)
• Exports records as ebank_dc and oai_dc
• Pilot eBank UK aggregator service
• Demonstrator with PSIgate physical sciences portal –
embedding eBank UK
• Consultation Workshop outcomes
– Cost-benefit issues for researchers?
– RAE impact?
– Disciplinary differences (A&H, social sciences)?
• Supporting studies on (1) Provenance
(2) Data models and schema
British Library, November 2004
40
The digital repository
ecrystals.chem.soton.ac.uk
Acknowledgement: Simon Coles
British Library, November 2004
41
Linking to publications
Acknowledgement:
Simon Coles
British Library, November 2004
42
eBank embedded in a science portal
Acknowledgement:
Simon Coles
British Library, November 2004
43
5. Linking research data
and learning
eBank Phase 2: linking to learning
• Embedding in e-Learning
processes
• Evaluating the pedagogical
benefits
– MChem course
– Chemical informatics
course
British Library, November 2004
45
Future planning….
generic models & metadata schema
• Phase 2 starts 1st February 2005
• Validation against other schema
– CLRC Scientific Metadata Model Vs 2
• Complex digital objects
• Investigate packaging options
– METS
– MPEG 21 DIDL
– ??
• Metadata enhancement - subject keyword additions to
datasets based on knowledge of keywords in related
publications
British Library, November 2004
46
….Identifiers and linking
•
Investigate identifiers e.g. International Chemical
Identifier (InChI code)
– Access to scientific (climate) data using DOIs (German
National Library of Science & Technology)
•
Explore context sensitive linking: find me
–
–
–
–
–
–
Datasets by this person
Journal articles by this person
Datasets related to this subject
Journal articles on this subject
Learning objects by this person
Learning objects on this subject
British Library, November 2004
47
Embedding
• Into the crystallographic research and publishing
communities
• Into the Chemistry workflow
– SMART TEA Digital Lab Book e synthesis Lab
– Other analytical techniques in chemistry
• Into e-Learning workflows
– MChem course
– Undergraduate Chemical Informatics courses
• Pedagogic evaluation
British Library, November 2004
48
Service expansion
into other
chemistry areas:
new datasets &
new processes
Feasibility study of other
physical sciences
British Library, November 2004
49
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator services:
eBank UK
Resource
discovery,
linking,
embedding
Learning object
creation, re-use
Harvesting
metadata
Research &
e-Science
workflows
Validation
Deposit / selfarchiving
Learning &
Teaching
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Deposit / selfarchiving
Publication
Resource
discovery, linking,
embedding
Linking
Data curation:
databases & databanks
Institutional
presentation
services: portals,
Learning
Management
Systems, u/g, p/g
courses, modules
Peer-reviewed
publications: journals,
conference proceedings
British Library, November 2004
Validation
Quality
assurance
bodies
50
Potential longer term impact
1. Track data and information flows in e-research and
scholarly communications – knowledge audit??
2. Validate the accuracy and authenticity of derived works
– ideas audit??
3. Facilitate explicit referencing and acknowledgment of
original contributors – intellectual integrity??
4. Raise standards associated with publication of
research outputs – academic publishing rigour??
5. Implement open access to and dissemination of data
and information – enhance the research process??
6. Give students links to original data underpinning
published works – enrich the learning process??
British Library, November 2004
51
British Library, November 2004
52
Thank you.
UKOLN receives funding from the Joint Information Systems
Committee (JISC) and the Museums, Libraries & Archives Council
(MLA) and is based at the University of Bath.