Transcript Powerpoint

This work is licensed under a Creative Commons License
Attribution-ShareAlike 2.0
UK Digital Curation Centre
An Introduction
Dr Liz Lyon,
Associate Director Outreach
IACMST MED Forum, November 2005
Funded by:
Digital | Curation | Centre
Repositories and digital curation
For later use?
Static
Data preservation
In use now (and the future)?
Dynamic
Data curation
“maintaining and adding value to a trusted body
of digital information for current and future use”
2
Digital | Curation | Centre
Assuring permanent access to the records of
science & the humanities?
Long term access to primary data
• Increasing data volumes from eScience and
Grid-enabled / cyberinfrastructure applications
• Changing research paradigm: data-driven
science, “big science”
• Observational data, simulations, large-scale
experimentation
• Multi-media resources, statistical data, surveys,
geo-spatial data……
3
Digital | Curation | Centre
Facilitate “post-processing” and knowledge
extraction
Enable the acquisition of newly-derived information and
knowledge
• Run complex algorithms over primary datasets
• Mining (data, text, structures)
• Modelling (economic, climate, mathematical, biological)
• Analysis (statistical, lexical, pattern matching, gene)
• Presentation (visualisation, rendering)
4
Digital | Curation | Centre
5
Digital | Curation | Centre
Provide additional functionality beyond
digital preservation processes:
adding value
Annotations
• Gene and protein sequences
• e-Lab books (Smart Tea Project in chemistry)
6
Digital | Curation | Centre
Presentation services: subject, media-specific, data, commercial portals
Data creation /
capture /
gathering:
laboratory
experiments,
Grids,
fieldwork,
surveys, media
Resource
discovery, linking,
embedding
Data analysis,
transformation,
mining, modelling
Searching ,
harvesting,
embedding
Aggregator
services: national,
commercial
Harvesting
metadata
The scholarly knowledge
cycle : linking research
data to publications
eBank UK Project
http://www.ukoln.ac.uk/projects/ebank-uk/
Research &
e-Science
workflows
Repositories :
institutional,
e-prints, subject,
data, learning objects
Validation
Deposit / selfarchiving
Validation
Publication
7
Linking
Data curation: databases & databanks
Peer-reviewed
publications: journals,
conference proceedings
Emerging
policy
on
Digital
| Curation
| Centre
open access to data
Issues: generic data models, metadata
schema & terminology
• Validation against other schema
• Complex digital objects and packaging options
– METS
– MPEG 21 DIDL
• Terminologies
–
–
–
–
Domain: marine?
Inter-disciplinary e.g. wider environment, bio….
Metadata and vocabularies
Meaningful resource discovery?
8
Digital | Curation | Centre
Ontologies for discovery in an
interdisciplinary world
• Transform the ‘list’ into an ‘ontology’
• Embed ontology into the deposition
process
• Aggregators use keywords for linking
with the broader literature
• Researchers use keyword ontology in
search and discovery services
• Formal vs informal “folksonomies”
• Web 2.0???
9
Digital | Curation | Centre
Issues: Persistent identifiers for
data (image) citation
• Use cases: depositor, author, service provider,
reader, publisher, ?
• Schemes: DOI, Handle, ARK, PURL
• Global identification: express as http URIs
• Added value services: CrossRef, resolution
service, integration (Globus), look-up service, ?
• Degree of trust or persistence
• Costs
• Future potential: political, ?
• Domain identifiers: e.g. International Chemical
Identifier (InChI) codes
10
Digital | Curation | Centre
Issues: Integration into
(marine) research workflows
• R4L Repository for the Laboratory Project (JISC-funded)
automated data capture from instrumentation,
registration of results
• SMART TEA electronic Laboratory notebook +
annotations
11
• Publishers??
• Research assessment (RAE) process?
Digital | Curation | Centre
UK Digital Curation Centre
•
•
•
•
Delivering services
Development activities
Research agenda
Outreach Programme
12
• http://www.dcc.ac.uk/
Digital | Curation | Centre
DCC people (some of them…)
• Management & Co-ordination
– Director Chris Rusbridge (University of Edinburgh)
• Community Support & Outreach
– Led by Dr Liz Lyon (UKOLN, University of Bath)
• Service Definition & Delivery
– Led by Professor Seamus Ross (HATII, University of Glasgow)
• Development
– Led by Dr David Giaretta (Astronomical Software & Services,
CCLRC)
• Research
– Led by Professor Peter Buneman (Informatics, University of
Edinburgh)
13
Digital | Curation | Centre
User requirements analysis: some
sound bytes…
R&D issues: Annotation services, Ontology development, Automating
metadata creation, Tools and toolkits, Data Format Description
Language, Identifiers, Registries, Economic and cost-benefits studies
Advisory services :“Ask-a-Curator”,FAQs, reports, briefings,
awareness-raising materials, best practice guidance, Storage media,
“Like Erpanet”, advise Government, Research Councils, funding
bodies
Professional development: Short courses, conferences, seminars,
workshops, secondments to DCC and to working repository services
Outreach: Leadership for the future, case studies, sharing solutions,
collaboration with other partners, international peers, industry links
Taxonomy of “Users”
14
Digital | Curation | Centre
Advisory services
• Responses to queries—from legal to
technical guidance
[email protected]
• FAQs constructed
• Some useful resources…..
15
Digital | Curation | Centre
Digital Curation Manual
• A world class resource
• Constructed from topic-specific chapters
– written by international experts
– editorial board comprising leading researchers
and practitioners
• 45 initial topics including
– Metadata, Appraisal and Selection; Costs;
Freedom of Information; Interoperability; the OAIS
Reference Model; Preservation Strategies; and
Open Source
16
• Briefing Papers aimed at senior managers
Digital | Curation | Centre
Workshops and Information Days
• 2005 Workshop Programme
– Persistent identifiers
– Institutional repositories
– Cost models
– Preservation of medical
databases
17
• Information Days at Bath,
Aberystwyth, London,
Glasgow, Belfast (1st
December)…..???
Digital | Curation | Centre
OAIS Reference Model
18
Digital | Curation | Centre
DCC: Development
• “DCC Approach to Digital Curation” based on the
Reference Model for an Open Archival Information
System (OAIS); ISO standard, 14721:
– Monitoring international standards
– Development of a Representation Information
(RI) registry/repository (DCC-RR)
– Recommendations for tools and methods for
generating Representation Information
– Creating test-beds for digital curation tools
Development info – see
19
http://dev.dcc.ac.uk
for details
of Wiki
and email
list
Digital
| Curation
| Centre
open to all
Trusted digital repositories
• Audit Checklist for Certification
• Draft Report August 2005
• Research Libraries Group RLG-NARA
Taskforce
• Defined criteria under 4 categories
–
–
–
–
Organisation
Functions, processes & procedures
Designated community & usability
Technologies & technical infrastructure
20
Digital | Curation | Centre
The database picture
21
Source data
Curated data: classified,
cleaned, annotated,
integrated, cross-linked
Digital | Curation | Centre
• www.ijdc.net
• Peer-review
Editorial Board
• Peter Buneman
Editor (research)
• Production editor
Richard Waller
• Papers for
submission are very
welcome!
22
• 1st issue soon….
Digital | Curation | Centre
DCC Conferences
• 1st International DCC
Conference, Bath, Sept
• Keynote speakers
 Clifford Lynch CNI
 Graham Cameron EBI
• Presentations available
• PV 2005 Edinburgh NOW
• 2nd DCC Conf Nov 2006
23
Digital | Curation | Centre
Associates Network
Goals
Develop understanding, share best practice, advance
research, promote recognition, develop consensus
Membership
International groups, national bodies, industry partners,
funders, research groups, HEIs, FEIs, individuals……
Benefits
Early access to R&D outputs, advisory services, training,
input to definition and design, community participation
24
Discussion Forum www.dcc.ac.uk
Please join us!
Digital | Curation | Centre
Thank you.
Questions?…..
Digital | Curation | Centre