Changing Roles, Responsibilities and Relationships

Download Report

Transcript Changing Roles, Responsibilities and Relationships

Changing Roles,
Responsibilities and
Relationships
Dr Liz Lyon, Director, UKOLN
Associate Director, UK Digital Curation Centre
Opening the research data lifecycle, JISC Conference 2007
UKOLN is supported by:
This work is licensed under a Creative Commons Licence
Attribution-ShareAlike 2.0
www.ukoln.ac.uk
A centre of expertise in digital information management
Preliminary findings from a JISC study
• Terms of Reference for UKOLN
To define how institutions (collectively and
individually) and scientific data centres can
together effectively achieve
– Preservation
– Access – Managed and open
– Re-use – Data citation, data mining and re-interpretation
• October 2006 – March 2007
• N.B. Work in progress!
Some of the data
stakeholders?
Funders
•
•
•
•
•
Interviews: 4 Research Councils + 1 charity
Support for data curation is (still) patchy
Mixed approaches: proactive to passive
Gaps in infrastructure support for data outputs
Limited formal links between programme
planning and support infrastructure
• Some Data management and sharing policies
• Some use of Data Management Plans
• Wellcome Trust – Policy + Q&A January 2007
January 2007
Data Management and Sharing Plan
required “if creating or developing a resource
for the research community as the primary
goal” or “involve the generation of a
significant quantity of data that could
potentially
www.ukoln.ac.ukbe shared for added benefit”
A centre of expertise in digital information management
Funders 2
•
•
•
•
Limited advocacy work
Funding models for infrastructure support vary
Funding models for research programmes vary
Some productive partnerships e.g. MRC and
Wellcome Trust, CCLRC and Wellcome
• Some examples of good practice
Hierarchy of drivers (for data sharing)
Acknowledgement: Mark Thorley, NERC
•
•
•
•
•
•
Level 0: deliver project.
Level 1: meet ‘good scientific practice’.
Level 2: support own science.
Level 3: employer’s requirements.
Level 4: funder’s requirements.
Level 5: public policy requirements.
NERC has:
7 designated data
centres
Data Management
Co-ordinator
DataGrid
NATURAL
ENVIRONMENT
RESEARCH COUNCIL
MRC developing a data support plan
Acknowledgement Alan Sudlow
Data centres & Data services
• Interviews with 5 data services
• Deep levels of expertise and subject knowledge
• Exemplars of good practice: standards, policies,
manuals, robust curation / preservation practice
• Limited sharing of expertise between centres
• Some effective partnerships:
– AHDS Stormont Papers with Queens Belfast
– BADC with CLADDIER Project
• Wide range of community awareness
• Use of licences but IPR issues: performing arts,
• Technical issues: complexity of data sets, version
control, identifiers, application profiles
Data centres & Data services 2
• Exemplar of good practice
– European Bio-informatics Institute
– Microarray data to inform gene expression
– Consensus on community standards MIAME
– Data pipelines at source via Laboratory Information
Management Systems LIMS
– User tools MIAMExpress & value-added services
– Annotation of data using the Gene Ontology
– Submission & deposit is embedded in community
culture: requirement for publication
– Training programme, eLearning materials coming
– This level of data curation is expensive!!
EMBL-Bank
DNA sequences
Reactome
Array-Express
Microarray
Expression Data
UniProt
Protein Sequences
EnsEMBL
Genome
Annotation
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
Source: Graham
Cameron, EBI
Large resources in related disciplines
Specialist biomolecular data
resource examples
BRENDA
Medical data
resources
IMGT
Pasteur DBs
Core
biomolecular
resources
Biodiversity
data
resources
SGD
Flybase
Chemical
data
resources
Eumorphia/
Phenotypes
MGD
Mutants
Mouse Atlas
Source: Graham
Cameron, EBI
Model organism resource examples
General Data Selection Criteria
• Usability
–
–
–
–
–
Quality of data
Usable data format
Conditions of Use
Reputable Author
Documentation
• Usefulness
–
–
–
–
Data quality
Uniqueness of data
Potential Strategic Use
Usefulness of parameters
Institutions & Data Repositories
• Not much data…. or duplication …… (yet?)
• Departmental audits of research data practice
at University of Southampton to inform
developing institutional data & curation policy
• Barriers to data sharing:
– IPR and geospatial data
– Lack of awareness amongst researchers
– Cultural roots and resistance to change
• Exemplars of good practice: eBank Project
eCrystals ‘Global Federation’ Model
Data creation
& capture in
“Smart lab”
Data discovery,
linking, citation
Presentation services / portals
Data discovery,
linking, citation
Publishers: peerreview journals,
conference
proceedings, etc
Aggregator
services
Search,
harvest
Search,
harvest
Publication
Deposit
Data analysis
Laboratory
repository
Institutional
data repositories
Validation
Search,
harvest
Subject
Repository
Deposit
Deposit ,
Validation
Deposit
Curation
Preservation
Deposit
Institution Library &
Information Services
Roles, Rights & Responsibilities
•
•
•
•
•
‘Scientist’: Creation and use of data.
‘Data centre’: Curation of and access to data.
‘User’: Use of 3rd party data.
‘Funder’: Set / react to public policy drivers.
‘Publisher’: Maintain integrity of the scientific
record.
Acknowledgement: Mark Thorley, NERC
NATURAL
ENVIRONMENT
RESEARCH COUNCIL
Closing thoughts
• Co-ordination and join up
– High level and strategic : Funders
– Operational level and practical : JISC data services
& research council data centres
• Funding
– Are current economic models for preservation &
data sharing infrastructure a) appropriate? b)
adequate? c) sustainable?
– Should inform prioritisation and investment
Closing thoughts 2
• Good Practice requirements
– Data management and sharing Policies
– Data Management Plans (peer-reviewed)
– Institutional data curation policies & planning
• Technical interoperability and integration
– Data are diverse and complex
– JISC IIE vision of discovery across repositories
– Contextual linking offers opportunity for data
centres and institutional repositories to realise
synergies and work more closely together
Closing thoughts 3
• Advocacy
– Programmes to reach across sectors
– Harmonisation and consistent messages
– Tailored & targeted to disciplines
– Researcher has some curatorial responsibility
• Training
– Lack of skills
– eLearning opportunity
– Data scientists? Recognition and career
development
– “Native” data scientists are coming….
“Dealing with the Data Deluge”
•
•
•
•
•
•
JISC Repositories Programme
Supporting Institutions in the Digital Age
Digital Repositories Conference
5-6 June 2007
University of Manchester
Research Data Strand