Research Objects
Download
Report
Transcript Research Objects
e-Labs and Research Objects
What is an e-Laboratory?
• A laboratory is a facility that provides controlled
conditions in which scientific research, experiments and
measurements may be performed, offering a work space
for researchers.
• An e-Laboratory is a set of integrated components that,
used together, form a distributed and collaborative space
for e-Science, enabling the planning and execution of in
silico experiments -- processes that combine data with
computational activities to yield experimental results
e-Labs
•
An e-Lab consists of:
1. a community;
2. work objects;
3. generic resources for building and transforming work
objects.
People
•
Data
Methods
Sharing infrastructure and content across projects
Research Objects
• The common currency for e-Labs
• A story about an investigation
• An aggregation of resources
– With a particular purpose, reason or rationale for the
aggregation
• Capturing the investigation process “from soup to nuts”
• Intended to be
– Reusable
– Repeatable
– Replayable
e-Labs + Research Objects
• An e-Lab is built from a collection of services, consuming
and producing Research Objects
Visualisation
Notification
Annotation etc.
Workbench/
RO driven UI
Service
RO Bus
Service
Service
Service
RO aware
services
Research
Methods
Experts
Development e-Lab
Scripts
Data sets
Research Objects
Services
Publications
Workflows
Application e-Lab
Delivery Experts
Policy makers
Knowledge Burying (Mons)
Knowledge
Experiment
Publication
Text Mining
Paper
• Publishing/mining cycle results in loss of knowledge
– ≥ 40% of information lost
• RIP – Rest in Paper
• ROs as a mechanism for publication of knowledge,
preserving information about the process.
(Current) RO Principles
• Common Schema for internal strcture
• References + metadata rather than Data
• Graceful degradation of understanding
– Not all services understand everything
– cf RDF/OWL
• Reflective
– Clickable
– Displayable
• Mailable
Anatomy of an RO
Flavours of RO
• RO as encapsulation of a process
– Up to date references to appropriate resources
• RO as a record of what happened
– Curated, “fossilised”, immutable aggregation
• RO as collection
– E.g Tutorial materials
• RO as protocol
• General templates that may be
specialised for specific
domains/tasks
What’s inside?
•
•
•
•
•
•
•
•
A research problem
A hypothesis
Experimental design
Data sets
Measurements
Workflows used to analyse data
Results of data analysis
Information about ethical
approval
• Governance policies
• Publications, e.g. papers,
reports, slide-decks
• The investigators involved in
the experiment;
• References to other SROs that
the work depends on or cites
• Descriptions of relationships
between resources.
– Lilly experiment ontology,
– SWAN/SIOC
– Scholarly discourse
– OBO relations ontology
RO Lifecycle
• ROs have a lifecycle: they may be created, manipulated,
edited, interrogated and published.
• Appropriate services
support this lifecycle
e-Labs services
•
•
•
•
•
•
•
•
•
Registry
Repository
Workflow Monitoring
Event Logging
– News feeds, activities
Social Metadata
– Tagging, groups, users,
Sharing
Annotation
Search
Visualisation
Notification
• Authentication, Authorisation
and Role based Access
• Job Execution. Workflow
engine, HPC scripts etc.
• Naming and Identity
Centralised vs. distributed.
• Synchronisation
– To support on-line and offline working
• Anonymisation
– e.g. for health records
• Text Mining
e-Labs activity
e-Labs TAG
• Obesity e-Lab (details next)
• myExperiment
– Packs as a precursor to
ROs
– Sharing/Social networking
services
• Biocatalogue
– Curated collection of bio
web services
• LifeGuide
– myExperiment for
storing/sharing Internet
interventions
• NW eHealth
– e-Labs as a “sense-making
layer” on top of NHS
Information Systems
• ONDEX
– Linking bio data sets
• Sysmo-DB
– Web-based exchange of
data
• Shared Genomics
– HPC Infrastructure for
analysis of large-scale
genetic data
Evolution
1st Generation
•Current practice of early adoptors of e-Labs tools
2nd Generation
such as Taverna
•Designing and delivering
now,
e.g. Obesity e-Lab
3rd
Generation
•Characterised by researchers
using
tools within
with -Taverna
andsome
myExperiment
•The problem
vision
the
e-Labs
we'll
be
delivering
inon
5 years
their •Experience
particular
area,
with
re-use
of and
research
results
arising
from
these activities
- and
illustrated
by open
science.
tools,our
data
methods
within
the
discipline.
•Key •Characterised
characteristic
re-use
- of
theby
increasing
pooland
by
global
reuse
of tools, data
•Traditional
publishing isissupplemented
methods
across
any
discipline,
and surfacing the
of tools,
and
methods
across
publication
ofdata
some
digital
artefacts
likeareas/disciplines.
workflows
rightsome
levelsfreestanding,
of complexityrecombinant,
for the researcher.
•Contain
and links
to data.
•Key characteristic
is radical
sharing analytics
reproducible
research
objects.
Provenance
•Provenance is recorded but not shared and re-used.
is significantly data driven - plundering the
plays•Research
a role.
•Science is
accelerated
and
practice
backlog
of data,
results
andbeginning
methods.to
scientificinpractices
are established and
shift •New
to emphasise
silico
work
•Increasing
automation
and decision-support
opportunities arise for completely
new scientific for the
researcher - the e-Laboratory becomes assistive.
investigations.
•Provenance assists design
•Curation is autonomic and social
ROs and e-Labs
• Research Objects
– Aggregations of resources (people + data + methods)
– Rationale, purpose, story
– Lifecycle
– Share and Exchange: Reuse, Replay, Repeat
• E-Labs
– Collection of services consuming and producing
Research Objects
A dream…
Problem
http://www.flickr.com/photos/fatdeeman/2879894
E-Lab