OPF-Planets-and-SCAPE

Download Report

Transcript OPF-Planets-and-SCAPE

PLANETS, OPF & SCAPE
A summary of the tools from these
preservation projects, and where their
development is heading
www.openplanetsfoundation.org
PLANETS
• A big project to build digital preservation tools...
www.openplanetsfoundation.org
OPF’s Challenge
• The Open Planets Foundation was set up to sustain
the PLANETS outputs into the future.
– But the tools are
• Numerous, often complex, & of mixed quality/maturity
• Require complex technology stacks (JEE)
– So, how do we make the code sustainable?
• Selection, modularisation, simplification
• Aim for a flexible suite of modular tools, rather than a
monolithic system
www.openplanetsfoundation.org
SCAPE
• http://www.scape-project.eu/
• Many PLANETS partners
– Including OPF
• Many new partners too
• Driven by data
– Web archiving, science data, large-scale
• Cluster computing for scale
– Based on the HADOOP platform
www.openplanetsfoundation.org
PLATO
www.openplanetsfoundation.org
The PLANETS Testbed
www.openplanetsfoundation.org
The PLANETS Testbed:
Too Many Good Ideas In One Place
• Designing experiments
– Web GUI for complex workflows
• Running experiments
– All services hosted centrally, plus test corpora
• Analysing the results
– Per-experiment automated & manual analysis
– Multi-experiment aggregation & data mining
• Sharing all of the above
www.openplanetsfoundation.org
Re-imagining The PLANETS Testbed:
A Modular Approach
• Use separate tools in each role
– Experiment Design
– Execution
– Analysis
• Publish results from each
– Loosely coupled instead of all-in-one
• i.e. sharing is built into the design
www.openplanetsfoundation.org
Experiment Design:
SCAPE Workflows In Taverna
• As part of SCAPE
www.openplanetsfoundation.org
Experiment Design Support:
SCAPE Service Registry
www.openplanetsfoundation.org
Experiment Design Support:
OPF Shared Test Corpora
• Simple collections accessed over HTTP
– No special browser software required
• Publicly hosted by HATII
– May also be mirrored by OPF members
• Stabilise corpora from Planets
– Adsorb corpora from SCAPE & elsewhere
• Look for Open Source CMS/Annotation tools
– Layer on top of HTTP collections
www.openplanetsfoundation.org
Experiment Design Support:
Sharing & Publishing Via myExperiment
www.openplanetsfoundation.org
Experiment Execution Support:
SCAPE’s Lightweight Tool Wrapping
• PIT: Preservation-action Invocation Tool
– Uses XML ‘tool specification’ documents that
describe preservation actions
• Command-line templates, Java classes, PLANETS/SCAPE
web services, etc
– Built to be shared
• Can be published via, e.g. myExperiment
• Should lead to more reproducible results
– Re-using PLANETS interoperability code
www.openplanetsfoundation.org
Experiment Execution:
Multi-platform Tool & Workflow Invocation
• Shared tool specifications make multi-platform
execution easier
– From the command line
– From within Taverna
– From the SCAPE cluster platform
– From a simplified web interface
• Run local-first, remote/service as needed
• Collect results in a standard form, using Testbed code
www.openplanetsfoundation.org
Experiment Execution:
Publishing Experimental Results Via REF
• OPF Results Evaluation Framework: REF
– Hard-coded experiments of common interest
• Can run the experiment automatically
– Publishes results as linked data
• http://data.openplanetsfoundation.org/ref/extension/
• Built by Dave Tarrant, based on P2 format registry
– Will come up again in the Identification session
– SCAPE aims to publish much more data
www.openplanetsfoundation.org
Analysing Results:
Linked Data & Future Plans
• REF allows data to be inspected
– Concentrating on collecting data at present
• Will expose SPARQL endpoint for data queries
– Analysis, visualisation can be build upon that
• Please add analysis Issues for your Datasets and
preservation processes to the wiki!
– e.g. what graphs and statistics would be useful?
www.openplanetsfoundation.org
Summary
• PLATO
– SCAPE will add Preservation Watch & more
• The PLANETS Testbed
– Re-imagined as a gateway to a complementary
suite of preservation tools and data services
– SCAPE leveraging work from Taverna, IMPACT
• Development driven by user needs
– SCAPE Scenarios, AQuA/Hackathon Issues
www.openplanetsfoundation.org