Taverna in e-Lico
Download
Report
Transcript Taverna in e-Lico
Taverna in e-Lico
e-Lico is an EU Project (2009-2012) to create a virtual
laboratory for data mining and data-intensive sciences
Main partners:
– University of Geneva – project coordination
– University of Manchester – Taverna and text mining
– Institut National de la Santé et de la Recherche Médicale
(Toulouse) – biological data
– Greece – image mining
– Rapid I – Rapid Miner/Analytics
– University of Zurich – planner
Who in Manchester
Robert Stevens
Simon Jupp
Rishi Ramgolam
James Eales
Alan Williams
What has been done so far?
Populous – adaption of RightField. Used for data entry
and ontology creation.
Taverna plugin to expose Rapid Miner operators as
Taverna services
Taverna extension to create data mining plans that are
translated into workflows
Rapid Analytics repository browser to allow choosing and
uploading of data, so getting metadata
Skinning of myExperiment – done by Don Cruickshank
What will be done - 1?
Saving of provenance and data and workflow – call it a
run.
Uploading the run to e-Lico’s myExperiment
Ensuring semantics so that the data is known to be the
inputs to port “fred” of the workflow. Even better,
provenance item X is for invocation 237 of service Y in
run 93 of workflow…
What will be done – 2?
Re-opening of a run from myExperiment
Integration of provenance from RapidMiner’s operator
with the workflow provenance
Tweaking of the Rapid Miner and planning extensions
Additional/improved renderers – desire for Cytoscape
SPARQL service plugin
Development and documentation of workflow patterns
What will be done – 3?
Coping with “dominating operators”
Training workshops
Benchmarking and documentation of results
What will go back into Taverna
Plugin to expose Rapid Miner services
Plugin to allow planning of data mining
SPARQL service
Workflow run upload/browse/reload
Repository browser possibly as additional data source
New renderers
What is e-Lico doing that is affected by other projects
The saving of a workflow run depends upon
– myExperiment having semantic pack capabilities
– Some stronger mechanism to specify relationships
(wf4ever?)
Browsing/view of workflow run may leverage
– Polish MSc student’s work
Integration of provenance
– Overlap with eScience Central
What is e-Lico doing that affects other projects
e-Lico workflow run may be a good test case for wf4ever
Renderers
– Do not know specific projects – should start list
Data gathering mechanism
– Ditto
Populous
– Highly related to RightField
– What about TavernaLC (Taverna in a spreadsheet)?
SPARQL service
– Harmonization/relationship with SADI
How could eLico be of use
Data mining services
– Use in data-oriented projects e.g. MethodBox or SysMO?
Metadata about data
– MethodBox or SysMO?
Planning of workflows
– Does it work in practice?