Cloud computing to assist Surveillance of Neglected Tropical

Download Report

Transcript Cloud computing to assist Surveillance of Neglected Tropical

This project is funded by
the European Union
Cloud computing to assist Surveillance of
Neglected Tropical Diseases: The Leishmaniasis
Virtual Laboratory in EUBrazilCC
Ignacio Blanquer (UPV)
Israel Cruz (ISCIII)
What EUBrazilCC is?
A project funded in the 2nd EU-Brazil coordinated
call
EUBrazil Cloud Connect (614048) is a Small or mediumscale focused research project (STREP) funded by the
European Commission under the Cooperation Programme,
Fra-mework Programme Seven (FP7)
Esse projeto é resultante do Edital MCT/CNPq Nº 13/2012 Programa de Cooperação Brasil – União Europeia na Área
de Tecnologias da Informação e Comunicação - TIC
A team of 12 institutions co-led by UPV and UFCG.
20/10/14
2
What EUBrazilCC aims at?
Foster EU-Brazil international cooperation in
cloud infrastructures at three levels:
Heterogeneous (especially cloud) infrastructure
federation.
Integration of Programming Services to efficiently
Access infrastructure resources.
User’s applications, creating shared spaces for the
benefit of international collaborations
Common (EU+Brazil) involvement in cloud
20/10/14
3
Main pillars of EUBrazilCC
The use – cases
Three applications coming
from the EU-Brazil
cooperation
Leishmaniasis
Virtual
Laboratory
Climate
change &
ecology
Vascular
system
simulation
The Platform
mc2
PDAS
eSC
COMPSs
Programming environments
PMES-COMPSs
Workflows
Scientific Gateways
The infrastructure
20/10/14
Heterogeneous
CSGRID
Fogbow
ONE
EUBrazilCC – FP7-614048
IM / VMRC
OStack
HPC
4
Clusters
The EUBrazilCC Infrastructure
EUBrazilCC gathers resources from 6 centres
Clusters, ONE & Ostack on-premise clouds.
BSC
CMCC
LNCC
BSC
UNEW
UFCG
CMCC
LNCC
EUBrazilCC – FP7-614048
UPVLC
20/10/14
5
Cloud >1000 cores + <500 op cores
HPC > 5500 cores
Federated at the level of resource provision (fogbow)
and at the level of the services (PMES-COMPS
and CSGRID).
The EUBrazilCC Software
Architecture
20/10/14
CSGRID
•
An homogeneous interface
for clusters and HPC
infrastructures.
Manages distributed &
heterogeneous
environments.
•
•
fogbow
•
A workflow-based platform
for data analysis.
Supporting blocks in “R”,
java, octave or javascript.
eScience
Central
•
•
A framework for the
execution of parallel
applications.
Offers a BES interface and
talks directly OCCI
IM /
VMRC
•
PDAS
•
•
A Parallel Data Analysis
service for processing Big
Data cubes.
Especially for
multidimensional data.
COMPSs
•
•
•
EUBrazilCC – FP7-614048
Deployment and
configuration of VAs.
A rich-metadata
repository of VMIs.
A federation middleware for
on-premise clouds and
opportunistic resources.
OCCI-compliant
6
Interoperability
EUBrazilCC services are compatible with other
infrastructure providers
Fogbow implemented a OCCI-compliant interface
(fOCCI), tested on the CloudWATCH Cloud Plugfest
and Standards Profile Workshop.
PMES-COMPSs can talk through rOCCI with EGI
Federated clouds infrastructures.
Authentication through VOMS
20/10/14
EUBrazilCC – FP7-614048
7
The EUBrazilCC Use Cases
A Virtual Lab to
improving the monitoring
of Leishmaniasis
An integrated
environment for blood
flow and heart simulation
A scientific app to
understand how biodiversity
affects climate change
Every year 1-2 million new cases of
leishmaniasis occur. The LVL aims to
improve surveillance and research
activities in the field of this
neglected tropical disease by
integrating data of parasites and
vectors distribution from different
databases (CLIOC, COLFLEB, ISCIII,
speciesLink, Genebank, pubmed),
together with
molecular data
and
bioinformatics
processing
pipelines.
Simulating a heartbeat is a complex,
multi-scale problem. EUBrazilCC will
deploy a complete blood simulation
system with an accuracy beyond the
state of the art by
integrating the heart
simulation
system
(ALYA)
with
a
complete
arterial
simulation
system
(ADAN)
Understanding the mutual interaction at
a global scale between climate change &
biodiversity dynamics is needed.
EUBrazilCC will integrate two workflows
combining models
of plant species
distribution and
multi-level
imaging data and
processing in a
scientific gateway.
20/10/14
EUBrazilCC – FP7-614048
8
Selected case: Leishmaniasis
Virtual Laboratory
www.who.int/tdr
20/10/14
EUBrazilCC – FP7-614048
9
Selected case: Leishmaniasis
Virtual Laboratory
Davies et al 2003
98 countries
350 Million people at risk
Prevalence: 12 Million
Incidence: 2 Million
Changes immune status
Mortality: 60 000 /yr
3rd parasitic disease
Environmental variation
DALYs: 2 357 000
9th infectious disease
Drug resistance
20/10/14
EUBrazilCC – FP7-614048
Spread
10
Selected case: Leishmaniasis
Virtual Laboratory
This will require three main components
Integrating high-quality Databases
CLIOC (http://clioc.fiocruz.br/) and the ISCIII-WHO-CCL
collection for enriching molecular samples with
associated clinical information.
COLFLEB (http://colfleb.fiocruz.br/), ISCIII-WHO-CCL
collection and speciesLink (http://splink.cria.org.br/) for
the georeferenced information of vectors.
20/10/14
GenBank® (www.ncbi.nlm.nih.gov/genbank/)
for the
EUBrazilCC – FP7-614048
11
LVL collection main data
The Leishmaniasis Virtual Laboratory will comprise
information about Leishmania parasites
(GeneBank+CLIOC/ISCIII) and sand fly vectors (SpeciesLink +
COLFLEB/ISCIII).
Filtering will allow to identify the set of samples that are of
interest for a given study.
20/10/14
13
LVL experiment
A typical experiment may consist on checking if
whether the DNA sequence from a Leishmania
isolate obtained from an outbreak was already
described in other locations.
The available DNA sequences from different
collections plus the user’s sample
(outbreak isolate) are stored in a FASTA
file and send to the processing
pipeline.
20/10/14
14
LVL experiment
eScienceCentral processes the set of stages of a
phylogeny study
Multiple pipelines will be offered, including multiple
algorithms, such as maximum parsimony, maximum
likehood and neighbor-joining.
Execution time may take hours,
depending on the arguments.
The result will be a phylogenetic
tree showing how this new
20/10/14
sequence
isolate is related to
15
LVL collection view - heatmap
Different branches of the tree define similarity
subsets, which can be explored separately.
Geographic maps will give a view of the “hot spots”
where more
entries are
found for a
given sequence.
20/10/14
EUBrazilCC – FP7-614048
16
LVL collection view – sand flies
Heat maps could be compared with maps from georeferenced occurrences of the vectors from COLFLEB
and other collections.
Analysing the interactions of specific strains and
specific vector species will increase the knowledge
on disease control management.
Potential use of Ecologic Niche
Modelling will enable identifying
potential risk areas under future
climate conditions.
20/10/14
EUBrazilCC – FP7-614048
17
LVL
Gathered information
Parasite
Vector
Host
Location
Location
Species
Host Immune Status
Host Preferences
Clinical Form
Genotype
Genotype
Surveillance System
Mapping Distribution /Ecological Niche Modelling
Assess Spread of Parasites and Vectors
Parasite-Vector-Host Profiles
Outbreak-associated Traits
Emergence Preparedness
20/10/14
EUBrazilCC – FP7-614048
18
Dissemination and Future
The EUBrazilCC web site has
additional information.
There are other ways to get
informed:
Twitter: @EUBrazilCC
www.facebook.com/EUBrazilclo
udconnect
www.linkedin.com/in/eubrazilcl
oudconnect
Join our newsletter mailing list!
http://www.eubrazilcloudconne
ct.eu/content/stay-touch
20/10/14
EUBrazilCC – FP7-614048
19
Conclusions
EUBrazilCC aims at demonstrating the use of
cloud infrastructures for research as a basis for
international cooperation
All the activities are based on joint EU-Brazil work.
The LVL is a use case with important social
impact and which requires cloud computing
infrastructures for production usage.
20/10/14
EUBrazilCC – FP7-614048
20