2010 - Bicocca
Download
Report
Transcript 2010 - Bicocca
Bioinformatics and Natural
Computing
DISCo Departmental Workshop
2010-06-03
Outline
• BIMIB: BIonformatics MIlano Bicocca
• Research areas and new directions
http://bimib.disco.unimib.it
• People
• Cooperations
2010-06-03
DISCo UNIMIB Departmental
Workshop
2
Research areas and directions
Systems Biology
Models of biological systems
Stochastic Simulation of Biochemical Processes
Data Mining
Statistical Analysis of Biological
Experiments
Association Studies, Microarray Analysis, Clustering,
Redescriptions
Algorithmics
Sequence Analysis
Approximation Algorithms for
Combinatorial Problems in
Computational Biology (MAST, LCS,
Fingerprint clustering …)
Motif Finding, SNP classification,
Haplotyping, Alternative Splicing
Prediction
Biomedical Ontologies
Collaborative Association Studies, Phenotype
Ontology Development
2010-06-03
Natural Computing
Theory and applications of Membrane Systems
Splicing Systems and Formal Languages
DNA Word Design
Evolutionary computing
DISCo UNIMIB Departmental
Workshop
3
Natural Computing
Natural computing
• The work conducted in this area concerns the study of models
of computation that are inspired by nature
• The most important research lines that the BIMIB group is
pursuing are centered on
– DNA computing
– Membrane systems
– Evolutionary and Genetic computing
2010-06-03
DISCo UNIMIB Departmental
Workshop
5
Natural Computing:
basic research
• Much of the type of research done in these areas can be
characterized as theoretical computer science, where
questions of decidability, computational complexity and
expressive power are paramount
• In particular:
–
–
–
–
–
–
Relations with languages in the usual Chomsky hierarchy
Comparison with other computational models
Complexity aspects related to time and space resources
Application of the model to the solution of computationally hard problems
Fitness-driven Importance Sampling techniques for evolutionary algorithms
Operators-Driven Distance Measures
2010-06-03
DISCo UNIMIB Departmental
Workshop
6
Natural Computing:
applications
• Some applications include:
– Description of cellular phenomena or cellular structures (e.g.,
Mechanosensitive channels, Sodium-Potassium pump, …)
– Analysis of the behaviour of complex systems, by means of
stochastic models
– Design of software simulators to return meaningful
information to biologists
– Automatic assessment of system's biology parameters
– Automatic mining of microarray datasets
2010-06-03
DISCo UNIMIB Departmental
Workshop
7
Bioinformatics
Bioinformatics: sequence
analysis applications
• One of the major applications of informatics to the
molecular biology lies in the application of string
analysis algorithms to the study of nucleic acids
and proteomic sequences
2010-06-03
DISCo UNIMIB Departmental
Workshop
9
Bioinformatics: sequence
analysis applications
• Alternative splicing prediction
– Alternative splicing (AS) is considered one of the main
mechanisms able to explain the huge gap between the
number of predicted genes and the high complexity of
proteome in human.
– Main goal is the development of fast and reliable
computational tools for analyzing and predicting AS from
Expressed Sequence Tag (ESTs) and other genomic data
– ASPIC (Alternative Splicing PredICtion) tool
2010-06-03
DISCo UNIMIB Departmental
Workshop
10
Bioinformatics: sequence
analysis applications
• Approximate Pattern Discovery
– Given a set of nucleotide or protein sequences, find all the
motifs or conserved patterns, i.e.:
• All patterns that occur (with a maximum allowed number of mutations,
insertions or deletions) in every sequence of the set
• All patterns that occur (as above) in a “surprisingly” high number of
sequences
• The pattern “closer” to the sequences under some distance measure
– Pattern discovery: The WeederWeb System
2010-06-03
DISCo UNIMIB Departmental
Workshop
11
Bioinformatics: sequence
analysis applications
• Phylogenetic Reconstruction and Comparison
– Computational complexity and algorithmic solution of
optimization problems derived by specific instances of the
more general problem of comparing phylogenies (or
evolutionary networks) to combine them into a single
representation (i.e. an evolutionary tree or network).
– A basic problem we investigate in comparative phylogenetics
is the reconciliation (or inference) of species tree from gene
trees
2010-06-03
DISCo UNIMIB Departmental
Workshop
12
Bioinformatics: sequence
analysis applications
• Haplotype Inference (HI) and Genetic Variation
Analysis
– Design and experimentation of algorithm for solving
combinatorial problems related to haplotype inference and
genetic variations analysis.
– Specific computational problems of interest are:
• inferring the complete information on haplotypes from
(incomplete or partial) haplotypes or genotypes
• efficient reconstruction of the perfect phylogeny describing the
evolutionary history of Single Nucleotide Polymorphisms
(SNPs) data in presence of recurrent mutations
2010-06-03
DISCo UNIMIB Departmental
Workshop
13
Statistical Data Analysis of High
Throughput Data
Statistical Data Analysis of
Biological Experiments
• The amount of data generated by high-throughput
(non-sequencing) biotechnology apparatuses is huge
– Microarray
– microRNA
– Proteomic machinery (cfr. mass-spectrometry)
2010-06-03
DISCo UNIMIB Departmental
Workshop
15
Statistical Data Analysis of
Biological Experiments
• Statistical methods of various kinds are necessary to validate
hypotheses and perform data mining operations
• The research pursued by the group in this area concentrated on
– Time course data analysis with kernel methods evaluation of
ontological “enrichments”
– Multiple data sources integration for mass-spectrometry data with
mutual information scoring
– Application of Evolutionary and Genetic computing for the
assessment of features (biological markers and combination of
biological markers) in gene assays
2010-06-03
DISCo UNIMIB Departmental
Workshop
16
Biomedical Ontologies
Engineering
Biomedical Ontologies
• The need for common vocabularies and “ontologies”
used to label and/or model data has been recognized
as a cornerstone of community research by biologists
and physicians
• The BIMIB group worked on using ontologies for two
applications
– Enrichment studies (cfr., statistical analysis)
– Definition of new ontologies for clinical applications and
genotype-phenotype associations
2010-06-03
DISCo UNIMIB Departmental
Workshop
18
Biomedical Ontologies
NeuroWEB
• The NeuroWEB project was concluded in 2009
– The aim of the NEUROWEB project is to support association
studies in the field of neurovascular medicine, with a special
commitment to genotype-phenotype relations
– In particular, in the NEUROWEB project, the phenotype is
formulated on the basis of the patients’ clinical data,
eventually leading to the comprehensive assessment of the
patients’ pathological state
2010-06-03
DISCo UNIMIB Departmental
Workshop
19
Biomedical Ontologies
NeuroWEB
• Three main ontological layers (10 Top Phenotypes ~200 Low Phenotypes - ~300 Core Data Set
elements) is organized in taxonomies
• A set of ontological relations (17 object properties) to:
– Connect the leaves of the three layers
– Enable complex phenotype construction;
• Accessory layers (anatomical parts,
quantitative/qualitative attributes, …)
2010-06-03
DISCo UNIMIB Departmental
Workshop
20
Biomedical Ontologies
NeuroWEB
CDS
Onto
Relations
LOW PHENOTYPE
2010-06-03
Onto
Relations
TOP PHENOTYPE
DISCo UNIMIB Departmental
Workshop
21
Systems Biology
Simulation and Analisys
Simulation of biological systems
• Systems biology is the study of a biological
system emergent properties once modeled (and
simulated) as a set of interacting parts
• Different kinds of simulations are possible
– Deterministic (differential equations)
– Stochastic (Gillespie’s algorithm, a form of Monte Carlo
algorithms)
2010-06-03
DISCo UNIMIB Departmental
Workshop
23
Stochastic Simulation
• The modeling formalism:
– Membrane (P) systems
• The simulator
–
–
–
–
C language
Desktop PC
Cluster DISCo and CINECA with MPI implementation
Algorithm: modified Gillespie’s algorithm with τ-leaping
2010-06-03
DISCo UNIMIB Departmental
Workshop
24
Studying stochasticity in
biological systems
• 2 kinds of noise:
– intrinsic noise - due to the inherent nature of the biochemical
interactions
– extrinsic noise - due to the external environmental conditions
• Complex systems such as the biological ones are
non-linear and often exhibits many steady states,
bifurcations or chaotic behavior
2010-06-03
DISCo UNIMIB Departmental
Workshop
25
Stochastic simulations:
applications
• Molecular and cellular scale:
– transport proteins
• Na+/K+ pump, Ca2+ channels, mechanosensitive channels
– chemical reactions
• Belousov-Zhabotinsky, Michaelis-Menten
– cellular signaling pathways
• EGFR, Ras/cAMP/PKA
– bacterial colonies
• Vibrio fischeri, Pseudomonas aeruginosa
2010-06-03
DISCo UNIMIB Departmental
Workshop
26
Biological systems simulations:
Colon Rectal Crypts
Three-dimensional schematic of a crypt in
the mouse small intestine. The positions
of the individual cells show how things
might look in a typical crypt. The Paneth
cells tend toward the bottom, where they
contribute to innate immunity by
responding to bacterial infection (Ayabe
et al. 2000). The numbers on the cells
show the transit cell generation i, as in
the Ti of Figure 12.6. The stem cells vary
in actual cellular position in the range 3–
7, but on average appear to be around
cell position 4 when numbered from the
bottom. The figure only shows the bottom
7 cell positions of the approximately 15
positions. CSC abbreviates "clonogenic
stem cell" (see Figure 12.6). Redrawn
from Marshman et al. (2002). Copied
from NCBI Frank’s online book
2010-06-03
DISCo UNIMIB Departmental
Workshop
27
People BIMIB DISCo
•
•
•
•
•
•
•
•
•
Marco Antoniotti
Paola Bonizzoni
Claudio Ferretti
Alberto Leporati
Giancarlo Mauri
Raffaella Rizzi
Leonardo Vanneschi
Claudio Zandron
Italo Zoppis
2010-06-03
•
•
•
•
•
•
•
•
•
•
•
Roslyn Sagaya Mary Antonath
Stefano Beretta
Mauro Castelli
Paolo Cazzaniga
Gianluca Colombo
Antonella Farinaccio
Luca Manzoni
Dario Pescini
Yuri Pirola
Antonio Enrico Porreca
Andrea Valsecchi
DISCo UNIMIB Departmental
Workshop
28
Other People
•
•
•
•
•
•
•
•
•
Francesco Archetti, DISCo
Enza Messina, DISCo
Enzo Martegani, BtBs
Marco Vanoni, BtBs
Riccardo Dondi, Un. Bergamo
Gianluca Della Vedova, Statistica,
UNIMIB
Daniela Besozzi, Un. Milano
Giulio Pavesi, Un. Milano
Graziano Pesole, Un. Bari
2010-06-03
•
•
•
•
•
•
•
•
Mario Giacobini, Un. Torino
Paolo Provero, Un. Torino
Manuela Gariboldi, IFOM-IEO
James Reid, IFOM-IEO
Luciano Milanesi, ITB CNR
Marco Pierotti, Istituto Nazionale dei
Tumori
Giovanna Castoldi, Medicina,
UNIMIB
Fulvio Magni, Medicina, UNIMIB
DISCo UNIMIB Departmental
Workshop
29
Other People International
•
•
•
•
•
•
•
•
•
Daniele Merico – Un. Toronto, Toronto, Canada
Gary Bader – Un. Toronto, Toronto, Canada
Bud Mishra – NYU, New York, USA
Naren Ramakrishnan – Virginia Tech, Blacksburg, VA, USA
Victor Moreno – ICOncologia, Barcellona, Spain
Miguel-Angel Pujana – ICOncologia, Barcellona, Spain
Laura Slaughter – National Technical University of Norway (NTNU),
Norway
Aristotelis Chatzioannou – EIE, Athens, Greece
Viktor Malyshkyn – Center for Supercomputing, Russian Academy of
Sciences, Novosibirsk, Russia
2010-06-03
DISCo UNIMIB Departmental
Workshop
30
Conferences and Workshops
• Signs Symptoms and Findings Workshop
2009, September 2009, Milan, Italy
2010-06-03
DISCo UNIMIB Departmental
Workshop
31
International cooperation
• BIMIB DISCo is the institutional contact point
for all initiatives concerning the EC Virtual
Physiological Human Network of
Excellence (www.vph-noe.eu)
2010-06-03
DISCo UNIMIB Departmental
Workshop
32
Funding
•
Ongoing
– FAR
– EnviGP - Improving Genetic Programming for the Environment and Other
Applications, Programa Operacional Factores de Competitividade,
Fundação para a Ciência e a Tecnologia (FCT), Portugal (PTDC/EIACCO/103363/2008)
– ProteomeNet - Rete Nazionale per lo studio della proteomica umana, FIRB
•
Pending
– EU FP7 ICT Virtual Physiological Human
•
•
CRControl (coordinator)
BioBridge (partner)
– Regione Lombardia, Programma ASTIL
– Regione Lombardia, Programma Quadro/Università
– PRIN 2009
2010-06-03
DISCo UNIMIB Departmental
Workshop
33
Publications
• All publications authored by BIMIB affiliates and
collaborators are listed on the group web site and on
the digidisco platform
http://bimib.disco.unimib.it/index.php/
Special:Publications/en
2010-06-03
DISCo UNIMIB Departmental
Workshop
34
THANK YOU
2010-06-03
DISCo UNIMIB Departmental
Workshop
35