Transcript odedMagger

Bioinformatics:
Cool stuff you can
do with Computers
and Biology
Oded Magger
Tel Aviv University / Autodesk inc.
GIP course 2010
When two fields come together
• Computer science has applications in many
fields (economics, physics, history, you name it.)
• The field of biology is being transformed:
• Tons of biological data.
• “Human genome project”
• “Large scale experiments”
• Complex computations and algorithms.
• Biologists who know how to send an Email.
2
Genetics 101
3
Genetics 101
DNA
mRNA
4
Protein
Challenge examples
• Find a drug for a disease.
– Infer its side effects.
– Understand how it influences the body.
• Find out what causes genetic disease.
• Figure out what a gene actually does.
• Decipher the secrets of evolution!!!11
– How similar are two genes?
– Construct the tree of life.
• Medical image analysis
( = “sir, I’m afraid you have cancer”).
5
Case study #1
PRINCE:
Associating genes and
protein complexes with
hereditary disease
Background
• Objective: to correctly predict the molecular
causing factors of hereditary disease.
• Causal genes
• How are causal genes identified today?
• Association studies.
• Prioritizing genes in an interval.
 Done computationally using sequence similarity,
functional similarity, network data and more.
• Important insight: proteins causing diseases with
similar phenotype tend to lie close in PPI network.
7
PPI – “Protein facebook”
8
From a problem to an algorithm
• Important insight: proteins causing similar
diseases tend to lie close in PPI network.
• If many of your friends or friends of friends in
Facebook study computer science, chances are
that so are you.
• Similar diseases – Natural Language Processing
(the computer reads the diseases encyclopedia).
9
From a problem to an algorithm
• If related parts of some car break down, the
‘symptoms’ will be similar.
10
How does PRINCE work?
Interval 10P
p1
p2
p3
p4
d1
p6
p5
d2
p8
p7
d4
p10
p11
0.1
q
0.3
d3
p9
0.7
d5
0.1
0.9
Strengths of PRINCE
• A network based method.
• Fast, propagation-based method converged to
accurate matrix-based solution.
• Global inference:
• Inference not limited to the direct vicinity of
genes for which prior knowledge exists.
• Smooth function over the network.
• Smart normalizations:
• Logistic transformation of disease similarity
metric.
• Edges between two high-degree nodes have
12
lowered weight.
PRINCE beats the competition!
13
Case study #2
Metabolic models:
Simulation of life!
(Parts stolen from Tomer Shlomi and Eytan Ruppin)
Metabolism
• The body is a huge factory for
assembling and disassembling
molecules ( = stuff).
• Factory workers: special
proteins called enzymes.
• A single transformation of
one set of material to
another is called “reaction”.
A reaction has a rate.
(“One worker can turn 2 cows to 50 steaks in one
15
hour.”)
Metabolic network
16
Metabolic simulation!
• Start with a set of molecules and their amounts, a
set of enzymes and their amount – and press
“play”…
• If you want details – wait until your third year…
• Relies on material from the Linear Algebra and
Algorithms courses.
17
What is it good for?
• Medicine:
• Diagnosis.
• Metabolic diseases (any diabetics in the crowd?)
• New cures for cancer.
• Biotechnology
• Genetically engineered bacteria and yeast
manufacture products of interest with high
efficiency (Insulin, bio-fuels, beer).
• Easier to run experiments on a computer than in
18
the lab (1 computer = 1000 lazy biochemists).
Questions?
19
Thank you for
listening!
20