Bioinformatics tools as JAWB (Just another Western Blot)

Download Report

Transcript Bioinformatics tools as JAWB (Just another Western Blot)

Social behavior of proteins?
Rui Alves
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
Proteins do not work alone!
Networks of “interactions” predict
global function
• Having the network of proteins/genes in
which your protein/gene is inserted
provides predictive information:
– Which cellular pathways or processes is your
protein/gene likely to be involved in
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
Publication databases are source
of information
Meta text databases create social
models from publication analysis
iHOP is a sofisticated context
analysis motor
How does meta-text analysis create
networks?
Literature
database
Server/
scripts
Your
genes
Program
Entry
List of entries mentioning your gene
Gene
e.g Ste20
names
database
e.g activate,
Language
inhibit
rules
rescue
database
Gene list
Rule list
Organization of the talk
• Social behavior of the protein?!?!?!?
• Meta text analysis
• Evolutionary based protein interaction
prediction
• Using pathway homology
• Using protein docking
• Using microarray data
• Using protein interaction data
Proteins that have coevolved share
a function
• If protein A has co-evolved with protein B, they
are likely to be involved in the same process
• Looking for proteins that coevolved will help
prediction social networks of proteins
• There are many methods to look for co-evolution
of proteins
– Phylogenetic profiling, gene neighbourhoods,
gene fusion events, phylogenetic trees…
Creating phylogenetic profiles
Database of
proteins in
fully
sequenced
genomes
Database of
proteins in
one
genome
Sequence of each protein
Homology search against
each genome
Target
Genome
Homologue in
Genome 1?
Homologue in …
Genome 2?
Protein A
Y
N
…
…
…
…
…
Database of
profiles for
each protein
in each
organism
Using phylogenetic profiles to predict
protein interactions
Your Sequence (A)
A
1
C
0.9
Server/
…
…
Program
B
Database of
proteins in
fully
sequenced
genomes
0.11
Target
Homologue in
Homologue in …
… Genome
… 2?
Genome
Genome 1?
(A
and absent
of
A ProteinsCalculate
Y and C) that are present
N
…in the same setProtein
are likely to be involved
in the same…process and therefore
id A
B genomes N
Y
interact
coincidence
index
C
Y
N
…
…
…
…
…
A if protein
B
CA is absent in all genomes in which protein
Similarly,
B isof
Database
i/number
ofisgenomes<1
present
there
a likelihood
that they perform the same profiles
function!for
0j/number
1 of genomes
2
each protein in
each organism
How to do it?
• Download genomes
• Use blast for homology
• Use perl for homology processing and
coincidence index calculations
Syntheny/Conservation of gene
neighborhoods
Genome 1
Protein A
Genome 2
Protein C
Protein A
Protein B
Protein C
Protein D
Protein A
Protein B
Protein D
Protein B
Protein C
Protein D
Proteins A and B are in a conserved relative
Genome 3
Protein B Protein A
Protein C
Protein D
position
most
genomes
is an
Whichinof
these
proteinswhich
“interact”?
indication that they are likely to interact
Genome …
…
How to do it?
• Download genomes
• Use perl for analysis
Gene fusion events
Genome 1
Genome 2
Protein A Protein B
Protein C
Protein D
Protein A
Protein B
Protein C
Protein D
Protein C
Protein A
Protein B
Protein D
Which of these proteins interact?
Proteins A and B have suffered gene fusion
Genome
3 in atProtein
Protein B Protein A
Protein C
D
events
least
some
genomes,
which is
an indication that they are likely to interact
Genome …
…
How to do it?
• Download genomes
• Use perl for analysis
Building phylogenetic trees of
proteins
Genome 1
Protein A
Protein B
Protein C
Protein D
Phylogenetic trees represent the
Genome evolutionary
2
homologue
Protein D
Protein C history
Protein A of
Protein
B
genes/proteins based on their
sequence
Genome 3
Protein B Protein A
Protein C
Protein D
Genome …
…
Get sequence of all homogues, align and
build a phylogenetic tree
Distance based phylogenetic trees
A1
A2
A3
…
A2
A1
5 substitutions
ACTDEEGGGGSRGHI…
A-TEEDGGAASRGHI…
ACFDDEGGGGSRGHL…
…
A1
A3
A3
3 substitutions
A2
8 substitutions
5
A1
A3
3
A2
Maximum likelihood phylogenetic trees
Alignment
Probability of aa substitution
A
-
E
D
…
ACTDEEGGGGSRGHI…
0.01 0.2
0.09 …
A-TEEDGGAASRGHI… A 1
ACFDDEGGGGSRGHL… - 0.01 1 0.0001 0.0001 …
…
E 0.2 0.0001 1
0.5
D
…
0.09 0.0001 0.5
1
Maximum likelihood phylogenetic trees
A2
Alignment
p(1,2)
ACTDEEGGGGSRGHI…
A-TEEDGGAASRGHI…
ACFDDEGGGGSRGHL…
…
p(1,3)
A1
5 substitutions
A1
A3
3 substitutions
p(2,3)<p(1,2)<p(1,3)
A3
A2
A1 p(2,3)
A3
A3
A1
A2
A2
8 substitutions
Similarity of phylogenetic trees indicates
“interaction” between proteins
B1
A1
B2
A2
B3
A3
…
…
C3
D2
…
Proteins A and B have similar
evolutionary
trees and thus are likely to “interact”
…
C2
C1
D1
D3
How to do it?
• Download genomes
• Use blast,… for analysis
• Use Clustal, Phylip, PAUP, … for tree
building
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
Pathway homology
Database of protein
sequences in
genomes
Database of
pathways in
genomes
Database of
interactions in
genomes
Your Sequence
Server/
Program
Homologue(s)
Output
Pathway homology complements
protein homology
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
What is protein docking?
Protein
A
Protein B
Protein B
Protein B
Best Docking
Protein B
Positive
Protein
A
Protein
A
Protein B
Protein
A
Protein
A
Negative
Protein
A
Same area of
interaction
Protein B
Negative
Caveats of using protein docking to
predict interaction
Protein B
Glycolisys
Protein
A
Protein
C
DNA synthesis
• Proteins may not come into contact in the
cell although if they did they could interact
• Very heavy computationaly
When shoudl we use protein docking
to predict network structure?
• When we have a group of proteins that are
known to be involved in the same function
and we want to predict how the different
proteins interact with each other
How to do it?
• Download structures or create structure
predictions
• Use GRAMM, HEX, …
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
Predicting protein interactions
using micro array data
cells
Group of proteins
involved in response
Purify cDNA Compare cDNA
to the stimuluslevels of
corresponding genes
Purify cDNA in the different
populations
Stimulum
Genes overexpressed
as a result of stimulus
Genes underexpressed
as a result of stimulus
cells
Genes with expression
independent of stimulus
Organization of the talk
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data
Predicting protein networks using
protein interaction data
Server/
Program
Database of protein
interactions
A
C
D
Your Sequence (A)
E
Continue until you are satisfied
B or completed
F
the network
Summary
•
•
•
•
•
•
•
Social behavior of the protein?!?!?!?
Using meta text analysis
Using phylogenetic profiling
Using pathway homology
Using protein docking
Using microarray data
Using protein interaction data