phylogeny orthologous prediction
Download
Report
Transcript phylogeny orthologous prediction
Phylogenetic analysis
To infer and study evolutionary history
of homologous gene families
Manuel Ruiz (CIRAD, Data Integration team)
Alexis Dereeper (IRD)
Nathalie Chantret (INRA)
Jean-François Dufayard (CIRAD, Data Integration team)
Species phylogeny, and molecular phylogeny
Species evolutionary history
Species
phylogeny
Species phylogeny, and molecular phylogeny
Species evolutionary history
Species
phylogeny
Molecular evolutionary history
Molecular
phylogeny
Species phylogeny, and molecular phylogeny
Species evolutionary history
Species
phylogeny
Modeling (trees)
Molecular evolutionary history
Molecular
phylogeny
Modeling (sequences, trees)
Species phylogeny, and molecular phylogeny
Wheat
Rice
Tomato
Sequence 1 (wheat)
Sequence 2 (rice)
Sequence 3 (tomato)
To infer evolutionary gene history...
Permits to infer evolutionary species history ?
What signal is supported by sequences ?
Are speciations the only important events ?
Species phylogeny, and molecular phylogeny
Wheat
Rice
Tomato
Duplication
Sequence 1 (wheat)
Sequence 2 (rice)
Sequence 3 (tomato)
Sequence 4 (wheat)
Sequence 5 (rice)
Sequence 6 (tomato)
Species phylogeny, and molecular phylogeny
Wheat
Rice
Tomato
Gene losses
Sequence 1 (wheat)
Loss (rice)
Sequence 3 (tomato)
Loss (wheat)
Sequence 5 (rice)
Loss (tomato)
Species phylogeny, and molecular phylogeny
Wheat
Rice
Tomato
Losses
Sequence 1 (wheat)
Sequence 3 (tomato)
Sequence 5 (rice)
Species phylogeny, and molecular phylogeny
Wheat
Rice
Tomato
Tree reconciliation
Sequence 1 (wheat)
Loss (rice)
Sequence 3 (tomato)
Sequence 1 (wheat)
Sequence 3 (tomato)
Loss (wheat)
Sequence 5 (rice)
Loss (tomato)
Sequence 5 (rice)
Definitions
Sequence 1 (wheat)
Sequence 2 (rice)
Sequence 3 (tomato)
Sequence 4 (wheat)
Sequence 5 (rice)
Sequence 6 (tomato)
In silico
Similarity: mathematical measure of similarity between two sequences (quantitative concept).
Coverage: relative or absolute length along which two sequences are comparable.
In vivo
Homology: two molecules are homologous if they have derived from a common ancestral molecule (qualitative concept).
Orthologous genes : genes derived from a common ancestor and diverged after a speciation event, they belong to different
species.
Paralogous genes : genes derived from a common ancestor and diverged after a duplication event, they can belong to the
same species.
Starting data
4 sequences from the reference transcriptome of Oryza sativa
1 mapped consensus sequence, for each individual (10 crop african rice et 1 wild
african rice).
Os2g25612
RC1
RC2
RC3
...
RS7
Os3g52163
Os4g66669
Workflows
Oryza sativa transcriptome:
Sequence1_SATIVA
Sequence2_SATIVA
...
Consensus Oryza glaberima / Oryza
barthii:
Sequence1_RC1
Sequence2_RS1
...
Clustering:
Retrieve homologous gene families
Spot paralogies anterior to the Oryza sativa / Oryza
glaberima divergence.
Alignment and cleaning:
Align similar portions between sequences, and select conserved bloc in order to support the homology
hypothesis.
Phylogenetic reconstruction:
Find a tree likely to explain the alignment in a given model of evolution.
Interspecifique / inter-gender
Analyse:
Display, rooting, reconciliation
Intraspecifique / intra-gender
Workflows
Oryza sativa transcriptome:
Sequence1_SATIVA
Sequence2_SATIVA
...
Similar sequences from distant
species:
Sequence1_MAIZE
Sequence2_SORBI
...
Consensus Oryza glaberima / Oryza
barthii:
Sequence1_RC1
Sequence2_RS1
...
BLAST on public databanks:
Clustering:
Search homologous genes from several distant species
Retrieve homologous gene families
Clarify the dates for duplication events
Spot paralogies anterior to the Oryza sativa / Oryza
glaberima divergence.
Alignment and cleaning:
Align similar portions between sequences, and select conserved bloc in order to support the homology
hypothesis.
Phylogenetic reconstruction:
Find a tree likely to explain the alignment in a given model of evolution.
Interspecifique / inter-gender
Analyse:
Display, rooting, reconciliation
Intraspecifique / intra-gender
Homologous search in public databanks
BLAST :