Transcript PPT
Comparative Biology with focus on 8 examples
•Comparative Biology
•The Domain of Comparative Biology
•Co-modeling in Comparative Biology
•The purpose of Comparative Biology
•Examples of Stochastic Comparative Modeling
•Gene Frequencies in Populations
•Genome Structure Evolution
•Stemmatology: Manuscript Evolution
•RNA Secondary Structure Evolution
•Protein Structure Evolution
•Movement Evolution
•Shape Evolution
•Pattern Evolution
Comparative Biology
Most Recent
Common Ancestor
Time Direction
?
ATTGCGTATATAT….CAG
observable
Key Questions:
•Which phylogeny?
•Which ancestral states?
•Which process?
ATTGCGTATATAT….CAG
observable
ATTGCGTATATAT….CAG
observable
Key Generalisations:
•Homologous objects
•Co-modelling
•Genealogical Structures?
Comparative Biology: Evolutionary Models
Object
Nucleotides/Amino Acids/codons
Continuous Quantities
Sequences
Gene Structure
Genome Structure
Population
Structure
RNA
Protein
Networks
Metabolic Pathways
Protein Interaction
Regulatory Pathways
Signal Transduction
Macromolecular Assemblies
Motors
Shape
Patterns
Tissue/Organs/Skeleton/….
Dynamics
MD movements of proteins
Locomotion
Culture
Manuscripts (stemmatology)
Language
Vocabulary
Grammar
Phonetics
Semantics
Phenotype
Dynamical Systems
Type
CTFS continuous time finite states
CTNS continuous time continuous states
CTUS continuous time countable states
Matching
CTCS MM
Brownian Motion/Diffusion
SCFG-model like
non-evolutionary: extreme variety
CTCS
CTFS
CTCS
CTCS
CTCS
?
?
- (non-evolutionary models)
- (non-evolutionary models)
- (non-evolutionary models)
Reference
Jukes-Cantor 69 +500 others
Felsenstein 68 + 50 others
Thorne, Kishino Felsenstein,91 + 40others
DeGroot, 07
Miklos,
Fisher, Wright, Haldane, Kimura, ….
Holmes, I. 06 + few others
Lesk, A;Taylor, W.
Snijder, T (sociological networks)
Mithani, 2009a,b
Stumpf, Wiuf, Ideker
Quayle and Bullock, 06, Teichmann
Soyer et al.,06
Dryden and Mardia, 1998, Bookstein, Jones & Moriarty
Turing, 52;
Grenander,
analogues to genetic models
analogous to sequence models
Biggins 05, Munz 10,
Cavalli-Sforza & Feldman, 83
Chris J Howe, http://www.cs.helsinki.fi/u/ttonteri/casc/
“Infinite Allele Model” (CTCS)
Swadesh,52, Sankoff,72, Gray & Aitkinson, 2003
Dunn 05
Bouchard-Côté 2007
Sankoff,70
Brownian Motion/Diffusion
-
The Purpose of Comparative Biology
To describe evolution:
• Make realistic model (pass goodness-of-fit (GOF) test)
• Estimate Parameters
• Make statements about the path of evolution – ancestral analysis
Analyse homologous pairs or sets
• What is the equilibrium distribution
• Integrate over histories
Biological Questions:
• Rate of Evolution
• Heterogeneity
Time
State Space
• Selection
• Co-Evolution of different components within a level
• Dependence among different levels (co-modelling)
Most of these questions have not been addressed beyond the sequence level:
• Primarily due to lack of data
• Secondarily due to lack of models
Xt is a diffusion with m(x)=0 and s(x)=x(1-x)
Famous Models:
• Continuous Time Continuous States Markov
Process - specifically Diffusion.
• For instance Ornstein-Uhlenbeck, which has
Gausssian equilibrium distribution
E. Thompson (1975) Human Evolutionary Trees CUP
Population Gene Frequencies
Genome Structure Evolution
• Evolutionary events:
Duplication
Inversion
1
1
1
1
2
3
Transposition
Deletion
1
2
3
1
2
3
1
2
1
3
3
k
• Inference Principles
• Shortest Path (Parsimony)
• Sum over paths with probabilities (ML)
2
k
3
1
• Extensions:
• Directions of Genes Unknown
• A set of chromosomes related by a phylogeny
Genome Structure Evolution
• Full graph for 5 genes
• Genomic reconstruction for
human, mouse and rat.
Ashmole 59
Buryed at Caane thus seythe the Croniculer
Digby 186
Beryed att Cane & thus says the cronyclere
BL Ad 31042
Beryed at caene so seyth the cronyclere
Lansd. 762
Buried at cane this saith the croneclere
de Worde
R. Wyer
And is buried at Cane as the Cronycle sayes
And buryed at cane as the Cronycle sayes
Phylogeny of “Canterbury Tales”:
Howe et al ,2001
Phylogenetics of Medieval Manuscripts by Christopher Howe
Stemmatology: Evolution of Manuscripts
Tree Representations of RNA Structure
Basic Edit Operations
A Tree Distance Pairwise Edit Algorithm
How Do RNA Folding Algorithms Work?. S.R. Eddy. Nature Biotechnology, 22:1457-1458, 2004.
Average complexity of the Jiang-Wang-Zhang pairwise tree alignment algorithm and of a RNA secondary structure alignment algorithmClaire
Herrbach, Alain Denise and Serge Dulucq
RNA Structure Evolution
Protein Structure Evolution
?
?
?
?
Known
a-globin
Unknown
300 amino acid changes
800 nucleotide changes
1 structural change
1.4 Gyr
Known
Myoglobin
1. Given Structure what are the possible events that could happen?
2. What are their probabilities? Old fashioned substitution + indel process with bias.
Bias: Folding(Sequence Structure) & Fitness of Structure
3. Summation over all paths.
Trajectories between two Secondary Structures
• Observation: two structures with sequence and secondary structure information
• Space of Protein Structures is large and complicated – both continuous and discrete
• Approximated by a series of stepping stones and a continuous time markov chain
S1
Sk
S2
3D Structure
Sn
S3
1 structure
2D Structure
1D Structure
Set of sequences
HQYWYWLLATIVVAWMCM
HSGHPPMCWFFWFLLIVIC
FYYRKKNQEDDNERPMTSG
QYYWWWFCTNSPPHYHRQ
DEEDNKRRKLWWAFFCCV
FIIAILLMVAGSTGVMMLMP
The Evolution/Comparison of Molecular Movements
Molecular Movements of Homologous Proteins are themselves homologous
The full problem: 2 times 1000 atoms observed at 106 time points.
Reductions:
i.
only a-carbons 100 space points
ii. Only correlated pairwise movements 1 dimensional summary for each aa pair
Dynamic Fingerprint Matrix (DFM)
Shapes and Shape Evolution
David F. Wiley, Nina Amenta, Dan A. Alcantara, Deboshmita Ghosh, Yong J Kil, Eric Delson, Will
Harcourt-Smith, F. James Rohlf, Katherine St. John, Bernd Hamann, Ryosuke Motani, Steven Frost, Alfred
L. Rosenberger, Lissa Tallman, Todd Disotell, and Rob O'Neill
•
Landmarks
•
Abstract
Semilandmarks
We propose to develop software tools for the analysis,
interpretation and visualization of three-dimensional shape
data from living and extinct organisms, using the statistical
framework of geometric morphometics. While this software
will be widely useful in biology and paleontology, we plan
to focus our work by concentrating on one significant
problem: incorporating fossils into evolutionary trees.
Evolutionary trees for groups of living species are usually
estimated using DNA sequence data. Since this is usually not
available for extinct species, we need to use morphology (the
shapes of fossil and modern specimens) to decide how the
extinct species should be included in a tree whose
framework is based on molecular studies. Specifically, we plan to estimate a well-supported evolutionary
tree for the mainly African papionin monkeys, an inherently interesting group that includes about as many
extinct as living clusters of species. Our analysis will be based on a large existing database of threedimensional data (mostly skull surfaces) at the American Museum of Natural History.
This end-to-end analysis project should produce research results at all levels. The interactive graphics,
visualization and statistical analysis tools we propose are ever more widely needed as the amount of threedimensional morphology data increases. We expect that the close interaction of geometric morphometrics
and computer graphics will lead to new ideas about the representation of shape. We have new approaches to
the problem of integrating morphology with molecular data in the study of evolution, which we hope will be
successful and applicable in many parts of the tree of life. And with massive amounts of new data, new
processing and analytic software, and new approaches to integrating morphology, we hope to be able to
answer specific questions about the evolution of African monkeys, which have remained elusive up until
now.
Our proposed work will have a variety of broader impacts beyond our own research agendas. A large part of
Gunz (2009) Early modern human diversity suggests subdivided population structure and a complex out-of-Africa scenario
Comparison of cranial ontogenetic trajectories among great apes and humans Philipp Mitteroeckera*,
Evolutionary Morphing David F. Wiley
http://graphics.idav.ucdavis.edu/research/projects/EvoMorph
Evolutionary Morphing
The Evolution/Comparison of Molecular Movements
http://www.stats.ox.ac.uk/__data/assets/file/0015/3327/brooks.pdf
The Phylogenetic Turing Patterns I
The Phylogenetic Turing Patterns II
Reaction-Diffusion Equations:
Stripes: p small
Analysis Tasks:
1. Choose Class of Mechanisms
2. Observe Empirical Patterns
3. Choose Closest set of Turing Patterns T1, T2,.., Tk,
4. Choose parameters p1, p2, .. , pk (sets?) behind T1,..
Spots: p large
Evolutionary Modelling Tasks:
1. p(t1)-p(t2) ~ N(0, (t1-t2)S)
2. Non-overlapping intervals have independent increments
I.e. Brownian Motion
Scientific Motivation:
1. Is there evolutionary information on pattern mechanisms?
2. How does patterns evolve?
Summary
•Comparative Biology
•The Domain of Comparative Biology
•Co-modeling in Comparative Biology
•The purpose of Comparative Biology
•Examples of Stochastic Comparative Modeling
•Gene Frequencies in Populations
•Genome Structure Evolution
•Stemmatology: Manuscript Evolution
•RNA Secondary Structure Evolution
•Protein Structure Evolution
•Movement Evolution
•Shape Evolution
•Pattern Evolution