CIPRES.2006.algorthms_sr
Download
Report
Transcript CIPRES.2006.algorthms_sr
Algorithms at Berkeley
(partial overview).
Group leader: Satish Rao
Overview
1. Distance Methods
• Using Singular value decompositions. (Erickson,Sturmfels)
• Neighbor Joining Criteria. (Mihaescu, Pachter.)
• Lower bounding techniques for tree search using matrix. (Chen et al.)
2. Statistical Research
• Breakthrough: Optimal logarithmic sequence length tree reconstruction
(Daskalakis, Mossel, Roch 05). Simplified version (Mihaescu et al. 06).
Preliminary Implementation [Adkins et al.].
• Maximum likelihood. NP-hard. Relationship to Maximum Parsimony
(Roch05,Tul05).
• Standard chains of MCMC method scan be problematic (DMV05).
• Learning with other Evolutionary Models: (Mossel, Roch). Only
nonsingular hmodels can be learned at all.
Overview
3. Alignment Related Issues
• Alignment Event Histories. Insertion/deletion histories. (Pachter, Snir
• Deriving Internal Sequence in a copy/delete model. (Chen et al.)
• Multiple Alignment. MAVID (Bray, Pachter.)
• Alignment Accuracy Metrics: (Schwartz, Meyer, Pachter). Pieces out
unalignable sequence better than previous metrics.
4.
Other Activities
• Gene Order. Can reconstruct phylogenies from gene order data with
logarithmic number of genes. (Adkins et al.)
• Supertree Methods. Reconstructing trees from Triplets, Quartets. (Snir
et al.) Combining above into supertree method.
• Reticulate Evolution: (Karp, Riesenfeld). Produces Galled trees from
high degree input trees. (Simultaneous.)
Distance Methods: Tree versus Forests
Forests (instead of tree):
Mossel: finds almost all edges even with logarithmic length sequence data.
Hill et al.: better run times, implementation.
Sequence Methods: Phase Transitions.
Phase Transition:
If and only if mutation rate less than ≈ 78%, can learn with
logarithmic length sequence data. (Mossel 2002,
Daskalakis, Mossel, Roch06. )
Takeaways:
• Lovely results, techniques. Applicable to other
computational biology and learning problems.
• Infer internal sequences (confirms algorithms for
MCMC, maximum likelihood, parsimony).
Statistical Methods
• Likelihood. NP-completeness implies that one should
analyse or rely on properties of model based
instances.
MCMC methods
• “Average Likelihood”. DMV is among the first to
understand the convergence conditions of these
methods.