Transcript Slides
Species Tree Workshop
January 14, 2012
Practice with BEST
Please download MrBayes 3.2 for either
Windows, Macintos, or UNIX from
http://mrbayes.sourceforge.net/
Agenda
The MrBayes with BEST (v 3.2)
implementation (work in progress)
Run the finch example (download finch.nex)
Run a multiple allele data set
(yeast with 4 genes, 22 taxa, 6 species )
…or Try your own data
Previous Implementation:
MrBayes with BEST
Step 1: Use MrBayes to propose vectors of joint gene trees (unlinked
and rooted with outgroup).
Step 2: Given those gene trees, propose a compatible species tree.
Step 3: Implement the chain fully within MrBayes using the usual
properties of the MCMC as proposed by the user.
Program found at www.stat.osu.edu/~dkp/BEST
New Implementation:
MrBayes 3.2 integrated with BEST
Assumes molecular clock for gene trees as part of a full model
including Coalescent for gene trees|species tree
Program found at http://mrbayes.sourceforge.net/
Implementation: MrBayes 3.2
As always
Wide variety of nucleotide, amino acid, and codon models
Variety of proposal distribution options
Parallel “hot” and “cold” chains to balance efficiency while
covering large tree spaces.
Checkpointing to allow stop and starts
New speed improvements
BEST can use MPI for Mac and UNIX
GPU (NVIDIA graphics card) support
Steps for any Bayesian Runs
Read the data
Set the model (data|gene tree)
Set the Prior (including gene|species)
Set the MCMC rules
Run the MCMC
Check convergence
Summarize results
Files created
ckp (Checkpoint file for restarting)
tree5.run2.t (trees saved loci 5 in run 2)
tree5.parts (partitions seen for tree 5)
tree5.trprobs (tree probabilties)
tree5.con.tre (consensus tree)
tree5.tstat (partition statistics)
tree5.vstat (branch and node statistics)
Remember
Use a separate folder for each analysis
Remember the “taxset”and
“speciespartition” statements in MrBayes
with ≥ one taxa per species
Remember to allow variable population
sizes
With n loci, the species tree shows up as
files labeled n+1
Remember to unlink
Gene tree topologies and branch lengths for
sure
unlink topology=(all) brlens=(all);
Parameters of model as approriate
unlink statefreq=(all) revmat=(all);
Issues
Gene trees following a molecular clock is too
restrictive
Some outputs still need to be modified for
species tree use
Species Tree Notation
0.005
0.01
0.035
0.03
0.02
A
Topology, branch lengths, &
population sizes:
(D:
0.035(C:0.03(A:0.02,B:0.02):0.01#0.
3):0.005#0.2)#0.25
0.02
B
C
D
qAB = 0.3, qABC = 0.2, qABCD = 0.25
Three lineages of grassfinches (Poephila)
Long-tailed
(acuticauda)
Long-tailed
(hecki)
Black-throated
(cincta)
30 gene trees from Australian finches
P. acuticauda P. hecki P. cincta
Jennings & Edwards (2005) Evolution 59, 2033-2047.
Estimated species tree distribution using BEST
Estimated species tree distribution using BEST
1.0
0.94
1.0
0.03