Slide - iPlant Pods - iPlant Collaborative

Download Report

Transcript Slide - iPlant Pods - iPlant Collaborative

Introductory Phylogenetic Workflows in
the Discovery Environment
Sheldon McKay
iPlant Collaborative,
DNALC, Cold Spring Harbor Laboratory
Feb 8, 2012
Why is the tree of life important?
“Knowledge of evolutionary relationships is
fundamental to biology, yielding new insights
across the plant sciences, from comparative
genomics and molecular evolution, to plant
development, to the study of adaptation,
speciation, community assembly, and
ecosystem functioning.”
We like to put things into categories
Classifications often represent evolutionary relationships*
F
B
A
D
* But not always
E
C
Phylogenetic trees are representations of evolutionary history
What is the difference between taxonomy and phylogeny?
Consider primates: Do humans make up a monophyletic group?
Hylobatidae
Pongidae
Hominidae
(E) human
Phylogeny based on  - globin pseudogene suggests that
humans and chimpanzees make up a single monophyletic group
outgroup
Trait Evolution
Image courtesy of Brian Omeara
How can iPlant help with phylogenetic tree building?
Factorial (trees)
Number of atoms in the
universe
E10
E2
Big Trees
It can take weeks or months to analyze data
sets with > 100, 000 species
Example of iPlant contribution:
NINJA/WINDJAMMER
-- NINJA
216K species, ~8 days
-- WINDJAMMER 216K species, ~4 hours
How can we scale up phylogenetic tree visualization?
Largest Published Tree (73,060 species)
Goloboff et al. 2009
13,533 names
HD TV: 1920 × 1080
largest computer monitors:
3280×2048 (can be tiled)
Laser printer: effectively
3600 × 4725 (can be tiled)
Prototype iPlant tree viewer
Scalability of Data
Scalability of Data
Scalability of Data
Phylogenetic workflows in the Discovery Environment
Phylogenetics workflows in the Discovery Environment
Overview: The introductory phylogenetic workflows training module is designed
to provide a hands on experience in of using phylogenetic and related
applications of the iPlant Discovery Environment (DE), while also developing a
familiarity with the general use of the DE user interface.
Question: How are tRNA_leu genes related in species of Magnoliophyta?
What are the implications of using gene families for phylogenetic inference?
Specific Objectives: By the end of this module, participants should:
1)Be familiar with the DE user interface
2)Understand the starting data for phylogenetic analysis
3)Be able to perform a multiple sequence alignment in the DE
4)Be able to perform a simple phylogenetic analysis in the DE
5)Be able to use the DE as a portal to visualizing phylogenetic trees