Presentation - Broad Institute

Download Report

Transcript Presentation - Broad Institute

Luke Alden Yancy, Jr.
Mentor: Robert Riley
Broad Institute of MIT & Harvard
Cambridge, MA
Source: http://staff.vbi.vt.edu/pathport/pathinfo_images/Mycobacterium_tuberculosis/AerosolTransmission.jpg
Deaths Causes by TB (Estimated by WHO)
1998
1,751,858
2006
1,654,805
Source: WHO Stop TB Department, website: www.who.int/tb
Learn
more about
Mycobacterium
Tuberculosis (Mtb)
using analysis of
gene expression
data

Biclustering
◦
◦
◦
◦
◦

Bimax (Prelic et al. 2006)
CC (Cheng and Church, 2000)
Plaid Model (Turner et al. 2003)
Spectral (Kluger et al. 2003)
Xmotifs (Murali and Kasif, 2003)
Traditional Clustering
◦ K-Means (MacQueen, 1967)
◦ Hierarchical (Eisen et al. 1998)
Traditional Clustering
Biclustering
Gene Clusters Based on:
All Experiments
Subsets of Experiments
Genes Assigned to
Clusters:
One-to-One
Many-to-Many/ Oneto-Many
Reproducibility:
Yes
No (due to random
steps in algorithm)
Source: Machine Learning and Its Applications to Biology, Tarca et al. 2007. (Editor: Fran Lewitter, Whitehead Institute)
Bimax
K-Means
Boshoff Data
(Processed: 3924 Genes, 359
Experiments)
Clusters of Genes
Source: The Transcriptional Responses of Mycobacterium
tuberculosis to Inhibitors of Metabolism. (Boshoff et al. 2004)
(proS loci of Mtb )
(N)
Significance of overlap k
estimated using hypergeometric
distribution:
Cluster (m)
(k)
Operon (n)
Gene Pair
(Source: http://www.nature.com/nature/journal/v409/n6823/full/4091007a0.html)
Bimax Biclustering Operon Overlap
Source: Prolinks: a database of protein functional linkages
derived from coevolution (Bowers et al. 2005)

Random step – lacks reproducibility

No biological soundness

Artificial arrangement of data
◦ Large data sets produce statistically significant, but
small clusters

Practicality
◦ Implementation
◦ Large Input Data Sets



K-Means clustering performs better than
biclustering on our data set
Next, use motif recognition methods to
identify regulatory motifs in clusters
Further development of improved biclustering
algorithms


Project Team
Robert Riley (Mentor)
Brian Weiner
The Broad Institue
Eric Lander
Core Members
SRPG Program Members

Summer Research Program
in Genomics (SRPG)
Shawna Young
Bruce Birren
Lucia Vielma
Maura Silverstein