20060511_microarray_..
Download
Report
Transcript 20060511_microarray_..
Epistasis Analysis
Using Microarrays
Chris Workman
Experiments with Microarrays
Cool technology, but how do we use it? How
is it useful?
Identify “marker genes” in disease tissues
Toxicology, stress response
Classification, diagnostics
Drug candidate screens, basic science
Genetic factors
Measuring interactions (chIP-on-chip)
Overview
Expression profiling in single-deletions
Epistasis analysis using single- and doubledeletions
Epistasis analysis, genetic and environmental
factors
Reconstructing pathways that explain the
genetic relationships between genes
Expression Profiling in 276 Yeast
Single-Gene Deletion Strains
(“The Rosetta Compendium”)
Only 19 % of yeast genes are essential in rich media
Giaever, et. al. Nature (2002)
Clustered Rosetta Compendium Data
Gene Deletion Profiles Identify Gene
Function and Pathways
Principle of Epistasis Analysis
Experimental Design
Compare single-gene deletions to wild type
Compare to the double knockout to wild type
Experimental Design:
Single vs Double-Gene Deletions
Classical Epistasis Analysis Using
Microarrays to Determine the Molecular
Phenotypes
Time series
expression (0-24hrs)
every 2hrs
Mixing Genetic and Environmental
Factors
Expression in Single-Gene Deletions
(yeast mec1 and dun1 deletion strains)
Chen-Hsiang Yeang, PhD
MIT
UC Santa Cruz
Craig Mak
UCSD
Yeang, Jaakkola, Ideker. J Comp Bio (2004)
Yeang, Mak, et. al. Genome Res (2005)
Measurements
“Systems level” understanding
Treat disease
Networks
Synthetic biology
In silico cells
Measurements
“Systems level” understanding
Treat disease
Networks
Synthetic biology
Test & Refine
In silico cells
Displaying deletion effects
Published work: “Epistasis analysis using expression profiling” (2005)
Relevant Interactions
Subset of Rosetta
compendium used
28 deletions were TF
(red circles)
355 diff. exp. genes
(white boxes)
P < 0.005
755 TF-deletion effects
(grey squiggles)
Network Measurements
Yeast under normal growth conditions
Promoter binding
ChIP-chip / location analysis
Lee, et. al. Science (2002)
Protein-protein interaction
Yeast 2-hybrid
Database of Interaction Proteins (DIP)
Deane, et. al. Mol Cell Proteomics (2002)
ChIP Measurement of Protein-DNA
Interactions (Chromatin Immunoprecipitation)
Step 1: Network connectivity
(chIP-chip analysis)
~ 5k genes
(white boxes)
~ 20k interactions
(green lines)
Step 2: Network annotation
(gene expression analysis)
Measure variables that are a function of
the network (gene expression).
Monitor these effects after perturbing the
network (TF knockouts).
What parts are
wired together
How and why the parts
are wired together
the way they are
Inferring regulatory paths
Direct
Indirect
=
=
Annotate: inducer or repressor
OR
Annotate: inducer or repressor
Computational methods
Problem Statement:
Find regulatory paths consisting of physical
interactions that “explain” functional relationship
Method:
A probabilistic inference approach
Yeang, Ideker et. al. J Comp Bio (2004)
To assign annotations
Formalize problem using a factor graph
Solve using max product algorithm
Kschischang. IEEE Trans. Information Theory (2001)
Mathematically similar to Bayesian inference, Markov random
fields, belief propagation
Inferred Network Annotations
A network with ambiguous annotation
Test
&
Refine
Which deletion experiments should
we do first?
A mutual information based score
For each candidate experiment (gene )
Variability of predicted expression profiles
Predict profile for each possible set of annotations
More variable = more information from experiment
Reuse network inference algorithm to compute effect
of deletion!
I M;Y e H(M ) H M | Y e
H M PM mPY e ylog 2 PM m | Y e y
m, y
Ranking candidate experiments
Gene
Function
HHF1
CKA1
Histone
52.1429
regulator for meiosis and PKA
45.0279
pathway
protein kinase of cell cycle
45.0075
A2
mating response
YAP6*
SOK2*
NRG1
FKH1
FKH2
SLT2
MSN4*
HAP4*
Downstream
genes
74
Rank
Model
1
2
64
2
1
64
3
5
40.9023
58
4
4
stress response regulator
regulator of glucose
dependent genes
regulator of cell cycle
35.1652
50
5
1, 3
31.6501
45
6
3
29.1194
41
7
2
regulator of cell cycle
protein kinase of cell wall
integrity pathway
regulator of stress response
26.7131
38
8
7
23.4727
31
9
8
21.8224
31
10
1
6.3310
9
34
1
regulator of cellular
respiration
Score
We target experiments to one region of
network
Expression for: SOK2, HAP4 , MSN4 , YAP6
Expression of Msn4 targets
Average signed
z-score
1 N
Ze
zie 0 sgn riezie
N i 1
Expression of Hap4 targets
Yap6 targets are unaffected
Refined Network Model
Caveats
Assumes target genes
are correct
Only models linear paths
Combinatorial effects
missed
Measurements are for
rich media growth
Using this method of choosing
the next experiment
Is it better than other methods?
How many experiments?
Run simulations vs:
Random
Hubs
Simulation results
# simulated deletions profiles used to learn a “true” network
Current Work
Measurements
“Systems level” understanding
Treat disease
Networks
Test & Refine
Transcriptional
response to
DNA damage
Acknowledgments
Trey Ideker
Craig Mak
Chen-Hsiang Yeang
Tommi Jaakkola
Scott McCuine
Maya Agarwal
Mike Daly
Ideker lab members
Tom Begley
Leona Samson
Funding grants from NIGMS, NSF, and NIH