Presentation

Download Report

Transcript Presentation

Location analysis of
transcription factor
binding sites
Guy Naamati
Andrei Grodzovky
A brief history
• Two weeks ago Masha and Michal told us
about gene expression and gene clusters.
• Last week, Lior and Ofer told us about tfbs,
and how to identify them.
 What about today?
Today!


A revolutionary new method that identifies where and
when in the genome a binding factor actually binds!
We will talk about the method that reveals the
genome wide localization, and provide several
important examples from the world of yeast cells.

The star of the show
Motivation no 1

What can location analysis give us that microarrays alone can’t?
• Micro-arrays identifies changes in mRNA
levels, but can not distinguish direct from
indirect effects.
Motivation no 2

What advantage does localization have over
try to identify the binding site?
• Right! We don’t have to handle many case
in which it “looks” like we identified
binding site, but in vivo it’s not.
The Method
The Method



Developed by the group of Richard A. Young
in Cambridge.
A combination of location and expression
profile.
Allows protein-DNA interactions to be
monitored across the entire yeast genome.
The Method




A modified ChIP, combined with micro-array
analysis.
DNA was taken from a cell, and broken with
sound waves (sonication).
Proteins of interest where tagged with myc.
Fragments cross linked to those proteins were
enriched by immunoprecipitation (IP).
What now?




Cross links were reversed, and the enriched
DNA was amplified and labeled (Cy5).
Cy5 labeled DNA was hybridized to a microarray, together with non-enriched DNA labeled
with Cy3.
Gene expression was also analyzed.
Three independent experiments, for accuracy.
Handling noise

A single-array error method was used.
How accurate is it?

This method can identify factors binding to
DNA, but cannot recognize the exact
location of the binding site. Why?
• The sonication breaks the DNA into
fragments 500-1000 bases long. Not very
specific.
Testing if it works



Used to identify sites bound by Gal4 in the
yeast genome.
Found seven genes previously reported to be
regulated by Gal4.
In addition, 3 more genes were found!
An important reminder

The consensus binding site for Gal4 was
found in many places in the gene where
Gal4 did not bind. Why is that?
• Previous studies of Gal4 have suggested
that chromatin structure also has a big role.
Confirmation
The next investigation


Ste12 functions in the response of haploid
yeast to mating pheromones.
More than 200 genes are activated in a Ste12
dependent fashion. Which are directly
regulated?
• By this method, only 29!
What’s next?



This method can identify the global set of
genes that are regulated directly in vivo.
Gives us accurate information about where and
when transcription factors bind.
Opens a new pathway into regulation
analysis…
Transcriptional regulatory networks in yeast
Lee et al.
Just as there are networks
of metabolic pathways…
There are networks of regulatorgene interactions
But the network consists of building blocks :
Those are…
How we identify them ?


Using genome wide location
analysis
Identification of a set of
promoter regions that are bound
by specific regulators allowed us
to predict sequence motifs that
are bound by these regulators
Auto-Regulation
Provides reduced response time to
environmental stimuli
Multi-Component
Loop
Offers the potential to produce bistable systems that can switch
between two alternative states.
Provides a form
of multi step
ultra sensitivity
as small changes in
the level of activity
of the master
regulator at the top
of the loop might
be amplified at the
ultimate target.
Feed-Forward Loop
Single-input motifs are
potentially useful for
coordinating a discrete
unit of biological
function, such as a set
of genes that code for
the subunits of a
biosynthetic apparatus
or enzymes of a
metabolic pathway.
Single Input Motif
Multi Input Motif
This motif offers the potential for coordinating gene
expression across a wide variety of growth conditions.
The chain represents the simplest circuit logic
for ordering transcriptional events in a
temporal sequence.
Example
FHL1 – Ribosomal proteins regulator.
Genome wide location analysis
Single Input Motif
Forms a single input regulatory motif
consisting of essentially all ribosomal
protein genes
Assembling motifs into network
structures

An algorithm based on genome wide
location data and expression data from over
500 experiments was developed in order to
identify group
of genes that are both coordinately
bound and expressed.
Network assembly algorithm




1-Define a set of genes G bound by a set of regulators
S.
2- Find a subset of G with a similar expression
pattern.
3- Go over the genes in G and drop genes with a
significantly different expression pattern.
4- Scan the remaining genome for genes with similar
expression profile and check if they’re bound by
factors from S.
What have we got ?

The resulting sets of
genes and regulators are
multi input motifs.

But they are refined for
common expression
Multi Input Motif
MIM-CE’s: What are they good for ?

Using MIM-CE’s the yeasts cell cycle
networks was constructed using an
automated method, without prior
knowledge of the regulators that
control transcription.
The process

Check for MIM-CE’s significantly enriched
in genes whose expression oscillates during
the cell cycle.

Align MIM-CE’s around the cell cycle on
the basis of peak expression of the genes in
the MIM-CE.
The outcome
Yeasts cell cycle
transcriptional
regulatory network.
Features of the network model:

Correlation of the computational positioning of regulators
with previous studies.

Regulators whose function was not known before could be
positioned in the network on the basis of direct binding
data.

Third, and most important, reconstruction of the regulatory
architecture was automatic and required no prior
knowledge of the regulators that control transcription
during the cell cycle.
Serial Regulation of Transcriptional Regulators
in the Yeast Cell Cycle
Simon et al.
Many transcriptional regulatory networks in yeast…
Why the cell cycle network ?
Cyclins regulate the cell cycle
Regulation of the cell cycle clock
is effected through activity of the
cyclin-dependent kinase (CDK)
family of protein kinases.
But who regulate the regulators ?

Nine
transcriptional
regulators were
identified
The method

Using genome wide
location analysis to
identify the binding sites
for each of the factors
in vivo.
The results
ChIP
Micro Array
These results confirm the stage specific
regulation of gene expression by those
factors.
The results also confirm that
genes encoding several of the
cell cycle transcriptional
regulators are themselves
bound by other cell cycle
regulators
In this way a full regulatory network
is formed.
And of course the cell cycle regulators
Cyclin’s/CDK’s are also regulated by those
factors.
Functional redundancy



Each of the factors
binds a critical cell
cycle gene.
Deletion mutants with
one of the factors
deleted survive…
Why ?
What for
Insures that the cell cycle completes
efficiently.
On the other hand devoting the two
members of the pair to distinct functional
group of genes enables coordinated regulation
of those functions.
The Genome-Wide
Localization of Rsc-9
Damelin et al., 2002
A bit of background



Recent studies identified common set of genes
that are repressed/induced in response to stress
(in yeast).
Generalized the roles of Msn2 and Msn4 in the
stress response.
Do they account for all the observed changes
in transcription response to stress?
Evidently not


Must account for extensive gene repression as
well as activation.
Previous evidence (Gasch et al, 2000): many
genes involving Msn2/4 are activated only in
some stress conditions.
• Tempting to consider a role for general
transcription factors in the stress response.
Along came RSC



Regulation of gene expression is closely
connected to change in Chromatin structure.
RSC: a 15 protein complex that uses ATP
energy to reposition nucleosomes.
Rsc9: a stable component of the RSC complex.
Genome wide localization


The exact method we talked about was used
for Rsc-9.
Two categories with significant enrichment: 1.
Genes coding the cytoplasmatic and
mitochondrial ribosomal proteins.
2. Genes involved with stress response.
What kind of stress?
Both set of genes are are affected by many
types of stress.
The question is raised whether Rsc9 responds
to specific or general stress. How do we find
out?


•
Localization to the rescue!!
Two Stress Treatments


Hydrogen-peroxide (elicits a transcriptional
response similar to many other stress).
Rapamycin (cell response is similar to
starvation).
Similar changes of Rsc9 localization after
both treatments suggest a general stress
response.
A question
• How would we know that the changes
wouldn’t occur from an unrelated
treatment?

Right. A genome wide localization was used
after treatment with the mating pheromone
alpha factor. The results were:
Conclusion

We have seen how genome wide
localization helps us recognize regulation
motifs and networks

Also we’ve seen a computational method to
create a whole regulatory network without
prior knowledge of the factors involved.
Conclusion


The changes in Rsc9 localization suggests that
the genome itself is conditioned during
widespread transcriptional regulation.
Raises new and interesting questions for
transcriptional regulation.
Bibliography
Lee et al. Transcriptional Regulatory Networks in
Saccharomyces cerevisiae. Science. 2002 298:799-804
Damelin et al. The Genome-Wide Localization of Rsc9,
a Component of the RSC Chromatin-Remodeling
Complex, Changes in Response to Stress. Mol Cell.
2002 9:563-573
Simon et al. Serial Regulation of Transcriptional Regulators
in the Yeast Cell Cycle. Cell 2001 106:697-708
Ren et al. Genome-Wide Location and Function of DNA Binding
Proteins. Science 2000 290:2306-2309
Hope you had
fun!