Transcript Chapter 7B

Chap. 7 Transcriptional Control of Gene
Expression (Part B)
Topics
• RNA Polymerase II Promoters and General Transcription
Factors
• Regulatory Sequences in Protein-coding Genes and the Proteins
Through Which They Function
Goals
• Learn about transcription
control elements and methods
of promoter analysis.
• Learn how the RNA
polymerase II pre-initiation
complex assembles at a
promoter.
• Learn about the structures of
eukaryotic transcription
factors.
Transcriptionally active polytene
chromosomes
Overview of Eukaryotic Promoters
The promoter of a eukaryotic gene can be defined as a sequence
that sets the transcription start site for RNA polymerase. Strong
RNA Pol II promoters contain an A/T rich sequence known as the
TATA box located 26-31 bp upstream of the start site (Fig.
7.14). Other genes have alternative sequence elements known as
initiators (Inr) which also serve as promoters that set the RNA Pol
II start site. Finally, CG-rich repeat sequences (CpG islands) are
used by RNA Pol II as promoters in 60-70% of genes. Most of
these genes are weakly expressed.
Transcription Initiation by RNA Pol II
RNA Pol II requires general TFs in addition to tissue-specific
transcription factors for transcription of most genes in vivo.
General TFs position RNA Pol II at start sites and assist the
enzyme in melting promoter DNA. General TFs are highly
conserved across species. The general TFs used at TATA box
promoters are TFIIA, B, D, etc. TFIIA is required for
transcription only in vivo. TFIID consists of TBP (TATA box
binding protein) and 13 TBP-associated factors (TAFs). While
the complete TFIID complex is required for transcription in
vivo, only TBP is required in vitro. Formation of the preinitiation complex in vitro is illustrated in the next two slides
(Fig. 7.17).
Pol II Pre-initiation Complex Formation (I)
The sequential steps leading to
the assembly of the RNA Pol
II pre-initiation complex in
vitro are shown in Fig. 7.17.
First, TBP binds to the TATA
box and bends DNA near the
promoter. Next, TFIIB binds,
and then a complex between
Pol II and TFIIF loads onto
the promoter. TFIIF positions
the Pol II active site at the
mRNA start site. TFIIF also
possesses histone acetylase
activity and helps maintain
chromatin at the promoter in
an uncondensed state. TFIIE
then binds creating a TFIIH
docking site (next slide).
Pol II Pre-initiation Complex Formation (II)
With the addition of TFIIH,
the assembly of the preinitiation complex is complete.
Subsequently, one subunit of
TFIIH melts DNA at the
promoter, obtaining energy by
ATP hydrolysis. Pol II then
begins transcribing the
mRNA. Another subunit of
TFIIH phosphorylates the Pol
II CTD, making Pol II highly
processive. In vitro, all
factors except TBP dissociate
from the promoter region
after Pol II moves
downstream. Tissue-specific
TFs bound to enhancers and
promoter-proximal elements
also play important roles in
transcription initiation in vivo.
Linker Scanning Mutagenesis Analysis of
Gene Regulatory Sites
A technique called linker
scanning mutagenesis
commonly is used to
identify transcription
control regions known as
promoter-proximal
elements that lie within
100-200 bp of a start
site Fig. 7.21. These
elements are required for
transcription but are not
directly involved in start
site selection. Large
changes in the locations of
these elements can
interfere with
transcription. Promoterproximal elements are
commonly important for
cell type-specific
transcription of genes.
Summary of Gene Control Elements
A spectrum of control elements regulate transcription by RNA Pol
II in eukaryotes. Their locations relative to the exons of a gene
are summarized in Fig. 7.22. Enhancers are transcription control
elements of 50-200 bp in length that can act from sites distant
from the regulated gene. They often are important for cell typespecific regulation of transcription. Enhancers can be positioned
upstream, downstream, or even within introns while still being
functional. They further may be located 50 kb away from a
transcription start site. Enhancers are composed of ~6-10 bp
DNA modules that are bound by transcription factors. Promoterproximal elements typically need to be with ~200 bp of the
transcription start site to be functional. Yeast genes usually
contain only upstream activating sequences (UAS) and a TATA box
for control of transcription. UASs act similarly to enhancers and
promoter-proximal elements in higher eukaryotes.
DNase I Footprinting
The human genome encodes ~2,000 transcription factors (TFs). A
method (DNase I footprinting) for determining locations of TF
binding sites in DNA is shown in Fig. 7.23. First, DNA labeled on
one strand is incubated with the protein of interest. Then the
complex is treated with a small amount of DNase I, which cleaves
DNA where it is not masked by the TF (Fig. 7.23a). A control
DNA sample lacking the TF is treated under parallel conditions.
Finally, the banding patterns from the two samples are compared
by gel electrophoresis to locate the "footprint" region where the
TF has shielded the DNA from cleavage (Fig. 7.23b).
Analysis of TF Activity in vivo
TFs can be assayed for their
ability to bind to DNA control
elements and regulate gene
expression by in vivo
transfection assays (Fig.
7.25). In this method, a
plasmid encoding the putative
TF (protein X) is introduced
into an animal cell along with
a second vector encoding a
reporter gene and the
putative protein X binding
site. If protein X binds to
the site and is a transcription
activator, then the reporter
gene is switched on. Note
that the cells must not
express protein X per se.
Modular Structure of Activators I
Transcription
activators are modular
proteins composed of
distinct functional
domains. They typically
contain both DNAbinding and activation
domains. A deletion
analysis performed
with the yeast GAL4
activator illustrating
that it contains these
two types of domains is
shown in Fig. 7.26.
The N-terminal amino
acids of GAL4
modulate DNA binding,
whereas its C-terminal
region contains an
activation domain.
Modular Structure of Activators II
Functional domains in activators are joined by flexible protein linker
sequences (Fig. 7.27). Due to the presence of linkers, the spacing
and location of DNA control elements often can be shifted without
interfering with DNA binding and regulation of promoters. The
evolution of gene control regions through shuffling of DNA binding
sequences between genes may have been favored due to the lack of
strong requirements for control element spacing and location. The
evolution of new activator protein genes through domain swapping
has probably also been facilitated by linker sequences.
Note, that transcription of some genes is controlled by repressors.
Repressors typically contain a DNA-binding domain and a repression
domain. The repression domain interacts with other TFs at a control
site, inhibiting their activity. The inactivation of a repressor can
lead to constitutive expression of the gene it controls.
Secondary Structure Motifs
Secondary structure motifs are evolutionarily conserved
collections of secondary structure elements which have a defined
conformation. They also have a consensus sequence because the
aa sequence ultimately determines structure. A given motif can
occur in a number of proteins where it carries out the same or
similar functions. Some well known examples such as the coiledcoil, EF hand/helix-loop-helix, and zinc-finger motifs are
illustrated in Fig. 3.9. These motifs typically mediate proteinprotein association, calcium/DNA binding, and DNA or RNA
binding, respectively.
Helix-turn-helix TFs
DNA-binding proteins bind
specifically to DNA via noncovalent interactions. ahelices are one of the most
common types of DNAbinding sequences (Fig.
7.28). The side-chains of
residues within the a-helix
often bind to the surfaces
of bases exposed in the
major groove of doublehelical DNA. Binding to
phosphates and bases in the
minor groove typically is less
important. One of the most
common DNA-binding
structure motifs is the
helix-turn-helix.The second
helix in this motif (the DNA
recognition helix) typically
binds to a specific sequence of bases in DNA. The recognition
helices in the dimeric bacteriophage 434 repressor are
indicated with asterisks in Fig. 7.28a. Helix-turn-helix TFs
are common in bacteria.
Zinc-finger TFs
The most common DNAbinding motif in human and
multicellular animal TFs is the
zinc finger. Two types of zinc
finger TFs are discussed
here--C2H2 zinc finger TFs
(Fig. 7.29a) and C4 zinc
finger TFs (Fig. 7.29b). Most
TFs that contain C2H2 zinc
fingers are monomeric. Its 2
cysteine and 2 histidine
residues bind to zinc ions
(Zn2+) (Fig. 7.29a), and the
a-helix containing the 2
histidines binds to bases in
the major groove. Much less
common are TFs containing C4 zinc fingers. Most TFs containing
this motif are dimeric. Nuclear receptors, which bind steroid
hormones and other compounds, contain this motif. The
glucocorticoid receptor is shown in Fig. 7.29b. Zinc ions are
bound to the DNA recognition helix of this motif, which contacts
bases in the major groove.
Leucine-zipper TFs
Leucine-zipper TFs contain extended a-helices wherein every 7th
amino acid is leucine. This periodicity creates a nonpolar face on
one side of the helix that is ideal for dimerization with another
such protein via a coiled-coil motif (Fig. 7.29c). So-called basic
zipper (bZip) TFs have a similar structure except that some
leucines are replaced by other nonpolar amino acids. The Nterminal ends of both leucine-zipper and bZip proteins contain
basic amino acids that interact with bases in the major groove
(Fig. 7.29c). Leucine zipper proteins are now considered to be a
subclass of bZip proteins.
Another class of TF, the
basic helix-loop-helix (bHLH)
proteins are similar to bZip
proteins, but contain a loop
between the DNA recognition
helix and the coiled-coil
region (Fig. 7.29d). bZip and
bHLH proteins commonly
form heterodimeric TFs.
Basic residues
Regulation of TF Activity
Many TFs bind ligands or co-activator/co-repressor proteins that
modulate their structure and activity. In the yeast GAL4 TF, its
"acidic activation domain" adopts an essentially random-coil
structure until it binds to a co-activator protein. This control
mechanism keeps the TF turned off until the appropriate cofactor
is present in the nucleus. Nuclear receptors such as the estrogen
receptor contain partially structured activation domains that
undergo conformational changes to the active structural state on
binding to hormone (e.g., estrogen) (Fig. 7.30b). In the active
conformation, the estrogen receptor can bind to co-activator
proteins required for transcription. The estrogen antagonist,
tamoxifen, that is used in breast cancer therapy, locks the
receptor in its inactive conformation that cannot bind co-activator
proteins (Fig. 7.30c).
Heterodimeric TFs
The formation of heterodimeric
TFs by bZip and bHLH TFs is
important in increasing the
complexity of transcriptional
regulation of genes. Sometimes
monomers within the
heterodimer recognize the same
DNA element, but have
different activation domains
(Fig. 7.31a). Different
regulatory responses result from
the different combinations of
activation domains bound to the
site. In other cases, monomers
within the heterodimer bind
different DNA elements (Fig.
7.31b). Each site then binds a
unique species of heterodimer.
Lastly, an inhibitory factor that
binds to only one type of
monomer will only affect sites
used by that monomer (Fig.
7.31c).
Cooperative Binding of TFs to DNA
In many cases, a TF will bind to a DNA element with high
affinity only when complexed with a second TF (Fig. 7.32a).
Such cooperative binding to DNA adds further complexity to
gene regulation. Namely, a certain TF will bind to its DNA
element only if its interacting partner also is expressed in that
cell type. In addition, expression levels of interacting TFs can
be varied between tissues to adjust gene transcription rates.
Multiprotein Complexes at Enhancers
Enhancers typically contain several DNA sequence elements that
are recognized by different DNA binding proteins (e.g., the ßinterferon enhancer, Fig. 7.33). The resulting nucleoprotein
complex is called an enhanceosome. Enhancers often are involved
in tissue-specific control of transcription. Given the complex
structures of enhanceosomes, it is easy to see how the absence
of even one of the factors that bind to the enhancer in a certain
tissue could change the expression level of the gene.