Transcript Interaction

general:
Activators protein-DNA interaction
MBV4230
The sequence specific activators:
transcription factors

Modular design with a minimum of two
functional domains




1. DBD - DNA-binding domain
2. TAD - transactivation domain
DBD: several structural motifs 
classification into TF-families
TAD - a few different types
N
Three classical categories
 Acidic domains (Gal4p, steroid receptor)
 Glutamine-rich domains (Sp1)
 Proline- rich domains (CTF/NF1)
 Mutational analyses - bulky hydrophobic more important than acidic
 Unstructured in free state - 3D in contact with target?
DBD


TAD
Most TFs more complex

Regulatory domains, ligand binding domains etc
C
MBV4230
TF classification based on
structure of DBD
Zinc finger
bHelix-Loop-HelixTwo levels of recognition
(Max)
1. Shape recognition
Anhelix fits into the major groove in BDNA. This is used in most interactions
2. Chemical recognition
Negatively charged sugar-phosphate chain
involved in electrostatic interactions
Hydrogen-bonding is crucial for sequence
recognition
Leucine zipper
(Gcn4p)
p53 DBD
NFkB
STAT
dimer
MBV4230
Alternative classification of TFs
on the basis of their regulatory role

Classification questions




Is the factor constitutive active or requires a signal for activation?
Does the factor, once synthesized, automatically enter the nucleus to
act in transcription?
If the factor requires a signal to become active in transcriptional
regulation, what is the nature of that signal?
Classification system


I. Constitutive active nuclear factors
II. Regulatory transcription factors


Developmental TFs
Signal dependent



Steroid receptors
Internal signals
Cell surface receptor controlled
 Nuclear
 Cytoplasmic
MBV4230
Classification - regulatory function
Brivanlou and Darnell (2002) Science 295, 813 -
MBV4230
Sequence specific DNA-binding
- essential for activators


TFs create nucleation sites in promoters for
activation complexes
Sequence specific DNA-binding crucial role
Principles of sequence
specific DNA-binding
MBV4230
How is a sequence (cis-element)
recognized from the outside?
Shape recognition
Chemical recognition
Electrostatic
interaction
Form/
geometry
Hydrogenbonds
Hydrophobic
interaction
MBV4230
Complementary forms
The dimension of anhelix
fits the dimensions of the
major groove in B-DNA
Sidechains point outwards and
are ideally positioned to
engage in hydrogen bonds
MBV4230
Direct reading of DNA-sequence
Recognition of form



The dimension of an -helix
fits the dimensions of the
major groove in B-DNA
Most common type of
interaction
Usually multiple domains
participate in recognition
dimers of same motif
 tandem repeated motif
 Interaction of two different motifs


recognition: detailed fit of
complementary surfaces
Hydration /vann participates
 seq specvariation of DNA-structure

MBV4230
Example

Steroid receptor
MBV4230
Recognition by
complementary forms
434 fag repressor
MBV4230
DNAs form:
B-DNA most common
B-form
Major groove
Minor groove
wide geometry
fits -helix
Each basepair with
unique H-bondingpattern
Deep and narrow
geometry
Each basepair
binary H-bondingpattern
B
MBV4230
DNAs form:
A-form more used in RNA-binding
A-form
Major groove
Minor groove
Deep and narrow
geometry
Wide and shallow
A
MBV4230
How is a sequence (cis-element)
recognized from the outside?
Shape recognition
Chemical recognition
Electrostatic
interaction
Form/
geometry
Hydrogenbonds
Hydrophobic
interaction
MBV4230
Next level: chemical recognition
- reading of sequence information

Negatively charged
sugar-phosphate chain
= basis for electrostatic
interaction


Equal everywhere - no sequencerecognition
Still a main contributer to the
strength of binding
MBV4230
Electrostatic interaction
Entropy-driven binding
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
-
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
-
-
Na+
Na+
Na+
Na+
-
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
-
-
Na+
Na+
Na+
Na+
Na+
Negative phosphate chain
partially neutralized by a
cloud of counter ions
Na+
Na+
Na+
Na+
Na+
Counter ions liberated
Entropy-driven binding
MBV4230
How is a sequence (cis-element)
recognized from the outside?
Shape recognition
Chemical recognition
Electrostatic
interaction
Form/
geometry
Hydrogenbonds
Hydrophobic
interaction
MBV4230
Recognition by Hydrogen bonding

Hydrogen-bonding is a
key element in
sequence specific
recognition

10-20 x in contact surface

Base pairing not exhausted in
duplex DNA, free positions point
outwards in the major groove
D
A
A
MBV4230
Unexploited H-bonding possibilities
in the grooves
Major groove
AT-base pair Minor groove
Major groove
GC-base pair Minor groove
MBV4230
A ”bar code” in the grooves
Unique ”bar code”
in major groove
A D
A
AT-basepair
A
AT-basepair
A
Unique recognition
of a base pair requires
TWO hydrogen bonds
In the major groove
Binary ”bar code”
in minor groove
A
D
GC-basepair
A
GC-basepair
A
D
AT-pair [AD-A] ≠ TA-pair [A-DA]
GC-pair [AA-D] ≠ CG-pair [D-AA]
A
AT-pair [A-A] = TA-pair [A-A]
GC-pair [ADA] = CG-pair [ADA]
MBV4230
Docked prot side chains exploit the
H-bonding possibilities for interaction

Hydrogen-bonding is
essential for
sequence specific
recognition




10-20 x in contact interphase
Most contacts in major groove
Purines most important
A Zif example
MBV4230
Interaction:
Protein side chain - DNA bp

Close up
 Amino
acid sidechains points
outwards from the -helix
and are optimally positioned
for base-interaction
 Still no
”genetic code” in the
form of sidechain-base rules

docking of the entire protein
MBV4230
Interaction:
Protein side chain - DNA bp

Close up
acid sidechains points outwards from the -helix
and are optimally positioned for base-interaction
 Amino
MBV4230
A network of H-bonds


Example:
c-Myb - DNA
Protein
DNA
MBV4230
How is a sequence (cis-element)
recognized from the outside?
Shape recognition
Chemical recognition
Electrostatic
interaction
Form/
geometry
Hydrogenbonds
Hydrophobic
interaction
MBV4230
Hydrophobic contact points
Ile
Homeodomains
MBV4230
The Homeodomain-family:
common DBD-structure

Homeotic genes - biology
Regulation of Drosophila development
 Striking phenotypes of mutants - bodyparts move
 Control genetic developmental program


Homeobox / homeodomain
Conservered DNA-sequence “homeobox” in a large
number of genes
 Encode a 60 aa “homeodomain”
 A stably folded structure that binds DNA
 Similarity with prokaryotic helix-turn-helix


3D-structure determined for several
HDs



Drosophila Antennapedia HD (NMR)
Drosophila Engrailed HD-DNA kompleks (crystal)
Yeast MAT2
MBV4230
Homeodomain-family: common
DBD-structure

Major groove contact via a 3 -helix structure

helix 3 enters major groove (“recognition helix”)
helix 1+2 antiparallel across helix 3

16 -helical aa conserved




9 in hydrophobic core
some in DNA-contact interphase (common docking mechanism?)
Positions important for sequence recognition




N51 invariant: H-binding Adenine, role in positioning
I47 (en, Antp) hydrophobic base contact
Q50 (en), S50 (2) H-bond to Adenine, determining specificity
R53 (en), R54 (2): DNA-contact
MBV4230
Engrailed
MBV4230
Antennapedia
MBV4230
Homeodomain-family: common
DBD-structure

Minor groove contacted via N-terminal
flexible arm
R3 and R5 in engrailed and R7 in MAT2 contact AT in
minor groove
 R5 conserved in 97% of HDs
 Deletions and mutants impair DNA-binding





ftz HD (∆6aa N-term) 130-fold weaker DNA-binding
MAT2 (R7A) impaired repressor
POU (∆4,5) DNA-binding lost
Loop between helix 1 and 2 determines
Ubx versus Antp function
Close to DNA
 exposed for protein protein interaction

MBV4230
HD-paradox:
what determines sequence specificity?


Drosophila Ultrabithorax (Ubx), Antennapedia (Antp),
Deformed (Dfd) and Sex combs reduced (Scr):
closely similar HD, biological rolle very different
Minor differences in DNA-binding in vitro
TAAT-motif bound by most HD-factors
 contrast between promiscuity in vitro and specific effects in vivo


Swaps reveal that surprisingly much of the
specificity is determined by the N-terminal arm which
contacts the minor groove
Swaps: Antp with Scr-type N-term arm shows Scr-type specificity in vivo
 Swaps: Dfd with Ubx-type N-term arm shows Ubx-type specificity in vivo


N-terminal arm more divergent than the rest of HD
R5 and R7 (contacting DNA) are present in both Ubx, Antp, Dfd, and Scr
 Other tail aa diverge much more

MBV4230
Solutions of the paradox

Conformational effects mediated by N-term arm


Even if the -helical HDs are very similar, a much larger diversity is found in
the N-terminal arms that contact the minor groove
Protein-protein interaction with other TFs through the
N-terminal arm - enhanced affinity/specificity - the
basis of combinatorial control
MAT2 interaction with MCM1 - cooperative interactions
 Ultrabithorax- Extradenticle in Drosophila
 Hox-Pbx1 in mammals

MBV4230
Combinatorial TFs give enhanced
specificity


TFs encoded by the the
homeotic (Hox) genes
govern the choice between
alternative developmental
pathways along the anterior–
posterior axis.
Hox proteins, such as
Drosophila Ultrabithorax,
have low DNA-binding
specificity by themselves but
gain affinity and specificity
when they bind together with
the homeoprotein
Extradenticle (or Pbx1 in
mammals).
MBV4230
N-tail in protein-protein interaction
- adopt different conformations

HD
b
HD
Mat-2/Mcm-1
Conformation determined
by prot prot interaction
MBV4230
The partner may also be a linker
histone


Repression of the
mouse MyoD gene by
the linker histone H1b
and the homeodomain
protein Msx1.
The first evidence that a
linker histone subtype
operates in a genespecific fashion to
regulate tissue
differentiation
MBV4230
It works impressively well

Hox genes
POU family
MBV4230
POU-family: common DBD-structure

The POU-name :




A bipartite160 aa homeodomain-related DBD




Pit-1 pituitary specific TF
Oct-1 and Oct-2 lymphoide TFs
Unc86 TF that regulates neuronal development in C.elegans
a POU-type HD subdomain (C-terminally located)
et POU-specific subdomain (N-terminally located)
Coupled by a variabel linker (15-30 aa)
POU is a structurally bipartite motif that arose
by the fusion of genes encoding two different
types of DNA-binding domain.
MBV4230
POU: Two independent subdomains

POUHD subdomain






POUspec subdomain






60 aa closely similar to the classical HD
Only weakly DNA-binding by itself (<HD)
contacts 3´-half site (Oct-1: ATGCAAAT)
docking similar to engrailed. Antp etc
Main contribution to non-specific backbone contacts
75 aa POU-specific domain
enhances DNA-affinity 1000x
contacts 5´-half site (Oct-1: ATGCAAAT)
contacts opposite side of DNA relative to HD
structure similar to prokaryotic - and 434-repressors
The two-part DNA-binding domain
partially encircles the DNA.
MBV4230
Flexible DNA-recognition

POU-domains
have intrinsic
conformational
flexibility


and this feature appears
to confer functional
diversity in DNArecognition
The subdomains
are able to assume
a variety of
conformations,
dependent on the
DNA element.
MBV4230
A POU prototype: Oct-1


Ubiquitously expressed Oct-1 (≠ cell type specific Oct-2)
Oct-1 performs many divergent roles in cellular trx
regulation


partly owing to its flexibility in DNA binding and ability to associate with
multiple and varied co-regulators
Oct-1 activates transcription of genes that are
involved in basic cellular processes
Oct-1 activates small nuclear RNA (snRNA) and
 S-phase histone H2B gene transcription
 cell-specific promoters, particularly in the immune and nervous systems
 immunoglobulin (Ig) heavy- and lightchains


Activate target genes by bidning to the “octamer”
cis-element ATGCAAAT

Hence the name “Octamer-motif binding protein”
MBV4230
Flexibility


On the natural highaffinity Oct-1 octamer
(ATGCAAAT) binding
site, the two Oct-1 POUsubdomains lie on
opposite sides of the
DNA
The unstructured linker
permits flexible
subdomain positioning
and hence diversity in
Oct-1 sequence
recognition.
MBV4230
Oct-1: associates with multiple and
varied co-regulators

Oct-1 associates with a B-cell specific coregulator OCA-B (OBF-1). OCA-B stabilizes
Oct-1 on DNA and provides a transcriptional
activation domain.






B-cell specific activation of immunoglobulin genes - for long a paradox
Depended on octamer cis-elements
B-cell express both ubiquitous Oct-1 and the cell type specific Oct-2 
Hypothesis: Oct-2 aktivates IgGs (Wrong!)
oct-2 deficient mouse  normal development of early B-cells and cell
lines without Oct-2 produce abundant amounts of Ig
A B-cell specific coactivator mediates Oct-1 transactivation
VP16 - a virus strategy to exploit a host TF
MBV4230
Many viruses use Oct-1 to promote
infection




When herpes simplex virus
(HSV) infects human cells, a
virion protein called VP16, forms
a trx regulatory complex with
Oct-1 and the cell-proliferation
factor HCF-1
VP16 = a strong transactivator,
not itself DNA-binding, but
becomes associated with DNA
through Oct-1
The specificity of Oct-1 is altered
from Octamer-seq to the virus
cis-element TAATGARAT
The VP16-induced complex has
served as a model for
combinatorial mechanisms of trx
regulation
Pax family
MBV4230
Pax family
Paired domain
MBV4230
Paired domain DBD
RED
Major groove
interaction:
Minor groove
interaction:
Major groove
interaction:
Flex?
PAI