Riboswitches: the oldest regulatory system?
Download
Report
Transcript Riboswitches: the oldest regulatory system?
Evolution of bacterial
regulatory systems
Mikhail Gelfand
Research and Training Center “Bioinformatics”
Institute for Information Transmission Problems
Moscow, Russia
January 2008
Plan
• Individual sites
• Transcription factors and their binding
signals
• Regulatory systems and regulons
Birth and death of sites
is a very dynamic process
NadR-binding sites upstream of pnuB seem absent in
Klebsiella pneumoniae and Serratia marcescens
… but there are candidate sites further upstream …
… and they are clearly different (not simply misaligned).
Cryptic sites and loss of regulators
Loss of RbsR in Y. pestis
(ABC-transporter also is lost)
RbsR binding site
Start codon of rbsD
Unexpected conservation of non-consensus
positions in orthologous sites
regulatory site of LexA upstream of lexA
consensus nucleotides are in caps
Escherichia coli
Salmonella typhi
Yersinia pestis
Haemophilus influenzae
Pasteurella multocida
Vibrio cholerae
TgCTGTATATActcACAGcA
aACTGTATATActcACAGcA
agCTGTATATActcACAGcA
atCTGTATAcAatacCAGTt
TtCTGTATATAataACAGTt
cACTGgATATActcACAGTc
wrong consensus?
TF PurR, gene purL
Escherichia coli
Salmonella typhi
Yersinia pestis
Haemophilus influenzae
Pasteurella multocida
Vibrio cholerae
A C G C A A A C Gg T T t C G T
A C G C A A A C Gg T T t C G T
A C G C A A A C Gg T T t C G T
A t G C A A A C G T T T G Ct T
A C G C A A A C G T T Tt C G T
A C G C A A A C Gg T T G C t T
TF PurR, gene purM
Escherichia coli
Salmonella typhi
Yersinia pestis
Haemophilus influenzae
Pasteurella multocida
Vibrio cholerae
t C G C A A A C G T T T G Ct T
t C G C A A A C G T T T G Ct T
t C G C A A A C G T T T G Cc T
t C G C A A A C G T T T G Ct T
t C G C A A A C G T T T G Ct T
A C G C A A A C G T T Tt C c T
Non-consensus positions are more conserved than
synonymous codon positions
Regulators and their motifs
• Cases of motif conservation at
surprisingly large distances
• Subtle changes at close evolutionary
distances
• Correlation between contacting
nucleotides and amino acid residues
• Changes in symmetry patterns
NrdR (regulator of ribonucleotide reducases
and some other replication-related genes):
conservation at large distances
DNA motifs and protein-DNA interactions
Entropy at aligned sites and the number of contacts
(heavy atoms in a base pair at a distance <cutoff from a protein atom)
CRP
PurR
IHF
TrpR
The LacI family:
subtle changes in motifs at close distances
G
A
CG
Gn GC
n
Specificity-determining positions
in the LacI family
Training set: 459 sequences
average length: 338 amino acids,
85 specificity groups
– 44 SDPs
10 residues contact NPF (analog of
the effector)
7 residues in the effector contact zone
(5Ǻ<dmin<10Ǻ)
6 residues in the intersubunit
contacts
5 residues in the intersubunit
contact zone (5Ǻ<dmin<10Ǻ)
7 residues contact the operator
sequence
6 residues in the operator contact
zone (5Ǻ<dmin<10Ǻ)
LacI from E.coli
The CRP/FNR family of regulators
TGTCGGCnnGCCGACA
CooA
Desulfovibrio
TTGTGAnnnnnnTCACAA
FNR
Gamma
TTGATnnnnATCAA
HcpR
Desulfovibrio
TTGTgAnnnnnnTcACAA
Correlation between contacting
nucleotides and amino acid residues
•
•
•
•
DD
DV
EC
YP
VC
DD
DV
EC
YP
VC
CooA in Desulfovibrio spp.
CRP in Gamma-proteobacteria
HcpR in Desulfovibrio spp.
FNR in Gamma-proteobacteria
COOA
COOA
CRP
CRP
CRP
HCPR
HCPR
FNR
FNR
FNR
Contacting residues: REnnnR
TG: 1st arginine
GA: glutamate and 2nd arginine
ALTTEQLSLHMGATRQTVSTLLNNLVR
ELTMEQLAGLVGTTRQTASTLLNDMIR
KITRQEIGQIVGCSRETVGRILKMLED
KXTRQEIGQIVGCSRETVGRILKMLED
KITRQEIGQIVGCSRETVGRILKMLEE
DVSKSLLAGVLGTARETLSRALAKLVE
DVTKGLLAGLLGTARETLSRCLSRMVE
TMTRGDIGNYLGLTVETISRLLGRFQK
TMTRGDIGNYLGLTVETISRLLGRFQK
TMTRGDIGNYLGLTVETISRLLGRFQK
TGTCGGCnnGCCGACA
TTGTGAnnnnnnTCACAA
TTGTgAnnnnnnTcACAA
TTGATnnnnATCAA
The
correlation
holds for
other
factors in
the family
NrtR (regulator of NAD metabolism):
systematic search for correlated positions
•
•
•
•
analysis of correlated positions in proteins and sites
analysis of specificity determining positions
the same positions in one alpha-helix identified
plans for experimental verification
NiaR: changed dimer structure?
The GalR
family
and Cproteins
of RMsystems:
direct
and
inverted
repeats
BirA:
changed
spacing
What are the events leading
to the present-day state?
• Expansion and contraction of regulons
• New regulators (where from?)
• Duplications of regulators with or
without regulated loci
• Loss of regulators with or without
regulated loci
• Re-assortment of regulators and
structural genes
• … especially in complex systems
• Horizontal transfer
Trehalose/maltose catabolism
in alpha-proteobacteria
Duplicated LacI-family regulators: lineagespecific post-duplication loss
The binding motifs are very similar (the blue branch is
somewhat different: to avoid cross-recognition?)
Utilization of an unknown galactoside
in gamma-proteobacteria
Yersinia and Klebsiella: two regulons, GalR and Laci-X
Erwinia: one regulon, GalR
Loss of regulator and merger of
regulons: It seems that laci-X was
present in the common ancestor
(Klebsiella is an outgroup)
Utilization of maltose/maltodextrin
in Firmicutes
Displacement: invasion of a regulator from a
different subfamily (horizontal transfer from a
related species?) – blue sites
Orthologous TFs with
completely different regulons
(alpha-proteobaceria and
Xanthomonadales)
Catabolism of gluconate in proteobacteria
Extreme variability of the regulation
of “marginal” regulon members
β
Pseudomonas spp.
γ
Regulation of amino acid biosynthesis
in Firmicutes
• Interplay between regulatory RNA
elements and transcription factors
• Expansion of T-box systems (normally
– RNA structures regulating
aminoacyl-tRNA-synthetases)
Three
regulatory
systems
for the
methionine
biosynthesis
A.
B.
C.
SAMdependent
riboswitch
Met-T-box
MtaR:
repressor of
transcription
MtaR
Methionine regulatory systems:
loss of S-box regulons
• S-boxes (SAM-1 riboswitch)
– Bacillales
– Clostridiales
– the Zoo:
•
•
•
•
•
•
ZOO
Petrotoga
actinobacteria (Streptomyces, Thermobifida)
Chlorobium, Chloroflexus, Cytophaga
Fusobacterium
Deinococcus
proteobacteria (Xanthomonas, Geobacter)
• Met-T-boxes (Met-tRNA-dependent attenuator)
+ SAM-2 riboswitch for metK
– Lactobacillales
• MET-boxes (candidate transcription signal)
Lact.
– Streptococcales
Strep. Bac. Clostr.
Recent duplications and bursts:
Arg-T-box in Clostridium difficile
LR_ARGS
CPE_ARGS
CAC_ARGS
CB_ARGS
CBE_ARGS
Lactobacillales
CTC_ARGS
LP_ARGS
LME_ARGS
Clostridiales
argS
argS
LJ_ARGS
CDF_YQIXYZ
LGA_ARGS
RDF02391
PPE_ARGS
LSA_ARGS
СDF_ARGC
BC_ARGS2
EF_ARGS
BH_ARGS
CDF_ARGH
Bacillales
argS
: ARG-specific T-box regulatory site
yqiXYZ
NEW
NEW
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
Clostridium
difficile
RDF02391
argCJBDF
argH
others
argG
predicted
amino acid
transporters
amino acid
biosynthetic
genes
… following transcription factor loss
Gram+ bacteria:
Clostridium
difficile:
AhrC regulatory protein
(negative regulation of arginine metabolism
positive regulation of arginine catabolism)
Binding to 5’ UTR gene region
regulation of gene expression
5’
...
AhrC site
AhrC is lost
Expansion of T-box regulon
regulation of expression of
arginine biosynthetic
and transport genes by
T-box antitermination
Other clostridia spp.
(CA, CTC, CTH, CPE, CB, CPE)
yqiXYZ
yqiXYZ
argC
argH
argC
argH
argG
: AhrC binding site
: ARG-specific T-box regulatory site
Regulon expansion, or
how FruR has become CRA
• CRA (a.k.a. FruR) in Escherichia coli:
– global regulator
– well-studied in experiment
(many regulated genes known)
• Going back in time: looking for candidate
CRA/FruR sites upstream of (orthologs of)
genes known to be regulated in E.coli
Common ancestor of gamma-proteobacteria
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Common ancestor of the Enterobacteriales
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Enterobacteriales
Common ancestor of Escherichia and Salmonella
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Enterobacteriales
E. coli and Salmonella spp.
Life without Fur
Regulation of iron homeostasis
(the Escherichia coli paradigm)
Iron:
• essential cofactor (limiting in many environments)
• dangerous at large concentrations
FUR (responds to iron):
• synthesis of siderophores
• transport (siderophores, heme, Fe2+, Fe3+)
• storage
• iron-dependent enzymes
• synthesis of heme
• synthesis of Fe-S clusters
Similar in Bacillus subtilis
Regulation of iron homeostasis in α-proteobacteria
[- Fe]
[+Fe]
[ - Fe]
[+Fe]
RirA
RirA
Irr
Irr
FeS
heme
degraded
Siderophore
uptake
2+
3+
Fe / Fe
uptake
Iron uptakesystems
Fur
[- Fe]
Iron storage
ferritins
FeS
synthesis
Heme
synthesis
Iron-requiring
enzymes
[ironcofactor]
Fur
IscR
Fe
FeS
Transcription
factors
FeS status
of cell
[+Fe]
Experimental studies:
• FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium
• RirA (Rrf2 family): Rhizobium and Sinorhizobium
• Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella
Distribution of
transcription
factors in
genomes
Search for
candidate
motifs and
binding sites
using
standard
comparative
genomic
techniques
Regulation of genes
in functional
subsystems
Rhizobiales
Bradyrhizobiaceae
Rhodobacteriales
The Zoo (likely
ancestral state)
Reconstruction of history
Frequent
co-regulation
with Irr
Strict division
of function
with Irr
Appearance of the
iron-Rhodo motif
All logos and Some Very
Tempting Hypotheses:
Cross-recognition of
FUR and IscR motifs
in the ancestor.
2. When FUR had
become MUR, and
IscR had been lost in
Rhizobiales, emerging
RirA (from the Rrf2
family, with a rather
different general
consensus) took over
their sites.
3. Iron-Rhodo boxes
are recognized by
IscR: directly
2
1.
testable
1
3
Summary and open problems
• Regulatory systems are very flexible
–
–
–
–
easily lost
easily expanded (in particular, by duplication)
may change specificity
rapid turnover of regulatory sites
• With more stories like these, we can start thinking about
a general theory
– catalog of elementary events; how frequent?
– mechanisms (duplication, birth e.g. from enzymes, horizontal
transfer)
– conserved (regulon cores) and non-conserved (marginal regulon
members) genes in relation to metabolic and functional
subsystems/roles
– (TF family-specific) protein-DNA recognition code
– distribution of TF families in genomes; distribution of regulon
sizes; etc.
People
•
•
•
•
•
•
Andrei A. Mironov – software, algorithms
Alexandra Rakhmaninova – SDP, protein-DNA correlations
•
•
•
•
•
•
•
•
Anna Gerasimova (now at U. Michigan) – NadR
Olga Kalinina (on loan to EMBL) – SDP
Yuri Korostelev – protein-DNA correlations
Ekateina Kotelnikova (now at Ariadne Genomics) – evolution of sites
Olga Laikova – LacI
Dmitry Ravcheev– CRA/FruR
Dmitry Rodionov (on loan to Burnham Institute) – iron etc.
Alexei Vitreschak – T-boxes and riboswitches
•
•
•
Andy Jonson (U. of East Anglia) – experimental validation (iron)
Leonid Mirny (MIT) – protein-DNA, SDP
Andrei Osterman (Burnham Institute) – experimental validation
Howard Hughes Medical Institute
Russian Foundation of Basic Research
Russian Academy of Sciences, program “Molecular and Cellular Biology”
INTAS