Transcript Document

Domain Assignment to Transcription Factors
416 Proteins with
at least one SCOP DBD assignment
416 proteins with DBD assignment
PFAM assignments
Remove 145 proteins
(Transposases,
Replication/Repair and
Restriction Enzymes)
271 Transcription Factors
113 with regulated gene information + 158 with DBD
To gain insight into the evolution and organisation of the transcription factors in E. coli, we used
the information available in RegulonDB (Salgado et al., 2001) on TFs and their regulated genes. To
find out about the domain architecture and family membership of E. coli TFs, we used the
SUPERFAMILY (Gough et al., 2001) database of structural assignments to predicted proteins from
the E. coli genome, and extracted all those with DNA-binding domains (DBDs). Using the functional
annotation for the E. coli genome, we removed proteins involved in replication & repair, restriction
enzymes and transposases.
Repertoire of DNA Binding Domains in E. coli TFs
1efa
1jhf
“Winged” helix
Lambda repressor-like
1jhg
Trp repressor
1ihf
IHF-like DNA-binding proteins
1rnl
1etx
1bl0
C-terminal effector domain of
the bipartite response regulator
1cmb
Homeodomain-like
FIS-like
1mjc
Met repressor-like
Nucleic acid binding protein
1g8e
Flagellar transcriptional
activator FlhD
1exi
Putative DNA binding protein
Domain Analysis of E. coli TFs
Small Molecule Binding Domain
2%
13%
9%
1 Domain TFs
Enzyme Domain
22%
2 Domain TFs
Receiver Domain
44%
3 Domain TFs
76%
12%
Protein Interaction Domain
7%
4 Domain TFs
11% 5%
DNA Binding Domain
Domain of unknown function
On average each TF has two domains:
one DBD and a partner domain which
is generally the control domain
The activity of ~90% of transcription factors is
controlled by a second domain and only ~10% is
controlled at the transcriptional level
In E. coli, there are 271 TFs that belong to 11 DBD families and 46 different partner domains as
identified in SUPERFAMILY. The domains can be broadly classified into six functional classes as
mentioned above. About two thirds of the 271 proteins have the same domain architecture as at
least one other protein, and have thus arisen as a consequence of duplication of a complete gene
(Apic et al., 2001 show that proteins with identical domain architecture are highly likely to be direct
duplicates.) On average, each TF is made up of two domains, and the neighbouring domains to the
DBD are of the small molecule binding type in 120 of the 240 multidomain proteins.