Baker - International School of Crystallography
Download
Report
Transcript Baker - International School of Crystallography
New Drug Targets from Mycobacterium
tuberculosis: Strategies, Progress and
Pitfalls from a Structural Genomics
Enterprise
Ted Baker
School of Biological Sciences
University of Auckland
New Zealand
On behalf of TB Structural
Genomics Consortium
The challenge posed by
complete genome sequences
The Mycobacterium tuberculosis
genome
Approx. 3900 open reading frames (ORFs)
~60% of gene products have an inferred
function (mostly by homology)
~25% are “conserved hypotheticals”
~15% are “unknowns”
~30% can be related to proteins of known 3D
structure - but only ~25 TB protein structures
Many metabolic pathways appear incomplete
Function from structure?
Relationships that are hidden at the sequence level
SpeB – virulence factor
from S. pyogenes
Actinidin – plant cysteine protease
- < 10% sequence identity
Structural Genomics
The use of genomic information to guide protein
structure discovery
- and its inverse
The use of protein structure analysis to add value
to genomic sequence data – to deduce function
Reversal of the ‘traditional’ direction of structural
analysis
Many targets – whole genomes, pathways,
functional classes, folds
Beginnings…~1998
A pilot pilot programme – Pyrobaculum aerophilum
Using laboratory-scale approaches
- PCR cloning
- Expression in E. coli, cleavable affinity tags
- Variation of expression temperature
- Purification by affinity chromatography and
gel filtration
Genomic approach – most tractable first
Results – P. aerophilum
Cloned
Expressed
Soluble
Purified
Crystallized
Structures
Main bottlenecks
25 (274)
20 (168)
12 (80)
12 (43)
6 (24)
4 (11)
- solubility
- crystallization
Pa_989 (TB homologue)
HisF (imidazoleglycerol phosphate synthase)
Banfield et al. Acta Cryst. D (2001)
Pa_2307 (unknown)
‘Ancient conserved domain’ found in bacteria
and archaea. No functional annotation
Reproducible crystals with Li2SO4
- but twinned
Two crystals grown from PEG/phosphate
1.5 A native data from one, SAD data from
Pt(NO2)4 deriv of the other (used gel shift)
Structure solved: SAD/Solve/Resolve/ARP
Pa_2307
The next phase – larger
enterprises
Publicly funded
- NIH Protein Structure Initiative (USA)
- Initiatives in Japan, Germany, UK,
France, Canada
Biotech companies
- Structural Genomix, Syrrx
NIH Protein Structure Initiative
10 groups (consortia) funded
Aim to develop methods and
tools for “high throughput”
structure determination
Goals primarily structural
- representative structures for
all protein sequence families
- discover novel folds (cover
“fold space”)
- estimate 10,000 structures
needed
But evolving
Mycobacterium tuberculosis
Causative agent of TB
One-third of world’s population affected
- approximately 3 million deaths annually
Five front-line drugs (isoniazid, pyrazinamide,
ethambutol, rifampin, streptomycin) but…
- effective only against actively-growing
bacteria
- very long treatment regime (6-9 months)
- resistance rising
- need for new drugs
Peculiarities of the organism
Very slow-growing Gram-positive organism
Complex waxy cell wall – outer layer rich
in unusual lipids, glycolipids, polysaccharides
Novel biosynthetic pathways
Complex lifestyle - persistence
- enters dormant state within
active macrophages
- survives through switches
in metabolism
- can be reactivated years later
Led in United States by:
- Tom Terwilliger (Los Alamos NL)
-
David Eisenberg (UCLA)
Jim Sacchettini (Texas A&M)
Bill Jacobs (Albert Einstein Coll. of Med.)
Tom Alber (UC Berkeley)….. and many others
Aims are focused on function:
- understanding TB biology
- discovery and structural analysis of
novel drug targets
http://www.doe-mbi.ucla.edu/TB/
Philosophy and policies
Open participation - to all with an interest in TB
Operates as a wider consortium of >30
participating labs in 13 countries worldwide
Collaboration between structural biologists
TB biologists, chemists….
Commitment to common policies
- collaboration and cooperation
- shared database for logging progress
- sharing of data and materials
- structures to be placed in public domain
Operational aspects
Central facilities for
- bioinformatic analysis and data storage
- protein expression and evolution
- crystallization
- synchrotron data collection
- gene knockouts
Technologies and facilities available to all
Individuals choose their own targets according to
their own interests – and assign priorities
Targeting scores determine priorities of facilities
Parallel efforts in individual labs
Progress to date
Most of structural results to date come as a result
of efforts in individual labs
But - availability of high-throughput facilities gives
flexible options for individual labs
and for efforts in the facilities
Within facilities – 688 genes cloned (out of 720
targeted to date)
First phase – concentrate on soluble proteins
Next phase – the insoluble proteins
Dealing with insoluble proteins
GFP fusions as reporter of solubility – G. Waldo
Folding Reporter - GFP
• Function of R (GFP) depends on solubility of X-L-R.
• Solubility of X-L-R depends on X.
Express fusion protein X-L-R
N
C
L
X
Non-functional R
Insoluble
R
Detect function R
Soluble
Cell Colonies
In Vitro
Transcription
+
Translation
X-L-GFP FUSION
FLUORESCENCE
Soluble Fraction
SDS-PAGE
X (Non-Fusion)
Pellet Fraction
Using GFP-fusions to engineer proteins
for solubility
Insoluble Protein
Mutate Gene
FORWARD
EVOLUTION
Recombine
Optima
Clone
Select
BACKCROSSING
Recombine
Optima &
Wild type
Clone
Select
Soluble Protein
G.Waldo
Solubilisation by evolution
Rv2002 – Se Won Suh
Putative ketoacyl ACP
reductase
Rendered soluble by
3 random mutations
I6T and T69K
mutations are on
the molecular surface
V47M mutation
enhances a semiexposed hydrophobic
contact
Potential new
TB drug targets
Early results from the TB
Structural Genomics Consortium
Target ORF Selection in
Mycobacterium tuberculosis
Selection of ORFs: (a) potential drug targets
and (b) to understand TB biology
Biosynthetic enzymes for essential amino
acids, cofactors, lipids, polysaccharides
Secreted proteins
Proteins implicated in antibiotic resistance
or response
Proteins implicated in persistence
1. Cell wall biosynthesis
- mycolic acids (Sacchettini lab)
Long chain branched lipids - form dense waxy
outer layer of the mycobacterial cell wall
Contribute to its impenetrability
Implicated in both virulence and persistence
Either covalently attached to cell wall
or released as trehalose dimycolate
(“cord factor”)
Modification of mycolic acids, eg. cyclopropanation
– varies between pathogenic and non-pathogenic
species
Cyclopropanation of mycolic acid
chains
Cyclopropane groups introduced by methylation
Three cyclopropane synthases
(C. Smith, J. Sacchettini – Texas A&M)
CmaA1
CmaA2
PcaA
2. Secreted proteins
(Eisenberg lab)
Secreted proteins attractive drug targets
for M. tuberculosis because:
Often determinants of virulence or persistence
- involved in cell wall modification
- role in survival in macrophages
M. tuberculosis secretes large number of proteins
Cell wall is impermeable to many antibacterial agents
Secreted proteins
(C. Goulding, D. Anderson, H. Gill, D. Eisenberg – UCLA)
C
Rv2220
Glutamine synthetase
- Synthesis of
poly-(L-Glu-L-Gln)
for cell wall
N
Rv1926c
Unknown, resembles
cell surface binding
proteins (invasin,
adaptin, arrestin)
Rv1886c
Antigen 85B
Mycolyl transferase
3. Targets against persistence
(Sacchettini lab)
Persistence within activated macrophages
facilitated by switch in metabolism
Glycolysis downregulated – instead
glyoxalate shunt allows use of C2 substrates
generated by b-oxidation of fatty acids
Enzymes isocitrate lyase and malate synthase
are drug targets for persistent bacteria
Glyoxalate shunt enzymes
(V. Sharma, J. Sacchettini - Texas A&M)
Rv0867
Isocitrate lyase
Rv1837c
Malate synthase
4. Antibiotic resistance
- Isoniazid response genes
DNA microarray analysis
of TB ORFs upregulated by
exposure to isoniazid
Some code for proteins of
known function – cell wall
biosynthesis
Others represent ‘unknowns’
The proteins encoded by
these ORFs may represent
the bacterial response to the
toxic effects of the antibiotic
Wilson et al., PNAS 96:12833-12838 (1999)
Putative INH response operon
Four ORFs appear to make up part of a
putative operon in the TB genome: Rv0340,
Rv0341, Rv0342, Rv0343.
Rv0340
Rv0341
Rv0342
Rv0343
None of the four ORFs have detectable
sequence homologues in other organisms.
Rv0340 and Rv0341 are paralogues, as are
Rv0342 and Rv0343
Same genes also upregulated by ethambutol.
Isoniazid response – Rv0340
Moyra Komen, Vic Arcus, Shaun Lott
Crystallization attempts
Oil
Spherulites
NMR – shows only partially folded
Limited proteolysis – gives N-terminal fragment
with excellent NMR spectrum
NMR spectrum – Rv0340
(residues 1-131)
Indicates helical
bundle with flexible
tail
Possible homology
with acyl carrier
protein
Gives putative
role in cell wall
biosynthesis
Problems of partial or incorrect
functional annotation
Rv1347c
Widespread in bacteria, but
not eukaryotes
No clearly indicated function
- closest sequence homologs:
malonyl CoA decarboxylase
siderophore biosynthesis
aminoglycoside acetyltransferase
No structure prediction
Rv1347c structure - Graeme
Card
Rv1347c
Acetyl-CoA dependent
aminoglycoside
acetyltransferase (11%
identity)
Rv1347c
Aminoglycoside N-acetyl
transferase (GCN5 family)
~ 11% sequence identity
Problem of partial or incorrect
functional annotations
Rv3853 - “menG”
Putative SAM-dependent methyltransferase
catalysing final step in menaquinone biosynthesis
Potential drug target – menaquinone pathway is
essential and is not present in humans
Genome also includes ubiE (Rv0558) - catalyses
this step in both menaquinone and ubiquinone
biosynthesis (menG is specific for menaquinone)
Expressed, refolded, crystallized, solved to 1.9Å by
SIRAS
Common methyltransferase fold
MenG structure – Jodie Johnston
Structure does not
look like a
methyltransferase
Resembles a
phosphate transfer
domain?
Incorrect annotation
Challenges for the future
Membrane proteins
Solubility of expressed proteins
Hetero-oligomeric proteins
Protein-protein interactions
Assignment of function to “unknowns”
Cellular pathways - metabolic pathways
- signalling pathways
Conclusions
Structural biology is being transformed by
new technologies – some driven by genomics
Less effort in solving initial structures – more
emphasis on “downstream” studies
TB structural genomics consortium – a different
model for large scale structure determination
- access to centralised facilities
- international effort on a common goal
- collaboration rather than competition
- opportunities for smaller labs
Thanks
Mycobacterium tuberculosis structural genomics
consortium
Members of Auckland Structural Biology
Laboratory – Vic Arcus, Kristina Backbro, Mark
Banfield, Heather Baker, Graeme Card, Jodie Johnston,
Rainer Knijff, Moyra Komen, Shaun Lott, Andrew
McCarthy, Clyde Smith
Marsden Fund
Health Research Council
New Economy Research Fund