Baker - International School of Crystallography

Download Report

Transcript Baker - International School of Crystallography

New Drug Targets from Mycobacterium
tuberculosis: Strategies, Progress and
Pitfalls from a Structural Genomics
Enterprise
Ted Baker
School of Biological Sciences
University of Auckland
New Zealand
On behalf of TB Structural
Genomics Consortium
The challenge posed by
complete genome sequences
The Mycobacterium tuberculosis
genome
 Approx. 3900 open reading frames (ORFs)
 ~60% of gene products have an inferred
function (mostly by homology)
 ~25% are “conserved hypotheticals”
 ~15% are “unknowns”
 ~30% can be related to proteins of known 3D
structure - but only ~25 TB protein structures
 Many metabolic pathways appear incomplete
Function from structure?
Relationships that are hidden at the sequence level
SpeB – virulence factor
from S. pyogenes
Actinidin – plant cysteine protease
- < 10% sequence identity
Structural Genomics
 The use of genomic information to guide protein
structure discovery
- and its inverse
 The use of protein structure analysis to add value
to genomic sequence data – to deduce function
 Reversal of the ‘traditional’ direction of structural
analysis
 Many targets – whole genomes, pathways,
functional classes, folds
Beginnings…~1998
A pilot pilot programme – Pyrobaculum aerophilum
 Using laboratory-scale approaches
- PCR cloning
- Expression in E. coli, cleavable affinity tags
- Variation of expression temperature
- Purification by affinity chromatography and
gel filtration
 Genomic approach – most tractable first
Results – P. aerophilum






Cloned
Expressed
Soluble
Purified
Crystallized
Structures
Main bottlenecks
25 (274)
20 (168)
12 (80)
12 (43)
6 (24)
4 (11)
- solubility
- crystallization
Pa_989 (TB homologue)
 HisF (imidazoleglycerol phosphate synthase)
 Banfield et al. Acta Cryst. D (2001)
Pa_2307 (unknown)
 ‘Ancient conserved domain’ found in bacteria
and archaea. No functional annotation
 Reproducible crystals with Li2SO4
- but twinned
 Two crystals grown from PEG/phosphate
 1.5 A native data from one, SAD data from
Pt(NO2)4 deriv of the other (used gel shift)
 Structure solved: SAD/Solve/Resolve/ARP
Pa_2307
The next phase – larger
enterprises
 Publicly funded
- NIH Protein Structure Initiative (USA)
- Initiatives in Japan, Germany, UK,
France, Canada
 Biotech companies
- Structural Genomix, Syrrx
NIH Protein Structure Initiative
 10 groups (consortia) funded
 Aim to develop methods and
tools for “high throughput”
structure determination
 Goals primarily structural
- representative structures for
all protein sequence families
- discover novel folds (cover
“fold space”)
- estimate 10,000 structures
needed
But evolving
Mycobacterium tuberculosis
Causative agent of TB
One-third of world’s population affected
- approximately 3 million deaths annually
Five front-line drugs (isoniazid, pyrazinamide,
ethambutol, rifampin, streptomycin) but…
- effective only against actively-growing
bacteria
- very long treatment regime (6-9 months)
- resistance rising
- need for new drugs
Peculiarities of the organism
 Very slow-growing Gram-positive organism
 Complex waxy cell wall – outer layer rich
in unusual lipids, glycolipids, polysaccharides
 Novel biosynthetic pathways
 Complex lifestyle - persistence
- enters dormant state within
active macrophages
- survives through switches
in metabolism
- can be reactivated years later
 Led in United States by:
- Tom Terwilliger (Los Alamos NL)
-
David Eisenberg (UCLA)
Jim Sacchettini (Texas A&M)
Bill Jacobs (Albert Einstein Coll. of Med.)
Tom Alber (UC Berkeley)….. and many others
 Aims are focused on function:
- understanding TB biology
- discovery and structural analysis of
novel drug targets
http://www.doe-mbi.ucla.edu/TB/
Philosophy and policies
 Open participation - to all with an interest in TB
 Operates as a wider consortium of >30
participating labs in 13 countries worldwide
 Collaboration between structural biologists
TB biologists, chemists….
 Commitment to common policies
- collaboration and cooperation
- shared database for logging progress
- sharing of data and materials
- structures to be placed in public domain
Operational aspects
 Central facilities for
- bioinformatic analysis and data storage
- protein expression and evolution
- crystallization
- synchrotron data collection
- gene knockouts
 Technologies and facilities available to all
 Individuals choose their own targets according to
their own interests – and assign priorities
 Targeting scores determine priorities of facilities
 Parallel efforts in individual labs
Progress to date
 Most of structural results to date come as a result
of efforts in individual labs
 But - availability of high-throughput facilities gives
flexible options for individual labs
and for efforts in the facilities
 Within facilities – 688 genes cloned (out of 720
targeted to date)
 First phase – concentrate on soluble proteins
 Next phase – the insoluble proteins
Dealing with insoluble proteins
GFP fusions as reporter of solubility – G. Waldo
Folding Reporter - GFP
• Function of R (GFP) depends on solubility of X-L-R.
• Solubility of X-L-R depends on X.
Express fusion protein X-L-R
N
C
L
X
Non-functional R
Insoluble
R
Detect function R
Soluble
Cell Colonies
In Vitro
Transcription
+
Translation
X-L-GFP FUSION
FLUORESCENCE
Soluble Fraction
SDS-PAGE
X (Non-Fusion)
Pellet Fraction
Using GFP-fusions to engineer proteins
for solubility
Insoluble Protein
Mutate Gene
FORWARD
EVOLUTION
Recombine
Optima
Clone
Select
BACKCROSSING
Recombine
Optima &
Wild type
Clone
Select
Soluble Protein
G.Waldo
Solubilisation by evolution
Rv2002 – Se Won Suh
 Putative ketoacyl ACP
reductase
 Rendered soluble by
3 random mutations
 I6T and T69K
mutations are on
the molecular surface
 V47M mutation
enhances a semiexposed hydrophobic
contact
Potential new
TB drug targets
Early results from the TB
Structural Genomics Consortium
Target ORF Selection in
Mycobacterium tuberculosis
 Selection of ORFs: (a) potential drug targets
and (b) to understand TB biology
 Biosynthetic enzymes for essential amino
acids, cofactors, lipids, polysaccharides
 Secreted proteins
 Proteins implicated in antibiotic resistance
or response
 Proteins implicated in persistence
1. Cell wall biosynthesis
- mycolic acids (Sacchettini lab)
 Long chain branched lipids - form dense waxy
outer layer of the mycobacterial cell wall
 Contribute to its impenetrability
 Implicated in both virulence and persistence
 Either covalently attached to cell wall
or released as trehalose dimycolate
(“cord factor”)
 Modification of mycolic acids, eg. cyclopropanation
– varies between pathogenic and non-pathogenic
species
Cyclopropanation of mycolic acid
chains
 Cyclopropane groups introduced by methylation
Three cyclopropane synthases
(C. Smith, J. Sacchettini – Texas A&M)
CmaA1
CmaA2
PcaA
2. Secreted proteins
(Eisenberg lab)
Secreted proteins attractive drug targets
for M. tuberculosis because:
 Often determinants of virulence or persistence
- involved in cell wall modification
- role in survival in macrophages
 M. tuberculosis secretes large number of proteins
 Cell wall is impermeable to many antibacterial agents
Secreted proteins
(C. Goulding, D. Anderson, H. Gill, D. Eisenberg – UCLA)
C
Rv2220
Glutamine synthetase
- Synthesis of
poly-(L-Glu-L-Gln)
for cell wall
N
Rv1926c
Unknown, resembles
cell surface binding
proteins (invasin,
adaptin, arrestin)
Rv1886c
Antigen 85B
Mycolyl transferase
3. Targets against persistence
(Sacchettini lab)
 Persistence within activated macrophages
facilitated by switch in metabolism
 Glycolysis downregulated – instead
glyoxalate shunt allows use of C2 substrates
generated by b-oxidation of fatty acids
 Enzymes isocitrate lyase and malate synthase
are drug targets for persistent bacteria
Glyoxalate shunt enzymes
(V. Sharma, J. Sacchettini - Texas A&M)
Rv0867
Isocitrate lyase
Rv1837c
Malate synthase
4. Antibiotic resistance
- Isoniazid response genes
 DNA microarray analysis
of TB ORFs upregulated by
exposure to isoniazid
 Some code for proteins of
known function – cell wall
biosynthesis
 Others represent ‘unknowns’
 The proteins encoded by
these ORFs may represent
the bacterial response to the
toxic effects of the antibiotic
Wilson et al., PNAS 96:12833-12838 (1999)
Putative INH response operon
 Four ORFs appear to make up part of a
putative operon in the TB genome: Rv0340,
Rv0341, Rv0342, Rv0343.
Rv0340
Rv0341
Rv0342
Rv0343
 None of the four ORFs have detectable
sequence homologues in other organisms.
 Rv0340 and Rv0341 are paralogues, as are
Rv0342 and Rv0343
 Same genes also upregulated by ethambutol.
Isoniazid response – Rv0340
Moyra Komen, Vic Arcus, Shaun Lott
 Crystallization attempts
Oil
Spherulites
 NMR – shows only partially folded
 Limited proteolysis – gives N-terminal fragment
with excellent NMR spectrum
NMR spectrum – Rv0340
(residues 1-131)
 Indicates helical
bundle with flexible
tail
 Possible homology
with acyl carrier
protein
 Gives putative
role in cell wall
biosynthesis
Problems of partial or incorrect
functional annotation
Rv1347c
 Widespread in bacteria, but
not eukaryotes
 No clearly indicated function
- closest sequence homologs:
malonyl CoA decarboxylase
siderophore biosynthesis
aminoglycoside acetyltransferase
 No structure prediction
Rv1347c structure - Graeme
Card
Rv1347c
Acetyl-CoA dependent
aminoglycoside
acetyltransferase (11%
identity)
Rv1347c
Aminoglycoside N-acetyl
transferase (GCN5 family)
~ 11% sequence identity
Problem of partial or incorrect
functional annotations
Rv3853 - “menG”
 Putative SAM-dependent methyltransferase
catalysing final step in menaquinone biosynthesis
 Potential drug target – menaquinone pathway is
essential and is not present in humans
 Genome also includes ubiE (Rv0558) - catalyses
this step in both menaquinone and ubiquinone
biosynthesis (menG is specific for menaquinone)
 Expressed, refolded, crystallized, solved to 1.9Å by
SIRAS
Common methyltransferase fold
MenG structure – Jodie Johnston
 Structure does not
look like a
methyltransferase
 Resembles a
phosphate transfer
domain?
 Incorrect annotation
Challenges for the future






Membrane proteins
Solubility of expressed proteins
Hetero-oligomeric proteins
Protein-protein interactions
Assignment of function to “unknowns”
Cellular pathways - metabolic pathways
- signalling pathways
Conclusions
 Structural biology is being transformed by
new technologies – some driven by genomics
 Less effort in solving initial structures – more
emphasis on “downstream” studies
 TB structural genomics consortium – a different
model for large scale structure determination
- access to centralised facilities
- international effort on a common goal
- collaboration rather than competition
- opportunities for smaller labs
Thanks
 Mycobacterium tuberculosis structural genomics
consortium
 Members of Auckland Structural Biology
Laboratory – Vic Arcus, Kristina Backbro, Mark
Banfield, Heather Baker, Graeme Card, Jodie Johnston,
Rainer Knijff, Moyra Komen, Shaun Lott, Andrew
McCarthy, Clyde Smith
 Marsden Fund
Health Research Council
New Economy Research Fund