Drug Discovery: Proteomics, Genomics

Download Report

Transcript Drug Discovery: Proteomics, Genomics

Drug Discovery: Proteomics,
Genomics
Philip E. Bourne
Professor of Pharmacology UCSD
[email protected] 858-534-8301
SPPS273
1
Agenda
• Where my perspective comes from
• The interplay between omics, IT and drug
discovery
• The omics revolution
• Changes in IT and open science and software
licensing
• Applying the new biology to drug discovery
– Example 1 – Drug repositioning
– Example 2 - Determining side-effects
• Words of caution
SPPS273
2
Some Background
• We work in the area of structural
bioinformatics
• We distribute the equivalent to ¼
the Library of Congress to approx.
250,000 scientists each month
• We are interested in improving
the drug discovery process
through computationally driven
hypotheses on the complete
biological system
• Personally:
– Open science advocate
– Started 4 companies
– Spent whole life in the ivory tower
The Source of My Perspective
SPPS273
3
Observations
• Glass ½ Empty: drug
discovery in the
traditional sense is in a
woeful state
• Glass ½ Full:
– We have an explosion of
data and hence a new
emerging understanding of
complex biological systems
– Information technology is
advancing rapidly
The Take Home Message
• Let optimism rule – let
traditional computational
chemistry and
cheminfomatics meet
bioinformatics, systems
biology and information
science to discover drugs
in new ways
SPPS273
4
The Drivers of Change – Data & IT
Biological Experiment
Collect
Data
Information
Characterize
Knowledge
Compare
Discovery
Model
Infer
Complexity
Higher-life
Technology
1
Organ
10
Brain
Mapping
Cardiac
Modeling
Cellular
Structure
Sequence
The Omics Revolution
102 Neuronal
Modeling
106
Virus
Structure
106 Computing
Power
Virtual
Communities
# People
/Web Site
1
Ribosome
Genetic
Circuits
Human
Genome
Project
Yeast
E.Coli
C.Elegans
Genome Genome Genome
ESTs
90
105
Blogs
Facebook
Model Metaboloic
Pathway of E.coli
Sub-cellular
Assembly
Data
1000
100
Gene Chips
95
Year
00
1 Small
Genome/Mo.
Human
Genome
05
1000’s
GWAS
Sequencing
Number of released entries
Its Not Just About Numbers its About Complexity
The Omics Revolution
Year
Courtesy of the RCSB Protein Data Bank
Metagenomics - 2007
• New type of genomics
• New data (and lots of it)
and new types of data
– 17M new (predicted
proteins!) 4-5 x growth
in just few months and
much more coming
– New challenges and
exacerbation of old
challenges
The Omics Revolution
8
Metagenomics: Early Results
• More then 99.5% of DNA
in very environment
studied represent
unknown organisms
– Culturable organisms are
exceptions, not the rule
• Most genes represent
distant homologs of known
genes, but there are
thousands of new families
The Omics Revolution
• Everything we touch
turns out to be a gold
mine
• Environments studied:
– Water (ocean, lakes)
– Soil
– Human body (gut, oral
cavity, human
microbiome)
9
Metagenomics New Discoveries
Environmental (red) vs. Currently Known PTPases (blue)
1
The Omics Revolution
10
The Good News and the Bad News
• Good news
– Data pointing towards function are growing at
near exponential rates
– IT can handle it on a per dollar basis
• Bad news
– Data are growing at near exponential rates
– Quality is highly variable
– Accurate functional annotation is sparse
The Omics Revolution
11
Example of the Interplay Between Bioinformatics &
Proteomics - The Structural Genomics Pipeline
Structural biology moves from being functionally driven to genomically driven
Basic Steps
Crystallomics
• Isolation,
Target • Expression,
Data
Selection • Purification, Collection
• Crystallization
Fill in
Robotics
protein fold -ve data
space
The Omics Revolution
Structure
Solution
Structure
Refinement
Software engineering
Functional
Annotation
Publish
Functional
prediction
Not
necessarily
12
Towards Open Science
• Open access publishing
• Open source software
• Generation of scientists weaned on social
networks
• Blogs, wikis, social bookmarking etc. are
becoming a valid form of scientific discourse
http://www.osdd.net/
SPPS273
13
University Tech Transfer Offices are
Slow to Embrace this Change
• Overvalue disclosures
• Inability to market disclosures appropriately
• Protracted negotiations in a fast moving
market
• Disable rather than enable startups
SPPS273
14
So Why is All of This So
Important to Drug Discovery?
We are beginning to piece together a
complex living system and we need
to understand that to do better
SPPS273
15
Why Don’t we Do Better?
A Couple of Observations
• Gene knockouts only effect phenotype in 10-20%
of cases , why?
– redundant functions
– alternative network routes
– robustness of interaction networks
A.L. Hopkins Nat. Chem. Biol. 2008 4:682-690
• 35% of biologically active compounds bind to
more than one target
Paolini et al. Nat. Biotechnol. 2006 24:805–815
Why Don’t we Do Better?
A Couple of Observations
• Tykerb – Breast cancer
• Gleevac – Leukemia, GI
cancers
• Nexavar – Kidney and liver
cancer
• Staurosporine – natural product
– alkaloid – uses many e.g.,
antifungal antihypertensive
Collins and Workman 2006 Nature Chemical Biology 2 689-700
Implications
• Ehrlich’s philosophy of magic bullets targeting
individual chemoreceptors has not been
realized
• Stated another way – The notion of one drug,
one target, one disease is a little naïve in a
complex system
So How Can We Exploit All The New
Data We are Collecting on This
Complex System?
Lets Work Through a Couple of
Examples
SPPS273
19
What if…
• We can characterize a protein-ligand
binding site from a 3D structure
(primary site) and search for that site
on a proteome wide scale?
• We could perhaps find alternative
binding sites (off-targets) for existing
pharmaceuticals and NCEs?
Exploiting the Structural Proteome
What Do These Off-targets Tell Us?
•
Potentially many things:
1. Nothing
2. How to optimize a NCE
3. A possible explanation for a side-effect of a drug
already on the market
4. A possible repositioning of a drug to treat a
completely different condition
5. The reason a drug failed
6. A multi-target strategy to attack a pathogen
Exploiting the Structural Proteome
Need to Start with a 3D Drug-Receptor Complex
- The PDB Contains Many Examples
Generic Name
Other Name
Treatment
PDBid
Lipitor
Atorvastatin
High cholesterol
1HWK, 1HW8…
Testosterone
Testosterone
Osteoporosis
1AFS, 1I9J ..
Taxol
Paclitaxel
Cancer
1JFF, 2HXF, 2HXH
Viagra
Sildenafil citrate
ED, pulmonary
arterial
hypertension
1TBF, 1UDT,
1XOS..
Digoxin
Lanoxin
Congestive heart
failure
1IGJ
Exploiting the Structural Proteome
A Reverse Engineering Approach to
Drug Discovery Across Gene Families
Characterize ligand binding
site of primary target
(Geometric Potential)
Identify off-targets by ligand
binding site similarity
(Sequence order independent
profile-profile alignment)
Extract known drugs
or inhibitors of the
primary and/or off-targets
Search for similar
small molecules
…
Dock molecules to both
primary and off-targets
Statistics analysis
of docking score
correlations
Exploiting the Structural Proteome
Xie and Bourne 2009
Bioinformatics 25(12) 305-312
The Problem with Tuberculosis
•
•
•
•
•
One third of global population infected
1.7 million deaths per year
95% of deaths in developing countries
Anti-TB drugs hardly changed in 40 years
MDR-TB and XDR-TB pose a threat to
human health worldwide
• Development of novel, effective, and
inexpensive drugs is an urgent priority
Example 1 – Repositioning The TB Story
Found..
• Evolutionary linkage between:
– NAD-binding Rossmann fold
– S-adenosylmethionine (SAM)-binding domain of SAMdependent methyltransferases
• Catechol-O-methyl transferase (COMT) is SAMdependent methyltransferase
• Entacapone and tolcapone are used as COMT
inhibitors in Parkinson’s disease treatment
• Hypothesis:
– Further investigation of NAD-binding proteins may uncover
a potential new drug target for entacapone and tolcapone
Example 1 – Repositioning The TB Story
Kinnings et al. 2009 PLoS Comp Biol 5(7) e1000423
Functional Site Similarity between COMT
and InhA
• Entacapone and tolcapone docked onto 215 NAD-binding
proteins from different species
• M.tuberculosis Enoyl-acyl carrier protein reductase ENR (InhA)
discovered as potential new drug target
• InhA is the primary target of many existing anti-TB drugs but
all are very toxic
• InhA catalyses the final, rate-determining step in the fatty acid
elongation cycle
• Alignment of the COMT and InhA binding sites revealed
similarities ...
Repositioning - The TB Story
Kinnings et al. 2009 PLoS Comp Biol 5(7) e1000423
Binding Site Similarity between COMT and
InhA
COMT
SAM (cofactor)
BIE (inhibitor)
InhA
NAD (cofactor)
641 (inhibitor)
Example 1 – Repositioning The TB Story
Kinnings et al. 2009 PLoS Comp Biol 5(7) e1000423
Summary of the TB Story
• Entacapone and tolcapone shown to have potential for
repositioning
• Direct mechanism of action avoids M. tuberculosis
resistance mechanisms
• Possess excellent safety profiles with few side effects –
already on the market
• In vivo support
• Assay of direct binding of entacapone and tolcapone to
InhA reveals a possible lead with no chemical relationship
to existing drugs
Example 1 – Repositioning The TB Story
Kinnings et al. 2009 PLoS Comp Biol 5(7) e1000423
Summary from the TB Alliance –
Medicinal Chemistry
• The minimal inhibitory concentration (MIC) of
260 uM is higher than usually considered
• MIC is 65x the estimated plasma
concentration
• Have other InhA inhibitors in the pipeline
Example 1 – Repositioning The TB Story
Kinnings et al. 2009 PLoS Comp Biol 5(7) e1000423
Predicted protein-ligand interaction network of M.tuberculosis. Proteins that are
predicted to have similar binding sites are connected. Squares represent the top 18
most connected proteins.
Bioinformatics 2009 25(12) 305-312
The TB Druggome
Bioinformatics 2009 25(12) 305-312
The TB Druggome
SMAP p-value < 1e-5
drugs
TB proteins
The TB Druggome
p < 1e-7
p < 1e-6
p < 1e-5
New Ways of Thinking
• Polypharmacology – One or multiple drugs
binding to multiple targets for a collective
effect aka Dirty Drugs
• Network Pharmacology – Measuring that
effect on the whole biological network
SPPS273
33
Example 2 - The Torcetrapib Story
PLoS Comp Biol 2009 5(5) e1000387
Cholesteryl Ester Transfer Protein (CETP)
CETP inhibitor
X
CETP
LDL
Bad Cholesterol
HDL
Good Cholesterol
• collects triglycerides from very low density or low density lipoproteins (VLDL
or LDL) and exchanges them for cholesteryl esters from high density
lipoproteins (and vice versa)
• A long tunnel with two major binding sites. Docking studies suggest that it
possible that torcetrapib binds to both of them.
• The torcetrapib binding site is unknown. Docking studies show that both
sites can bind to torcetrapib with the docking score around -8.0.
Example 2 - The Torcetrapib Story
PLoS Comp Biol 2009 5(5) e1000387
Docking Scores eHits/Autodock
Off-target
PDB Ids
Torcetrapib
Anacetrapib
JTT705
Complex ligand
CETP
2OBD
-11.675 / -5.72
-11.375 / -8.15
-7.563 / -6.65
-8.324 (PCW)
Retinoid X receptor
1YOW
1ZDT
-11.420 / -6.600
-6.74
-8.696 / -7.68
-7.35
-6.276 / -7.28
-6.95
-9.113 (POE)
PPAR delta
1Y0S
-10.203 / -8.22
-10.595 / -7.91
-7.581 / -8.36
-10.691(331)
PPAR alpha
2P54
-11.036 / -6.67
-0.835 / -7.27
-9.599 / -7.78
-11.404(735)
PPAR gamma
1ZEO
-9.515 / -7.31
> 0.0 / -8.25
-7.204 / -8.11
-8.075 (C01)
Vitamin D receptor
1IE8
>0.0/ -4.73
>0.0 / -6.25
-6.628 / -9.70
-8.354 (KH1) -7.35
Glucocorticoid
Receptor
1NHZ
1P93
Fatty acid
binding protein
2F73
2PY1
2NNQ
>0.0/ -4.33
>0.0/-6.13
/-6.40
>0.0/ -7.81
>0.0/ -6.98
/-7.64
-7.191 / -8.49
/-6.33
/6.35
???
T-Cell CD1B
1GZP
-8.815 / -7.02
-13.515 / -7.15
-7.590 / -8.02
-6.519 (GM2)
IL-10 receptor
1LQS
/ -4.59
/ -6.77
GM-2 activator
2AG9
-9.345 / -6.26
-9.674 / -6.98
(3CA2+) CARDIAC
TROPONIN C
1DTL
/-5.83
/-6.71
/-5.79
cytochrome bc1
complex
1PP9 (PEG)
/-6.97
/-9.07
/-6.64
1PP9 (HEM)
/-7.21
/8.79
/-8.94
1V5H
/-4.89
/-7.00
/-4.94
human cytoglobin
Example 2 - The Torcetrapib Story
/-4.43
/-5.63
/-7.08
/-0.58
/-7.09
/-9.42
/ -5.95
-8.617 / -6.17
???
??? (MYR) -4.16
PLoS Comp Biol 2009 5(5) e1000387
JTT705
Torcetrapib
Anacetrapib
JTT705
VDR
–
RAS
High blood
pressure
+
RXR
PPARα
PPARδ
PPARγ
+
Anti-inflammatory
function
FA
?
?
FABP
?
JNK/IKK pathway
JNK/NF-KB pathway
Immune response
to infection
Example 2 - The Torcetrapib Story
PLoS Comp Biol 2009 5(5) e1000387
The Future?
Chang et al. 2009 Mol Sys Biol Submitted
Modifications to Early Stage Drug Discovery
Off-targets
http://www.celgene.com/images/celgene_drug_arrow.gif
Systems Biology
SPPS273
39
Some Known Limitations
•
•
•
•
Structural coverage of the given proteome
False hits / poor docking scores
Literature searching
It’s a hypothesis – need experimental
validation
• Money 
Known Limitations
Perceived Limitations
• Mistrust of computational approaches
• Bioinformatics was previously oversold
• Omics was previously oversold
• Still too cutting edge
• No interest in drug resistance
SPPS273
41
[email protected]
Questions?