Overview of the Biotech Industry

Download Report

Transcript Overview of the Biotech Industry

Overview of the Biotech
Industry
Srinivasan Seshadri, CEO, Strand
Genomics
The drug discovery process is currently being transformed by emerging technologies
EMERGING DISCOVERY PROCESS
Basic
process
Current
impact of
new
technologies
Identify
compounds to
interact with
target lead
Identify disease
mechanism –
define targets
Fundamentally new
approach
Old world
•Molecular biology
•Physiology
•Biochemistry
New world
•Genomics
•Combinatorial chemistry
•High throughput screening
Batter/faster
Optimise
leads
Biological
validation
Better/faster
No major
breakthrough
•Still reliant on in vitro/in
vivo models of disease
Old world
•Slow, largely manual chemical synthesis of leads
•Slow, manual screening on limited range of assays
New world
•More sophisticated/automated (high throughput)
screening has increased lead identification productivity
by 30 times
•Rate of compound generation increased by factor
>1,000 through combinatorial chemistry techniques
(although not clear what % are useful)
•Greater use of more
sophisticated ‘genetic’
models, but currently
complex and slow
1
Genomics is central to this evolving landscape. The goal of Genomics is to unravel the genetic basis of
health and disease providing a huge array of potential drug targets. Given the complexity of the genome
and the volumes of data being generated, significant challenges exist in accessing and leveraging this
data effectively. New IT based arenas are emerging to do this
GENOMICS: A BIOLOGICAL DEFINITION
Genomics is the study of the genetic composition of an organism and provides information on
the structure, role and genetic linkage of genes. Some gene function is implicated in disease
and it is therefore believed that better, more specific information about the origins of a disease
will lead to more effective treatments.
AT
TA
G
CC
GC
G
G
C
AT
TA
The characteristics
(phenotype) of each
individual. . .
. . . and their organs and . . . are determined by
tissues. . .
their genetic makeup
(genotype). Every cell
contains the full
“Genomics represents a paradigm
complement of
shift in disease treatment from
individual’s genetic
‘underlying mechanism’ to ‘root cause”
material – the genome
CSO, Genomic-co
The genome consists of
a length of double
standard DNA to which
are attached 4 types of
molecule of bases.
There are 3 billion of
these in total
Some of these bases
code for proteins [the
cell manufactures
protein using the DNA
template). Others fill in
the gaps and have no
other function. A gene
represents a section of
DNA which codes for a
protein or other
2
functional piece of
cellular machinery (e.g.,
Bioinformatics describes one of two information driven arenas within pharmaceutical drug discovery. . .
Focus on this document
BIOINFORMATICS: A DEFINITION
Description
Bioinformatics
Key applications
•Information technology •Searching external
designed/used to
genomic databases
generate and access
genetic data and derive
•Constructing and
information from it.
managing proprietary
databases
•Extracting information
Informatics
from data
–Gene expression in
health and disease
–Gene function in health
and disease
Cheminformatics
Relevance for
pharmacos
•Pharmacos need to be
able to effectively access
external sources
•Pharmacos need to
create proprietary
databases (derived from
data from multiple
sources) so that they can
be tailored to the needs of
internal discovery function
•Targets need to be
identified and their role
defined leveraging genetic
data
•Information technology
•Molecular structure
•IT solutions required to
used to design molecular
libraries to interact with
identified targets
design
•Structure-activity
relationships
•Molecular library
management/manipulation
manage the increasing
scale of molecule
generation with discovery
process
•Predominantly addressed
within pharmacos
3
although may require
leveraging links with
partners and across
. . . and currently assists in leveraging genetic data. As the arena develops, the current boundaries with
cheminformatics are likely to blur
INFORMATICS USED IN DRUG DISCOVERY
Define target
Gene
sequence
Target role
Protein
sequence
Protein
sequence
Function/
active site
Analysing sequence data is just the
starting point for bioinformatics – the key
step will be relating that data back to
protein structure
J. Craig Venter, TIGR
Identify lead
compounds
which interact
with targets
Optimise leads
Bioinformatics
Cheminformatics
Activity
•Trawl genomic
•Define amino
•Determine
•Define protein
•Trawl molecular •Refine search/
databases for
genes of interest
acid sequence
derived from
gene
structure of
protein and how
it folds into
active molecule
activity
•Define likely
molecule
structures to
interact with
target
databases for
likely activity
against target
development of
lead compound
•Links with combinational chemistry
and high throughput screening
4
Many significant hurdles remain – before the value of bioinformatics can be fully exploited
High
Medium
Low
KEY TECHNOLOGICAL AND SCIENTIFIC HURDLES/CHALLENGES
Technology/
IT-based
Key challenges
Why important
•Developing tools which improve
•Lower the barriers for effective use
human-computer interface
of computers by multiple disciplines
to access databases and translate
data into user-friendly format
•Allowing disparate systems to
•IT architectural differences
interface with genomic databases
constrain access to databases
•Developing industry standard low
cost infrastructure to access
databases (internal and external)
Key challenges
for bioinformatics
Science-based
infrastructure is often too costly/time
consuming
•Current database data mining
effectiveness of database mining
generates vast quantities of
irrelevant to search criteria
•Most drug targets are proteins
•Need to have clear understanding of role
of protein to drug design
currently carried out using laborious
methodologies of moderate efficacy
e.g., X-ray crystallography
•Predicting protein function currently
•Experimental methodologies being
expensive and time consuming
developed e.g. gene knockouts, although
time consuming and ill-developed
•Understanding how genes and
proteins are expressed/modified in
vivo currently unknown
Source: interviews, press search
•Developing proprietary
•Increasing the efficiency and
•Predicting tertiary protein structure
Size of
challenge
•Gene structure predicted from genetic
sequences may not reflect the gene
expressed in vivo, and proteins can be
modified into alternative structures with
differing function to those predicted
5
The current activity is only a small part of what bioinformatics (integrating with the other emerging
technologies) could contribute to the way we understand and treat disease
BIOINFORMATICS HIERARCHY OF POSSIBILITIES
Insilica
research
Examples
Status
•Replacing animal-based ‘wet biology’
•Very preliminary
with computer-based predictive
models
•Replacing crystallography to
Increasing the
productivity of
discovery
Increasing the effectiveness and
efficiency of genomic database
mining
Source: interviews, articles
determine protein structure with
predictive models
•Generating better information about
disease and patient populations
allowing better targeting of clinical
trials
•In early development
•In medium stage
development
•Mining genetic databases from normal
•Ongoing
and diseased populations to elucidate
gene function
•Increasing the effectiveness/efficiency
of generating information from
genomic data
•Ongoing
6
To-date, bioinformatics has developed symbiotically with Genomics. It is now emerging as a field in its
own right
BIOINFORMATICS INDUSTRY EVOLUTION: A DESCRIPTION
Academia-driven
Multiple academic
groups leveraging
existing IT
competencies to
develop insights into
identification and role of
genes
Source: Team interviews
Genomics-driven
Gene function-driven
As genomic data becomes
easier to generate,
genomic companies
(positional cloners and
sequencers) develop IT
systems to facilitate
access to genome
sequences
Focus of effort becomes role and
function of genes, and in
particular, gene products.
Organisations develop IT skills to
push the boundaries of
knowledge (e.g., predicting
protein structure-activity
relationships).
Key differentiating factor is
scope and scale of gene
databases to which clients
have access
Key differentiating factor is
becoming a company’s ability to
provide bioinformatic solutions to
extract ‘information’ from genes
7
BIOINFORMATICS INDUSTRY EVOLUTION: KEY MILESTONES
1981
First 579 human
genes mapped
Key Scientific
Milestones
1972
first DNA cloning
(Boyer & Cohen
1977
Chemical method for
sequencing DNA
devised (Gilbert &
Maxam)
1983
Method for automated
DNA sequencing
(Carruthers & Hood)
1983
Huntingdon’s disease
gene demonstrated to
be on chromosome 4
(Gusella)
1991
Expressed sequenced
tags (ESTs) created
(Venter)
1992
First genetic linkage map of
entire human genome
published, and first whole
human chromosome
physical maps (Y and 21)
GENOMICS-DRIVEN
ACADEMIA-DRIVEN
GENE FUNCTION-DRIVEN
Genomics/
Bioinformatics
Industry Activity
1977
First genetic
engineering
company,
Genentech,
founded
1982
Genbank
established
1990
Human Genome
Project launched
1988
Human Genome
Organisation (HUGO)
founded
1996/7
Genomic industry
broadening value
proposition
1997
1993
Emergence of
Incyte goes
bioinformatic
public, the first of
players with no
many U.S.
genomic heritage
genomic
e.g.,
companies to do
- Pangea
so
8
- MAG
- NetGenics
SUMMARY
TECHNOLOGIES
Three broad enabling technologies are driving progress in drug R&D :
• Genomics leads to better disease understanding and target
identification
• Combinatorial chemistry generates more lead compounds
• High throughput screening tests more leads on a greater number of
targets
BIOINFORMATICS
The explosion of data and the increasing demand for sophisticated
analytical tools has given rise to a rapidly growing bioinformatics market
with three major service areas :
• Database providers who generate and organize genome and discovery
data
• Discovery software providers who provide cutting-edge IT solutions to
elements of the discovery process
• Research enterprise ASPs who integrate multiple databases and analysis
tools into a single platform
9
THREE BROAD TECHNOLOGIES ARE DRIVING DRUG DISCOVERY
• Study of both structural and
functional aspects of the
genome, including both genes
and proteins, leading to a
greater understanding of cellular
processes and disease
GENOMICS
Supported by
BIOINFORMATICS
• Rapid and systematic
generation of a variety of
molecular entities, or building
blocks, in many different or
unique combinations
CATALYTIC/
COMBINATORIAL
CHEMISTRY
HIGH
THROUGHPUT
SCREENING
• Use of robotic automation to
allow for massive parallel
experimentation and testing of
many compounds or targets
10
MANY SPECIFIC EMERGING TECHNOLOGIES HAVE LED TO THE ADVANCES IN
GENOMICS, COMBINATORIAL CHEMISTRY AND HIGH THROUGHPUT SCREENING
•
•
•
•
•
•
•
• Synthetic biopolymers
• Biochemical drug delivery and
encapsulation systems
Antisense
Transgenics
Gene therapy
Pathway mapping
Surrogate markers
Animal-free disease models
Genetic networks
GENOMICS
CATALYTIC/
COMBINATORIAL
CHEMISTRY
HIGH
THROUGHPUT
SCREENING
• Intelligent chemical systems
•
•
•
•
•
HT DNA sequencing
HT proteomics
Biochip microarrays
Pharmacogenomics
Biosensors
• Lab Automation
• Micromachines/miniaturization
• CC library arrays
• Chemical chips
• Advanced biophysical assays
Note : HT = High Throughput; CC = Combinatorial Chemistry
11
TECHNOLOGIES
Three broad enabling technologies are driving progress in drug R&D :
• Genomics leads to better disease understanding and target
identification
• Combinatorial chemistry generates more lead compounds
• High throughput screening tests more leads on a greater number of
targets
BIOINFORMATICS
The explosion of data and the increasing demand for sophisticated
analytical tools has given rise to a rapidly growing bioinformatics market
with three major service areas :
• Database providers who generate and organize genome and discovery
data
• Discovery software providers who provide cutting-edge IT solutions to
elements of the discovery process
• Research enterprise ASPs who integrate multiple databases and analysis
tools into a single platform
12
BIOINFORMATICS IS THE “BRAINS OF BIOTECHNOLOGY”
In order for Genomics, HTS, and combinatorial chemistry to have impact, they must increasingly rely on
bioinformatic capabilities.
BIOINFORMATICS
“BROAD SCIENCE THAT INVOLVES BOTH
CONCEPTUAL AND PRACTICAL TOOLS FOR THE
UNDERSTANDING, GENERATION, PROCESSING,
AND PROPAGATION OF BIOLOGICAL
INFORMATION”1
GENOMICS
Supported by
BIOINFORMATICS
CATALYTIC/
COMBINATORIAL
CHEMISTRY
HIGH
THROUGHPUT
SCREENING
Science, “Bioinformatics in the Information Age” April 2000; 287; 1221
Source : “Brains of Biotechnology” is from Karl Thiel, Biospace.com
1
13
NEW TECHNOLOGIES ARE DRIVING THE NEED FOR BIOINFORMATICS
DATA AND ANALYSIS CAPABILITIES
EXPLOSION OF DATA
GENOMICS
CATALYTIC/
COMBINATORIAL
CHEMISTRY
HIGH
THROUGHPUT
SCREENING
NEW TECHNOLOGIES
ANALYSIS TOOLS
• Gene (DNA) sequences
• Protein sequences
• SNP mapping and disease
mapping
• Gene expression profiles by
tissue, species, and drug
influence
• Protein expression profiles
• Protein:protein interaction
profiles
• Protein structure
information
• CC libraries
• Screening activity data
(SAR)
• Toxicology databases
• Sequence alignment
searches (BLAST)
• Relational alignment
programs (phylogeny)
• Virtual lab processes
software (PCR,
elongation)
• Protein folding algorithms
• Structure-based target
design using virtual SAR
modeling
• Virtual CC generation and
screening
• ADME and toxicology
profiling software
Demand for
different types of
databases
Demand for
discovery software
14
The global market for bioinformatics is expected to show significant growth
over the next five years. However the state of infancy of the industry poses
credibility issues on the estimates from research houses
Growth in Global Bioinformatics
$10-20Bln
$5Bln
•Numbers likely to include
•Software solutions
•Automations tools
•“Hardware” such as microarrays
$300m
1998
2003
15
The current biotechnology market in India is focused on the AgBio, Industrial
and Vaccine sectors, but will see emerging opportunities in Bioinformatics and
vaccines
Growth in Indian Biotechnology
100%=??
Bioinformatics
Genome Technologies
Vaccines
100%=$500m
Industrial
22%
Ag Products
25%
Health Products
47%
1998
The future will witness
additional opportunities in
Informatics and related
genome based technologies
2003
16
Source: Biosupportinida
Projected growth in Pharma and Biotech R&D spending will enable the
industry to attain its projected targets
Pharmaceutical R&D Budgets
100%=100Bln
$46B
$20B
$13B
$13B
$7B
Typical IT budgets will
be 10-20% of total R&D
Discovery
PreClinical Clinical CMC Clinical Production/
Trials manufacturing
17
Source: PhRMa
The Lehman Report consisted of interviews with decision makers in Pharma
and Biotech and highlighted some interesting findings
Summary of Findings
• New Biology will significantly increase R&D costs- a large chunk of which are technology driven
•Companies will see substantial pressure on earnings
•Attempts to use “today's relatively immature technology” will result in higher failure rates amongst
“novel” targets. These failures will likely also stretch out the time period for the arrival of new drugs that
Genomics promises
•High risk of “novel target failure”
•Less understood (only 8 publications per novel target vs. 100 for those generated by
conventional methods
•Companies pushing these less understood targets through the drug pipeline
•Traditional chemical technologies will n to be sufficient to identify novel chemical entities
that can interact with a target- could have adverse outcomes during the clinical trial process
18
Source: & Company
Genomics influenced increase in R&D Costs
Assuming no increase in technology
More than doubles
From current
2010
3.6
3.6
2005
3.2
2000
1995
1.6
Total Annual
R&D Budget
2
2
NCEs
1
1.6
0.8
Annual
R&D Budget/NCE
output
19
Genomics influenced increase in R&D Costs
Assuming moderate increase in technology
2010
Promise of productivity expansion
2.7
4.4
2005
2.6
2000
1995
1.6
Total Annual
R&D Budget
2
2
NCEs
0.6
1.3
0.8
Annual
R&D Budget/NCE
output
20
Most technologies are likely to make an impact only 5-10 years from today
5-10 FROM IMPACT
Integrated Technologies
7
Value
Mapout biological
Pathways
Seq Human
Genome
Delineate disease
mechanisms
Map out human
proteome
Map out human
genome
0
2000
2005
2010
2015
We are still years away from the real impact of Genomics technology. Most of them have just got started
-Biotech Executive
21
*Integrated technologies include both experimental and informatics approach
Most technologies are likely to make an impact only 5-10 years from today
5-10 FROM IMPACT
Protein Chips
6
Value
Identify Differential
Expression
0
Profile complex
diseases
Identify some
Cellular proteins Identify key
Post-translational
modifications
2000
2005
2010
2015
It will be a few years before we have a protein chip that is cheap, fast and accurate
-Biotech Executive
Proteomics will be a big help with target validation. H however, we still need to increase speed and improve
Productivity
-Pharma R&D executive
*Integrated technologies include both experimental and informatics approach
22
Most technologies are likely to make an impact only 5-10 years from today
5-10 FROM IMPACT
Bioinformatics data mining
7
Assign single
Function based
On functional
Genomics data
Value
Basic protein
Structure
Homology queries
Correlate expression data
And protein interaction data
Correlate gene/protein
Expression date with
function
0
2000
2005
2010
2015
Most of the data mining algorithms are pretty primitive and straightforward today
-Biotech Executive
We are facing more explosive data produced by Genomics technologies. Unfortunately, the informatics
tools are still not there to allow us to explore them fully
-Pharma R&D executive
*Integrated technologies include both experimental and informatics approach
23
Large investments are necessary to reap the benefits of technology
THRESHOLD LEVEL OF INVESTMENT NECESSARY
iNFORMATICS
•Threshold annual
Expenditures
$20-40m
•Bioinformatics
•Key means/technol
ogies to achieve impact •Chemoinformatics
•Clinical Informatics
at bottleneck
TARGET
VALIDATION
$20-40m
•Functional Genomics
tools
•Database subscriptions
LEAD
OPTIMIZATION
$10-20
EXPLORATORY
DEVELOPMENT
$20-30m
•Closed loop chemistry •Process improvements
•Pharmacogenomics
•ADME
•Computer aided trial
•HTS
design
24
There are three broad organizational models emerging
BIOINFORMATICS PRODUCT/SERVICE MODELS
•Provide user friendly access
to proprietary and public
gene databases compatible
with customer IT architecture
•Requires bioinformatic and
genomic
competencies/assets
•Assumes customer does not
need to develop significant
in-house capabilities
•Conduct discrete stages
Gene
Database
Designer
Discovery
Services
Provider
of discovery process
•Requires broad informatic
and drug discovery
capabilities
•Value proposition built on
superior informatic
capabilities
IT
Architects
•Provide off-the“The trouble is that bioinformatics is so
new, and the market so ill-defined, that
companies are having difficulty settling on
the business model they will follow”
In Vivo
Source: ; press search
shelf/bespoke informatic
solutions
•Requires leading edge
bioinformatic capabilities
•Assumes customer has inhouse skills and
competencies to be able to
leverage and manipulate
genetic data
25
THESE DEMANDS FOR BIOINFORMATICS ARE ADDRESSED BY
THREE MAJOR SERVICE MODELS...
RESEARCH ENTERPRISE ASPs
INTEGRATED
• Provide user friendly interface that can access both off-theshelf bioinformatics software and more sophisticated IT
solutions
• Require extensive IT capabilities
DATABASE
FOCUS
DATABASE PROVIDERS
NARROW
DISCOVERY SOFTWARE PROVIDERS
• Provide access to proprietary and public
databases, e.g., gene and protein
sequences
• Provides cutting-edge computational
solutions to discrete components of the
discovery process
• Require data acquisition assets (e.g.,
Genomics heritage) along with solid
bioinformatics capabilities
• Requires extensive expertise in drug
discovery and bioinformatics capabilities
SIMPLE
COMPLEX
ANALYTICAL
CAPABILITIES
Source: analysis
26
...AND MANY PLAYERS HAVE ADOPTED EACH SERVICE MODEL
INTEGRATED
RESEARCH ENTERPRISE ASPs
eBioinformatics
DoubleTwist
NetGenics
Base4
Viaken
Bioreason
DATABASE
FOCUS
Strand
Celera
Genomics
Incyte
Structural
GenomiX
Compugen
Tripo
s
Hyseq
Molecular
Simulations
NARROW
DATABASE PROVIDERS
Spotfir
e
DISCOVERY SOFTWARE
SIMPLE
COMPLEX
ANALYTICAL
CAPABILITIES
Source: analysis; company websites
27
It is not yet clear which if any of the current approaches will prove sustainable
CORE BELIEFS AND CHALLENGES FOR EACH BUSINESS MODEL
Service model
Gene Database
Designer
Core beliefs
•Databases sufficiently fragmented thus
rendering inefficient for pharmacos to ‘go it
alone’
•Ability to remain ahead of pharmacos vis-avis technological innovation
•Genomic heritage a prerequisite for success
IT Architects
•IT skills are the defining basis of competition
not knowledge of Genomics
•IT solution will not emerge from existing
pharmaco IT suppliers
•Ability to remain ahead of other entities vis-avis technological innovation
Discovery
Services Provider
•Pharmacos will increasingly seek discoveryoriented solutions requiring broader skill set
(increasing proportion of research
investments are external)
•Value creating in longer term as provides a
base for full integration
Source: Team interviews; articles
Issues
•Multiple public databases challenging role of
proprietary databases
•Pharmacos are developing skills to create
bespoke databases in-house
•Real risk that skill could become a commodity
(e.g., cost of sequencing a bacterial genome
fell from $12m to $0.5m in 1997)
•Unclear who are the natural
owners/developers (“several pharmacos have
thought about this longer than we have . . . we
need to stay on the cutting edge” VP S&M
Molecular Applications Group)
•Clear potential for non pharma IT players to
enter market
•Potential commoditisation of services
•Not clear under which conditions pharmacos
will outsource discovery functions
•Issues of skills, critical mass and focus
present real challenges to companies
developing from a Genomics/IT heritage
28
The traditional genomic companies are polarising into two categories; those that design databases, and
those are broadening their value proposition to encompass ‘discovery’ offerings. The new breed of
bioinformatic companies are establishing themselves in a third category – IT architects
CATEGORISING TODAY’S BIOINFORMATICS COMPANIES
Product services providers
Gene database designers
IT architects
Discovery Services Provider
Building and distributing annotated
gene databases and services from
public and private
Building IT systems to enable the
sequencing, synthesis and
access of genomic data
Conducting discrete stages of the drug
discovery process using proprietary
systems and knowledge
•Alphagene
•Digital Gene Technologies Inc.
•Genome Therapeutics Corp.
•human Genome Sciences Inc.
•Hyseq
•Incyte Pharmaceuticals
•Myriad Genetics
•Sequana Therapeutics
•Base 4 bioinformatics
•Genecodes
•GeneTrace Systems
•Genomica Corp
•Informax Inc.
•MDL Information Systems Inc.
•Molecular Applications Group
•Molecular informatics Inc.
•Netgenics
•Oncormed
•Oxford Molecular
•Pangea Systems Inc.
•PE Applied Biosystems
Source: Annual reports; text lines; interviews; team analysis
•Acacia Biosciences
•Affymetrix
•Ariad Pharmaceuticals
•Chiroscience (acq. Darwin Molecular)
•Exelixis Pharmaceuticals Inc.
•Genelogic
•Genetech
•Millennium
•Mitokor
•Ontogency
•Pharmagene
•Progenitor
•Structural Bioformatics Inc.
•Xenometrix
29
MOST OF THE LATEST R&D TECHNOLOGIES WERE DEVELOPED OUTSIDE BIG
PHARMA
Genomics
Cheminformatics
Bioinformatics
Transgenic
animals
High
throughput
screening
PharmacoGenomics
Combinatorial
Chemistry
Proteomics
Molecular
modelling
Antisense
30
HT DNA Sequencing
Technology basics
• Typically, a sample of DNA is
amplified using PCR* with specific
fluorescent probes for AGTC;
separated by electrophoresis
through automated technology
and DNA sequence is analyzed.
• For sequencing of both genomic
DNA and expressed genes
(cDNA)
Competitive landscape
• Many players are involved in
sequencing the genome,
contributing to both proprietary
and public databases :
– Public : Human Genome
Project
– Human Genome Sciences
– Incyte Genomics
– Celera Genomics
• Supplements DNA mapping and
positional or functional cloning
Old
method :
DNA SEQUENCING TECHNOLOGY
Nucleotides/day
1,000,000s**
• Entire human genome will be
sequenced by end of 2001 (Celera
appears to be leading the way)
– All 3 billion nucleotides, on 23
pairs of chromosomes,
composing about ~100,000
genes!
• Sequencing does not provide any
insights about gene function,
merely a blueprint for proteins
• Viability of business model for
companies only sequencing DNA
is questionable. Most recognize
need to move towards functional
Genomics and protein studies
• Patents on genes or gene
fragments (expressed sequence
tags, or ESTs), without annotated
function data, are not likely to be
approved
1000s
1990
Status and current issues
2000
* PCR refers to Polymerase Chain Reaction, a technique for amplifying specific sequences of DNA
**Celera’s shotgun approach and powerful computers can sequence 11,000,000 nucleotides per day
31
HT Proteomics
Technology basics
Competitive landscape
Status and current issues
• Analysis of proteins and protein
expression in diseased and
normal states
• Fewer companies are engaged in
HT proteomics work than HT DNA
sequencing
• Proteomics deals with two areas:
• Key players in HT proteomics :
– Oxford GlycoSciences
– Large Scale Proteomics Corp.
– Proteome Inc.
– Ciphergen Biosystems
• Protein function depends on 3-D
structure and at present, even the
best computer software is not
good at modeling protein structure
– protein sequence, expression,
and modification analysis using
techniques of protein separation,
including 2-dimensional
electrophoresis (2-DE) and
protein chips, and identification,
typically involving mass
spectrometry
– 3D structure analysis by X-ray
crystallography and nuclear
magnetic resonance (NMR), as
well as complex computer
modeling. These structures are
useful for structure-based drug
design.
• Players in 3D protein folding (mostly
software) include :
– Structural GenomiX, Inc
– Structural Bioinformatics
– Bio-IT Ltd.
PROTEIN ANALYSIS TECHNOLOGY
Proteins analyzed/day 100,000s
• Understanding how proteins are
modified after expression,
especially in the presence of drugs
and/or disease, will dramatically
aid drug development
1000s
<1
1990
2000
Prototypes*
* Prototypes, which should be commercial within 2 years, involve high throughput separation techniques (HPLC) and advanced mass
spectroscopy (MALDI-TOF)
Source:Science journals, popular press, public biotechnology reports
32
Biochip Microarrays
Technology basics
• Biochip microarrays are ordered
sets of known molecules (DNA,
proteins, etc…) attached to a solid
support (silica, fibers, etc…) that
allow for a vast number of parallel
experiments in miniature.
• DNA chips are made by either
“building” short sequences of DNA
on chips or by attached pre-made
oligonucleotides (short pieces of
DNA) to the chip
• Expressed cDNA prepared from
samples is then allowed to interact
with the DNA on the chips and
these interactions are detected.
• This same principle can be applied
with proteins and small molecules
Competitive landscape
• Current market for biochips is
about ~$175 Million, and is
dominated by Affymetrix;
however, many new players are
entering the market with alternative
chip technologies :
– Nanogen (electroactive chips)
– Illumina (fiber optic bead-based)
– Sequenom (“industrial
Genomics” with mass
spectroscopy)
– Ciphergen (protein chips)
• Affymetrix business model : It
nearly “gives away” a detection
machine ($175,000) and then
hopes to make money from the
sale of its disposable GeneChips
(Razor blade approach)
Status and current issues
• As of today, chips with ~250,000
probes are commercially available;
in near future, probes representing
entire genomes should be
available
• “The use of DNA arrays to
interrogate biological information
represents a paradigm change that
will profoundly alter biology and
medicine”
Dr. Leroy Hood
University of Washington
• Uses for biochip microarrays are
exploding :
– gene sequencing
– polymorphism identification
– genetic testing
– gene expression profiling
– toxicology analysis
– forensics
– immunoassays
– proteomics
– drug screening
Source: Literature, BioInsight
33
GENE CHIP MICROARRAYS ARE SMALL GRIDS CONTAINING PIECES
OF DNA
Technology Basics
• Gene (or DNA)
chips are grids
• Each square
(feature) on the
grid contains the
same known
repeating DNA
sequence
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
.
.
.
.
.
.
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
T
T
A
• Different squares
contain different
sequences
T
T
A
T
T
C
T
T
G
T
T
T
T
A
A
T
A
C
T
A
G
T
A
T
T
G
A
T
G
C
T
G
G
T
G
T
T
C
A
T
C
C
T
C
G
T
C
T
A
T
A
A
T
C
A
T
G
A
G
A
A
C
A
A
C
C
A
C
G
A
C
T
A
G
A
A
G
C
A
G
G
A
G
T
A
A
A
A
A
C
A
A
G
A
A
T
G
T
A
G
T
C
G
T
G
G
T
T
G
G
A
G
G
C
G
G
G
G
G
T
G
G
A
G
G
C
G
G
G
G
G
T
G
C
A
G
C
C
G
C
G
G
C
T
C
T
A
C
T
C
C
T
G
C
G
A
C
C
A
C
C
C
C
C
G
C
C
T
C
G
A
C
G
C
C
G
G
C
G
T
C
A
A
C
A
C
C
A
G
C
A
T
Add mixture of
unknown
flourescently
labeled probes to
DNA chip
• Probes stick
(hybridize) to
squares that have
a similar sequence
to the probe
• A laser reads out
which squares the
probes stick to
• Software makes
the information
intelligible
Because the
DNA sequence is
known at each
location on the
DNA chip,
unknown probe
sequences can
be determined
by monitoring
where on the
DNA chip these
probes stick
34
BIOCHIPS CAN BE USED IN GENE EXPRESSION MONITORING AS A
POWERFUL TOOL FOR IDENTIFYING KEY GENES INVOLVED IN OR
AFFECTED BY DISEASE PROCESSES
Technology basics
Approach
DNA
Healthy
Tissue
RNA
Cell
DNA
• Compare readouts
from chips
exposed to
healthy and
diseased samples
(probes)
Probes
DNA chip
RNA
Diseased
Tissue
Cell
Probes
DNA chip
Find healthy
and diseased
individuals
Isolate
healthy and
diseased
tissues
Isolate RNA from
each sample (RNA
tells us which genes
are turned on)
Make fluorescently
labeled probes from
RNA (probes are
pieces of DNA that
represent genes
which are turned on)
Expose DNA chips
(which have thousands of known
genes on them) to
probes – probes will
only stick to DNA
chips in certain locations (see next page)
• Differences
(dashed boxes)
indicate genes
that may be
involved in the
disease process
• Gene products
(proteins) from
these genes may
serve as good
disease targets,
therapeutics, or
markers
35
THE COMPETITIVE LANDSCAPE FOR BIOCHIP ARRAYS IS
HEATING UP AS THE TECHNOLOGY RAPIDLY EVOLVES
Competitive Landscape
Five example companies and their technologies
Affymetrix • Disposable GeneChip array has oligos* attached
to it by photolithography
• Early leader in biochip development
• Oligos are bound by fluorescent probes
Nanogen
Illumina
• Pre-made oligos are bound to reuseable
semiconductor chip
• Electroactive spots on chip direct and move
attached oligos, which interact with fluorescent
probes
• Oligos (or drugs, proteins) are attached to microbeads, which self-assemble onto the tips of fibers
in an optical fiber bundled microarray
• Analyzed by fluorescence with fiber optics
Sequenom • MassArray chips have oligos attached to them
• Analysis by laser-ionization and mass
spectroscopy
• Called “industrial Genomics”
CipherGen • ProteinChip array has defined proteins (like
antibodies) bound to it which interact with ligands
in the sample
• Analyzed by laser-ionization and mass
spectroscopy
Uses of biochip
microarrays continues to
expolode :
• Gene sequencing
• Polymorphism identification
• Genetic testing (genotyping)
• Gene expression profiling
• Toxicology analysis
• Forensics
• Immunoassays
• Proteomics
• Drug screening
• Many others
Over 75 public and private
Biotech firms
make biochip technology
* Oligos are oligonucleotides, or short (25 bases) sequences of DNA
Sources : Press reviews, scientific journals, company reports
36
WHILE AFFYMETRIX HAS DOMINATED THE BIOCHIP MARKETPLACE,
STRONG COMPETITION FROM NEW BIOCHIP TECHNOLOGIES WILL LIKELY
FRAGMENT THE SECTOR FURTHER
Competitive Landscape
Market Share percent, 1999
1999 Market ~$176 Million
Other
Trends in competitive landscape
11%
Affymetrix
• New biochip technology players will
cut into Affymetrix’s marketshare
Homemade
43%
24%
Ciphergen
• Biochip market expected to grow to
~$1 Billion by 2005
2%
2%
ACLARA
6%
Caliper
9%
3%
Incyte
• Use of homemade chips will likely
decrease as complexity and
versatility of commercial chips
increases
• The market for hardware and
bioinformatic software for chip
detection and data collection/
analysis will also explode
Phase -1
Source: lLiterature; BioInsights
37
Pharmacogenomics
Technology basics
Competitive landscape
Status and current issues
• Every individual has a distinct set of
“polymorphism” or gene variants.
These variants could lead to
enhanced or diminished responses
to therapy.
• Key players include :
• Pharma community is in
agreement that pharmacoGenomics is important - but its
effects are uncertain:
• It applies genetic testing techniques
to identify these variants that are
predictive of a patient’s response to
a therapeutic agent
• Pharmacogenomics can be used to:
– increase the likelihood of a drug’s
success in the clinic by identifying
patients who are more likely to
have responses to drugs
– rescue previous drugs who failed
or were taken off the market for
safety concerns by identifying safe
patient populations
– Genset is working on a map of
SNPs for clinical testing (with
Abbott Labs)
– Others companies include:
· Affymetrix
· Celera
· GeneLogic
· Incyte
· LJL Biosystems
· Lynx Therapeutics
· Millennium Predictive
Medicine
– “The FDA has asked us (senior
pharma people) to come in and
discuss pharmacogenomic
testing with them”
B. Michael Silber
Director of Clinical Diagnostics
Pfizer
– “Rescuing drugs has the potential
to absolutely take off, or it might
not”
Greg Miller,
Head of Molecular Profiling,
Genzyme
38
Lab Automation
Technology basics
• With the explosion of compounds
from combinatorial chemistry and
the accelerated identification of
gene targets from Genomics, the
ability to analyze and screen
compounds becomes critical ratelimiting step. So highly automated
lab technologies have developed
in four major areas :
– Microplate readers and
equipment
– Liquid handling, manipulating,
and dispensing devices
– Robotics
– Software to control the process
Competitive landscape
Status and current issues
• Key players include :
– Robotics : LJL Biosystems,
Robocon, Zymark
– Microplate : Perkin Elmer,
Molecular Devices, Dynex
– Liquid : Beckman Coulter, Gilson
– Software : Oxford Molecular
Group, Tripos, MSI, MDL
Information systems
Market breakdown by sector
1998, Total market $1.1 Billion
Robotics and
Software
• Likely to see high growth in the
next few years as lab automation
increases
• Miniaturization will lead to lower
reagent costs; likely value shift to
equipment and software
• Huge need for quality
bioinformatics software that is
capable of data acquisition/
collection as well as data analysis
and storage.
16%
41%
Microplaterelated
equipment
Lab Automation market (WW)
2100
$, Millions
43%
13% CAGR
Liquid Handling/Manipulating/
Dispensing
•
•
•
•
dispensers
workstations
organic synthesizers
solid-phase extraction devices
1100
577
1993
1998
2003E
Source: Literature, Genetic Engineering News
39
Database suppliers/designers
Technology basics
Competitive landscape
Status and current issues
• Provides remote access to their
proprietary database, as well as
public ones; typically using an
internet or intranet platform
• Key players include :
• Multiple public databases, like
GenBank, are challenging the role
and importance of proprietary
databases in many areas
(especially Genomics).
• Data acquisition skills (e.g., DNA
sequencing heritage) is a
prerequisite for success in this
segment
• Generally, three main revenue
models :
– Subscription-based access
– Royalties-based and shared risk
– Fee-for-service
– Celera Genomics (subscriptions
to gene database, ESTs)
– Incyte Genomics (online “Incyte
2.0” : LifeSeq and LifeExpress
databases)
– Human Genome Sciences
(exclusive databases for Human
Gene Consortium)
– GeneLogic (Expression
databases)
– AlphaGene (DNA)
– Hyseq (GeneSolutions.com
provides access to proprietary
data)
– Myriad Genetics (ProNet, a
protein:protein interaction
database)
– Sequana
– Genset (SNPs database)
– Orchid Biosciences (SNPs)
– Oxford GlycoSciences
(LifeExpress with Incyte)
• Many pharmacos/biotechs are
developing their own bioinformatics
skills to handle databases in-house
• Large risk that gene data
acquisition skills could be
commodity (and therefore limit
value of proprietary databases),
e.g., cost of sequencing a bacterial
genome fell from $12m to $0.5m
by 1997
• Belief that current databases are
fragmented and inefficient - leading
many pharmaco/biotech firms to
outsource database management
Source: Literature; press releases
40
Discovery Software Providers
Technology basics
Competitive landscape
Status and current issues
• Provides cutting-edge informatics
solutions to discrete components of
the discovery process, e.g., protein
folding or CC library selection and
screening
• Key players include :
• Not clear which activities will be
outsourced and which will be
developed in-house
• Drug discovery process is
increasingly seeking more
sophisticated IT solutions/software
that require a specialized skill set
• Requires deep expertise in drug
discovery as well as leading edge
bioinformatics/IT capabilities
• Simply put, these are drug
discovery tool kit companies
– Structural Bioinformatics, Inc.
(structure-based target id using
sophisticated protein structure
modeling and database)
– Tripos (offers several discovery
tools, including FlexX, a virtual
CC library software)
– Molecular Simulations, Inc.
(Pharmacopeia subsidiary,
software simulates molecular
interactions of drugs, proteins)
– Compugen’s LabOnWeb.com
(aimed at early gene sequence
PCR work)
– Bioreason (chemical entity
analysis programs)
– Spotfire (decision analytic
software aimed at researcher
productivity)
– Molecular Mining Corp.
• Critical mass, skills, and focus are
important issues for firms
developing from a data acquisition
heritage.
• Value proposition must include
superior IT tools.
Source: Literature; press releases
41
Research Enterprise ASPs
Technology basics
Competitive landscape
Status and current issues
• Offer ASP platforms that integrate
broad databases and sophisticated
IT applications, coupled with
“research portal” functionality.
• Key players include :
– DoubleTwist (leader in the
research enterprise ASP space;
formerly Pangea)
– eBioinformatics
– Base4 (collaborative knowledge
and project management
platform with database handling
applications)
– NetGenics (subscription ASP
distributing computing platform
with broad discovery
applications)
– Genomica (Discovery Manager
software suite)
– Viaken (a premier life science
ASP for database hosting and
analytic software)
• Clear potential entry point for nonlife science IT players
• Provides user-friendly interface
that offers a suite of off-the-shelf
bioinformatics solutions enabling
users to access broad range of
applications for data and analysis
• Requires leading edge IT
capabilities, but does not rely on
any specific drug discovery or data
acquisition knowledge.
• Potential threat of commoditization
of services
• Unclear who are the natural
owners of this space
Source: Literature; press releases
42