Beyond Gene Prediction
Download
Report
Transcript Beyond Gene Prediction
Lessons from the
Human Genome Project
Tim Hubbard, TACD, October 2002
Everybody’s Genome
but
Benefits for Everyone?
The Human Genome Project
• Sequencing 3 billion bases was an unprecedented technical and
logistical challenge for biology.
• The initiative started in concept in 1985, in principle in 1990, and in
earnest in 1995.
• Obtaining the sequence is just the beginning. We now need to
interpret the sequence. Biology is acquiring a foundation in
information comparable to its foundation in chemistry. “Informatics is
to biology what mathematics is to physics”.
HGP Data release policy
(Bermuda Rules - 1996)
• Assembled sequence greater than 1000bp long is deposited in public
database (GenBank/EMBL/DDBJ) every 24 hours
• No patents are filed
• This applies to large scale production centres funded for systematic
sequencing, and was a joint commitment of the sequencing centres
and funding agencies
• Derives from principle that sequence will be of greatest public benefit
if freely accessible without restriction
– Also good for cooperation of competitive institutions and for relations with the
wider research community
0
May-00
Jan-00
Sep-99
May-99
Jan-99
Sep-98
May-98
Jan-98
Sep-97
May-97
Jan-97
Sep-96
May-96
Jan-96
Sep-95
May-95
Jan-95
Megabases
Human Genome Sequence in GenBank
3500
3000
2500
2000
1500
1000
CSH 1999
Draft
500
Finished
Human Genome Sequence in GenBank
3500
CSH 2000
ca. 35,000 genes
predicted
2500
2000
Draft
1500
1000
CSH 1999
500
May-00
Jan-00
Sep-99
May-99
Jan-99
Sep-98
May-98
Jan-98
Sep-97
May-97
Jan-97
Sep-96
May-96
Jan-96
Sep-95
May-95
0
Finished
Jan-95
Megabases
3000
Human Genome Sequencing
CHGC
JFCR
CSHL
CGM
GBF
Tok ai
othe r s
MaxPlanck
TIGR
UTSW
Ok lahom a
Ke io
Stanford (Davis )
U.Was h (Hood)
U.Was h (Ols on)
Stanford (M ye r s )
Be ijing
Je na
GTC
Ge nos cope
White he ad
RIKEN
Baylor
DOE JGI
Sange r Ce ntr e
Was hington
Unive r s ity
WIBR Template and Sequencing Room
ES40 Clusters, F.C Storage
Half way to corporate structure
• Strong structure of line-management; goals; planning
meetings
• Weekly international conference calls of grantholders with
funders
– Funder’s provide performance stats for ‘discussion’
• Competitive atmosphere between Sequencing Centres,
within a framework of open data, information exchange
• Academic salaries, no profit motive
Celera/HGP controversy
• Danger of commercial organisation obtaining
monopoly control over human genome sequence
• Controversy over whether private is more efficient
than public
• Controversy over whether “whole genome shotgun
method” is a panacea for large genomes
HGP sequencing strategy
Chromosome
24
Overlapping BACs
354,510
Tiling set
29,298
4-5 x shotgun sequence
& computer assembly
Draft sequence
……..TAGCTGTGTACGATGATC……….
4-5 x more shotgun
Gap closure
Problem solving
i.e. “Finishing”
~15 contigs per clone
1 contig
Finished sequence
less than one
error in 10,000
Celera assembly strategy
Reads
Contigs
Read pairs
Scaffold
Then order scaffolds on the chromosomes
using the HGP clone map and other
publicly available maps
public version of human genome (93% of genome, 7.5x coverage)
2.69 billion bases, 149,821 pieces
Celera
Assembler
Celera shotgun data
(99% of genome,
5.1x coverage)
private version of human genome
2.65 billion bases, 170,033 pieces
Took the public data and ended up with a version that was shorter
and in more pieces!
Beware Greeks
Scientists
bearing
bearing
gifts
press releases
The full story
“The Common Thread”
Sulston & Ferry
To be published in US
soon.
Lessons
• Twin track required
– Advocacy: Campaign again private ownership
– Action: Release data as fast as possible, to prevent
private ownership
• Advocacy to media critical
– Private domain was allowed to talk unopposed to media
for >1 year; results of this have still not gone away
• Competition between ‘public domain’ groups
important
The Human Genome now
•
•
•
•
Initial draft HGP assembly had 149,821 fragments
Initial draft Celera assembly had 170,033 fragments
Celera stopped sequencing in April 2000
HGP has since June 2000 ‘finished’ 50% of the Human
Genome, reducing the number of gaps by more than half.
• No Celera sequence is used by the public domain
• Finishing of Drosophila is also being carried out in the
public domain
Academic and commercial partnership
• The SNPs consortium (TSC) is a partnership of 12 companies and Wellcome
Trust
– $3 million membership
– 1 million SNPs by Autumn 2000
– all data freely accessible
• Mouse Sequencing Consortium
– $6.5 million each from SKB, Merck; $3.5 million from Affymetrix
– $34 million from NIH; $7.75 million from Wellcome Trust
– Sanger, Whitehead, Washington University will complete 3x coverage of mouse
by end March 2001
– Traces available from NCBI/EBI
• Structural Genomics Consortium (under discussion)
Effect of restrictions on access to
biological data
• Biology is too complex for any organisation to have a
monopoly of ideas or data
• When company starts a new project:
“Most research is being done elsewhere”
• If blocks of biological data are held privately, even if
they pay for access, companies miss out on the
analysis that would be published by other scientists, if
they too had access to this data.
• The fewer people analysing a block of data, the less
valuable it is.
Collective intelligence
• Eric Raymond’s the Cathedral and Bazaar
broadened interest in the benefits of collaborative
research models.
GNU General Public License (GPL)
• For copyrighted materials, you can use freely even
for commercial purposes, but must share modified
code.
• Efforts to extend to patent or data.
“open” verses “closed”
• Open Source Software (LINUX)/Microsoft
• Public Library of Science/Commercial Journal
publishers
• Public/Private Scientific Databases
• Napster/Digital rights management
• Public Health Care/Drug patents
Biological data “libraries”
• Maintaining the Ensembl analysis of the human
genome sequence takes hundreds of computers;
Terabytes of disk space and a team of ~25
developers
• Currently around 150,000 hits per day from 80
different countries in any one week
The public interest
– too much regulation
– not enough reward for investors
– fewer benefits for society
– too little regulation
– too much power in the hands of commerce
– fewer benefits for society
History of a classical patent
• Company A patents “mouse trap”
• Company B thinks it can do better
– attempts to license “mouse trap” from A
– develops and patents “improved mouse trap”
• Company A licenses “mouse trap” technology
to Company C
• Healthy competition
History of a gene patent
• Company A patents “gene145” (identifies function X
and possible application Y)
• Company B thinks it can do better
– attempts to license “gene145” from A
– only gene145 is involved in function X; there is no
“improved gene145” to patent
• Company A refuses to license “gene145” technology to
anyone
• Unhealthy monopoly
History of a gene patent (2)
• Company A patents “gene145” (identifies
function X and possible application Y)
• Company B patents “gene145” (identifies
function J and possible application K)
• Company A refuses to allow Company B to use
its 2nd “gene145” patent
• Unhealthy monopoly
History of a gene patent (3)
• Company A patents “gene145” (identifies
function X and possible application Y)
• Company B discovers sequence variant of
“gene145”, present only in ethnic group d and
specific for a disease unique to that ethnic group
• Company B is blocked from applying this
knowledge by Company A, which has no
interest in ethnic group d
Consequences of Patents?
Software Patents:
“If people had understood how patents would be
granted when most of today's ideas were invented
and had taken out patents, the industry would be at a
complete stand-still today.”
Bill Gates
Does society need IP any more?
• In the past
– Few scientists, innovators
– Significant benefit from disclosure in exchange for patents
• Today
– Millions doing R&D
– Everyone goes to the same meetings, hear same talks, get
stimulated in the same way, go away and have same ‘ideas’
– Ideas belong less to individuals (let along corporations) than ever
before.
Does industry need IP to get the job
done?
• Other industries operate economics models which
rely much less heavily on IP, e.g. microchips.
• Issue is how to fund research
• Médicins San Frontièrs (Doctors without Borders)
– DNDi (drugs for neglected diseases initiative)
– Fund R&D from public domain
– Charge drugs at cost (no IP)
Context
• There is an ongoing “War” between forces of
ownership and openness
• The “battle” for openness of the human genome
sequence was won
but it nearly went the other way…
• Increased data ownership limits freedom
• Rich world ownership inhibits world equality
The future of ideas
•
•
•
•
Creativity and innovation always builds on the past
The past always tries to control the creativity that builds on it
Free societies enable the future by limiting the past
Ours is less and less a free society
– Copyright
– Patents
– Digital control
Lawrence Lessig, OSCON 2002
http://randomfoo.net/oscon/2002/lessig/
http://www.aaronsw.com/weblog/000438
“open” verses “closed”
• Software: Open Source (LINUX) v Microsoft
• Publishing: Public Library of Science v Commercial
Journal publishers
• Data: Public v Private Scientific Databases
• Media: Napster v Digital rights management
• Public Health Care: Access to Drugs v Patents
Options to moderate monopoly effects
of gene patents
• Make compulsory licensing easier and less costly
– A government or a judge issues a non-voluntary license
to use a patent.
– Compulsory licensing can introduce competition and
lower prices.
– Compulsory licensing can prevent a patent holder from
blocking R&D and/or the development of new products.
• Do not allow gene based patents
– Already much more difficult
– Patent law currently being reviewed at WIPO, WTO
Funding structures
R&D
Drugs
R&D
Fund
Public drug payments
Generics
Public domain and IP issues
• World Business Council for Sustainable Development Project on
Intellectual Property Rights
• Royal Society brainstorming on the future of IPR
• EU discussions concerning open source software
• OECD working group on Issues of Access to Publicly Funded
Research Data
• Aventis scenarios workshop on Sustainable Health Care
• Medicin san Frontiers IP policy for DNDi
• Rockefeller Foundation workshop on Collective Management of
Intellectual Property
Example of change
• European dairy farming, changed from:
– Make as much milk as possible
– Make a fixed quota of milk
• Consequences
– Market for milk quota
– 50% of high protein feed manufactures went out of
business
– Industry survived
Cost
Access to Drugs
+ Marketing
+ R&D
Cost + Profit
Free
People treated
Challenging the traditional view of
academic/commercial public/private in science
• Traditional view:
– academic science is open, but slow
– commercialisation needed to do anything serious; leads to IP
• Juggernaut of IP has developed a life of its own:
– WTO/US calls for patents in more and more areas
– EPO is expanding to address these areas
• Information is infrastructure
– Open alternatives may work better for society
King Canute
• Pressures
– drug prices
– pressure on IP
– generic drugs
• Other industries suffering similar public pressure
– Microsoft
– Music; Films
• Side effects of monopoly based economic model for
funding R&D
Change, what me?
Lots of people agree there is a problem but…
…say there is nothing that can be done about it
“apologists” for the existing model
Aventis:
“Senarios for Sustainable Health Care”
Breast cancer screening and the
BRCA1 gene patent
• International patents granted to Myriad genetics
covering BRCA1 and BRCA2 breast and ovarian
cancer genes
• Myriad test is expensive Euros 990 ($869)
• Cheaper tests have been developed, ranging from
Euros 122 ($107) to Euros 689 ($605), but are
blocked by Myriad patent
• Much of work carried out in public domain