PGP-10 - Church Lab

Download Report

Transcript PGP-10 - Church Lab

Personal Genome Projectile
9-9:30 AM 20-Oct NRB Room 350
Thanks to:
1
Agenda
9:00 am Introduction: George Church
9:30 am PGP-10 comment period
10:00 am Genomic Consultation : Joseph Thakuria MD, MGH
10:30am PGP Cells: Jay Lee MD, In Hyun Park
11:00 am Response of Patients to ApoE info: Robert Green MD, Boston Univ.
11:30am PGP-10 & Staff -saliva specimen collection for microbiomics
12:00 pm Lunch
12:45 pm Tour featuring Polonator instruments
1:00 pm Sharing: John Wilbanks, Science Commons
1:30 pm Association Studies: David Altshuler MD, Broad Institute
2:00 pm PGP-10 & Staff Flex Time
3:00 pm International PGP: Jeantine Lunshof, Amsterdam, Margret Hoehe,MD Berlin
3:30 pm PGP-10 comment period and public data release
4:00 pm Closing Remarks: George Church
4:30 pm Press conference NRB Rotunda
6:00 pm Public event sponsored by Nova scienceNow
2
Suggestions are welcome
#1:
#2:
3
Major points
#1: Thank you
#2: Today is a start, not a final product
#3: PGP is research, not a genetics service
#4: We are providing some interpretations, but
mainly to initiate study & discussion. Decisions
about releasing data should be largely based on
other considerations.
#5: Today: PGP-10: 50K exons, SNPs, CNVs,
#6: 2009: PGP>100: 200K exons, RNA,
microbiome, VDJome; full genomes for 10.
4
PGP Education, ELSI, International
Meetings:
PG-0 28-Jun-06 Toronto, AnnArbor,
PG-1 18-Jul-07 Boston, Brookline
PG-2 20-Oct-08 Boston
PG-3 ??
Education: pgEd.org, NecessaryFilms.com
Oppenheimerfoundation.org
PGP inquiries: UCSD, JCVI, UNM, Yale, ISB
Berlin, Toronto, Seoul, Shenzhen, Singapore
5
PersonalGenomes.org :
gene/environment/trait data
0431
1660
1) Open access (very low barrier to participation)
2) Avoid over-promising on de-identification
3) 100% on Exam to assure informed consent
1070 1846
4) Low cost coding sequence + regulatory data
5) Multi-traits: imaging, iPS stem cell RNA, microbes
6) Cells available for personal functional genomics
1677 1730
7) IRB approval for 100,000 diverse volunteers
2003 to 2008 International consortium
1731 1687
Lunshof JE, Chadwick R, Vorhaus DB, Church GM. From genetic privacy to
open consent. Nat Rev Genet. 2008
Lunshof JE, Chadwick R, Church GM (2008) Hippocrates revisited? Old ideals
and new realities. Genomic Med. 2(1-2):1-3.
1833 1781
6
Inherited Genomics
Once in a life-time genome sequence
to Predictive Medicine
PERSONAL
GENOME
1 to 98%
TRAITS
(Phenome)
7
Inherited + Environmental Genomics
Once in a life-time genome + yearly ( to daily) tests
Public Health Bio-weather map : Allergens, Microbes, Viruses
PERSONAL
GENOME
1 to 98%
Multitissue
Epigenom
e
(RNA,mC)
VDJ-ome
TRAITS
(Phenome)
Microbiome
8
Omic combinatorics
(Alleles^n * environments^m) vs. (lumping via pathways)
9K chem/drugs
VDJ-ome
1M receptors
PERSONAL
GENOME
3M alleles
>>250
tissues
epigenome
(RNA,mC)
4000 disorders
+ non-medical
(quant)traits
Microbiome
1M species
9
Multiple hypothesis testing
Y= Number of Sib Pairs (Assocation)
X= Number of Alleles (Hypotheses) Tested
Pool some
alleles by
pathway &
mutation type
(not LD or
chromosome
position)
GRR=1.5, p= 0.5 (population frequency)
= Genotypic relative risk
1,600
1,400
1,200
1,000
800
600
400
Allele &
environment
combinations
|
200
0
1E+4
1E+7
1E+10
1E+13
1E+16
1E+19
1E+22
based on Risch & Merikangas (1996) Science 273: 1516
10
Sequencing tracked Moore’s law
(2X / 2 yr) until 2004-8 (10X / yr)
10
1
0.1
0.01
0.001
0.0001
0.00001
0.000001
0.0000001
1990
1995
2000
2005
2010
40X 98% genome $5K in 2008 ($50 for 1%?)
11
Multiplex Cyclic Sequencing by Synthesis
Polonator: multiple chemistries: polonies on slides or beads
Polymerase -orA
Mitra, et
al. 2003
Analyt.
Biochem.
1999
NAR
Ligase
Shendure,
Porreca, et al.
2005 Science
G
C
T
AB-SOLiD*, CGI*
Illumina, IBS*
12
Open-architecture hardware, software, wetware
e.g.
1981
IBM
PC
$150K - 2 billion beads/run
Polonator
Rich Terry
13
6 Next Generation Sequencing Platforms
Roche
Illumina
AB-SOLiD
Helicos
Polonator
$500K
$680K
.001G/0.03h 0.2 G /2.6h
$690K
0.3 G /4h
$1350K
2.8 G/2h
$155K
2G/2h
VDJ-grant
Co-develop SAB
Exomes
SAB & PGP10 in 2009
Co-develop
14
Association Studies & Direct to Consumer
$350K
98% genome
$2500
0.02% 23 clinical
$999
0.02% 31 diseases
$400
0.02% 10 clinical
68 research
subsidized 0.02% de-identified
subsidized
0.02% researcher subset
subsidized
Varies -- no trait data
15
DTC: 23andme
#genes
3
8
1
6
3
8
5
1
2
9
16
The number of human genes
Broad Inst.: 20,500 genes with conservation pattern
indicative of function
Genecard annotated: 29,479; name/location: 38,891
Genetests: 1347; #included in today’s exome: 953
1% of the genome is protein coding = 2x30Mbp
17
Selective genome sequencing
3 ways to capture alleles from genomic or c-DNA
1.
In vitro
Paired-endtags (PET)
Science 2005
2.
Hybridiz.
selection
3.
3.
Gap
Fill
Nat Methods 2007
For rearrangements
Red=Synthetic; Yellow=genome/cDNA
Shendure, et al. Science 309:1728
Porreca et al 2007 Nat Methods 4:931
Nilsson et al. (2006) Trends Biotechnol 24:83.
How do we optimize >100K 100mers ?
Zhang, Chou, Shendure, Li, Leproust,
Dahl, Davis, Nilsson, Church
18
99% Concordance : GapFill Sequencing & HapMap
19
Concordance : GapFill Exon PGP Sequencing &
Affymetrix SNP chip data (4+ reads)
20
RNA/epigenome challenge:
Multiple cell types from adults
3mm skin sample
21
Induced Pluripotent Stem Cell Generation
& Transdifferentiation (Oct4/Sox2/Myc/Klf4)
Retroviral Infection
Adenoviral Infection
Tissue Culture on a Mouse Feeder Layer
ES Cell Colony Identification
Clonal Isolation and Propagation
Embryoid Body Induction
&
Guided Differentiation
2 months
Multiple integration sites
Yamanaka, Daley, Thomson
Hochedlinger, Jaenisch labs
Mixture of differentiated cell types
&
Guided Differentiation
1 week
In Hyun
Park
Jay
Lee
22
Reprogramming reproducibility
23
Inherited + Environmental Genomics
One in a life-time genome + yearly ( to daily) tests
Public Health Bio-weather map : Allergens, Microbes, Viruses
PERSONAL
GENOME
1 to 98%
Multitissue
Epigenom
e
(RNA,mC)
VDJ-ome
TRAITS
(Phenome)
Microbiome
24
PGP Microbiome-Resistome: 18 Antibiotics
Dantas, Sommer, Church
unpublished
25
Bacteria Subsisting on 18 Antibiotics
Dantas
Sommer
Church
Science
2008
26
Antibody VDJ regions
Lefranc, The Immunoglobulin FactsBook; Janeway, Immunobiology 2001
27
Human B &T lymphocyte cDNA : VDJ Polonies
2-4 E6 / ml * 5L = 1E10 cells (blood)
46*23*6*67*5 = 2M combinations (24 bits vs 750 bp)
V
D
J
C
IGH
38-46
23
6
9
IGK
31-35
-
5
1
IGL
29-32
-
4-5
4-5
TRA
45-47
-
50
1
TRB
39-46
2
13
2
TRD/A
5
3
4
1
TRG
4-6
-
5
2
Uri Laserson,
Francois Vigneault
http://www.infobiogen.fr/services/chromcancer/Genes/TCRBID24.html
28
VDJ(H)
16 antigens &
3 PG-B cells
combinations
24x86
ImMunoGeneTics database
http://imgt.cines.fr/
29
Suggestions are welcome
#1:
#2:
30
Major points
#1: Thank you
#2: Today is a start, not a final product
#3: PGP is research, not a genetics service
#4: We are providing some interpretations, but
mainly to initiate study & discussion. Decisions
about releasing data should be largely based on
other considerations.
#5: Today: PGP-10: 50K exons, SNPs, CNVs,
#6: 2009: PGP>100: 200K exons, RNA,
microbiome, VDJome; full genomes for 10.
31
.
32
Is promising anonymity realistic? Are we in denial?
Trends in laws to make data public (not just at elite institutions): e.g. H.R. 2764,
SEC. 218. 26Dec07 open-access for all NIH-funded research. SEC, GINA, etc
(12) Identify individual case/control status from pooled SNP data Homer et al
PLoS Genetics 2008
(11) Re-identification after “de-identification” using public data. Group
Insurance list of birth date, gender, zip code sufficient to re-identify medical
records of Governor Weld & family via voter-registration records (1998)
Self identification trend (genome-altruists)
(10) Unapproved self-identification. e.g. Celera IRB. (Kennedy Science. 2002)
(9) Obtaining data about oneself via FOIA or sympathetic researchers.
(8) DNA data CODIS data in the public domain. even if acquitted
33
index
Is promising anonymity realistic? Are we in denial?
Accessing “Secure data”
(7) Laptop loss. 26 million Veterans' medical records, SSN & disabilities stolen
Jun 2006.
(6) Hacking. A hacker gained access to confidential medical info at the U.
Washington Medical Center -- 4000 files (names, conditions, etc, 2000)
(5) Combination of surnames from genotype with geographical info An
anonymous sperm donor traced on the internet 2005 by his 15 year old son
who used his own Y chromosome data.
(4) Identification by phenotype. If CT or MR imaging data is part of a study, one
could reconstruct a person’s appearance . Even blood chemistry can be
identifying in some cases.
(3) Inferring phenotype from genotype Markers for eye, skin, and hair color,
height, weight, geographical features, dysmorphologies, etc. are known & the
list is growing.
(2) “Abandoned DNA bearing samples (e.g. hair, dandruff, hand-prints, etc.)
(1) Government subpoena. False positive IDs and/or family coercion
34
index