From the Data to the Integrome. How fusing different data

Download Report

Transcript From the Data to the Integrome. How fusing different data

Translational Case Histories
Harvard Medical School
Center for Biomedical Informatics
i2b2 National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
Isaac S. Kohane, MD, PhD
John Glaser, PhD
Susanne Churchill, PhD
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
First signal:
• 1 year after
Celecoxib
• 8 months
after
Rofecoxib
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
• For every
million
prescriptions,
0.5% increase
in MI (95%CI
0.1 to 0.9)
• 50.3% of the
deviance
explained
A National Center for Biomedical Computing
Effect on patient age
i2b2
Infor matics for Integrating Biology & the Bedside
• Negative association
between mean age at
MI and prescription
volume
• Spearman correlation
-0.67, P<0.05
A National Center for Biomedical Computing
I2B2: Test RelNet Project
 Correlate available GEO expression data for GPL96 platform containing
expressions for more than 22K human genes
 Number of gene pairs for this gene chip: ~ 250 Million
 Multi-threaded application to run on the high-performance Cluster
environment from HP
 Bottleneck: the back-end Database
 Current, fine-tuned version of the application takes about 2-3 months to
complete one data set calculation
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Average BMI by Age
35
Obesity
34
33.5
33
32.5
32
31.5
31
Distribution of Highest BMI of each Patient
30.5
i2b2
Infor matics for Integrating Biology & the Bedside
78
10000
75
72
69
66
63
60
57
54
51
48
Age
9000
8000
7000
Total Patients
6000
5000
4000
3000
2000
1000
BMI
A National Center for Biomedical Computing
89
82
78
74
70
66
62
58
54
50
46
42
38
34
0
30
45
42
39
36
33
30
27
24
21
30
18
Average BMI
34.5
Recurrent Themes
• Access to large numbers of phenotyped
specimens
• Inadequacy of informatics at the cutting
edge
– Inadequacy of software solutions alone
• A persistent multidisciplinary requirement
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Overall Remission Rate with Citalopram
QIDS: Quick=
Inventory
of Depressive Symptoms, self report
32.9%
Percent (%)
N = 943/2876
No
Mild
Moderate
Severe
depression symptoms symptoms symptoms
Last QIDS-SR Score
Trivedi MH, et al. Am J Psychiatry 2006;163:28-40.
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Very severe
symptoms
Aims:
• Identify a cohort of patients with TRD, and
a matched cohort with SSRI-responsive
MDD.
– Data-mining tools
– Natural language processing
• Conduct the first genomewide association
study of TRD.
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
• Scan computerized medical records (DataMart)
– ICD9 RA x 3 plus one of:
• CCP or RF
• Erosions on x-ray
adds >95% specificity
• DMARD treatment
• Crimson “discarded” blood samples (cases and controls)
• CCP on all samples (and bank serum)
• DNA on all samples for genetic studies
i2b2
Infor matics for Integrating Biology & the Bedside
www.i2b2.org/disease/arthritis.html
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
i2b2
Association in population samples
Affecteds
Controls
SNP frequency in cases compared to controls
Positive controls: MHC, PTPN22, STAT4, TRAF1-C5, TNFAIP3
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
cgt…ggaatac…...
cgt…ggaatac…..
…...a
Allele ‘A’
NspI
…...a
NspI
NspI
cgt…ggattac…..
cgt…ggattac……
…...a
Allele ‘B’
NspI
NspI
......a
NspI
NspI
NspI
MSRE digested
Allele ‘A’
Allele ‘B’
AB
control
(no digestion)
i2b2
Infor matics for Integrating Biology & the Bedside
A_
_B
‘A’ methylated
‘B’ methylated
‘B’ expressed
‘A’ expressed
A National Center for Biomedical Computing
__
both unmethylated
both expressed
Methodology Dev
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Making the numbers better
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Gene Network Enrichment Analysis
Microarray data
Protein protein
interaction network
i2b2
Infor matics for Integrating Biology & the Bedside
Biological Process
A National Center for Biomedical Computing
Molecular Function
Diabetes Genome Anatomy Project:
Mouse Models of Insulin Resistance, Insulin
Deficiency and Obesity
• Knockouts
–
–
–
–
Insulin receptor
Insulin receptor substrates
Leptin
PGC1A
• Environmental
– High fat diets
– Drug treatments (Streptozotocin)
67 Conditions Total
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Tissues
Three Functional Sets Are Consistently
Over-represented In Disease Models
1.
Insulin signaling,
interleukins, and nuclear
receptors.
2.
Insulin signaling is
consistent with the given
disease models. Was not
identified using standard
techniques.
3.
Interleukins and nuclear
receptors consistent with
the inflammation and
disordered metabolism
associated with type 2
diabetes.
i2b2
Infor matics for Integrating Biology & the Bedside
Insulin signaling
Nuclear
Receptors
Nuclear receptors: 31 of 67.
Interleukins: 38 of 67.
Insulin signaling: 45 of 67.
A National Center for Biomedical Computing
Interleukins
Early
evidence
of
signature
from WBC
i2b2
Infor matics for Integrating Biology & the Bedside
A National Center for Biomedical Computing
Predicting CAG Length in HD
i2b2
Infor matics for Integrating Biology & the Bedside
QuickTime™ and a
decompressor
are needed to see this picture.
A National Center for Biomedical Computing
i2b2
Infor matics for Integrating Biology & the Bedside
Thank you
A National Center for Biomedical Computing