October 29th, 2014 University of Pennsylvania
Download
Report
Transcript October 29th, 2014 University of Pennsylvania
TIES Cancer Research Network
Y3 Face to Face Meeting
U24 CA 180921
Session 3 New Partner Introductions
October 9th, 2015
University of Pennsylvania
Overview of Research Biobanking
and Data Analytics
Sidney Kimmel Cancer Center at
Thomas Jefferson University
John Reber
Systems Development Manager
Translational Pathology Core, Jefferson
SKCC
TCRN F2F, Washington DC, October 9,
2015
Overview of the Sidney Kimmel Cancer
Center (SKCC) at Jefferson
Background
Jefferson Medical College was founded in 1824, and the Hospital a year later. JMC is the second
largest private medical school in the U.S.
The Sidney Kimmel Cancer Center at Jefferson was founded in 1991 with approximately 30
investigators in the basic sciences.
Today, the SKCC has approximately 400 members that include physicians and scientists
dedicated to discovery and development of novel approaches for cancer treatment.
About SKCC
SKCC’s mission is to make transformational discoveries of the cellular and molecular biology of
the malignant process, and effectively translate the latest research discoveries into clinical trials
to provide the highest quality of care to all patients including those of diverse ethnic and racial
populations.
The SKCC’s Jefferson Cancer Network oversees clinical trials research at over 20 community
hospitals and practices in the Pennsylvania, Delaware, New Jersey, and New York.
SKCC’s IT infrastructure
GE Centricity
inpatient EMR
Allscripts outpatient
(ambulatory care)
EHR
Cerner A/P lab
system
EPIC inpatient and
outpatient
EPIC Beaker
OpenSpecimen research biobank
management
TIES clinical text extraction
i2b2 research data mart
TriNetX data analytics network
Specimen annotation management
TJUH clinical paraffin block
archive
Pathology Department
research tissue bank
(J. Evans, PI)
Pancreatic tumor bank
(C. Yeo, PI)
Breast tumor bank
(J. Palazzo, PI)
Thyroid tumor bank
(E. Pribitkin, PI)
Brain tumor bank
(D. Andrews, PI)
Liver tumor bank
(V. Navarro, PI)
JJJjjjj
Brain tumor bank
Jefferson
integrated
Research
Specimen
management
(OpenSpecimen)
> 230,000
patients
> 650,000
specimens
> 100,000
patients
via i2b2 RDM
Cancer patients having
comprehensive annotation
from the Tumor Registry and
banked specimens
Text Information Extraction System (TIES) deployment
at SKCC
Version 5.3 of TIES has been deployed at Jefferson
SKCC. To date, ~450,000 pathology reports are
available from TIES.
Pathology reports are automatically communicated from
the Cerner A/P system to the TIES database via an HL7
feed.
De-identification of the reports is accomplished using the
De-ID Corporation product.
Jefferson’s i2b2 Research Data Mart
• Built on “informatics for integrating biology and the
bedside” (i2b2) framework from the NIH-funded National
Center for Biomedical Computing based at Partners
HealthCare System (Harvard).
• RDM data are de-identified. Re-identification possible
via an honest broker, who has access to a reidentification application.
• Currently ~ 45 million observations on > 450,000
patients. Data refreshed weekly.
Current Jefferson Data Resource Landscape
i2b2 RESEARCH DATA MART
OPEN SPECIMEN
biospecimen annotation (SNOMED)
TJUH CLINICAL DATA
WAREHOUSE
IMPAC METRIQ
DEMOGRAPHICS
(gender, race, age, vital status, ethnicity)
cancer registry site, stage, histology,
treatment, survival (ICD-O-3)
DIAGNOSES (ICD9)
PROCEDURES (ICD9)
CLINICAL LABS (LOINC)
CERNER A/P
“omic” data
MEDICATIONS
FORTE ONCORE
clinical trial data
Patient data obtained from TJUH EMR
DEMOGRAPHICS
Age
Ethnicity
Gender
Race
Vital Status (alive/dead)
DIAGNOSES
Disease systems --> diseases (organized by ICD9 coding)
CLINICAL LAB RESULTS
Chemistry
Coagulation
Hematology
MEDICATIONS
Anti-neoplastic
INPATIENT PROCEDURES
Diagnostic and Treatment procedures (organized by ICD9
coding)
Patient mutation data obtained from Pathology Molecular
Diagnostic Testing (both outsourced and in-house)
ALK
rearrangement
BRAF
p.D594E
BRAF
p.K601E
BRAF
p.V600E
c.1782T>G
EGFR
EGFR
EGFR
p.E746K
EGFR
Deletion in exon 19
Insertion in exon 20
c.2236G>A
c.1801A>G
c.1799T>A
c.2236_2250del15
p.E746_A750delELREA
EGFR
c.2156G>C
p.G719A
EGFR
c.2155G>T
p.G719C
EGFR
c.2155G>A
p.G719S
EGFR
c.2573T>G
p.L858R
EGFR
c.2582T>A
p.L861Q
EGFR
c.2303G>T
p.S768I
JAK2
p.V617F
c.1849G>T
JAK3
p.V722I
c.2164G>A
KRAS
p.G12A
KRAS
p.G12C
KRAS
p.G12D
KRAS
p.G12R
KRAS
p.G12S
KRAS
p.G12V
KRAS
p.G13D
c.35G>C
NRAS
p.Q61H
NRAS
p.Q61K
NRAS
p.Q61L
NRAS
p.Q61R
c.183A>T
PIK3CA
p.E545K
PIK3CA
p.H1047L
PIK3CA
p.H1047R
c.1633G>A
c.34G>T
c.35G>A
c.34G>C
c.34G>A
c.35G>T
c.38G>A
c.181C>A
c.182A>T
c.182A>G
c.3140A>T
c.3140A>G
TP53
c.843C>A
TP53
c.811G>T
TP53
c.857A>C
TP53
c.400T>C
TP53
c.734G>A
p.G245D
TP53
c.388C>G
TP53
c.524G>A
p.R175H
TP53
c.817C>T
p.R273C
TP53
c.818G>A
p.R273H
TP53
c.318C>G
TP53
c.659A>G
TP53
c.707A>G
p.D281E
p.E271*
p.E286A
p.F134L
p.L130V
p.S106R
p.Y220C
p.Y236C
Specimen annotation from campus biobanks
Eight biobanks, including the TJUH paraffin block archive of ~400,000 cases
since 1990.
Anatomic origin (SNOMED)
Class (tissue, fluid)
Type (frozen, FFPE)
Pathology (normal, malignant, diseased)
Slide images
Patient data from Jefferson Tumor Registry
Over 100,000 cases since 1990.
Primary Cancer Diagnosis
Age at diagnosis/date of diagnosis
Survival (months) from diagnosis
Tumor histology and behavior
Stage (AJCC/TNM, clinical and pathological)
Grade
Recurrence
local, distant
Treatment
chemotherapy, radiation, surgery, transplant,
palliative
Disease-specific factors
ex: (prostate --> Gleason score)
Example data summaries from the i2b2
RDM
CLINICAL DIAGNOSES OF TJUH PATIENTS WITH
THYROID SPECIMENS
Pathology images are available via i2b2 query tool
TriNetX application offers an alternative query tool
with enhanced data visualization
Google-like query
interface
Graphic result
display
TriNetX application offers an alternative query
toolwith enhanced data visualization
Interactive display
capability
Selected areas of research using
RDM:
Hallgeir Rui, MD, PhD: Molecular Cancer Epidemiology, cancer
pharmacogenetics, individualised cancer risk assessment and
prognostication.
Hushan Yang, PhD: Molecular Cancer Epidemiology.
Scott Waldman, MD, PhD: Pharmacology and experimental
therapeutics.
Ron Myers, PhD:
Gene environmental risk assessmant.
Stephen Peiper, MD: Biomarker discovery using Next Generation
Sequencing.
Future Plans:
Allow access to TIES report directly from i2b2
selected cohorts (as we do with slide images).
Provide direct investigator access to the TIES
application.
Provide direct investigator access to the TCRN.
Cancer Research @ SB: Program Development
Programs under development:
• Metastasis and Experimental Therapeutics
• Metabolomics/Lipidomics
• Precision Approaches to Cancer: Imaging, informatics, genomics:
Wei Zhao and Helene Beneviste, Joel Saltz, Scott Powers
• Prevention and Population Research
• GI Cancer
Integrative Multi-scale Analysis in Biomedical Informatics
• Predict treatment outcome,
select, monitor treatments
• Computer assisted exploration of
new classification schemes
• Integrated analysis and
presentation of observations,
features analytical results –
human and machine generated
Current ITCR
Specific Aim 2:
Database infrastructure to manage and query image data, image analysis
results.
Specific Aim 3:
HPC software that targets clusters, cloud computing, and leadership scale
systems.
Specific Aim 4:
Develop visualization middleware for 2D/3D image and feature data and
for integrated image and “omic” data.
Image Analysis
Quantitative Feature Analysis in Pathology:
Emory In Silico Center for Brain Tumor Research
Dan Brat (PI), Joel Saltz (PD)
NLM/NCI: Integrative Analysis/Digital Pathology
R01LM011119,R01LM009239
Joel Saltz and David Foran
NCI: Tools to Analyze Morphology and Spatially
Mapped Molecular Data, 1U24CA180924-01A1
Joel Saltz
Marcus Foundation Grant
Ari Kaufman, Joel Saltz
Nuclear and feature segmentations yield new
determinants/correlates for outcomes.
Current ITCR: Storage and Visualization
Scalable storage architectures allow for effective and efficient
comparison of multiple image analysis methods.
Current ITCR: Classification with CNNs
Le Hou, Dimitris Samaras, Tahsin Kurc,Yi Gao, Liz Vanner, James Davis, Joel Saltz
How do we use TIES?
• IT actively deploying TIES for consortium and internal use.
• Regulatory progress
– IRB submission in preparation.
– Accept Terms of Agreement.
– Assigned roles.
• Extract additional morphology features for integrative research.
Using TIES to derive Morphology Features
A simple extraction pipeline:
Process pathology reports using TIES.
Learn which concepts indicate morphology features.
Identify modifiers that specify values for these features.
Using TIES to derive Morphology Features
How good is this?
Feature mention detection F1 = 0.93
Feature is absent or not F1 = 0.74
Feature value assignment F1 = 0.67
Collaboration Opportunities with TCRN
• Develop new tools that combine human generated features and machine
generated features.
– Collaboration on TIES analysis service enhancements.
• Joint engagement with cancer research partners to exploit these tools.
• Institutional collaboration with TIES in biomarker and clinical study research
Our Vision for Collaboration
Goal:
Proposal:
Facilitate integrated analysis of whole slide images and pathology
reports on the same specimen.
A micro-services composition for a global API ecosystem.
Integrated Image / NLP analysis
Context User Interfaces
Image
Analysis
HTTP REST API
NLP