Extending Genomic Data Sharing Policies

Download Report

Transcript Extending Genomic Data Sharing Policies

National Cancer Institute
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
National Institutes of Health
Importance of Semantics
in Precision Oncology at
NCI
Sherri de Coronado, MS, MBA
NCI CBIIT
May 15, 2015
Mind Map of Precision
Oncology Space
May 12 2015
+ Reusable
+BD2K
+BD2K
Semantics Related Opportunities
New Genomic Data Sharing Policy
• The new Genomic Data Sharing (GDS) Policy was released
in draft form in September 2013 (NOT-OD-13-119)
• Draft Policy put out in Federal Register for a 60-day public
comment period
• November 2013 public comments collected by the Office
of Science Policy. Policy modified with feedback from the
IC Directors and NIH GWAS data sharing Governance
committees (TSDS, PPDM, SOC)
• The final Genomic Data Sharing (GDS) Policy was
released August 27 2014 (NOT-OD-14-124)
Trans-NCI Data Sharing WG
• Responsible for the activities necessary for the Institute to
implement and maintain the GDS policy framework
•
•
•
Develop a plan & recommend any resources needed
Propose governance needs
Develop and disseminate materials for implementation
• Focus Areas
• Data Standards: Define baseline expectations,
including data types & timelines
• Process: Develop processes and resources facilitate
implementation and compliance.
• Resources: Consider all resource needs to implement
and oversee policy expectations.
• Governance: Consider governance needs and
procedures for adjudication of implementation issues,
and oversight.
Extending Genomic Data Sharing Policies
GWAS Policy
Scope
Consent
Standard -Existing*
Collections
*Before the effective
date of the GDS policy
Consent
Standard –
Future*
Collections
Applies to human GWAS data
Applies to all genomic data types,
human and non-human
If research consent, IRB reviews for
consistency.
If no research consent exists, data
may still be submitted to NIH
databases.
Same
N/A
Samples or cell lines should be
consented for research use and
broad data sharing. Exceptions can
be requested.
*After the effective date
of the GDS policy
Data
Data submitted as soon as quality
Submission control procedures are completed
Data
Release
GDS Policy
Immediate data release.
12 month publication embargo
Timelines vary by data type, but
generally as soon quality control
procedures are complete
6 month deferral of data release.
No publication embargo
New NCI MATCH TRIAL
"Precision Medicine uses genetic
information from a person’s
cancer to determine a patient’s
treatment with a treatment
targeted to that particular genetic
abnormality."
NCI MATCH trial
• Question: Can molecular markers predict
response to targeted therapies in patients
with advanced cancer resistant to standard
treatment?
• Biopsies from tumors from up to 3,000 patients to
undergo DNA/RNA extraction; assay workflow to
identify actionable mutations.
• ECOG-ACRIN leading study with NCI; Multiple arms,
matching particular molecular profile to specific
available drugs.
• Objectives: Assess response and time to progression
based on tumor profile, regardless of tumor origin.
TCGA History
• About three years post-Human Genome
Project – Large scale tumor profiling in a
systematic way.
• Initiated in 2005, pilots 2006, extend 2009
• Collaboration of NHGRI and NCI to
examine GBM, Lung and Ovarian cancer
using genomic techniques in 2006.
• Expanded to 20+ tumor types
TCGA Drivers
• Provide high quality reference sets for 20+
tissue types
• Provide a platform for systems biology and
hypothesis generation
• Provide a test bed for understanding the real
world implications of consent and data access
policies on genomic and clinical data.
• Now, data collection over, but MANY users and
many pan cancer and other papers. (>2700)
• Kinds of questions we want to ask and CAN
ask have changed and grown.
13
Genomic Data Commons
(GDC)
• In transition from The Cancer Genome
Atlas (TCGA) to GDC, a Commons to
host TCGA, TARGET and other future
genomic data sets
• University of Chicago and NCI
collaborating to initiate the Genomic
Data Commons (GDC), (Robert Grossman, Dir)
• To enable any researcher to test their
ideas, to bring their analytics to the
data.
NCI Cancer Genomics Data Commons
...
Genomic +
clinical data
GDC
Cancer
information
donor
NCI Genomics
Data Commons
NCI Genomic Data Commons
• Unified repository for cancer genomics data
– Accept from both NCI Center for Cancer Genomics
(CCG) and external projects
– Including submissions from small laboratories
• Unifying repository for cancer genomics data
– Perform reproducible, consistent bioinformatics
pipelines to generate standard higher-level data (e.g.,
tumor variant calls)
– Pipelines designed and updated with community input
to represent the best practices of the field
• The availability of genomic data will make it
possible for researchers to better classify disease.
GDC Context
From: Mark Jensen GDC
GDC ConOps
From: Mark Jensen, GDC
Clinical Data at GDC
• Key issues:
– Low barriers to data submission
• Minimal number of required data elements
– Ongoing curation and semantic assignment
• Balance acceptance of submitter-provided semantic
information with GDC curation
– Provide cross-project searches over clinical data
elements to filter genomic data
• Allow users acquire data intuitively, but also provide
semantic sources and IDs as available
• Ideal:
– Expose clinical data intuitively, but manage with
rigorous semantic information
Cancer Genome Cloud Pilots
Three pilots, initiated Fall 2014, to be public
"cancer knowledge clouds" in which data
repositories would be co-located with
advanced computing resources.
•
•
•
•
Broad Institute, UCSC, UC Berkeley
ISB-led team, Google, SRA
Seven Bridges Genomics
Begin piloting components and gathering
feedback required by Jan 2016
Cancer Genome Cloud Pilots
• Goals:
– democratize access to large-scale data
repositories and
– computational infrastructure
– co-locate data and compute to minimize
unnecessary data transfer
– integrate public and private datasets
– allow web-based exploration of hosted data
– transform and accelerate collaborative cancer
research
Cancer Genome Cloud Pilots
• People can register at any or all of these
sites, if they are interested in getting involved:
• Seven Bridges
cancergenomicscloud.org
• Broad
Firecloud.org
• Institute for Systems Biology
cgc.systemsbiology.net
Precision Medicine Opportunities
involve Semantics
The era of precision medicine and precision oncology is predicated
on the integration of research, care, and molecular medicine and
the availability of data for modeling, risk analysis, and optimal
care
Warren Kibbe
The promise of precision medicine will only be fully realized if
the research community can adapt its clinical trials
methodology to study molecularly characterized tumors
instead of the traditional histologic classification.
» Abrams et al, National Cancer Institute's Precision Medicine Initiatives
for the New National Clinical Trials Network, 2014
Semantic Opportunities:
Heard from this meeting and beyond
• Imaging
–
–
–
Pathology Imaging ontology gaps - terms/formal defs to characterize histopathology
images and algorithms.
NLP effort to automate image annotation with ontologies to create metadata for large
image collections by training classifiers.
QHIO- terms/relationships whole lifecycle of images
• Proteomics, Chris Kinsinger, CPTAC – better clinical biospecimen annotation
• Cancer Phenotypes
– Cohorts/ finding patients
– Cancer Pathology Protocol changes
• Modeling tumor micro environments – integration of multiscale cancer
data –effort to model cancer state as an ecological problem
• Cancer classification
• Data Needs vs Ontological Classification
• Pan Cancer analyses can be improved using DO (Hive)
Semantic Opportunities (2):
Heard from this meeting and beyond
• Tools/ Resources/Standards
– Getting usable, effective, efficient software into
peoples hands will increase uptake of semantically
well described metadata, terms and ontologies, and
better integration of metadata and terminology
– Integrated use of a variety of ontologies
– Ways to manage research and clinical data streams,
bridge
– Tools to help harmonize/ use/ metadata and
terminology
– Provenance – use of checklists early on. Bottom up.
– Research Commons
Thank you
Sherri de Coronado
[email protected]
Thanks to content contributors:
Gilberto Fragoso, Mark Jensen, Warren Kibbe, Juli Klemm, Elizabeth
Gillanders and others.