big data - ndmctsgh.edu.tw

Download Report

Transcript big data - ndmctsgh.edu.tw

轉譯臨床研究和精準醫學的生物醫學資訊整合
Integrated Biomedical Informatics for Translational
Clinical Research and Precision Medicine
Yang C. Fann, Ph.D.
Director, IT and Bioinformatics Program
NINDS/NIH/HHS
Currently a Visiting Professor at NYMU and TMU
Outline
• Challenges of Translational Research
• NIH Integrated Biomedical Informatics
Infrastructure for Translational Research
• NIH Biomedical Informatics Initiatives:
• BIG Data to Knowledge (BD2K)
• Precision Medicine Initiatives
• From BIG DATA Analytics to Future “Smart”
Medical Care…
• Summary
From Translational Research to Health Care
Bedside
Bench
Bedside
From curative
to predictive, preventive and preemptive
to personalized medicine
From 2004 NIH Roadmap Projects…
Why Translational Research is Challenging?
• Data sharing (culture change)
• Large and complexity of heterogeneous data
(Geno-pheno data, hypothesis, experimental
designs and cohorts…quality of datasets)
• Lack of common data standards
• Availability of analytical tools for BIG Data
discovery
• Domain knowledge informatics (data) scientists
• Informatics infrastructure for clinical research,
collaboration and discovery (research networks
and international)
Biomedical Informatics!
Evolution of Biomedical Informatics
Study
Management
Interoperability
• Data Collection
• Integration
• Clinical Trials
• Sharing
• Research Network • Repository
Innovation
Services
• “Intelligence”
• BIG Data Analytics
• Discovery
NIH Funded Biomedical Informatics Projects
Current active
projects (2016)
• Total of 339
Projects with over
$1 billion dollars
• Within those 339
projects, 106 of
them are
“database” related
with total about
$400 millions
http://projectreporter.nih.gov/reporter.cfm
NIH Clinical Center
World’s largest biomedical research complex
EHR – Clinical Research Information System (CRIS)
NIH Clinical Center
Research Activities 2010-14
What is BTRIS DW?
(Biomedical Translational Research Information System)
Institute
System
Personal
System
CRIS/MIS
Lab
System
BTRIS
What are in BTRIS?
•
• CRIS (2004 -)
• Alerts
• Allergies
• Anatomic Pathology
• Blood Bank
• Clinical Documents (unstructured)
• Clinical Documents (structured)
• Demographics
• Diagnoses/Problems
•
• Echocardiograms
• Electrocardiograms
• Lab Tests and Panels (Location)
• Medication Administration
• Medication Orders
• Microbiology (Links to Mass Spec)
• PDF Documents
•
• Radiology Reports
• Radiology Images
• Vital Signs
MIS (1976-2004)
– Blood Bank
– Demographics
– Lab Tests and Panels
– Medications
– Microbiology
– Radiology Reports
– Vital Signs
Other institutes
– NIAID (CRIMSON: Labs, Meds, Problems)
– NIAAA (Assessments)
– NCI (Labmatrix, C3D, Biospecimens)
– NICHD (CTDB: Forms)
– NHGRI (Labmatrix, Exome Data)
Genomics Data (2014 - )
– NIAAA (SNP array data)
– NHGRI (exome sequencing data)
– NIAMS (exome sequencing data)
– NIDDK (genotyping data)
BTRIS Data Stat 2015
Next Frontier of Biomedical Informatics
Big Data to Knowledge (BD2K) Initiatives
Four Scientific Areas
• Facilitating Broad Use of Biomedical
Big Data (Data Discovery Index)
• Developing and Disseminating
Analysis Methods and Software
• Enhancing Training for Biomedical Big
Data (Data Scientist)
• Establishing Centers of Excellence for
Biomedical Big Data
https://datascience.nih.gov/
Biomedical Informatics Infrastructure: The Commons
Data
The Why:
Data Sharing Plans
The How:
The End Game:
Scientific
Discovery
The Long Tail
Knowledge
NIH
Awardees
Government
Software
Index
Standards
The
Commons
Core Facilities/HS Centers
Rest of
Academia
Data
Discovery
Index
BD2K
Centers
Usability
Quality
Private
Sector
Security/
Privacy
Metrics/
Standards
Sustainable
Storage
Clinical /Patient
Cloud, Research Objects,
Business Models
Data Sharing
NIH Genomic and Human Data Sharing Policy
What is the Precision Medicine Initiative?
To enable a new era of medicine through research,
technology, and policies that empower patients,
researchers, and providers to work together toward
development of individualized treatments.
Precision Medicine
vs.
Personalized Medicine
Why Personalized Medicine Progress Slow?
• Incomplete knowledge of disease causation in
individuals and the factors that dictate their
variable responses to therapy.
• Prevent the development of disease would require
the ability to recognize individuals at high risk of
developing specific disorders and the development
of new interventions that can prevent subsequent
development of overt disease.
• Diseases of high population burden that currently
do not have specific predictive biomarkers include
Alzheimer’s disease and type II diabetes mellitus.
The NIH’s Perspective in Precision Medicine
Precision medicine is an emerging approach for
disease prevention and treatment that takes into
account people’s individual variations in genes,
environment, and lifestyle.
The Precision Medicine Initiative
will generate the scientific
evidences needed to move the
concept of precision medicine
into clinical practice.
Precision Medicine is a new pathway to Personized medicine!
http://www.nejm.org/doi/full/10.1056/NEJMp1500523?
query=featured_home&
PMI: Short-term Goals
“Precision oncology”: targeting unexplained drug resistance,
genomic heterogeneity of tumors, insufficient means for monitoring
responses and tumor recurrence, and limited knowledge about the
use of drug combinations
PMI: Long-term Goals
Create a research cohort of > 1 million American
volunteers who will share genetic data, biological
samples, and diet/lifestyle information, all linked to
their electronic health records (follow up for 10
years)…
Research based upon the cohort data will:
• Advance pharmacogenomics, the right drug for the right
patient at the right dose
• Identify new targets for treatment and prevention
• Test whether mobile devices can encourage healthy
behaviors
• Lay scientific foundation for precision medicine for many
diseases
Why We Do PMI Now?
Ten Years Ago
Cost of sequencing a
human genome
Amount of Time to
Sequence a Human
Genome
Number of smart phones
in the United States
EHR Adoption
(% hospitals)
Computing Power
$22,000,000
2 years
1 million (<2%)
Now – 2014
(most recent data)
$1,000 - $5,000
<1 day
160 million (58%)
20-30%
>90%
n
n x 16
Deep Learning
PMI – Information flow
Direct Volunteers
HPO Volunteers
Self-report Measures
mHealth Data
Consent
EHR Data
Baseline Exam
Biological Samples
Scientific Opportunities in the PMI-Cohort
 Discover new biomarkers predictive of future disease risk
 Discover determinants of individual variation in response to
therapeutics
 Determine quantitative risk estimates in the population by
integrating environmental exposures, genetic factors, and
gene-environment interactions
 Integrate mHealth and sensor technologies
 Determine clinical impact of loss-of-function mutations on
clinical outcome
 Discover new classifications and relationships among diseases
 Enable targeted clinical trials of subjects with rich clinical data
 Make ‘big data’ broadly available to investigators
BIG Dada
Personalized Medicine
NIH Funding on Big Data Projects
$800
$700
$600
Millions
$500
$400
$300
$200
$100
$0
2010
2011
2012
2013
2014
2015
Over 75% are Biomedical Informatics involved projects
NIH Big Data Funding Categories (2014-15)
Current Taiwan BIG Data Research in Medicine
• NHRI Data Warehouse
• Insurance claim data (limited datasets!)
• Only useful for certain types of research
• Clinical Research Data (silos?)
• Cancer, or Disease-based Registries
• Trials, Consortia, Research Networks, CoE, etc.
• Health Care Information Hub
• EHR Integration (history, medicine, labs, imaging,
etc.) and data warehouse
• Biospecimen (Hospitals and Taiwan Biobanks)
Next Frontier of BIG Data Research?
TMU EHR DW
BIG Data Analytics from EHR
• Identify if people who have medical history of X and Y
are more likely to develop or diagnosed with Z?
(association analysis)
• What symptoms and tests give the diagnosis of
disorder A? And, what characteristic groups likely to
have it (regression and classification analysis)
• Find relationship (e.g. similarity or data elements) among
stroke, PTSD and TBI patients?
(social network/correlation analysis)
• Based on the patient’s past medical history (with
continuing tests), is s/he most likely to develop disorder
P and how to prevent it? (machine learning)
How about patient privacy and confidentiality?
ICD-10 diagnoses from the National Danish
Patient Registry (6.2m)
Jensen et al, Nat Commun. 2014 Jun 24; 5: 4022.
Cardiovascular Disease Trajectory Clusters
De-Identification via GUID
Informatics for Translational Clinical Research
Protocol
review and
Approval
Protocol
authoring
Study
management
Data analysis
& report
PTMS
Patient
recruitment
Data
collection &
monitoring
FDA approval
& publication
Data sharing
IBIS
CSIS
STAMS
IBIS System Modules and Tools
Clinical Research
• Defining electronic case
report forms
• Scheduling and collecting
clinical data
• Exporting, analyzing and
reporting on collected
data.
Study Management and
Data Submission
• Defining and managing
study information and
access
• Contributing, uploading, and
storing the research data
• Define federated data stores
Defining and Validating Data
• Creating, managing, and
searching data elements and
form structures
• Validating research data against
the defined validation rules
Querying, Reporting
and Exporting Data
• Include locally collected
research data
• Include research data
from defined, federated
sites.
Global Unique Identifier System
• Allows researchers to share data
specific to study participants
• Correlate participants across
studies without exposing
personally identifiable information
(PII)
MIPAV Imaging Tool
• Image submission tool
• Image analysis tools
• 3D Image visualization
User Account Management
• Creating, approving, and
managing user accounts
• Managing access controls,
roles & permissions
• Single sign on
IBIS Web UI
Repository Query
Neuro-Grid BIG Data Cloud
Informatics-driven Translational Research
Stroke DR
PD DR
Alzheimer
DR
Neuro DR
TBI DR
From Biomedical Informatics to Future Medical Care
“Information-Driven” ?
BIG Data Analytics?
Internet of Things (IoT)?
Smart Technology (sensor/control, etc.)?
Information Driven and Value added…
e.g. What can I use the BIG Data for?
• My (my love ones’) health status
• Long-term care
• Life style couching
Already happening!
Connected Health
Consumer technology driven Health Innovations
Another BIG Data contributors!
Internet of Things (IoT)
“the network of physical objects— devices,
vehicles, buildings and other items —
embedded with electronics, software, sensors,
and network connectivity that enables these
objects to collect and exchange data!”
What can we do with it?
• Prevent medical errors
• Provide proactive patient care thru active
monitoring and feedback
• Life style coughing and well being
Data (information) -> Services
Medical Errors – Human!
No. 3 killer in the U.S.!
Now - A.I. @2016
“1997 IBM Watson
(deep Blue) won
over the world chess
champion”
What’s Next for AlphaGO?
e.g. Real-time alert of a patient; physician's task manager
“Smart” Technology, “Smart” Hospital -> Better Health Care!
“Google DeepMind granted access to 1.6m NHS
patients' confidential records including Past 5
years of historical records plus real time data
on hospital visits, test results, diagnoses, and
more… ”
“Streams” which will help detect real-time cases of
acute kidney injury -- a factor in as many as 20
percent of emergency hospital admissions, saving
over 100K people per year!
Future Medicine – Are we ready for it?
Now
Upcoming
Future?
?
EHR DW
Data Repository
BIG Data Analytics
EHR A.I.
“Smart” Doctor
“Smart” Hospital
Virtual Hospital
& Doctor?
Innovator? Active Participant? or, Follower?
Future “Smart” Medical Care?
Informatics
+ Services
Live Well,
Healthy &
Happy
Challenges: Cost, Standards and People adoption!
(Security & Confidentiality)
Summary
• Biomedical big data is not only massive and
complex but also diverse and not well organized;
which presents both challenges and opportunities
• Both BD2K and Precision Medicine are new data
driven approaches to accelerate biomedical
discovery and pave a pathway toward future
personalized medicine
• Building sustainable informatics infrastructure is
the key to biomedical discovery as well as the
foundation to future smart hospital and intelligent
health care!
Question?
Email: [email protected]