Biomedical Informatics and Clinical NLP in
Download
Report
Transcript Biomedical Informatics and Clinical NLP in
Biomedical Informatics and
Clinical NLP in Translational
Science Research
Piet C. de Groen, M.D.
Overview - Examples
Patient-specific research – N=1 study
Understanding a disease
Finding the right MD, diagnosis and
treatment
Renal Transplant patient
May, 2005
Hepatobiliary Clinic Consultation
Abnormal liver tests – using Lipitor™
Diarrhea and weight loss
Challenge
Very complex medical history
Nobody understands the case
HUGE history with hundreds of notes
Patient January 16, 2006
Total weight of printed pages presented for review:
5 lbs.
Patient January 16, 2006
Total number of X-rays presented for review:
16,902
Questions
What is exactly the patient’s problem?
– Are liver tests and weight loss due to Lipitor?
– When did she use Lipitor?
– What was the weight on what date?
Impossible to review all notes!
– Which notes are relevant to current symptoms?
– Which have notes have weights and drug
information?
What I need
I need to see trends over time
– Weight
– Lipitor use
– Effects of Lipitor on lipids and liver tests
But I cannot see trends over time
– EMR does not have structured data for weight or
Lipitor use
– EMR only allows for display of laboratory test results
in very large tables or simple graphs
Data Warehouse to the
Rescue!
Demographics
– MC # = xx-xxx-xxx
Clinical Notes
– Patient Vitals
Weight exists
Weight in kg
70
65
60
55
Result
– 243 notes
43 had weight
50
Start Dialysis
TransplantNew Problem
45
40
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
What happened to
Cholesterol?
She was on Lipitor, but:
– When was it discontinued?
– Did it do anything to her lipid levels?
NLP to the rescue!
Sort 33 identified Clinical Notes on date
First note is from 1997
– Lipitor is highlighted in the note
– …Dr. X recommended discontinuation of Pravachol
and initiation of Lipitor … have written a prescription
for Lipitor …
Last note is from 2005
– … Lipitor was discontinued in 2004 …
– March 2004 note confirms discontinuation
Warehouse to the Rescue!
Demographics
– MC # = xx-xxx-xxx
Tests
– Cholesterol exists
Clinical Notes
– “Lipitor”
Result
Cholesterol in mg/dL
350
300
250
200
150
100
Lipitor
50
0
1993
1995
– 22 cholesterol levels
– 243 notes: 33 mentioned “Lipitor”
1998
2001
2004
2006
Recommendations
72 hour stool fat on 100 gram fat diet
– 689 gram, 23 gram fat/day (2-7 Normal)
EGD/EUS with biopsies and aspirate
–
–
–
–
–
–
Esophagitis - ? Candida – biopsy negative
Duodenal diverticula, normal pancreas
Duodenal biopsy normal
Aerobes > 100,000 Gram negative bacillus cfu/mL
Anaerobes > 10,000 Bacteroides Fragilis cfu/mL
Yeast 1,000-10,000 cfu/mL
Small Bowel X-ray
– Numerous diverticula
Understanding a disease
Hepatocellular Cancer in
Obesity
Spring 2006
Based on simple queries of MCLSS
• For NASH the ICD-9 code 473.8 was
used; this code may include other
diagnoses, but the vast majority is NASH
• For Primary Liver Cancer the ICD-9 codes
155.0 and 155.1 were used
• For Obesity ICD-9 code 278.0 was used,
or Diagnosis section Clinical Notes
• BMI was retrieved from Clinical Notes;
maximum value during life time was used
Primary Liver Cancer
NASH Cases with BMI>30
40
35
30
25
Cases
Males
20
Females
15
10
5
0
1992
1994
1996
1998
2000
2002
2004
2006
Cancers with Increasing Incidence
2012 report US: 1999 through 2008
CA: A Cancer Journal for Clinicians
Volume 62, Issue 2, pages 118-128, 4 JAN 2012 DOI: 10.3322/caac.20141
http://onlinelibrary.wiley.com/doi/10.3322/caac.20141/full#fig2
Finding the right MD,
diagnosis and treatment
Interval Colorectal Cancer
Time Line
Example of Interval Colorectal Cancer
Pathology
< 3 years
Endoscopy
Diagnoses
Time Line
1
2
3
4
5
Year
Benign Colon
Colon Cancer
Non-Colon Disease
(Endoscopy data)
325,370 Procedures
Part description = “COL/RECT”
AND Valid MCN
Patients with CC diagnosis
and C procedure
238,177 specimens
2,692 patients
Diagnosis_code = One of 50
identified cancer diagnosis codes
Extract all C procedures, the
date and other features
19,259 specimens
4,743 procedures (date,
other features)
Unique? (One specimen may have
multiple diagnosis codes)
Compare the CC diagnosis
and C dates
13,477 specimens
(10,136 patients)
Remove Patients with Research
Authorization = ‘No’
Missed Lesions (Anatomic
location, tumor size, other
characteristics)
Colonoscopy 1992- 2004
Colon Cancer 1993-2006
(Pathology data)
4,203,857 specimens
Methods
Negative History
Pathology = Colorectal Cancer
No lesions at colonoscopy
Truly Missed
1
Probably Missed
1
2
3
4
5
2
3
4
5
Year
Lesions at colonoscopy
Seen, removed
1
2
3
4
5
Seen, not removed
1
2
3
4
5
1
2
Colorectal Cancer History
Recurrent, 2nd, 3rd
cancer not prevented
3
4
5
Results Summary
• Truly missed case
– 90 days to 3 years
82
• Probably missed case
– 3 to 5 years
54
>283
• A lesion was seen
– removed <5 years
– not removed <5 years
• Local recurrence or
2nd, 3rd cancer
95
8
>44
©Ralph A. Clevenger
Tumor Growth Curves
3 Months
Doubling Time
100
90
80
t=3
yrs
Truly Missed
Probably Missed
Seen & Removed
Recurrent, 2nd, 3rd
Tumor Size (mm)
70
60
50
40
30
20
10
5
0
0
200
400
600
800
1000
Time Interval (days)
1200
1400
1600
1800
Numbers for each Endoscopist
20
Truly Missed
Probably Missed
Seen & Removed
Recurrent, 2nd, 3rd
15
Number
Not
Detected
10
5
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
100
Number
Seen
50
0
Miss Rate for each Endoscopist
Truly Missed
Probably Missed
Seen & Removed
25
20
% Not
Detected
15
10
5
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
Endoscopist
8
7
Detection of cancers in previously seen patients (self)
6
5
4
3
2
1
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
8
7
6
5
4
3
2
1
0
Detection of cancers in patients seen by colleagues (others)
Overview - Examples
Patient-specific research – N=1 study
Understanding a disease
Finding the right MD, diagnosis and
treatment