Transcript Chapman

Information Extraction from
Clinical Reports
Wendy W. Chapman, PhD
University of Pittsburgh
Department of
Biomedical
Informatics
Background
• 1994: B.A. in Linguistics &
Chinese
– University of Utah
• 2000: Ph.D. in Medical
Informatics
– University of Utah
– Peter Haug
• 2003: Postdoctoral
Fellowship
– University of Pittsburgh
– Bruce Buchanan
– Greg Cooper
• 2003-present: Faculty
– University of Pittsburgh
Problems Being Addressed with IE
My work
• Identifying patients with pneumonia from chest
radiograph reports
• Understanding the components of a clearly written
radiology report
– Train radiologist to dictate
• Classifying patients into syndrome categories from chief
complaints
– Cough/SOB  Respiratory patient
• Characterizing patient’s clinical state from ED reports
– Outbreak detection
– Outbreak investigation
• NLP-assisted ontology learning
• Locating pathology specimens
Problems Being Addressed with IE
Future areas of application I would like to work on
• Learning genotype-phenotype patterns for
diseases
• Quality control
– Ensure physicians are complying with core measures
required by Medicare
– Look for medical errors
• Automatically assigning billing codes
Where is the Field Now?
• Field mainly focused on sentence-level
problems
– Identifying clinical conditions, therapies,
medications
• A few systems for encoding characterizing
information for condition
• Less work on discourse level tasks—these
are crucial for successful annotation of
clinical texts
– Contextual features
•
•
•
•
•
Negation
Uncertainty
Experiencer
Temporality
Finding validation
– Coreference resolution
– Inference
What Technologies Work?
IE of clinical concepts (80% “simple”, 20% difficult)
• Shallow parsing quite effective
– MetaMap can identify many of the UMLS concepts in
texts
• Concept-value pairs important—Regular
expressions quite effective
– “temperaure 39C”
• Structure of report important
– Neck: no lymphadenopathy  cervical
lymphadenopathy
– CXR: evidence of pneumonia  radiological evidence
of pneumonia
Where Do We Need More Work?
• Non-contiguous information
– need deep parse
• Inference
– “pain when press on left side of sternum” 
non-pleuritic chest pain
• semantic networks
– Opacity consistent with pneumonia 
localized infiltrate
• Bayesian networks
What Technologies Work?
Contextual Features (80% “simple”, 20% difficult)
Rules based on trigger terms work quite well
– NegEx
– ConText
Three Contextual Features
Negation
Is the condition negated?
Patient Experience
Did the patient experience the condition?
Temporality
When did the condition occur?
Negated
Affirmed
Yes
No
Historical
Recent
Hypothetical
ConText Algorithm
• Four elements
–
–
–
–
Trigger terms
Pseudo-trigger terms
Scope of the trigger term
Termination terms
• Assign appropriate value to contextual features
for clinical conditions within scope of trigger
terms
• Scope is usually until end of sentence or until
trigger term
ConText: Determine Values for
Contextual Features
Based on negation algorithm NegEx
scope
Clinical condition:
Negation:
Cough
Negated
Patient denies cough but complains of headache.
No change in the patient’s chest pain.
trigger term
pseudo-trigger
term
termination
term
Evaluation of ConText
• Test set
– 90 ED reports
• Reference standard
Physician annotations with NLP-assisted review
– 55 conditions
– 3 contextual features
• Outcome measures
– Recall
– Precision
ConText’s Performance
1,620 annotations
Feature
Recall
Precision
Negation (773)
97%
97%
Historical (98)
Hypothetical (40)
67%
83%
74%
94%
Experiencer (8)
100%
100%
What is Needed for the 20%
• More knowledge modeling
– Historicity often depends on the condition not
on explicit time triggers
– Coreference resolution needs fine-grained
semantic knowledge
• Statistical techniques
– Integrating information regarding sentencelevel and discourse level information
• Annotated data sets
Why Haven’t We Implemented
Many NLP Applications?
• Are we addressing the best application
areas?
• Do we need more semi-automated
applications?
Sharing Clinical Reports
University of Pittsburgh IRB
– Chief complaints are non human subjects
data
• Can share openly as long as can’t triangulate
patient
– Chief complaint, age, hospital  patient
– To use clinical report, must apply De-ID
software
• Once apply De-ID, considered deidentified
• caBIG project, can share de-identified reports
• I hope to establish repository
National Sharing
• Maybe as some institutions begin sharing,
others will follow?
• Can the NLM help?
– Apply de-identification
– Encrypted hospital information
– Password protected
– Repository of texts and annotations
– Folk annotations?
Annotation Sets of Ours
Chief Complaints (40,000)
• Syndrome classifications
ED Reports
• Syndrome classification
• 55 respiratory-related clinical conditions
–
–
–
–
Negation
Experiencer
Historical
Hypothetical
6 report types
• All clinical conditions
– Contextual features
Annotation Evaluation
• Measuring annotators’
– Reliability
– Agreement
• More difficult if measuring agreement on
what text was marked
– F-measure
• Measuring quality of annotation schema
– Dependent variable = agreement between
annotators
Mean F-Score for Pairs of Annotators
1.00
0.89
0.90
0.90
0.86
0.80
0.80
0.89
0.83
0.84
0.85
0.81
Physicians
0.79
Lay People
0.70
0.60
0.50
Baseline Annotation Annotation Annotation
Schema
Schema
Schema
Schema
Train 1
Train 2
Train 3
Post
Schema
C
0.24
0.23
C
0.10
A
0.17
Baseline Schema Stage
B
A
0.13
0.12
B
Annotation Schema Stage
Photos courtesy Brian Chapman
http://web.mac.com/desertlight/iWeb/Reflections,%20Rotations,%20Symmetries/Welcome.html