Negation - Mayo Clinic

Download Report

Transcript Negation - Mayo Clinic

Identifying Negation/Uncertainty
Attributes for SHARPn NLP
Presentation to SHARPn Summit “Secondary Use”
June 11-12, 2012
Cheryl Clark, PhD
MITRE Corporation
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
The Challenge:
Text Mentions versus Clinical Facts
■ Negation: event has not occurred or entity does not exist
She had no fever yesterday.
■ Uncertainty: a measure of doubt
The symptoms are not inconsistent with renal failure.
■ Conditional: could exist or occur under certain circumstances
The patient should come back to the ED if any rash occurs.
■ Subject: person the observation is on; experiencer
Mother had lung cancer.
■ Generic: no clear subject/experiencerfever
renal infarction
E. coli is sensitive to Cipro but enterococcus is not
HOSPITAL-PEDIATRIC DISCHARGE SUMMARY
NAME – #####
DATE OF ADMISSION – ####
LOCATION – #####
BIRTH DATE - #####
HOSPITAL-PEDIATRIC
DISCHARGE SUMMARY
(REASON FOR
ADMISSION)
NAME – #####
SWOLLEN, PAINFUL
HANDS. VOMITING. SYMPTOMS OF 18
DATE OF ADMISSION – ####
HOURS DURATION.
HOSPITAL-PEDIATRIC
DISCHARGE
LOCATION
– #####SUMMARY
NAME (ABSTRACT)
– ##### BIRTH DATE - #####
DATE PATIENT,
OF ADMISSION
– ####
1 YEAR
OLD. IS KNOWN TO HAVE SICKLE CELL
HOSPITAL-PEDIATRIC
DISCHARGE SUMMARY
LOCATION
– #####
(REASON
FOR ADMISSION)
DISEASES
AND
2 EPISODES
OF MENINGITIS. DEVELOPED
NAME
– #####
BIRTH
DATE - #####
SWOLLEN,AND
PAINFUL
HANDS.
VOMITING.
SYMPTOMS OF 18
SWOLLEN,
PAINFUL
WARM
HANDS.
HAD
SEVERAL
DATE
ADMISSION – ####
HOURS
DURATION.
EPISODES OF
VOMIINT
PRIOR
TO OF
ADMISSION.
LOCATION
##### OR
(REASON
FOR ADMISSION)
LABORATORY
STUDIES DID NOT
REVEAL– ANEMIA
BIRTH DATE
- #####
SWOLLEN,
PAINFUL
HANDS. VOMITING.
SYMPTOMS
OF 18
(ABSTRACT)
SYSTEMIC
INFECTION.
HYDRATION
THERAPY
AND
BED
HOURS
DURATION.
REST
WERE PATIENT,
PROVIDED,
IN TO
48 HAVE
HOURS.
1 WITH
YEAR IMPORVEMENT
OLD. IS KNOWN
SICKLE CELL
FOR
WAS DISCHARGED
IMPROVED.
TO BE FOLLOWED
IN
DISEASES
AND 2 (REASON
EPISODES
OF ADMISSION)
MENINGITIS.
DEVELOPED
SWOLLEN,
PAINFUL
HANDS.
VOMITING. SYMPTOMS OF 18
(ABSTRACT)
HEMATOLOGY
CLINIC. PAINFUL
SWOLLEN,
AND WARM
HANDS.
HAD SEVERAL
HOURS
DURATION.
EPISODES
VOMIINT
PRIOR
TO ADMISSION.
PATIENT, 1 YEAR
OLD. ISOFKNOWN
TO HAVE
SICKLE
CELL
STUDIES
DID NOTDEVELOPED
REVEAL ANEMIA OR
DISEASES AND 2LABORATORY
EPISODES OF
MENINGITIS.
(ABSTRACT)
SYSTEMIC
INFECTION.
HYDRATION
SWOLLEN, PAINFUL
AND WARM
HANDS.
HAD
SEVERALTHERAPY AND BED
REST WERE
PROVIDED,
WITH
48 HOURS.
EPISODES OF VOMIINT
PRIOR
TOPATIENT,
ADMISSION.
1 IMPORVEMENT
YEAR OLD. ISIN
KNOWN
TO HAVE SICKLE CELL
WAS DISCHARGED
IMPROVED.
TO 2OR
BE
FOLLOWED
LABORATORY STUDIES
DID NOT REVEAL
ANEMIA
DISEASES
AND
EPISODES
OFIN
MENINGITIS. DEVELOPED
HEMATOLOGY
CLINIC.
SYSTEMIC INFECTION.
HYDRATION
THERAPYPAINFUL
AND BEDAND WARM HANDS. HAD SEVERAL
SWOLLEN,
REST WERE PROVIDED, WITH IMPORVEMENT
IN VOMIINT
48 HOURS.
EPISODES OF
PRIOR TO ADMISSION.
WAS DISCHARGED IMPROVED. TO LABORATORY
BE FOLLOWEDSTUDIES
IN
DID NOT REVEAL ANEMIA OR
HEMATOLOGY CLINIC.
SYSTEMIC INFECTION. HYDRATION THERAPY AND BED
REST WERE PROVIDED, WITH IMPORVEMENT IN 48 HOURS.
WAS DISCHARGED IMPROVED. TO BE FOLLOWED IN
HEMATOLOGY CLINIC.
rash
lung cancer
Cipro
…
no
uncertain
conditional
family member
generic
Page 2
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Background:
Assertion Analysis Tool, Version 1
Input
docs
Negation &
Uncertainty
Cue/Scope
Tagger
Compute
scope
enclosures
by rule
Identify
sections
Extract
words,
concepts,
locations
i2b2
concepts
Identify
word
classes and
ordering
Independent Evaluation:
Assertion
Classifier
(Maximum
Entropy)
i2b2/VA 2010 Clinical
NLP Challenge
Assertion Status Task
F Score = 0.93
i2b2
assertions
3
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Assertion Status Integration within SHARPn
Clinical Document Pipeline
cTAKES analysis engines
…
Input
docs
…
Negation &
Uncertainty
Cue/Scope
Tagger
Annotations
All annotations
are UIMA
Common Analysis
Structure (CAS)
…
Compute
scope
enclosures
by rule
Identify
sections
Extract
words,
concepts,
locations
Identify
word
classes and
ordering
Updated
attribute
annotations
Assertion
Classifier
(Maximum
Entropy)
4
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
i2b2 Assertion Categories
■ Assertion classification system designed to meet requirements of
2010 i2b2/VA Challenge Assertion subtask
Present: default category
Patient had a stroke
Absent: problem does not exist in the patient
History inconsistent with stroke
Possible: uncertainty expressed
We are unable to determine whether she has leukemia
Conditional: patient experiences the problem only under certain conditions
Patient reports shortness of breath upon climbing stairs
Hypothetical: medical problems the patient may develop
If you experience wheezing or shortness of breath
Corresponds to
SHARPn conditional
Not Patient: problem associated with someone who is not the patient
Family history of prostate cancer
Page 5
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Re-architecting Assertions
■ i2b2 assertion output values
– defined for medical problems
– closed set of values
– mutually exclusive (fixed priority
when multiple values apply)
present
absent
single,
possible
multi-way
hypothetical
classifier
not patient
conditional (no SHARPn equivalent)
■ SHARPn assertion attributes
– apply to various entities, events,
relations
– independent
– attributes can have multiple values
– additional attributes may be added
negation
uncertainty
conditional
subject
multiple
classifiers,
some binary
yes/no
yes/no
yes/no
multi-valued
(patient,
family, donor, other…)
…
Page 6
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Assertion Module Refactoring: Phase 1
■ Simple mapping from i2b2 assertion classes to SHARPn attributes
– Uses existing i2b2-trained single classifier model
– Identifies i2b2/SHARPn equivalences
– Maps to SHARPn attribute values
Please call physician [if you develop shortness of breath ] .
i2b2 assertion status = “hypothetical”
SHARPn conditional attribute = “true”
Page 7
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Assertion Module Refactoring: Phase 2
■ Direct assignment of SHARPn attribute values
■ Will use multiple classifiers trained on SHARPn data
– Will identify attribute values directly
■ Benefits
– Aligns with SHARPn concept attributes requirements
– Aligns with SHARPn clinical data annotation
– Enables more accurate meaning representation i2b2 2010 Paradigm
Choose one:
present
absent
He does not smoke , has no hypertension , and
possible
has no family history of coronary artery disease.
hypothetical
conditional
not patient
negator
absent
SHARPn Attribute Paradigm
not patient
family
negation = present
subject = family_member
Page 8
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
System Errors=> Need for Better Linguistic
Analysis for Assertions
■ Need for phrasal structure; scope extent not always enough
negated
She had [no chest pain or chest pressure ] with this and this was deemed a
negative test.
not negated
9
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Syntactic Approaches*
■ Insert a signifier node into constituency parse above entity
■ Use tree kernel methods to compare similarity with negated
sentences in training data (can be used on other modifiers as well
with varying degrees of success)
* Slide courtesy of Tim Miller, Children’s Hospital Boston
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Tree kernel fragment mining*
■ Use TK model to extract tree fragment features (Pighin & Moschitti
07)
■ Allows interaction with other feature types
■ Faster to find fragments than do whole-tree comparisons
* Slide courtesy of Tim Miller, Children’s Hospital Boston
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Next Steps: Assertions for Relations
■ Some assertion attributes apply to relations, too.
– negation
– uncertainty
– conditional
location relation
uncertain
The fundal AVMs are a potential site of bleeding
although do not explain the extent of bleeding .
negated
causal relation
Page 12
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Next Steps: Classifier Retraining and
Component Evaluation
■ Model Retraining
–
–
–
–
Models for individual attributes
Linguistic features based on parser output
Training on SHARPn data
Enhancements to parsers
■ Evaluation
– Accuracy on i2b2 gold annotations vs. accuracy on SHARPn gold
annotations
■
i2b2 absent vs. SHARPn negated
■ i2b2 possible vs. SHARPn uncertainty
■ i2b2 hypothetical vs. SHARPn conditional
– Evaluation based on system-generated entity annotations
– Evaluation on CEM concept rather than on individual mentions
Page 13
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Thank you!
SHARPn Negation/Uncertainty Team
John Aberdeen
David Carrell
Cheryl Clark
Matt Coarr
Scott Halgrim
Lynette Hirschman
Donna Ihrke
Tim Miller
Guergana Savova
Ben Wellner
Page 14
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Backup Slides
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Clarifying Definitions
Negation and temporal
The patient had the tumor removed.
The text span “removed” indicates the tumor was there but does
not exist anymore. Originally annotated as negated.
No longer annotated as negated.
Course: degree_of (tumor, CHANGED (span for “removed”))
Circumstantial negation (i2b2 calls this conditional)
While smoking, he does not use his nicotine patch
Annotated as negated
Allergens
ALLERGIES
Medications mentioned as allergens originally negated
Allergen status distinguished from negation
Allergy_indicator_class
PCN
Sulpha
Zocor
Asendin
Rocephin
Page 16
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
System Errors=> Need for Better Linguistic
Analysis for Assertions
absent = negated
present = should not be negated
She had no signs of infection [on her leg wounds ]and she did have
some mild erythema around her right great toe
Issue is structure and not simply span extent:
negated
She had [no chest pain or chest pressure ] with this and this was deemed a
negative test.
not negated
17
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
MASTIF-Generated SHARPn attributes in
cTAKES Output
■ [Add screenshot]
default
values
calculated
value
Page 18
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Assertions for Different Concept Types
polarity = -1
negated
Page 19
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Issues: Differences in training data
annotation
UMLS CUI-driven annotation (SHARPn)
UMLE contains some concept-internal negation; concept-internal subject
Cigarette smoker
Concept: [C0337667] (finding)
Never smoked
Non-smoker
Concept: [C0425293] Never smoked tobacco (finding)
Concept: [C0337672] Non-smoker (finding)
Mother smokes
Father smokes
Concept: [C0424969] (finding)
Concept: [C0424968] (finding)
Mother does not smoke Concept: [C2586137] (finding)
Father does not smoke Concept: [C2733448] (finding)
i2b2 concept excludes contextual cues; SHARPn concept includes it.
The patient has never smoked.
i2b2 concept: smoked (negated)
SHARPn concept: never smoked (not negated)
Page 20
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Issue: Differences in training data
annotation
No known allergies
Concept: [C0262580] No known allergies
i2b2: concept = known allergies; type = problem; assertion = absent
SHARPn: concept = no known allergies; type = disease/disorder; (finding in UMLS)
assertion = present
NKA
i2b2: concept = nka ; type= problem; assertion = absent
Page 21
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.
Abstract
We describe a methodology for identifying negation and uncertainty in clinical
documents and a system that uses that information to assign assertion values to
medical problems mentioned in clinical text. This system was among the top
performing systems in the assertion subtask of the 2010 i2b2/VA community
evaluation Challenges in natural language processing for clinical data, and has
subsequently been packaged as a UIMA module called the MITRE Assertion Status
Tool for Interpreting Facts (MASTIF), which can be integrated with cTAKES. We
describe the process of extending MASTIF, which uses a single multi-way classifier to
select among a closed set of mutually exclusive assertion categories, to a system
that uses individual, independent classifiers to assign values to independent negation
and uncertainty attributes associated with a variety of clinical concepts (e.g.,
medications, procedures, and relations) as specified by SHARPn requirements. We
discuss the benefits that result from this new representation and the challenges
associated with generating it automatically. We compare the accuracy of MASTIF on
i2b2 data with accuracy on a subset of SHARPn clinical documents, and discuss the
contribution of linguistic features to accuracy and generalizability of the
system. Finally, we discuss our plans for future development.
Page 22
Approved for Public Release: 12-2751
© 2012 The MITRE Corporation. All rights reserved.