Open-labeled long-term study of the subcutaneous sumatriptan
Download
Report
Transcript Open-labeled long-term study of the subcutaneous sumatriptan
Noun compounds (NCs)
Any sequence of nouns that itself
functions as a noun
asthma hospitalizations
asthma hospitalization rates
health care personnel hand wash
Technical text is rich with NCs
Open-labeled long-term study of the subcutaneous
sumatriptan efficacy and tolerability in acute
migraine treatment.
1
NCs: 3 computational tasks
Identification
Syntactic analysis (attachments)
[Baseline [headache frequency]]
[[Tension headache] patient]
Semantic analysis
Headache treatment
Corticosteroid treatment
treatment for headache
treatment that uses
corticosteroid
2
Two approaches
Treat it as a classification problem
(and use a machine learning algorithm)
Linguistically motivated: consider the
“semantics” of the nouns which will
determine the relations between
them
3
First approach
Extraction of NCs from titles and
abstracts of Medline
Part-of-Speech Tagger
Extraction of sequences of units tagged as
nouns
Collection of 2245 NCs with 2 nouns
A manual annotation of the NCs found 38
semantic relations
Collection of labeled NCs and a set of
semantic relations
4
Semantic relations
Frequency/time of
Measure of
headache drugs, hiv medications, influenza treatment
Defect
aciclovir therapy, laser irradiation, aerosol treatment
“Purpose”
relief rate, asthma mortality, hospital survival
Instrument
influenza season, headache interval
hormone deficiency, csf fistulas, gene mutation
Inhibitor
Adrenoreceptor blockers, influenza prevention
5
Semantic relations
Cause
Change
Bile delivery, virus reproduction
Person Afflicted
Papilloma growth, disease development
Activity/Physical Process
Asthma hospitalization, aids death
….
Aids patients, headache group
6
Features
Lexical (words)
MeSH descriptors
7
Classification method and results
Multi-class (18) classification problem
Multi layer Neural Networks to classify
across all relations simultaneously.
Results
Features
Accuracy
Words
62%
MeSH
61%
Baselines
Guessing
5%
Most frequent relation
31%
Vanderwende94 (13 classes)
52%
Lapata00 (binary)
80%
8
Second approach
Linguistic Motivation
Head noun has argument structure
Meaning of the head noun determines
what kinds of things can be done to it,
what it is made of, what it is a part of…
9
Linguistic Motivation
Material + Cutlery Made of
Food + Cutlery Used on
steel knife, plastic fork, wooden spoon
meat knife, dessert spoon, salad fork
Profession + Cutlery Used by
chef's knife, butcher's knife
10
Linguistic Motivation
Hypothesis:
A particular semantic relation holds
between all 2-word NCs that can be
categorized by a MeSH pair.
Use the classes of MeSH to identify
semantic relations
11
Grouping the NCs
A02 C04 (Musculoskeletal System, Neoplasms)
B06 B06 (Plants, Plants)
skull tumors, bone cysts, bone metastases, skull osteosarcoma…
eucalyptus trees, apple fruits, rice grains, potato plants
A01 M01 (Body region, Person)
shoulder patient, eye physician, eye donor
Too different: need to be more specific: go down the
hierarchy
A01 M01.643 (Body Regions, Patients)
shoulder patient
C04 M01.526 (Body Regions, Occupational Groups)
eye physician, chest physicians
12
Classification Decisions +
Relations
A02 C04 Location of Disease
B06 B06 Kind of Plants
C04 M01
C04 M01.643 Person afflicted by Disease
C04 M01.526 Person who treats Disease
A01 H01
A01 H01.770
A01 H01.671
A01 H01.671.538
A01 H01.671.868
A01 M01
A01 M01.643 Person afflicted by Disease
A01 M01.526 Specialist of
A01 M01.898 Donor of
13
Evaluation
Accuracy:
Anatomy: 91% accurate
Natural Science: 79%
Neoplasm: 100%
Total Accuracy : 90.8%
14
Conclusion of NCs
Problem of assigning semantic relations to
two-word technical NCs
Important problem: many NCs in technical
text
Especially difficult for the lack of
syntactic clues
State-of-the-art results
One of very few working systems to tackle
this task for NCs
15