Open-labeled long-term study of the subcutaneous sumatriptan

Download Report

Transcript Open-labeled long-term study of the subcutaneous sumatriptan

Noun compounds (NCs)

Any sequence of nouns that itself
functions as a noun




asthma hospitalizations
asthma hospitalization rates
health care personnel hand wash
Technical text is rich with NCs
Open-labeled long-term study of the subcutaneous
sumatriptan efficacy and tolerability in acute
migraine treatment.
1
NCs: 3 computational tasks


Identification
Syntactic analysis (attachments)



[Baseline [headache frequency]]
[[Tension headache] patient]
Semantic analysis


Headache treatment
Corticosteroid treatment
treatment for headache
treatment that uses
corticosteroid
2
Two approaches


Treat it as a classification problem
(and use a machine learning algorithm)
Linguistically motivated: consider the
“semantics” of the nouns which will
determine the relations between
them
3
First approach
Extraction of NCs from titles and
abstracts of Medline






Part-of-Speech Tagger
Extraction of sequences of units tagged as
nouns
Collection of 2245 NCs with 2 nouns
A manual annotation of the NCs found 38
semantic relations
Collection of labeled NCs and a set of
semantic relations
4
Semantic relations

Frequency/time of


Measure of


headache drugs, hiv medications, influenza treatment
Defect


aciclovir therapy, laser irradiation, aerosol treatment
“Purpose”


relief rate, asthma mortality, hospital survival
Instrument


influenza season, headache interval
hormone deficiency, csf fistulas, gene mutation
Inhibitor

Adrenoreceptor blockers, influenza prevention
5
Semantic relations

Cause


Change


Bile delivery, virus reproduction
Person Afflicted


Papilloma growth, disease development
Activity/Physical Process


Asthma hospitalization, aids death
….
Aids patients, headache group
6
Features

Lexical (words)

MeSH descriptors
7
Classification method and results



Multi-class (18) classification problem
Multi layer Neural Networks to classify
across all relations simultaneously.
Results
Features
Accuracy
Words
62%
MeSH
61%
Baselines
Guessing
5%
Most frequent relation
31%
Vanderwende94 (13 classes)
52%
Lapata00 (binary)
80%
8
Second approach


Linguistic Motivation
Head noun has argument structure

Meaning of the head noun determines
what kinds of things can be done to it,
what it is made of, what it is a part of…
9
Linguistic Motivation

Material + Cutlery  Made of


Food + Cutlery  Used on


steel knife, plastic fork, wooden spoon
meat knife, dessert spoon, salad fork
Profession + Cutlery  Used by

chef's knife, butcher's knife
10
Linguistic Motivation

Hypothesis:


A particular semantic relation holds
between all 2-word NCs that can be
categorized by a MeSH pair.
Use the classes of MeSH to identify
semantic relations
11
Grouping the NCs

A02 C04 (Musculoskeletal System, Neoplasms)


B06 B06 (Plants, Plants)


skull tumors, bone cysts, bone metastases, skull osteosarcoma…
eucalyptus trees, apple fruits, rice grains, potato plants
A01 M01 (Body region, Person)



shoulder patient, eye physician, eye donor
Too different: need to be more specific: go down the
hierarchy
A01 M01.643 (Body Regions, Patients)


shoulder patient
C04 M01.526 (Body Regions, Occupational Groups)

eye physician, chest physicians
12
Classification Decisions +
Relations





A02 C04  Location of Disease
B06 B06  Kind of Plants
C04 M01
 C04 M01.643  Person afflicted by Disease
 C04 M01.526  Person who treats Disease
A01 H01
 A01 H01.770
 A01 H01.671
 A01 H01.671.538
 A01 H01.671.868
A01 M01
 A01 M01.643  Person afflicted by Disease
 A01 M01.526  Specialist of
 A01 M01.898  Donor of
13
Evaluation

Accuracy:




Anatomy: 91% accurate
Natural Science: 79%
Neoplasm: 100%
Total Accuracy : 90.8%
14
Conclusion of NCs





Problem of assigning semantic relations to
two-word technical NCs
Important problem: many NCs in technical
text
Especially difficult for the lack of
syntactic clues
State-of-the-art results
One of very few working systems to tackle
this task for NCs
15