Selectional Restrictions (Cont.)

Download Report

Transcript Selectional Restrictions (Cont.)

Ambiguity Resolution
Allen’s Chapter 10
J&M’s Chapters 16 and 17
1
Ambiguity Resolution
• The word bridge has at least four distinct senses:
the structure of a raised passage or road
the card game
the dental device
the abstract notion of providing a connection
• To distinguish between these readings, different senses can
be used:
STRUCTURE1
GAME27
DENTAL-DEV37
CONNECT12
• The sentence The bridge crosses the river only makes
sense with STRUCTURE1
• How could we select the correct sense during the semantic
interpretation
2
Word Sense Hierarchy
• Word senses can be categorized based on the
object classes they describe
• Some senses are disjoint (e.g., DOG1, sense of a
dog, and CAT1, sense of a cat, are disjoint
• Some senses are subclass of others (e.g., DOG1 is
a subclass of MAMMAL1 and PET1)
• Some senses overlap (e.g., MAMMAL1 and
PET1)
• This general knowledge can be used for semantic
disambiguation
3
Word Sense Hierarchy
4
Selectional Restrictions
• The subset relation defines an abstraction
hierarchy of the word senses, and can be
used to restrict sense choices
• An adjective such as purple should only
modify physical objects; purple ideas or
purple events don’t make sense
• The modifier precise should be applied to
ideas and actions; unfortunate normally
modifies events and situations
5
Selectional Restrictions (Cont.)
• Verb read has two principal arguments: the
agent (i.e., of type PERSON), and the theme
(i.e., of type TEXTOBJ)
• Noun dishwasher has two senses: a machine
of type DISHWASH/MACH, or a person of
type DISHWASH/PERS
• Noun article has two senses: a paper
(ARTICLE/TEXT), or a POS (ARTICLE1)
6
Selectional Restrictions (Cont.)
• Sentence the dishwasher read the article
seems to have four semantic meaning, but
only one of them makes sense:
(READS1 [AGENT <THE d1 DISHWASH/PERS>]
[THEME <THE p1 ARTICLE/TEXT>])
• A semantic interpreter can perform this
disambiguation using selectional restrictions
• Selectional restrictions define valid
combinations of senses that can co-occur
7
Selectional Restrictions (Cont.)
• To incorporate SRs into the semantic interpreter,
we need to extract the type information inherent in
the Logical Form
• Examine involved unary and binary predicates of
the LF
(READS1 r1 [AGENT <THE d1 {DISHWASH/MACH DISHWASH/PERS}>]
[THEME <THE p1 {ARTICLE/TEXT ARTICLE1}>])
• The involved unary and binary predicates
are:
(READS1 r1), ({DISHWASH/MACH DISHWASH/PERS} d1),
({ARTICLE/TEXT ARTICLE1} p1), (AGENT r1 d1), (THEME r1 p1)
8
Selectional Restrictions (Cont.)
• Finding valid combination of senses can be
viewed as a constraint satisfaction
problem
• Constraints include the relationship between
unary predicates, expressed as Selectional
Restrictions on the types of the arguments
of binary relations
9
Selectional Restrictions (Cont.)
• The selectional restrictions for READS1are
expressed as:
(AGENT READS1 PERSON)
(THEME READS1 TEXTOBJ)
(AGENT r1 d1) is satisfied if d1 is a person
• So the unary constraint on d1 is simplified
to (DISHWASH/PERS d1)
• Similarly constraint on p1 is simplified to
(ARTICLE/TEXT p1)
10
There might exist More Ambiguities
• Verb read has also another sense, READS2, for
understanding someone’s intentions:
• Jill can read John’s mind
• Selectional Restrictions on READS2 can be:
(AGENT READS2 PERSON)
(THEME READS2 MENTAL-STATE)
• The LF of The dishwasher read the article would
be
({READS1 READS2} r1
[AGENT <THE d1 {DISHWASH/MACH DISHWASH/PERS}>]
[THEME <THE p1 {ARTICLE/TEXT ARTICLE1}>])
• The additional ambiguity doesn’t affect the final
result
11
Adjectives act as Verbs
• The technique can be applied to proper names,
pronouns, and adjectives
• Adjective, like verbs, impose restrictions on their
arguments
• Adjectives can be handled by introducing a state
variable for each adjective and a thematic relation
MOD
• For happy dishwasher, instead of (HAPPY1 d1),
we use (HAPPY-STATE h1) and (MOD h1 d1)
• The Selectional Restriction for happy would be:
(MOD HAPPY-STATE ANIMATE)
12
Semantic Interpretation as
Constraint Satisfaction
• Every time a logical form is produced, the
set of unary and binary semantic relations it
contains can be checked against the
Selectional Restrictions
• If there is no interpretation that satisfies the
constraints, then the interpretation is
anomalous and the constituent is discarded
• Otherwise the simplified logical form can
be used as the SEM of the constituent
13
A simple Constraint Satisfaction Algorithm
14
The dishwasher reads the article
Types(r1) = READS1, READS2
Types(p1) = ARTICLE/TEXT, ARTICLE1
Types(d1) = DISHWASH/PERS, DISHWASH/MACH
Binary relations are: (AGENT r1 d1) and (THEME r1 p1)
Relevant Selectional Restrictions are:
(AGENT READS1 PERSON)
(THEME READS1 TEXTOBJ)
(AGENT READS2 PERSON)
(THEME READS2 MENTAL-STATE)
For (AGENT r1 d1), we iterate through senses of r1, both of which need a
PERSON agent, so types(d1) is reduced to:
DISHWASH/PERS
For (THEME r1 p1) we iterate through senses of r1, READS2 is eliminated
because types of p1 doesn’t include a MENTAL-STATE, types of p1 is
reduced to ARTICLE/TEXT
In the next cycle no changes occur so the correct interpretation is found
15
Selectional Restrictions (Cont.)
• Selectional Restrictions are also useful for refining the type
of unknown objects
• In interpreting He read it, type of it will be constrained to
TEXTOBJ
• Selectional Restrictions provide an important tool for
disambiguation
• However, the main problem is that semantic wellformedness is a continuous scale rather than an all-ornothing decision
I ate the pizza. I ate the box. I ate the car. I ate the thought.
• A concrete constraint (THEME = FOOD) reject many
possible interpretation, and an abstract one (THEME =
PHYSOBJ) eliminates almost nothing
16
Selectional Restrictions (Cont.)
• Selectional Restrictions are weak for disambigution
• They are also sensitive to the context of propositions
E.g., in a negated context, restrictions can be violated:
I could not eat a car is ok even eats needs an edible theme
• They are not applicable to My car drinks gasoline, in
which a metaphor is used
• Despite all these problems, they are still extremely useful
in actual systems, especially in limited domains
• An alternative is to view Selectional Restrictions as
preferences rather than absolute requirements
• Interpretations are sorted base on the number of SR that
they violate, preferring less violation
17
Semantic Filtering Using SR
• There are at least two ways that SRs can be added
to a parser (sequential model vs incremental one)
• The sequential model involves running the parser
and then checking all the complete S
interpretations it finds
• The incremental model involves checking each
constituents as it is suggested by the parser, and
discarding it if it is semantically ill-formed
• The incremental model can be significantly more
efficient
18
He booked a flight to the city for me
This sentence has five different structures:
The PP to the city may modify
1. booked
2. flight
The PP for me may modify
1. city
2. flight
3. booked
If you consider the semantic meaning of the sentence, there is only one
plausible reading:
The flight is to the city, and it was booked for me
This intuition can be captured by the Selectional Restrictions
19
Relevant Selectional Restrictions
•
•
•
•
•
•
(AGENT BOOK1 PERSON)
(THEME BOOK1 FLIGHT1)
(BENEFICIARY BOOK1 PERSON1)
(DESTINATION FLIGHT1 CITY1)
(NEARBY PHYSOBJ PHYSOBJ)
(NEARBY ACTION PHYSOBJ)
20
He booked the flight to the city for me
•
A constituent that is suggested by the parser and is
rejected by semantic filtering is
(VP SEM (BOOKS1 v258
[AGENT ?semsubj]
[THEME <DEF1 v260 (FLIGHT1 v260)>]
[DESTINATION <THE v263 (CITY1 v263)>])
VAR v258
SUBJ ?semsubj
•
The unary constraints on the variables are:
V258 BOOKS1, v260 FLIGHT1, v263 CITY1
•
From these, the binary relations generated from the SEM
are:
1.
2.
•
(THEME BOOKS1 FLIGHT1)
(DESTINATION BOOKS1 CITY1)
The second one doesn’t match any Selectional
Restrictions and is rejected
21
Semantic Filtering (Cont.)
• A standard chart parser finds all five structures and
generate 52 constituents on the chart
• With semantic filtering it finds only one interpretation
and generates only 33 constitutes
• The saving becomes more significant when more
complex sentences are considered:
He booked a flight to the city near the college for me
• Using sequentional model, the parser finds 14 different
interpretation and generates 116 constituents.
• With incremental semantic filtering only 3
interpretations are found and 63 constituents are
generated
22
Statistical Word Sense Disambiguation
• Selectional Restrictions provide only a
coarse classification of acceptable and nonacceptable forms
• Many cases of ambiguity cannot be
resolved
• Using statistical information (if available),
more predictive models can be developed
• The simplest technique is based on simple
unigram statistics
23
Statistical Word Sense Disambiguation (Cont.)
• Given a suitably labeled corpus, we can collect
information on the usage of different senses of
each word
• We might find 5845 uses of word bridge:
5651 uses of STRUCTURE1
194 uses of DENTAL-DEV37
• This simple strategy would be correct about 70%
for a broad range of English
• To perform better by including some effects of
context
24
Statistical Word Sense Disambiguation (Cont.)
• Although DENTAL-DEV37 occurs very rarely in the
corpus, in certain texts (e.g., dentistry), it will be the most
common sense of bridge
• In such texts, the would be many uses of words such as
teeth, dentist, cavity, orthodontics, etc.
• We want to prefer the DENTAL-DEV37 sense in the
presence of such words
• Such information is concerned with word collocations
(i.e., what words tend to appear together)
• We may consider bigrams, trigrams, …, entire sentence, or
even larger (say nearest 100 words)
• The amount of text examined for each word is called
window
25
Using Word Sense as POS
• One may suggest to adopt POS techniques to use
word senses rather than syntactic categories
• To do this, we need a corpus of words tagged with
their senses
• The unigram statistics (probability of word w has
sense S), bigrams , etc. can be computed
• There are two problems: 1) there are many more
senses than syntactic categories, and 2) to obtain
reasonable results you must use a larger context
than simple bigrams or trigrams.
26
Estimating Probability of a Particular Word Sense
A different approach is to estimate the probability of a word w
relative to a window of n words in the text centered on w
w1 w2 … wn/2 w wn/2+1 … wn-1
We want to compute the sense S of a word w that maximizes
PROB(w/S | w1 w2 … wn/2 w wn/2+1 … wn-1)
Using Bayes’s rule it is equal to:
PROB(w1 w2 … wn/2 w wn/2+1 … wn-1 | w/S) * PROB(w/S) /
PROB(w1 w2 … wn/2 w wn/2+1 … wn-1)
PROB(w1 w2 … wn/2 w wn/2+1 … wn-1 | w/S) is approximated by
i=1,n PROBn(wi | w/S),
where PROBn(wi | w/S) is the probability of word wi occurs in an n-word
window centered on word w with sense S
27
Estimating Probability of a Particular Word Sense
• So the best sense S of word w is one that
maximizes
PROB(w/S) * i=1,n PROBn(wi | w/S)
• Based on the assumption that each event is
independent of all others, the larger the
window used, the less data is needed,
because more data can be collected from
each window
28
Estimating Probability of a Particular Word Sense
• PROBn(wi | w/S) can be estimated via
Count(#times wi occurs in window centered on w/S) /
Count(#times w/S is the center of a window)
• Now consider the hypothetical information in
figure 10.11 for the word bridge,
• Using a window size 11 on a corpus of 10 million
words
• Given the data in Figure 10.11, above formula,
you get the next estimates:
29
A hypothetical Corpus
30
Statistical Word Sense Disambiguation (Cont.)
The context independent probabilities:
31
Statistical Word Sense Disambiguation (Cont.)
• The probability estimates for the senses
in a window that contains just the word the
• Probability estimates in a window that contains just word
teeth reverses the preferences.
• Content words, like dentist here, have most dramatic
effects:
32
Larger Windows improve the accuracy of estimation
• With a larger window, there are many more
chances for content words that strongly
affect the decision
• E.g., the dentist put a bridge on my teeth
• Dentist and teeth together combine to
strongly prefer the rare sense of bridge
33