Transcript Ambiguity
Linguistics 187 Week 4
Ambiguity and Robustness
Language has pervasive ambiguity
Tokenization
Morphology
Syntax
Semantics
Entailment
Discourse
Bill fell. John kicked him.
because or after?
John didn’t wait to go.
now or never?
Every man loves a woman.
The same woman or each their own?
John told Tom he had to go.
Who had to go?
The duck is ready to eat. Cooked or hungry?
walk
Noun or Verb
I like Jan.
untieable knot
(untie)able or un(tieable)?
|Jan|.| or |Jan.|.|
bank?
river or financial?
(sentence end or abbreviation)
Ambiguity
Syntactically legitimate ambiguity
(vs. spurious ambiguity: “boys and girls” & pushup)
Sources:
– Alternative c-structure rules
– Disjunctions in f-structure description
– Lexical categories
XLE’s display/computation of ambiguity
Dealing with ambiguity
– Recognize legitimate ambiguity
– OT marks for preferences (later in the course)
– Stochastic disambiguation
Syntactic Ambiguity
Lexical
– part of speech
– subcategorization frames
Syntactic
– attachments
– coordination
Implemented system highlights interactions
Lexical Ambiguity: POS
verb-noun
I saw her duck.
I saw [NP her duck].
I saw [NP her] [VP duck].
noun-adjective
the [N/A mean] rule
that child is [A mean].
he calculated the [N mean].
Morphology and POS ambiguity
English has impoverished morphology and
hence extreme POS ambiguity
– leaves: leave +Verb +Pres +3sg
leaf +Noun +Pl
leave +Noun +Pl
– will: +Noun +Sg; +Aux; +Verb +base
Even languages with extensive morphology
have ambiguities
Lexical ambiguity: Subcat frames
Words often have more than one
subcategorization frame
– transitive/intransitive
I broke it./It broke.
– intransitive/oblique
He went./He went to London.
– transitive/transitive with infinitive
I want it./I want it to leave.
Subcat-Rule interactions
OBL vs. ADJUNCT with intransitive/oblique
– He went to London.
[ PRED ‘go<(^ SUBJ)(^ OBL)>’
SUBJ [PRED ‘he’]
OBL [PRED ‘to<(^ OBJ)>’
OBJ [ PRED ‘London’]]]
[ PRED ‘go<(^ SUBJ)>’
SUBJ [PRED ‘he’]
ADJUNCT { [PRED ‘to<(^ OBJ)>’
OBJ [ PRED ‘London’]]}]
OBL-ADJUNCT cont.
Passive by phrase
– It was eaten by the boys.
[ PRED ‘eat<(^ OBL-AG)(^ SUBJ)>’
SUBJ [PRED ‘it’]
OBL-AG [PRED ‘by<(^ OBJ)>’
OBJ [PRED ‘boy’]]]
– It was eaten by the window.
[ PRED
‘eat<NULL(^ SUBJ)>’
SUBJ
[PRED ‘it’]
ADJUNCT { [PRED ‘by<(^ OBJ)>’
OBJ [PRED ‘boy’]]}]
XCOMP-ADJUNCT
to infinitives can be arguments or adjuncts
(purpose clauses)
– I want her to leave.
[ PRED ‘want<(^ SUBJ)(^ XCOMP)>(^ OBJ)’
SUBJ [ PRED ‘I’ ]
OBJ [ PRED ‘her’ ]1
XCOMP [ PRED ‘leave<(^ SUBJ)>’
SUBJ [ 1 ] ] ]
XCOMP-ADJUNCT cont.
– I want money to buy that.
[ PRED ‘want<(^ SUBJ)(^ OBJ)>’
SUBJ [ PRED ‘I’ ]
OBJ [ PRED ‘money’ ]
ADJUNCT { [ PRED ‘buy<(^ SUBJ)(^ OBJ)>’
SUBJ [ PRED ‘pro’ ]
OBJ [ PRED ‘that’ ] ] } ]
But both sentences get both analyses
– The syntax does not have world knowledge
OBJ-TH and Noun-Noun compounds
Many OBJ-TH verbs are also transitive
– I took the cake. I took Mary the cake.
The grammar needs a rule for noun-noun
compounds
– the tractor trailer, a grammar rule
These can interact
– I took the grammar rules
– I took [NP the grammar rules]
– I took [NP the grammar] [NP rules]
Syntactic Ambiguities
Even without lexical ambiguity, there is
legitimate syntactic ambiguity
– PP attachment
– Coordination
Want to:
– constrain these to legitimate cases
– make sure they are processed efficiently
PP Attachment
PP adjuncts can attach to VPs and NPs
Strings of PPs in the VP are ambiguous
– I see the girl with the telescope.
I see [the girl with the telescope].
I see [the girl] [with the telescope].
This ambiguity is reflected in:
– the c-structure (constituency)
– the f-structure (ADJUNCT attachment)
PP attachment cont.
This ambiguity multiplies with more PPs
– I saw the girl with the telescope
– I saw the girl with the telescope in the garden
– I saw the girl with the telescope in the garden on
the lawn
The syntax has no way to determine the
attachment, even if humans can.
Ambiguity in coordination
Vacuous ambiguity of non-branching trees
– this can be avoided (pushup)
Legitimate ambiguity
– old men and women
old [N men and women]
[NP old men ] and [NP women ]
– I turned and pushed the cart
I [V turned and pushed ] the cart
I [VP turned ] and [VP pushed the cart ]
Grammar Engineering and ambiguity
Large-scale grammars will have lexical and
syntactic ambiguities
With real data they will interact, resulting in
many parses
– these parses are (syntactically) legitimate
– they are not intuitive to humans
(but more plausible words can make them better)
XLE provides tools to manage ambiguity
– grammar writer interfaces
– computation
XLE display
Four windows
–
–
–
–
c-structure (top left)
f-structure (bottom left)
packed f-structure (top right)
choice space (bottom right)
C-structure and f-structure “next” buttons
Other two windows are packed
representations of all the parses
– clicking on a choice will display that choice in the
left windows
Example
I see the girl in the garden
PP attachment ambiguity
– both ADJUNCTS
– difference in ADJUNCT-TYPE
Packed F-structure and Choice space
Sorting through the analyses
“Next” button on c-structure and then fstructure windows
– impractical with many choices
– independent vs. interacting ambiguities
– hard to detect spurious ambiguity
The packed representations show all the
analyses at once
– (in)dependence more visible
– click on choice to view
– spurious ambiguities appear as blank choices
» but legitimate ambiguities may also do so
Ambiguity Demo
– eng-week4-demo.lfg
– eng-week4-demo-test.lfg
Attachment
– the girl ate the banana with the monkey
Subcategorization
– the girl thought about the banana
Feature
– the sheep laughed
All three (2 c-structures; 8 analyses)
– the girl thought about the banana with the monkey
XLE Ambiguity Management
The sheep liked the fish.
How many sheep?
How many fish?
Options multiplied out
The sheep-sg liked the fish-sg.
The sheep-pl liked the fish-sg.
The sheep-sg liked the fish-pl.
The sheep-pl liked the fish-pl.
Options packed
The sheep
sg
sg
liked the fish
pl
pl
Packed representation is a “free choice” system
– Encodes all dependencies without loss of information
– Common items represented, computed once
– Key to practical efficiency
Dependent choices
Das Mädchen
The girl
nom
acc
sah die Katze
nom
acc
saw the cat
Again, packing avoids duplication … but it’s wrong
It doesn’t encode all dependencies, choices are not free.
Das Mädchen-nom sah die Katze-nom
Das Mädchen-nom sah die Katze-acc
Das Mädchen-acc sah die Katze-nom
Das Mädchen-acc sah die Katze-acc
Who do you want to succeed?
I want to succeed John
I want John to succeed
bad
The girl saw the cat
The cat saw the girl
bad
want intrans, succeed trans
want trans, succeed intrans
Solution: Label dependent choices
Das Mädchen-nom sah die Katze-nom
Das Mädchen-nom sah die Katze-acc
Das Mädchen-acc sah die Katze-nom
Das Mädchen-acc sah die Katze-acc
Das Mädchen
p:nom
p:acc
sah die Katze
q:nom
q:acc
bad
The girl saw the cat
The cat saw the girl
bad
=
(pq)
(pq)
• Label each choice with distinct Boolean variables p, q, etc.
• Record acceptable combinations as a Boolean expression
• Each analysis corresponds to a satisfying truth-value assignment
(free choice from the true lines of ’s truth table)
Ambiguity and Robustness
Large-scale grammars are massively
ambiguous
Grammars parsing real text need to be robust
– "loosening" rules to allow robustness increases
ambiguity even more
Need a way to control the ambiguity
– version of Optimality Theory (OT)
Theoretical OT
Grammar has a set of violable constraints
Constraints are ranked by each language
– This gives cross-linguistic variation
Candidates (analyses) compete
– John waited for Mary. vs. John waited for 3 hours.
Constraint ranking determines winning candidate
Issues for XLE
– Candidates can be very ungrammatical
» we have a grammar to produce grammatical analyses
» even with robust, ungrammatical analyses, these are controlled
– Generation, not parsing direction
» we know what the string is already
» for generation we have a very specified analysis
XLE OT
Incorporate idea of ranking and (dis)preference
Filter syntactic and lexical ambiguity
Reconcile robustness and accuracy
Allow parsing grammar to be used for
generation
XLE OT Implementation
OT marks in
– grammar rules
– templates
– lexical entries
CONFIG states
– preference vs. dispreference
– ranking
– parsing vs. generation orders
The o:: projection
OT marks are not f-structure features
OT marks are in their own projection
f-structure
c-structure
o-structure
(set of OT marks)
The o:: projection
The o-structure is just a set of marks
{ PPadj GuessedN }
Instead of ^ and !, have o::*
(NB: !f::*)
PP: (^ ADJUNCT)=!
PPadj $ o::* ;
– the f-structure is exactly the same
– there is now an additional o-structure
Ranking analyses
Specify relative importance of OT marks in the CONFIG
OPTIMALITYORDER Mark3 Mark2 +Mark1.
Importance
Comparing analyses
– Find most important mark where the analyses differ
– Prefer the analysis with the
» Least number of dispreference marks (no +)
» Most number of preference marks (+)
Ranking analyses (continued)
OPTIMALITYORDER Mark3 Mark2 +Mark1.
Importance
an analysis with Mark2 is preferred over an analysis
with Mark3
an analysis with no mark is preferred over an analysis
with Mark2 or Mark3
an analysis with one Mark2 is preferred over one with
two Mark2
an analysis with Mark1 is preferred over an analysis
with no mark
an analysis with two Mark1 is preferred over an
analysis with one Mark1
Difference with Theoretical OT
Theoretical OT: only dispreference marks
XLE OT:
– dispreference marks: Mark1
– preference marks: +Mark1
– NOTE: + is only indicated in the CONFIG
only the name (Mark1) appears in the
grammar
Deciding which to use can be difficult
Example: PP ambiguities
John waited for Mary.
John waited for 3 hours.
Rule with OT marks Using template OT(_mark)=_mark $ o::*.
VP --> V
(NP: (^ OBJ)=!)
PP*: { (^ OBL)=!
@(OT PPobl)
|! $ (^ ADJUNCT)
@(OT PPadj)}.
Basic Structures
John waited for Mary
f-str:
[ PRED 'wait<SUBJ>'
SUBJ [ PRED 'John']
ADJ {[ PRED 'for<OBJ>'
OBJ [ PRED 'Mary' ]]}]
o-str:
{ PPadj }
John waited for Mary
f-str:
[ PRED 'wait<SUBJ OBL>'
SUBJ [ PRED 'John']
OBL [ PRED 'for<OBJ>'
OBJ [ PRED 'Mary' ]]]
o-str:
{ PPobl }
Ranking for Example
Disprefer ADJUNCTs
– OPTIMALITYORDER PPadj.
– Problem: will disprefer adjuncts even when no
OBL analysis is possible
Prefer OBLs
– OPTIMALITYORDER +PPobl.
– Problem: will prefer OBL even when the other
analysis was not an ADJUNCT
– Still probably better than dispreferring ADJUNCTs
– Solution: local OT marks (not discussed here)
Special OT marks in XLE
Separate other marks into fields
Marks preceding
– NOGOOD: remove parts of the grammar
for debugging or specializing
– STOPPOINT: apply on a second pass
for extending grammar on failure
– CSTRUCTURE: filter when the c-structure is built
for speed
There is lots of discussion in the XLE
documentation; the reading on the web is a
bit out of date for these marks
The NOGOOD Mark
OT marks can be used to remove parts of the
grammar
– rules or rule parts
– templates or template parts
– lexical items or parts of them
Use for
– grammar adaptation/sharing
– grammar development
Example
– OPTIMALITYORDER FrontMatter NOGOOD.
NOGOOD Example
ROOT rule allows for front matter for special
corpus
ROOT --> (FR-MAT: (^ ID)=!
@(OT FrontMatter))
S.
FR-MAT --> NUMBER
(PERIOD).
1. The light flashes.
FR-MAT
Grammars for corpora with front matter will
not rank the OT mark FrontMatter
(unranked marks are neutral)
Grammars for corpora without front matter will
make the OT mark a NOGOOD
OPTIMALITYORDER FrontMatter NOGOOD.
Effective ROOT rule: ROOT --> S.
Allows rule sharing across grammars
Can also be used for debugging
Robustness
What to do if the grammar doesn't provide an
analysis?
Graceful failure
– FRAGMENTs
– Specific relaxations
Ungrammatical analysis only if no
grammatical one
Avoid ungrammatical analyses in generation
Robustness: STOPPOINT
On first pass, STOPPOINT is treated as
NOGOOD
Small, fast grammar for standard constructions
If first pass fails, ignore STOPPOINT and
extend grammar
– Relaxation possibilities precede STOPPOINT
– OPTIMALITYORDER BadDetNAgr STOPPOINT.
STOPPOINT Mark example
Example: NP: this boy
NP: this boys
Template call with OT mark:
DEMON(_P _N) = (^ SPEC PRED)='_P'
{ (^ NUM)=c _N
|(^ NUM)~= _N
@(OT BadDetNAgr)}.
Lexical entry:
this DET XLE @(DEMON %stem sg).
Ranking
OPTIMALITYORDER BadDetNAgr STOPPOINT.
Structures for STOPOINT example
NP: this boy
f-str
[ PRED 'boy'
NUM sg
SPEC [ PRED 'this' ]]
o-str
NP: this boys
f-str
[ PRED 'boy'
NUM pl
SPEC [ PRED 'this' ]]
o-str
{ BadDetNAgr }
Parsing this boys will be slow: the grammar
has to parse a second time
But the ungrammatical input gets a parse
Only put OT marks behind the STOPPOINT
if they will be rarely triggered
Preference marks and STOPPOINT
Preference marks behind the STOPPOINT
are tried first (counter to intuitition)
– OPTIMALITYORDER +MWE STOPPOINT.
Use MWE readings if at all possible
If fail, do a second pass with the analytic
(non-MWE) structure (inefficient if fail)
Example:
print` quality N * @(NOUN %STEM) @(OT MWE).
The [N print quality] is excellent.
I want to [V print] [NP quality documents].
CSTRUCTURE Marks
Apply marks before f-structure constraints are
processed
– OPTIMALITYORDER NoCloseQuote Guessed
CSTRUCTURE.
Improve performance by filtering early
May loose some analyses
– coverage/efficiency tradeoff
CSTRUCTURE example: Guessed
Only use guessed form if another form is not
found in the morphology/lexicon
– OPTIMALITYORDER Guessed CSTRUCTURE.
Trade-off: lose some parses, but much faster
The foobar is good.
no entry for foobar ==> parse with guessed N
The audio is good.
audio: only A in morphology ==> no parse
CSTRUCTURE example: Quote
Only allow unbalanced quote marks if there is
no other quote mark
Then I left."
vs.
He said, "they appeared."
METARULEMACRO: …
_CAT QT: @(OT NoCloseQt);
…
XLE only tries balanced version, not double
unbalanced version
– failure when really needed two unbalanced quotes
Combining the OT marks
All the types of OT marks can be used in one
grammar
– ordering of NOGOOD, CSTRUCTURE,
STOPPOINT are important
Example
OPTIMALITYORDER
Verbmobil NOGOOD
Guessed CSTRUCTURE
+MWE Fragment STOPPOINT
RareForm StrandedP +Obl.
Other Features
Grouping: have marks treated as being of
equal importance
– OPTIMALITYORDER (Paren Appositive) Adjunct.
Ungrammatical markup: have XLE report
analyses with this mark with a *
– these are treated like any dispreference mark for
determining the optimal analyses
– OPTIMALITYORDER *NoDetAgr STOPPOINT.
Generation
XLE uses the same basic grammar to parse
and generate
Do not always want to generate all the
possibilities that can be parsed
Put in special OT marks for generation to block
or prefer certain strings
– fix up bad subject-verb agreement
– only allow certain adverb placements
– control punctuation options
GENOPTIMALITYORDER
OT Marks: Main points
Ambiguity: broad coverage results in
ambiguity – OT marks allow preferences
Robustness: want fall back parses only when
regular parses fail – OT marks allow
multipass grammar
XLE provides for complex orderings of OT
marks
– NOGOOD, CSTRUCTURE, STOPPOINT
– preference, dispreference, ungrammatical
– see the XLE documentation for details
FRAGMENT grammar
What to do when the grammar does not get a
parse
– always want some type of output
– want the output to be maximally useful
Why might it fail:
– construction not covered yet
– "bad" input
– took too long (XLE parsing parameters)
Grammar engineering approach
First try to get a complete parse
If fail, build up chunks that get complete
parses (c-str and f-str)
Have a fall back for things without even chunk
parses
Link these chunks and fall backs together in a
single f-structure
Basic idea
XLE has a REPARSECAT which it tries if
there is no complete parse
Grammar writer specifies what category the
possible chunks are
OT marks are used to
– build the fewest chunks possible
– disprefer using the fall back over the chunks
Sample output
the the dog appears.
Split into:
– "token" the
– sentence "the dog appears"
– ignore the period
C-structure
F-structure
How to get this
FRAGMENTS -->
{ NP: (^ FIRST)=!
@(OT-MARK Fragment)
|S: (^ FIRST)=!
@(OT-MARK Fragment)
|TOKEN: (^ FIRST)=!
@(OT-MARK Fragment) }
(FRAGMENTS: (^ REST)=! ).
Lexicon: -token TOKEN * (^ TOKEN)=%stem
@(OT-MARK Token).
Why First-Rest?
FIRST-REST
[ FIRST [ PRED …]
REST [ FIRST [ PRED … ]
REST … ] ]
– Efficient
– Encodes order
Possible alternative: set
{ [ PRED … ]
[ PRED … ] }
– Not as efficient (copying)
– Even less efficient if mark scope facts
Accuracy?
Evaluation against gold standard
“PARC 700” f-structure bank for Wall Street Journal
Measure: F-score on dependency triples
– F-score: average of precision and recall
– Dependency triples: separate f-structure features
Subj(run, dog) Tense(run, past)
Results for best-matching f-structure:
– Full parses:
F=88.5
– Fragment parses: F=76.7
(Riezler et al, 2002)
Fragments summary
XLE has a chunking strategy for when the
grammar does not provide a full analysis
Each chunk gets full c-str and f-str
The grammar writer defines the chunks based
on what will be best for that grammar and
application
Quality
– Fragments have reasonable but degraded f-scores
– Usefulness in applications is being tested