Unification Based Parsing

Download Report

Transcript Unification Based Parsing

Chapter 16: Features and
Unification
Heshaam Faili
[email protected]
University of Tehran
Overview




Feature Structures and Unification
Unification-Based Grammars
Chart Parsing with Unification-Based
Grammars
Type Hierarchies
2
Agenda



we introduce the idea that grammatical categories like VPto,
Sthat, Non3sgAux, or 3sgNP, as well as the grammatical rules
like S→NP VP that make use of them, should be thought of as
objects that can have complex sets of properties associated with
them
The information in these properties is represented by
constraints, and so these kinds of models are often called
constraint based formalisms
Use these constraints for




Agreement (this flights*)
Subcategorization ()
Adding properties (features) to words
Adding some operation to rules in order to test equalities
3
Feature structures

We had a problem adding agreement to CFGs. What
we needed were features, e.g., a way to say:


A structure like this allows us to state properties, e.g.,
about a noun phrase


[number sg
person 3 ]
[cat NP
number sg
person 3 ]
Each feature (e.g., ‘number’) is paired with a value
(e.g., ‘sg’)

A bundle of feature-value pairs can be put into an attributevalue matrix (AVM)
4
examples
5
Feature paths

Values can be atomic (e.g. ‘sg’ or ‘NP’ or ‘3’), or can
be complex, and thus we can define feature paths
[cat NP
agreement [number sg
person 3]]


The value of the path [agreement number] is ‘sg’
A grammar with only atomic feature values can be
converted to a CFG


e.g. AVM on previous page  NP3,sg
However, when the values are complex, it is more expressive
than a CFG  can represent more linguistic phenomena
6
An Example for FS
7
Reentrancy (structure-sharing)


Feature structures embedded in feature structures
can share the same values
That is, two features have the exact same value—
they share precisely the same object as their value



we’ll indicate this with a tag like *1
In this example, the agreement features of both the
matrix sentence and the embedded subject are
identical
This is referred to as reentrancy
8
FS with shared value
(Reentrant)
9
Feature structures as graphs


Technically, feature structures are directed acyclic graphs
(DAGs)
So, the feature structure represented by the attribute-value
matrix (AVM):
[cat NP
agreement [number sg
person 3]]

is really the graph:
CAT
 np

sg

3
NUM

AGR

PER
10
Unification





Unification (U) = a basic operation to merge two
feature structures into a resultant feature structure
(FS)
The two feature structures must be compatible, i.e.,
have no values that conflict
Identical FSs:
 [number sg] U [number sg] = [number sg]
Conflicting FSs:
 [number sg] U [number pl] = Fail
Merging with an unspecified FS:
 [number sg] U [number []] = [number sg]
11
Unification (cont.)

Merging FSs with different features specified:
 [number sg] U [person 3] =
[number sg
person 3]

More examples:

[cat NP] U [agreement [number sg]] =
[cat NP
agreement [number sg]]

[agr [num sg]
subj [agr [num sg]]]
[agr [num sg]
subj [agr [num sg]]]
U
[subj [agr [num sg]]]=
12
Unification with Reentrancies

Remember that structure-sharing means they are the same
object:
13
Unification with Reentrancies

When unification takes place, shared values are copied over:
14
example
15
Subsumption

We can see that a more general feature structure
(less values specified) subsumes a more specific
feature structure
(1) [num sg]
(2) [per 3]
(3) [num sg
per 3]

So, we have the following subsumption relations,
where



(1) subsumes (3)
(2) subsumes (3)
(1) does not subsume (2), and (2) does not subsume (1)
16
Subsumption

F ⊑ G if and only if


1. For every feature x in F, F(x) ⊑ G(x)
(where F(x) means “the value of the
feature x of feature structure F”).
2. For all paths p and q in F such that F(p)
= F(q), it is also the case that G(p) = G(q).
17
Subsumption
18
Subsumption (partial order)
19
Semilattice, unification


F [] ⊑ F , so we can model the
sumsumption as a lattice , which [] at
the top of it
Unification F⊔G = most general H such
that F ⊑ H and G ⊑ H
20
Overview




Feature Structures
and Unification
Unification-Based
Grammars
Chart Parsing with
Unification-Based
Grammars
Type Hierarchies
21
Grammars with Feature Structures


CFG skeleton augmented with feature structure path
equations, i.e., each category has a feature structure
CFG skeleton
S  NP VP

Path equations
<NP agreement> = <VP agreement>
1. There can be zero or more path equations for each rule skeleton 
no longer atomic
2. When a path equation references constituents, they can only be
constituents from the CFG rule
e.g., <D agreement> = <Nom agreement> is an illegal equation
for the above rule! (But it would be fine for NP  Det Nom)
22
FEATURES STRUCTURES IN THE
GRAMMAR
23
agreement

subject-verb agreement and determiner
nominal agreement.
24
Agreement in Feature-Based Grammars
S  NP VP

Compare with the CFG case:
<S head> = <VP head>
S  3sgNP 3sgVP
S  PluralNP PluralVP
3sgVP 3sgVerb
3sgVP  3sgVerb NP
3sgVP  3sgVerb NP PP
3sgVP  3sgVerb PP
<NP head agr> = <VP head agr>
VP  V NP
<VP head> = <V head>
NP  Det Nom
<NP head> = <Nom head>
<Det head agr> = <Nom head agr>
Nom  Noun
etc.
<Nom head> = <Noun head>
Noun  flights
<Noun head agr num> = pl
25
Percolating Agreement Features
S  NP VP
<NP head agr> = <VP head agr>
VP  V NP
<VP head> = <V head>
NP  Det Nom
<NP head> = <Nom head>
<Det head agr> = <Nom head agr>
Nom  Noun
<Nom head> = <Noun head>
26
agreement
27
Head features in the grammar

An important concept shown in the previous
rules is that heads of grammar rules share
properties with their mothers

VP  V NP
<VP head> = <V head>

Knowing the head will tell you about the
whole phrase

This is important for many parsing techniques
28
Sub-categorization

We could specify subcategorization like so:

VP  V
<VP head subcat> = intrans

VP  V NP
<VP head subcat> = trans

VP  V NP NP
<VP head subcat> = ditrans


But values like ‘intrans’ do not correspond to anything
that the rules actually look like
To make SUBCAT better match the rules, we can
make it a list of a verb’s arguments, e.g. <NP,PP>
29
Handling Subcategorization
VP  V NP PP
<VP head> = <Verb head>
<VP head subcat> = <NP,PP>
V  leaves
<V head agr num> = sg
<V head subcat> = <NP,PP>
[head: *1[subcat < *2, *3> ]]
VP
PP
V
NP
[cat *2]
[cat *3]
leaves
There is also a longer, more formal
[head: *1[agr [num sg]
way to specify lists:
subcat < [cat np],
[cat pp] >
<NP,PP> is equivalent to:
[FIRST NP
REST [FIRST= PP
REST = <>]]
30
Subcategorization
31
Subcategorization frames


Subcategorization, or valency, or dependency is a very
important notion in capturing syntactic regularity And
there is a wide variety of arguments that a verb (or
noun or adjective) can take.
Some subcategorization frames for ask:







He
He
He
He
He
He
He
asked [Q “What was it like?”]
asked [S[wh] what it was like]
asked [NP her] [S[wh] what it was like]
asked [VP[to] to see you]
asked [NP her] [VP[to] to tell you]
asked [NP a question]
asked [NP her] [NP a question]
32
Subcategorization frame
33
Long Distance Dependencies




What cities does Continental service?
What flights do you have from Boston to Baltimore?
What time does that flight leave Atlanta?
wh-non-subject-question:




S → Wh-NP Aux NP VP
Should WH-NP agreed with NP
Gap-list : implemented by feature “GAP” are
passed through different trees
Fillers: (what cities) are put on the top of
gap list and should be unified to the
subcategorization frame of the verb
34
Long-Distance Dependencies
What is the earliest flight that you have _?
TOP (fill gap):
S  WH-word Be-copula NP
<NP gap> = <WH-word head>
MIDDLE (pass gap):
NP  D Nom
<NP gap> = <Nom gap>
Nom  Nom RelClause
<Nom gap> = <RelClause gap>
RelClause  RelPro NP VP
<RelClause gap> = <VP gap>
BOTTOM (identify gap):
VP  V
<VP gap> = <V subcat second>
S
Wh-wd
NP [gap *1]
Be-copula
[head *1]
D
Nom
[gap *1]
RelClause
Nom
RelPro
[gap *1]
NP
VP
[gap *1]
V
[subcat <NP, *1>]
35
Overview




Feature Structures
and Unification
Unification-Based
Grammars
Chart Parsing with
Unification-Based
Grammars
Type Hierarchies
36
Implementing Unification

How do we implement a check on unification?


i.e., given feature structures F1 and F2, return F, the unification of
F1 and F2
Unification is a recursive operation:

If a feature has an atomic value, see if the other FS has that feature
with the same value


If a feature has a complex value, follow the paths to see if they’re
compatible and have the same values at bottom


[F a] unifies with [], [F ], and [F a]
Does [F G1] unify with [F G2]? We have to inspect G1 and G2 to find
out.
To avoid cycles, we have to do an occur check to see if
we’ve seen a FS before or not
37
38
39
Base case
•One or both of the arguments has a null value.
•The arguments are identical.
•The arguments are non-complex and non-identical.
40
An Example
41
42


original arguments are neither identical, nor null, nor
atomic,
These arguments are also non-identical, non-null, and
non-atomic so the loop is entered again leading to a
recursive check of the values of the AGREEMENT features
43
f1 and f2 after the recursion adds the value of the new PERSON feature
44
The final structures of f1 and f2 at the end
45
Modifying a Early Parser to handle
Unification



Our grammar still has a context-free
backbone, so we could just parse a sentence
with a CFG and use the features to filter out
the ungrammatical sentences
But by utilizing unification as we parse, we
can eliminate parses that won’t work in the
end
e.g., we’ll eliminate NPs that don’t match in
agreement features with their VPs as we
parse, instead of ruling them out later
46
Changes to the Chart Representation


Each state will be extended to include the LHS
DAG (which can get augmented as it goes
along).
i.e., Add a feature structure (in DAG form) to
each state



So, S   NP VP, [0,0]
Becomes S   NP VP, [0,0], DagS
The predictor, scanner, and completer have to
pass in the DAG, so all three operations have
to be altered
47
Earley Parser with Unification
48
Predictor, Scanner, Completer
49
Unify States
50
Example (That flight )
51
Change to ENQUEUE

The enqueue procedure should also be
changed to use a subsumption test:


Do not add a state to the chart if an equivalent or
more general state is already there.
So, if Enqueue wants to add a singular
determiner state at [x, y], and the chart already
has a determiner state at [x, y] unspecified for
number, then Enqueue will not add it.
52
Why a Subsumption Test?

If we don't impose a subsumption restriction,
enqueue will add two states at [x, y], one expecting
to see a singular determiner, the other just a
determiner.



On seeing a singular determiner, the parser will advance the
dot on both rules, creating two edges (since singular will
unify with both singular and with unspecified).
As a result, we would get duplicate edges.
If we impose the restriction, and we see either a
single or plural determiner, and we advance the dot,
only one edge (singular or plural) gets created at [x,
y].
53
The Need for Copying

show me morning flights
54
Overview




Feature Structures
and Unification
Unification-Based
Grammars
Chart Parsing with
Unification-Based
Grammars
Type Hierarchies
55
Using Type Hierarchies


Instead of simple feature structures, formalisms like
Head-Driven Phrase Structure Grammar (HPSG) use
typed feature structures
Two problems right now:



What prevents us right now from specifying the following?
<number feminine>
How can we capture the fact that all values of NUMBER are
the same sort of thing, i.e., make a generalization?
Solution: use types
56
Type Systems

1. Each feature structure is labeled by a type.
[noun
CASE case]

2. Each type has appropriateness conditions
specifying what features are appropriate for it.




noun  [CASE case]
verb  [VFORM vform]
3. Types are organized into a type hierarchy.
4. Unification is modified to allow two different types
to unify.
57
Simple Type Hierarchy
58
Type Hierarchy

So, if



CASE is appropriate for noun, and
the value of CASE is case, and
we have the following type hierarchy
case
nom acc dat

Then, the following are possible feature structures:
[noun
CASE nom]
[noun
CASE acc]
[noun
[CASE dat]
59
Unification of types

Now, when we unify feature structures, we have to
unify types, too:
[CASE case] U [CASE nom] = [CASE nom]
[CASE nom] U [CASE acc] = fail

Let’s also assume that acc and dat have a common
subtype, obj
acc dat
obj

Then, we have the following unification:
[CASE acc] U [CASE dat] = [CASE obj]
60
Practices

16.2, 16.3, 16.6, 16.7,
61