Click to add title - Technion – Israel Institute of

Transcript Click to add title - Technion – Israel Institute of

Max-Margin Matching for Semantic
Role Labeling
David Vickrey
James Connor
Daphne Koller
Stanford University
Overview

We consider two complementary models for the same
task


In our case, one discriminative, one generative
Combine the models by feeding output predictions into
a discriminative classifier

Try two different classifiers for combining: multi-class SVM
and max-margin matching
Semantic Role Labeling

Label the arguments of a verb in context
I gave the dog a bone.
Giver

Gift
Recipient
PropBank: 1m labeled words (Wall Street Journal)
Syntactic Model

Most useful features:


Argument word/part of speech
Path from argument to verb in parse tree
S
NP
VP
I
VP
NP
gave dog

NP
bone
Use a standard classifier, e.g. SVM


One vs. All classifier for each possible argument type
Trained across all verbs at once
Semantic Model

Data set: words occurring as “Eater” for verb eat



Usually, will either be a person or an animal
We want to generalize to unseen animals or people



Used WordNet as the hierarchy
Selected categories using Bayesian score
Will be presented tomorrow



Google Sets problem
Our idea: use a word hierarchy to find categories of words


dog, cat, he, …
Bayesian Methods for Natural Language Processing
On its own, improves log-likelihood of test sets on PropBank
Train one model for each argument for each verb
Combining Models

For each word w in a sentence with verb v:

For each possible argument (Giver, Gift, Recipient, etc.):



Margin of One vs. All classifier using syntactic features
log Pv(Arg | w) using semantic model trained on Arg and verb v
Use these as inputs to multi-class SVM

One weight for each argument for each model (not specific to v)
I wanted my dog to eat the pickle.
Syntax
Semantic
Dog?
I
dog pickle
Eater
1.0
1.0
Food
-1.5 1.2
-1.0
1.2
Eater: 1.0a+-0.29b
Food: 1.2c+-1.39d
I
dog
pickle
Eater
-0.05
-0.29 -4.61
Food
-3.0
-1.39 -0.01
Results

Tested on first 500 frames (~ ½ of the data)
Features
Training
F1
Syntax
Syntax + Semantic
Syntax
None
None
Multi-class SVM
78.9
78.9
80.2
Syntax + Semantic
Multi-class SVM
80.4
Max-Margin Matching


Each argument should be assigned to only one word
Complete bipartite graph:
I
dog
pickle

1.2c+-1.39d
1.2
Eater
Food
Weight of edge from a word w to argument a:




1.0
1.0a+-0.29b
Syntax only: Margin of classifier for a applied to w
Syntax and Semantic: (weighted) sum of confidences of each
Same set of weights as in Multi-class SVM
Apply max-margin matching learning for these weights
Results

Can do matching for any type of training of weights
Training of Weights
Matching? Syntax Both
None
None
Multi-class SVM
No
Yes
No
78.9
80.2
80.2
78.9
79.9
80.4
Multi-class SVM
Yes
Max-Margin Matching Yes
81.4
81.4
82.5
82.4
Results Summary

Classifying using confidences of One vs. All can help


Combining the models in a classifier worked




Improving One vs. All classifier may remove this benefit
Able to improve using WordNet
However, only worked with both matching and training!
Matching helped, but training the max-margin
matching did not
Why?


Not that much data (shared weights across all verbs)
Not that many parameters
Future Directions

Previous work* used a Markov random field over
classification decisions




Can’t do exact inference
Can include potentials besides one word per argument
We could try to extend max-margin matching
Unlabeled data


Bootstrap between different classifiers
When to include an example?


High confidence under a single classifier?
Or, high confidence in combined classifier?
* Toutanova, Haghighi, Manning, ACL 2005

Click to add title - Technion – Israel Institute of

Transcript Click to add title - Technion – Israel Institute of

Directory