Click to add title - Technion – Israel Institute of
Download
Report
Transcript Click to add title - Technion – Israel Institute of
Max-Margin Matching for Semantic
Role Labeling
David Vickrey
James Connor
Daphne Koller
Stanford University
Overview
We consider two complementary models for the same
task
In our case, one discriminative, one generative
Combine the models by feeding output predictions into
a discriminative classifier
Try two different classifiers for combining: multi-class SVM
and max-margin matching
Semantic Role Labeling
Label the arguments of a verb in context
I gave the dog a bone.
Giver
Gift
Recipient
PropBank: 1m labeled words (Wall Street Journal)
Syntactic Model
Most useful features:
Argument word/part of speech
Path from argument to verb in parse tree
S
NP
VP
I
VP
NP
gave dog
NP
bone
Use a standard classifier, e.g. SVM
One vs. All classifier for each possible argument type
Trained across all verbs at once
Semantic Model
Data set: words occurring as “Eater” for verb eat
Usually, will either be a person or an animal
We want to generalize to unseen animals or people
Used WordNet as the hierarchy
Selected categories using Bayesian score
Will be presented tomorrow
Google Sets problem
Our idea: use a word hierarchy to find categories of words
dog, cat, he, …
Bayesian Methods for Natural Language Processing
On its own, improves log-likelihood of test sets on PropBank
Train one model for each argument for each verb
Combining Models
For each word w in a sentence with verb v:
For each possible argument (Giver, Gift, Recipient, etc.):
Margin of One vs. All classifier using syntactic features
log Pv(Arg | w) using semantic model trained on Arg and verb v
Use these as inputs to multi-class SVM
One weight for each argument for each model (not specific to v)
I wanted my dog to eat the pickle.
Syntax
Semantic
Dog?
I
dog pickle
Eater
1.0
1.0
Food
-1.5 1.2
-1.0
1.2
Eater: 1.0a+-0.29b
Food: 1.2c+-1.39d
I
dog
pickle
Eater
-0.05
-0.29 -4.61
Food
-3.0
-1.39 -0.01
Results
Tested on first 500 frames (~ ½ of the data)
Features
Training
F1
Syntax
Syntax + Semantic
Syntax
None
None
Multi-class SVM
78.9
78.9
80.2
Syntax + Semantic
Multi-class SVM
80.4
Max-Margin Matching
Each argument should be assigned to only one word
Complete bipartite graph:
I
dog
pickle
1.2c+-1.39d
1.2
Eater
Food
Weight of edge from a word w to argument a:
1.0
1.0a+-0.29b
Syntax only: Margin of classifier for a applied to w
Syntax and Semantic: (weighted) sum of confidences of each
Same set of weights as in Multi-class SVM
Apply max-margin matching learning for these weights
Results
Can do matching for any type of training of weights
Training of Weights
Matching? Syntax Both
None
None
Multi-class SVM
No
Yes
No
78.9
80.2
80.2
78.9
79.9
80.4
Multi-class SVM
Yes
Max-Margin Matching Yes
81.4
81.4
82.5
82.4
Results Summary
Classifying using confidences of One vs. All can help
Combining the models in a classifier worked
Improving One vs. All classifier may remove this benefit
Able to improve using WordNet
However, only worked with both matching and training!
Matching helped, but training the max-margin
matching did not
Why?
Not that much data (shared weights across all verbs)
Not that many parameters
Future Directions
Previous work* used a Markov random field over
classification decisions
Can’t do exact inference
Can include potentials besides one word per argument
We could try to extend max-margin matching
Unlabeled data
Bootstrap between different classifiers
When to include an example?
High confidence under a single classifier?
Or, high confidence in combined classifier?
* Toutanova, Haghighi, Manning, ACL 2005