Filtered Ranking for Bootstrapping in Event Extraction
Download
Report
Transcript Filtered Ranking for Bootstrapping in Event Extraction
FILTERED RANKING
FOR BOOTSTRAPPING
IN EVENT EXTRACTION
Shasha Liao
Ralph Grishman
@New York University
CONTENT
Introduction
Related work
Ranking methods in bootstrapping
System description
Experiment
Conclusion
INTRODUCTION
The goal of event extraction is to identify instances of a
class of events, including its occurrence and arguments.
In this paper, we focus on identify the occurrence of an
event
Annotating large corpora to train supervised event
extractors is expensive
Semi-supervised methods are trained from a small
seed set and an unannotated corpus
Semi-supervised methods can greatly reduce human
labor.
INTRODUCTION
Most
semi-supervised event extractors seek to
learn sets of patterns
Patterns
typically consist of a predicate and some
lexical or semantic constraints on its arguments.
Such
patterns indicate that there is an event
For example: “ORG appointed PER as the vice
president…”
An
effective semi-supervised extractor should have
good performance over a range of extraction tasks
and corpora.
FLOW CHART
A typical bootstrapping approach
Untagged
Corpus
Exit
Yes
New Patterns
Seeds
Stop?
No
Pattern
Ranking Function
RELATED WORK
Document-centric method
Riloff (1996)
Yangarber et al. (2000)
Surdeanu et al. (2006)
Patwardhan and Riloff (2007)
Similarity-centric method
Stevenson and Greenwood (2005) (S&G)
Greenwood and Stevenson (2006)
RANKING METHODS IN
BOOTSTRAPPING
Document-centric method
Find patterns with high frequency in relevant documents and low frequency
in irrelevant documents.
Good for extracting patterns for a scenario, which involve related events
(hiring and firing, attacks and injuries).
Corpus selection is quite important.
Similarity-centric method
Find patterns with high lexical similarities.
Good for extracting patterns of the same event type
No extra corpus is needed, although you can use one
Problem of polysemy in computing lexical similarities
RANKING METHODS IN
BOOTSTRAPPING
Our assumption is more restrictive:
patterns that appear in relevant documents and are lexically
similar are most likely to be relevant.
We limit the effect of ambiguous patterns by narrowing the
search to relevant documents
We limit irrelevant patterns in relevant documents by word
similarity restriction.
Many combinations can be possible, and we propose one using
the word similarity as a filter.
DocumentRanking( p) SimilarityRanking( p) t
Filter( p)
0
otherwise
SYSTEM DESCRIPTION
Pre-processing:
Tokenization, stemming, name tagging, semantic labeling
GLARF – logical grammatical and predicate-argument
representation
SURFACE
LOGIC1
grammatical logical role, regularize phenomena like passive, relative clauses, etc.
LOGIC2
from parse tree
predicate-argument role, corresponding to Propbank & Nombank
Generally “arg0” for SBJ (agent), and “arg1” for OBJ (patient)
John is hit by Tom’s brother.
<Arg1 hit John>
<Arg0 hit brother>
<T-pos brother Tom>
<Arg1 hit PER>
<T-pos brother PER>
SYSTEM DESCRIPTION
Document-based
ranking
Patterns in seed set have precision scores of 1,
other patterns have precision scores of 0.
Rel(d) 1 p K(d)(1 Prec(p))
Rel (d)
i
Prec i1(p)
Sup(p)
dH(p)
| H(p) |
Rel(d)
d H(p)
Sup(p)
RankFun Yangarber(p)
* logSup(p)
H
H(p) isthe set of documents which contain pattern p.
K(d) is the set of accepted patterns document d.
SYSTEM DESCRIPTION
Pattern similarity
For two words, we use the Information Content (IC) from
WordNet (same as S&G 2005)
S&G only focus on patterns headed by verbs, we include verbs,
nouns and adjectives
They only record the subject and object to a verb, we record all
argument relations between verbs, nouns, and adjectives
We only use predicate and one constraint (we do not do multiconstraint patterns currently)
Sim( p1, p2 )
* Sim( f1, f 2 ) * Sim(r1,r2 ) * Sim(a1,a2 )
SYSTEM FLOW CHART
Our process follows Yangarber, while incorporating
word similarity into the pattern ranker.
Untagged
Corpus
Seeds
New Patterns
Word
Similarity
Pre-Processor
Document
Ranking Function
Exit
Yes
Stop?
No
Pattern
Ranking
Function
EXPERIMENTS
-DATA
MUC-6 Evaluation
Task: hiring and firing of executives
Bootstrapping data: Reuters corpus (Rose et al. 2002)
Evaluation data: MUC-6
Preselected, 6000 documents
half relevant and half irrelevant
200 documents
ACE Evaluation
Task: multiple elementary event types, like attack, die, hire
Bootstrapping data : Agence France Press (AFP) from Gigaword
corpus
Non-preselected, 14,171 documents
Evaluation data: ACE 2005
589 documents
EXPERIMENTS
-MUC EVALUATION
Filtered ranking is better in
performance.
metric: F-measure of finding
relevant sentences
Our conclusion is different from
S&G’s experiment, why?
Does corpus matter?
Reuters (6,000)
WSJ (18,734)
Gigaword (14,171)
Is
this conclusion general?
EXPERIMENTS
-ACE 2005 EVALUATION
Three event types to be tested
Die
Attack
Start-Position
Two kinds of evaluations
Sentence level
Word level
If there is a pattern matching in sentence s, tag s as relevant; otherwise,
irrelevant.
If the pattern matches a trigger word, it is correct; otherwise, incorrect.
Comparison to a simple supervised method
For training, for every pattern, we count how many times it contains
an event trigger and how many times it does not. If more than 50% of
the time it contains an event trigger, we treat it as a positive pattern.
We did a 5-fold cross-validation on the ACE 2005 data, report the
average results.
EXPERIMENTS
-ACE 2005 EVALUATION
Sentence
Word
level evaluation
level evaluation
CONCLUSIONS
We propose a new ranking method in bootstrapping for
event extraction
This new method can block some irrelevant patterns coming
from relevant documents
This new method, by preferring patterns from relevant
documents, can eliminate some lexical ambiguity.
Experiments show that this new ranking method performs
better than previous ranking methods and is more stable
across different corpora.