Unsupervised Learning of Narratives - Researcher

Download Report

Transcript Unsupervised Learning of Narratives - Researcher

Learning Narrative Schemas
Nate Chambers, Dan Jurafsky
Stanford University
IBM Watson Research Center Visit
Two Joint Tasks
Events in a Narrative
Semantic Roles
suspect, criminal, client,
immigrant, journalist,
government, …
police, agent, officer,
authorities, troops, official,
investigator, …
Scripts
Schank and Abelson. 1977. Scripts Plans Goals and Understanding. Lawrence Erlbaum.
Mooney and DeJong. 1985. Learning Schemata for NLP. IJCAI-85.
• Background knowledge for language understanding
Restaurant Script
• Hand-coded
• Domain dependent
Applications
• Coreference
• Resolve pronouns (he, she, it, etc.)
• Summarization
• Inform sentence selection with event confidence scores
• Aberration Detection
• Detect surprise/unexpected events in text
• Story Generation
• McIntyre and Lapata, (ACL-2009)
• Textual Inference
• Does a document infer other events
• Selectional Preferences
• Use chains to inform argument types
The Protagonist
protagonist:
(noun)
1. the principal character in a
drama or other literary work
2. a leading actor, character, or
participant in a literary work
or real event
Inducing Narrative Relations
Chambers and Jurafsky. Unsupervised Learning of Narrative Event Chains. ACL-08
Narrative Coherence Assumption
Verbs sharing coreferring arguments are semantically connected
by virtue of narrative discourse structure.
1. Dependency parse a document.
2. Run coreference to cluster entity mentions.
3. Count pairs of verbs with coreferring arguments.
4. Use pointwise mutual information to measure relatedness.
Chain Example (ACL-08)
Schema Example (new)
Police, Agent,
Authorities
Prosecutor, Attorney
Judge, Official
Plea, Guilty, Innocent
Suspect, Criminal,
Terrorist, …
Narrative Schemas
Integrating Argument Types
• Use verb relations to learn argument types.
• Record head nouns of coreferring arguments.
The typhoon was downgraded
Sunday as it moved inland from the
coast, where it killed two people.
downgrade-o, move-s, typhoon
move-s, kill-s, typhoon
downgrade-o, kill-s, typhoon
• Use argument types to learn verb relations.
• Include argument counts in relation scores.
sim(ei ,e j )
sim(ei ,e j ,a)
Learning Schemas
narsim(N,v j ) 

d Dv j
max chainsim (Ci , v j ,d )
Ci
Argument Induction
• Induce semantic roles by scoring
argument head words.
score ( )
Suspect
Government
Journalist
Monday
Member
Citizen
Client
…
…
Training Data
• 1.2 million New York Times articles
• NYT portion of the Gigaword Corpus
• David Graff. 2002. English Gigaword. Linguistic Data Consortium.
• Stanford Parser
• http://nlp.stanford.edu/software/lex-parser.shtml
• OpenNLP coreference
• http://opennlp.sourceforge.net
• Lemmatize verbs and noun arguments.
Learned Examples
court, judge, justice, panel,
Osteen, circuit, nicolau,
sporkin, majority
law, ban, rule, constitutionality,
conviction, ruling, lawmaker,
Learned Examples
company, inc, corp, microsoft,
iraq, co, unit, maker, …
drug, product, system, test,
software, funds, movie, …
Database of Schemas
• ~500 unique schemas, 10 events each
• Temporal ordering data
• Available online soon.
Evaluations
• Compared to FrameNet
• High precision when overlapping
• New type of knowledge not included
• Cloze Evaluation
• Predict missing events
• Far better performance than vanilla distributional approaches
Future Work
• Improved information extraction
• Extract information across multiple predicates.
• Knowledge Organization
• Link news articles describing subsequent events.
• Core AI Reasoning
• Automatic approach to learning causation?
• NLP specific tasks
• Coreference, summarization, etc.
Thanks!
•
Unsupervised Learning of Narrative Schemas and their
Participants Nathanael Chambers and Dan Jurafsky ACL-09, Singapore.
2009.
•
Unsupervised Learning of Narrative Event Chains Nathanael Chambers and
Dan Jurafsky ACL-08, Ohio, USA. 2008.
•
Jointly Combining Implicit Constraints Improves Temporal
Ordering Nathanael Chambers and Dan Jurafsky EMNLP-08, Waikiki, Hawaii,
USA. 2008.
•
Classifying Temporal Relations Between Events Nathanael Chambers,
Shan Wang, Dan Jurafsky ACL-07, Prague. 2007.
Cloze Evaluation
1.
2.
3.
4.
Choose a news article at random.
Identify the protagonist.
Extract the narrative event chain.
Randomly remove one event from the chain.
•
Predict which event was removed.
Cloze Results
• Outperform the baseline distributional learning
approach by 36%
• Including participants improves further by 10%
Comparison to FrameNet
• Narrative Schemas
• Focuses on events that occur together in a narrative.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.
Comparison to FrameNet
• Narrative Schemas
• Focuses on events that occur together in a narrative.
• Schemas represent larger situations.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.
• Frames typically represent single events.
Comparison to FrameNet
1. How similar are schemas to frames?
•
Find “best” FrameNet frame by event overlap
2. How similar are schema roles to frame elements?
•
Evaluate argument types as FrameNet frame elements.
FrameNet Schema Similarity
1. How many schemas map to frames?
•
•
13 of 20 schemas mapped to a frame
26 of 78 (33%) verbs are not in FrameNet
2. Verbs present in FrameNet
•
•
35 of 52 (67%) matched frame
17 of 52 (33%) did not match
FrameNet Schema Similarity
• Why were 33% unaligned?
• FrameNet represents subevents as separate frames
• Schemas model sequences of events.
One Schema
trade
rise
fall
Two FrameNet Frames
Exchange
Change Position on a Scale
FrameNet Argument Similarity
2. Argument role mapping to frame elements.
• 72% of arguments appropriate as frame elements
FrameNet frame: Enforcing
Frame element: Rule
law, ban, rule, constitutionality,
conviction, ruling, lawmaker, tax
INCORRECT
XX Event Scoring
n
chainsim(Ci , acquit, subj )  max sim( acquit, subj ,ei ,a)
a
i1
score (Ci ,a)
XX Argument Induction
• Induce semantic roles by scoring
argument head words.
= criminal?
n1
n
score( )    (1
sim(e
)i,e
pm
j , i()ei ,e j )  log( freq(ei ,e j , ))
i1 j i1
How often do events share
any coreferring arguments?
How often do they share
argument ?
Results
Chains
Schemas
Typed Chains
Typed Schemas
10.1%
Results
1. We learned rich narrative structure.
•
10.1% improvement over previous work
2. Induced semantic roles characterizing the
participants in a narrative.
3. Verb relations and their semantic roles can be jointly
learned and improve each other’s results.
•
Selectional preferences improve verb relation learning.
XX Semantic Role Induction
• Supervised Learning
• PropBank (Palmer et al., 2005),
• FrameNet (Baker et al., 1998),
• VerbNet (Kipper et al., 2000)
• Bootstrapping from a seed corpus
• (Swier and Stevenson, 2004), (He and
Gildea, 2006)
• Unsupervised, pre-defined roles
• (Grenegar and Manning 2006)
• WordNet inspired
• (Green and Dorr, 2005), (Alishahi and
Stevenson, 2007)
Suspect
Government
Journalist
Monday
Member
Citizen
Client
…