A Database of Narrative Schemas

Download Report

Transcript A Database of Narrative Schemas

A Database of Narrative Schemas
A 2010 paper by Nathaniel Chambers
and Dan Jurafsky
Presentation by Julia Kelly
Natural Language Understanding (NLU)
• A more specific definition of a sub-topic for
Natural Language Processing (NLP)
• The Parsing rather than the generating part of
NLP
Chambers and Jurafsky:
What are they trying to solve?
Using unsupervised learning (machine learning
technique) in conjunction with coreference
chains to extract rich event structure in order to
produce better narrative schema
What came before?
• FrameNet: Baker (also a
Upenn affiliate) et al
– 1997
– Frame definition
– Annotation
• TimeBank and TempEval
: Pustejovsky et al
– TimeBank is a corpus
(often referred to as
TimeML now)
– TempEval seems to
organize by events
TimeBank
• TimeML is a robust specification language for events and
temporal expressions in natural language. It is designed to
address four problems in event and temporal expression
markup:
• Time stamping of events (identifying an event and
anchoring it in time);
• Ordering events with respect to one another (lexical versus
discourse properties of ordering);
• Reasoning with contextually underspecified temporal
expressions (temporal functions such as 'last week' and
'two weeks before');
• Reasoning about the persistence of events (how long does
an event or the outcome of an event last).
We specify three separate tasks that involve identifying event-time and event-event
temporal relations. A restricted set of temporal relations will be used, which includes
only the relations: BEFORE AFTER OVERLAP (defined to encompass all cases where event
intervals have non-empty overlap).
,
, and
TASK A:
For a restricted set of event terms, identify temporal
relations between events and all time expressions
appearing in the same sentence.
(NOTE: The restricted set of event terms is to be specified
by providing a list of root forms. Time expressions will be
annotated in the source, in accordance with TIMEX3.)
TASK B:
For a restricted set of event terms, identify temporal
relations between events and the Document Creation Time
(DCT).
(NOTE: The restricted set of events will be the same as for
Task A. DCTs will be explicitly annotated in the source.)
TASK C:
Identify the temporal relations betweeen contiguous pairs
of matrix verbs.
(NOTE: matrix verbs, i.e. the main verb of the matrix clause
in each sentence, will be explicitly annotated in the source.)
Chambers and Jurafsky:
Narrative Schema
• Narrative Schema contain:
– Sets of related events
– A temporal ordering of events
– The semantic roles of the participants
• Based off of scripts.
Narrative Schema
A brief example
Hand-selected examples from
the Database
Chambers and Jurafsky:
A better approach
• A schema is not just a chain of synonyms
• Counting times verbs appear before or after
on another
• Combines both the schema and temporal data
within the database
Narrative Coherence Assumption
• Predicates sharing coreferring arguments are related by virtue of
narrative discourse structure.
• In order to find events,
1. Parse all sentences into dependency graphs.
2. Run coreference over the document.
3. Count all pairs of verbs and dependencies (e.g. subject,
object) that are filled by coreferring entities.
4. Record with each pair the head word of the shared
argument.
Why did other approaches not make
the grade?
• FrameNet approaches the problem from the
aspect of frames
• TimeBank did not also include a method of
schema in addition to temporal placement
Performance
• The Chambers and Jurafsky database has very
similar results to the hand-labeled FrameNet
data.
• The temporal aspect of the database did not
seem to perform as well as the TimeBank
• Somewhat specialized