Transcript 7 - CLAIR
EECS 595 / LING 541 / SI 661
Natural Language Processing
Fall 2005
Lecture Notes #7
Natural Language Generation
What is NLG?
• Mapping meaning to text
• Stages:
– Content selection
– Lexical selection
– Sentence structure: aggregation, referring
expressions
– Discourse structure
Systemic grammars
• Language is viewed as a resource for
expressing meaning in context (Halliday,
1985)
• Layers: mood, transitivity, theme
The system
will
save
the document
Mood
subject
finite
predicator
object
Transitivity
actor
process
Theme
theme
rheme
goal
Example
(
:process save-1
:actor system-1
:goal document-1
:speechact assertion
:tense future
)
Input is underspecified
The Functional Unification
Formalism (FUF)
• Based on Kay’s (83) formalism
• partial information, declarative, uniform,
compact
• same framework used for all stages:
syntactic realization, lexicalization, and text
planning
Functional analysis
•
•
•
•
•
Functional vs. structured analysis
“John eats an apple”
actor (John), affected (apple), process (eat)
NP VP NP
suitable for generation
Partial vs. complete specification
action =
actor =
object =
•
•
•
•
•
eat
John
apple
Voice: An apple is eaten by John
Tense: John ate an apple
Mode: Did John ear an apple?
Modality: John must eat an apple
prolog: p(X,b,c)
Unification
•
•
•
•
•
Target sentence
input FD
grammar
unification process
linearization process
Sample input
((cat s)
(prot ((n ((lex john)))))
(verb ((v ((lex like)))))
(goal ((n ((lex mary))))))
Sample grammar
((alt top (((cat s)
(prot ((cat np)))
(goal ((cat np)))
(verb ((cat vp)
(number {prot number})))
(pattern (prot verb goal)))
((cat np)
(n ((cat noun)
(number {^ ^ number})))
(alt (((proper yes)
(pattern (n)))
((proper no)
(pattern (det n))
(det ((cat article)
(lex “the”)))))))
((cat vp)
(pattern (v))
(v ((cat verb))))
((cat noun))
((cat verb))
((cat article)))))
Sample output
((cat s)
(goal ((cat np)
(n ((cat noun)
(lex mary)
(number {goal number})))
(pattern (n))
(proper yes)))
(pattern (prot verb goal))
(prot ((cat np)
(n ((cat noun)
(lex john)
(number {verb number})))
(number {verb number})
(pattern (n))
(proper yes)))
(verb ((cat vp)
(pattern (v))
(v ((cat verb)
(lex like))))))
Comparison with Prolog
• Similarities:
– both have unification at the core
– Prolog program = FUF grammar
– Prolog query = FUF input
• Differences:
– Prolog: first order term unification
– FUF: arbitrarily rooted directed graphs are unified
The SURGE grammar
• Syntactic realization front-end
• variable level of abstraction
• 5600 branches and 1600 alts
Lexical
chooser
Lexicalized FD
Syntactic FD
SURGE
Linearizer
Morphology
Text
Systems developed using
FUF/SURGE
•
•
•
•
•
•
COMET
MAGIC
ZEDDOC
PLANDOC
FLOWDOC
SUMMONS
CFUF
• Fast implementation by Mark Kharitonov
(C++)
• Up to 100 times faster than Lisp/FUF
• Speedup higher for larger inputs
References
• Cole, Mariani, Uszkoreit, Zaenen, Zue (eds.) Survey of the State of the
Art in Human Language Technology, 1995
• Elhadad, Using Argumentation to Control Lexical Choice: A
Functional Unification Implementation, 1993
• Elhadad, FUF: the Universal Unifier, User Manual, 1993
• Elhadad and Robin, SURGE: a Comprehensive Plug-in Syntactic
Realization Component for Text Generation, 1999
• Kharitonov, CFUF: A Fast Interpreter for the Functional Unification
Formalism, 1999
• Radev, Language Reuse and Regeneration: Generating Natural
Language Summaries from Multiple On-Line Sources, Department of
Computer Science, Columbia University, October 1998
Path notation
• You can view a FD as a tree
• To specify features, you can use a path
– {feature feature … feature} value
– e.g. {prot number}
• You can also use relative paths
– {^ number} value => the feature number for the current
node
– {^ ^ number} value => the feature number for the node
above the current node
Sample grammar
((alt top (((cat s)
(prot ((cat np)))
(goal ((cat np)))
(verb ((cat vp)
(number {prot number})))
(pattern (prot verb goal)))
((cat np)
(n ((cat noun)
(number {^ ^ number})))
(alt (((proper yes)
(pattern (n)))
((proper no)
(pattern (det n))
(det ((cat article)
(lex “the”)))))))
((cat vp)
(pattern (v))
(v ((cat verb))))
((cat noun))
((cat verb))
((cat article)))))
Unification Example
Unify Prot
Unify Goal
Unify vp
Unify verb
Finish
Discourse Analysis
The problem
•
•
•
•
Discourse
Monologue and Dialogue (dialog)
Human-computer interaction
Example: John went to Bill’s car dealership to
check out an Acura Integra. He looked at it for
about half an hour.
• Example: I’d like to get from Boston to San
Francisco, on either December 5th or December
6th. It’s okay if it stops in another city along the
way.
Information extraction and
discourse analysis
• Example: First Union Corp. is continuing to
wrestle with severe problems unleashed by a
botched merger and a troubled business strategy.
According to industry insiders at Paine Webber,
their president, John R. Georgius, is planning to
retire by the end of the year.
• Problems with summarization and generation
Reference resolution
• The process of reference (associating
“John” with “he”).
• Referring expressions and referents.
• Needed: discourse models
• Problem: many types of reference!
Example (from Webber 91)
• According to John, Bob bought Sue an Integra,
and Sue bough Fred a legend.
• But that turned out to be a lie. - referent is a
speech act.
• But that was false. - proposition
• That struck me as a funny way to describe the
situation. - manner of description
• That caused Sue to become rather poor. - event
• That caused them both to become rather poor. combination of several events.
Reference phenomena
• Indefinite noun phrases: I saw an Acura Integra
today.
• Definite noun phrases: The Integra was white.
• Pronouns: It was white.
• Demonstratives: this Acura.
• Inferrables: I almost bought an Acura Integra
today, but a door had a dent and the engine
seemed noisy.
• Mix the flour, butter, and water. Kneed the dough
until smooth and shiny.
Constraints on coreference
• Number agreement: John has an Acura. It is red.
• Person and case agreement: (*) John and Mary have
Acuras. We love them (where We=John and Mary)
• Gender agreement: John has an Acura. He/it/she is
attractive.
• Syntactic constraints:
–
–
–
–
–
John bought himself a new Acura.
John bought him a new Acura.
John told Bill to buy him a new Acura.
John told Bill to buy himself a new Acura
He told Bill to buy John a new Acura.
Preferences in pronoun
interpretation
• Recency: John has an Integra. Bill has a Legend. Mary
likes to drive it.
• Grammatical role: John went to the Acura dealership with
Bill. He bought an Integra.
• (?) John and Bill went to the Acura dealership. He bought
an Integra.
• Repeated mention: John needed a car to go to his new job.
He decided that he wanted something sporty. Bill went to
the Acura dealership with him. He bought an Integra.
Preferences in pronoun
interpretation
• Parallelism: Mary went with Sue to the Acura
dealership. Sally went with her to the Mazda
dealership.
• ??? Mary went with Sue to the Acura dealership.
Sally told her not to buy anything.
• Verb semantics: John telephoned Bill. He lost his
pamphlet on Acuras. John criticized Bill. He lost
his pamphlet on Acuras.
An algorithm for pronoun
resolution
• Two steps: discourse model update and
pronoun resolution.
• Salience values are introduced when a noun
phrase that evokes a new entity is
encountered.
• Salience factors: set empirically.
Salience weights in Lappin and Leass
Sentence recency
100
Subject emphasis
80
Existential emphasis
70
Accusative emphasis
50
Indirect object and oblique complement
emphasis
40
Non-adverbial emphasis
50
Head noun emphasis
80
Lappin and Leass (cont’d)
• Recency: weights are cut in half after each
sentence is processed.
• Examples:
– An Acura Integra is parked in the lot. (subject)
– There is an Acura Integra parked in the lot. (existential
predicate nominal)
– John parked an Acura Integra in the lot. (object)
– John gave Susan an Acura Integra. (indirect object)
– In his Acura Integra, John showed Susan his new CD
player. (demarcated adverbial PP)
Algorithm
1. Collect the potential referents (up to four sentences back).
2. Remove potential referents that do not agree in number or
gender with the pronoun.
3. Remove potential referents that do not pass intrasentential
syntactic coreference constraints.
4. Compute the total salience value of the referent by adding
any applicable values for role parallelism (+35) or
cataphora (-175).
5. Select the referent with the highest salience value. In case
of a tie, select the closest referent in terms of string
position.
Example
• John saw a beautiful Acura Integra at the
dealership last week. He showed it to Bill. He
bought it.
Rec
Subj
John
100
80
Integra
100
dealership
100
Exist
Obj
50
Ind
Obj
Non
Adv
Head
N
Total
50
80
310
50
80
280
50
80
230
Example (cont’d)
Referent
Phrases
Value
John
{John}
155
Integra
{a beautiful Acura Integra}
140
dealership
{the dealership}
115
Example (cont’d)
Referent
Phrases
Value
John
{John, he1}
465
Integra
{a beautiful Acura Integra}
140
dealership
{the dealership}
115
Example (cont’d)
Referent
Phrases
Value
John
{John, he1}
465
Integra
{a beautiful Acura Integra, it}
420
dealership
{the dealership}
115
Example (cont’d)
Referent
Phrases
Value
John
{John, he1}
465
Integra
{a beautiful Acura Integra, it}
420
Bill
{Bill}
270
dealership
{the dealership}
115
Example (cont’d)
Referent
Phrases
Value
John
{John, he1}
232.5
Integra
{a beautiful Acura Integra, it1}
210
Bill
{Bill}
135
dealership
{the dealership}
57.5
Observations
• Lappin & Leass - tested on computer manuals 86% accuracy on unseen data.
• Centering (Grosz, Josh, Weinstein): additional
concept of a “center” – at any time in discourse, an
entity is centered.
• Backwards looking center; forward looking
centers (a set).
• Centering has not been automatically tested on
actual data.
Discourse structure
• (*) Bill went to see his mother. The trunk is
what makes the bonsai, it gives it both its
grace and power.
• Coherence principle:
– John hid Bill’s car keys. He was drunk
– ?? John hid Bill’s car keys. He likes spinach
• Rhetorical Structure Theory (Mann,
Matthiessen, and Thompson)
Sample rhetorical relations
Relation
Nucleus
Satellite
Antithesis
ideas favored by the
author
ideas disfavored by the author
Background
text whose understanding text for facilitating
is being facilitated
understanding
Concession
situation affirmed by
author
situation which is apparently
inconsistent but also affirmed by
author
Elaboration
basic information
additional information
Purpose
an intended situation
the intent behind the situation
Restatement
a situation
a reexpression of the situation
Summary
Text
a short summary of that text
Example (from MMT)
1) Title: Bouquets in a basket - with living flowers
2) There is a gardening revolution going on.
3) People are planting flower baskets with living plants,
4) mixing many types in one container for a full summer of floral beauty.
5) To create your own "Victorian" bouquet of flowers,
6) choose varying shapes, sizes and forms, besides a variety of complementary colors.
7) Plants that grow tall should be surrounded by smaller ones and filled with others that tumble over
the side of a hanging basket.
8) Leaf textures and colors will also be important.
9) There is the silver-white foliage of dusty miller, the feathery threads of lotus vine floating down
from above, the deep greens, or chartreuse, even the widely varied foliage colors of the coleus.
Christian Science Monitor, April, 1983
Example (cont’d)
Cross-document structure
Number
1
Relationship type
Identity
Level
Any
2
Equivalence (paraphrasing)
S, D
3
Translation
P, S
4
Subsumption
S, D
5
6
Contradiction
Historical background
S, D
S
7
8
9
10
Cross-reference
Citation
Modality
Attribution
P
S, D
S
S
11
Summary
S, D
Description
The same text appears in more than one
location
Two text spans have the same
information content
Same information content in different
languages
One sentence contains more
information than another
Conflicting information
Information
that
puts
current
information in context
The same entity is mentioned
One sentence cites another document
Qualified version of a sentence
One sentence repeats the information of
another while adding an attribution
Similar to Summary in RST: one
sentence summarizes another
Number
12
Relationship type
Follow-up
Level
S
13
Elaboration
S
14
Indirect speech
S
15
16
Refinement
Agreement
S
S
17
18
19
20
21
22
23
24
Judgement
Fulfilment
Description
Reader profile
Contrast
Parallel
Generalization
Change of perspective
S
S
S
S
S
S
S
S,D
Description
Additional information which reflects
facts that have happened since the
previous account
Additional information that wasn’t
included in the last account
Shift from direct to indirect speech or
vite-versa
Additional information that is
One source expresses agreement with
another
A qualified account of a fact
A prediction turned true
Insertion of a description
Style and background-specific change
Contrasting two accounts of facts
Comparing two accounts of facts
Generalization
The same source presents a fact in a
different light