Tutorial for the annotation of the Penn Discourse Treebank
Download
Report
Transcript Tutorial for the annotation of the Penn Discourse Treebank
Annotation Guidelines for the
Penn Discourse Treebank
Part A
Eleni Miltsakaki, Rashmi Prasad,
Aravind Joshi, Bonnie Webber
1
Discourse relations (1)
Discourse relations hold between parts of text
One way of marking discourse relations is by
use of explicit markers
Markers discourse connectives
Textual spans they relate arguments
2
Example
(1) On the one hand, John loves Barolo.
(2) So he went and ordered three cases.
(3) On the other hand, he didn’t have
much money.
(4) So then he had to cancel the order.
3
Discourse relations (2)
Between adjacent textual spans, discourse
relations may hold which must be inferred.
In such cases, we establish the presence of an
implicit connective
Example
(5a) You should never lend any books to
John.
(5b) He never returns them.
4
Goals of PDTB
To produce a large scale and reliably annotated
corpus, which
Encodes discourse relations associated with
discourse connectives
Including implicit connectives
5
Corpus
Penn Treebank
Approx. 1 million words
Wall Street Journal
25 sections
100 files in each section
6
Annotation tasks
Annotation of explicit connectives
Annotation of implicit connectives
7
Annotation tool
Wordfreak
Allows you to search for specific connectives
Keeps record of connectives and arguments
More later…
8
Explicit connectives (1)
Subordinate conjunctions
‘because’, ‘although’, ‘when’, etc.
Arguments found locally
Subordinate clauses can be preposed
(6a) John failed the exam because he was
lazy.
(6b) Because he was lazy, John failed the
exam.
9
Explicit connectives (2)
Coordinate conjunctions
‘and’, ‘but’, ‘or’, ‘so’
Arguments found locally
Preposing is not allowed
(7a) John is very smart but he failed the
exam.
(7b) # But he failed the exam, John is very
smart.
10
Explicit connectives (3)
Adverbials
‘therefore’, ‘however’, ‘as a result’, etc.
One argument found locally
One argument may or may not be found locally
(1) On the one hand, John loves Barolo.
(2) So he went and ordered three cases.
(3) On the other hand, he didn’t have much
money.
(4) So then he had to cancel the order.
11
Annotation of explicit conns
We have grouped explicit connectives in sets of 10.
Your task is to:
• Identify all instances of a given set of connectives in
the corpus.
• Mark their arguments.
Proceed one file at a time.
12
Sets of explicit connectives
In progress: Set 3
Adverbials
•
Coordinate conj.
•
And, but, or
Subordinate conj.
•
Indeed, for example
As soon as, unless, as long as,
after, until
Already annotated:
Adverbials:
• instead, otherwise, therefore, as
a result, nevertheless, in fact,
then, on the other hand,
however, furthermore/further
Subordinate conjunctions
• Because, although, even though,
when, so that, if, while, since
In progress: Set 4
Adverbials
•
• Section 00, Section 06
Coordinate conj.
•
Though, yet, so, on the
contrary, conversely
Empty
nor
Subordinate conj.
•
Whereas, as, insofar as, till
13
Implicit connectives
Implicit connectives describe relations that hold
between adjacent textual spans and that they must be
inferred.
In PDTB we will only annotate implicit connectives
between sentences in the same paragraph.
We will initially ignore implicit connectives across
paragraphs or within a sentence.
(8) John walked across the room, waving at
everybody.
14
Annotation of implicit conns
Relation between two adjacent sentences.
Both sentences belong to the same paragraph.
Second sentence does not contain a connective.
! Preposed subordinate conjunctions in the second sentence do not count
(9) Mary stayed until late. (IMPLICIT) Although she
was very tired, she had to finish the report today.
Mark the period as a placeholder for an implicit connective.
Mark the arguments.
Provide an explicit connective that best expresses the relation.
15
What is a legal argument
Multiple-sentences
Sentences
Main clause + subordinate clauses
(10) It is cold, although the sun is shining.
(11) John walked across the hall, waving his hand
cheerfully.
Clauses
Grammatical unit that contains a predicate and its
arguments
Tensed
Non-tensed
16
Predicates and propositions
Predicates
Verb
Says something about the subject
(12) John is sleeping
May require one, two, or three arguments (‘sleep’, ‘eat’,
‘give’
Propositions (expressions of events or states)
Predicate and its arguments
Semantic objects, constant across syntactic variability
(13) John ate the banana.
(14) The banana was eaten by John.
(15) Did John eat the banana?
17
Attention!
Discourse relations hold between propositions.
When annotating arguments include a predicate.
(16) Everybody considered Einstein's contribution to
be a breakthrough because he discovered the
theory of relativity.
Do not separate a predicate from its arguments.
BUT: Implicit arguments are OK in non-tensed clauses.
(17) John crossed the hall, waving his hand
cheerfully.
ALSO: If the only thing left from the clause that contains your selection is
a non-verbal element, include it in your selection.
(18) * In Geneva, however, [they supported Iran’s
proposal].
18
What is not a legal argument
Textual spans that do not contain (or refer to)
propositional material (usually a verb and its
arguments at minimum).
Verbs separated from their arguments.
You can select a clause that is the argument of a verb,
excluding the verb.
You cannot select the verb and leave out its arguments.
(19) John said [that Mary left]. OK
(20) [John said] that Mary left. NOT OK
(21) [John said that Mary left]. OK
19
NPs as arguments?!
Discourse deictic expressions are NPs and they may
be selected as arguments because they may refer to
propositional material.
Discourse deictic expressions are ‘this’ and ‘that’
when they refer to textual spans in the preceding
discourse.
(21) ABC is firing 1,000 employees. That (is)
because they have huge debts.
20
What is a legal argument: summary
A single clause
[John left].
Because [John left]…
While [watching TV]….
John wants [to leave].
A single sentence
[John wants to leave because he’s sick].
Multiple sentences
NPs that refer to clauses
[This] because…
Some nominal forms expressing events or states (but make a
note)
After [the sudden price increase]…
21
What is ARG1/ARG2
The clause that contains the connective is always
Arg2.
The other argument of the connective is Arg1.
Note that with subordinate conjunctions it is
possible for Arg2 to precede Arg1.
(22)Because [Arg 2 he was sick], [Arg1 John left
early].
22
ARG and SUP annotations
When deciding what to mark as an argument of the connective,
you should select what is ‘minimally’ necessary to interpret the
relation established by the connective. Mark that as ARG.
This is a good principle to follow.
However, sometimes you may feel you want to mark/include
material which provides useful, even if not crucial,
information about the interpretation of an argument. Mark this
as SUP. SUP annotations are optional.
23
SUP: Example 1
Lawyers and their clients who frequently bring
business to a country courthouse can expect to appear
before the same judge year after year. [Fear of
alienating that judge is pervasive], says Maurice
Geiger, founder and director of the Rural Justice
Center in Montpellier, Vt., a public interest group that
researches rural justice issues.
As a result, lawyers think twice before appealing a
judge’s ruling, are reluctant to mount, or even
support, challenges against him for reelection and
usually loath to file complaints that might impugn a
judge’s integrity.
24
SUP: Example 2
While dividends have risen smartly, [their
expansion hasn’t kept pace with even stronger
advances in stock prices].
25
Connectives are not part of their
arguments
When annotating the second argument of a
connective do not include the connective itself.
(23) He failed the exam although he had studied hard.
Connectives may appear in an argument of a
connective that you are annotating. Include that
connective in the selection of the argument.
(24) When the stock market dropped nearly 7% Oct. 13,
for instance, the Mexico Fund plunged about 18%
and the Spain Fund fell 16%.
26
What about sentence medial
connectives?
If a connective is sentence medial you exclude
it from your selection of the argument.
Wordfreak allows you to select discontinuous
text and enter it as single argument.
27
Using the discontinuous text
selection feature in the tool
On-line demo
Basic steps
Press Control
Select span 1
Holding Control pressed,
Select span 2, 3, etc.
Then click on the Arg button to enter your selection
All selected spans will show up in the Arg window in the
order that they were selected
28
Examples with discontinuous text
selections
Connectives
(25) In Geneva, however, they supported
Iran’s proposal.
Modifications
(26) Mary, who is a friend of mine, just arrived in
Philadelphia.
Parentheticals
(27) The price of the stock -many had
expected this- was rising.
…
29
What not to annotate!
Do not annotate connectives that are followed by a preposition.
Out:
(28) Instead of teaming up, GE Capital staffers and Kidder
investment bankers have bickered.
In:
(29)
The Hopkinsian universal disinterested
benevolence, although holding to original sin and
the doctrine of election, inspired its adherents
to heroic endeavours for others, ...
(30) Its 1,400-member brokerage operation reported
an estimated $5 million loss last year, although
Kidder expects it to turn a profit this year.
30
Connectives and relations
Think of a discourse relation that ‘then’ can
express?
Think of another discourse relation that ‘then’
can express?
Think of a connective that expresses a
‘contrastive’ relation?
Think of another connective that expresses a
‘contrastive’ relation?
Other discourse relations?
31
Practice: Test your understanding
of legal arguments
1. When Sophie and Joanna got to the supermarket they
went their separate ways.
2. At the end of the road there was a sharp bend, known
as Captain’s Bend.
3. People seldom went that way except on the weekend.
4. Sophie tried to imagine herself shaking hands and
introducing herself as Lillemor Amundsen, but it
seemed all wrong. It was someone else who kept
introducing himself.
5. ‘I’m Sophie Amundsen,’ she said.
6. Sophie tried to beat her reflection to it with a
lightning movement but the girl was just as fast.
7. Sophie pressed her index finger to the nose in the
mirror and said, ‘You are me.’ As she got no answer to
this, she turned the sentence around and said, ‘I’m
you.’
32
Wordfreak
On-line demonstration
33