Tutorial for the annotation of the Penn Discourse Treebank
Download
Report
Transcript Tutorial for the annotation of the Penn Discourse Treebank
Annotation Guidelines for the
Penn Discourse Treebank
Part B
Eleni Miltsakaki, Rashmi Prasad,
Aravind Joshi, Bonnie Webber
1
Brief summary
Annotation of discourse connectives and their
arguments.
Discourse connectives: subordinate conjunctions,
coordinate conjunctions, adverbials, empty.
Discourse connectives express relations between at least 2
events or states.
Legal argument: a clause at minimum.
Annotation tool: WordFreak
Tags: CONN,ARG1, ARG2, SUP1, SUP2
Features: Search, discontinuous text selection, comment
box.
2
Basic types of clauses
Tensed
Main
Subordinate
Non-tensed
Subordinate
Complement Adverbial Relative Infinitival Participial
3
Examples
Main:
Tom left
Complement: Mary said that Tom left
Adverbial: Tom left when he finished
Relative:
Tom, who finished early, left
Infinitival: Tom wants to leave
Participial: Tom spend the day watching TV
4
“Small clauses”
Complements (“objects”) of certain verbs
Verb “be” is understood but may not be
explicit
(1) I consider Mary a smart student.
5
“Small clauses” as arguments
Selecting just the small clause is sufficient
even though there is no explicit verb
(2) I consider Mary a smart student
although she failed her exams.
6
Relative clauses as arguments
Selecting just the relative clause is sufficient
Syntactic information from Treebank will help
us identify the head of the relative clause
(3) I bought some booksi which (nulli)
were very expensive, even though they
were second hand.
7
Distinction between a relative
clause and an NP
… a reference to spiders that attract males and
then kill them after mating.
‘that attract males and then kill them after
mating’ relative clause OK ARG
‘spiders that attract makes and then kill them
after mating’ Noun Phrase NOT OK
ARG
8
Modified connectives
As other syntactic categories, connectives can be
modified.
E.g., only when, largely because, especially after, etc.
In such cases, select both the connective and the
modifier.
When you see a comma before the connective, select
just the connective. In such cases, the modifier does
not modify the connective.
(4) He wears jeans only, because he wants to have a casual
look.
(5) He wears jeans only because he wants to have a casual
look.
9
Words that look like discourse
connectives
Reminder:
Discourse relations require clausal interpretations.
Ignore instances of “connectives” in your set if they are
not associated with a clause
Examples
(6) These mainly involved such areas as materials --advanced
soldering machines, for example – and medical devices
derived from experimentation in space.
(7) They bought wine and beer.
(8) Mary, also John, will leave late today.
10
Not all adverbials are connectives
Some adverbials do not express a discourse relation. The
clause that contain them is sufficient for the interpretation.
An adverbial counts as a connectives when it expresses a
relation between at least TWO situations in the discourse.
In:
(9) John did not finish the report. Therefore, we will
postpone today’s meeting.
Out:
(10) John was hungry. Strangely, he only ordered a
fruit salad.
11
ARG1 and ARG2 for double
connectives
For double connectives such as On one hand … on
the other hand, If…then
Select the two connectives using the discontinuous text
selection feature and enter them together under CONN.
Mark as ARG1 the clause that contains the first connective.
Mark as ARG2 the clause that contains the second
connectives.
(11) If you finish your homework before noon, then you may
go to the movies.
ARG1= you finish your homework before noon
ARG2=you may go to the movies
12
A few more conventions
Exclude punctuation marks appearing at the
end of the clause that you are selecting.
When selecting the content of a complement
clause, include ‘that’ in your selection.
When selecting a relative clause, include the
relative pronoun in your selection.
When a connective appears in the clause
you’re selecting as an argument, include that
connective.
13
The case of VP coordination
Do not annotated connectives that related two
verb phrases.
Diagnostic: missing subject in the second
(tensed) verb.
(12) OUT: Mary finished her food and left.
(13) IN: Mary finished her food and she left.
14
Some hard cases: “As”
Multiple meanings of “as”
Temporal
(14) He tripped over a bunch of plugs as he was leaving the room.
Causal
(15) W. U. had major losses as its telex business faltered in the face
of competition from facsimile machines.
Annotate “as” only if it has a temporal or causal
interpretation.
Do NOT annotate, for example:
(16) As she puts it, there’s no hope.
(17) We do as we are told, as is the rule.
15
Some hard cases: “So”
“So” expresses a consequence relation.
But it’s not always easy to identify the
consequence:
(18) She flunked the exam. So, what’s next?
(19) You said she didn’t work hard. So, if you
believe this, you must be right.
Sometimes it may be hard to identify ARG.
If you run into such cases, let us know and
make a comment in the comment box.
16
Some hard cases: “Nor”
“Nor” can be found:
By itself: annotate as regular connective.
(20) This has nothing to do with you. Nor will it ever.
In a “neither … nor” construction: annotate as a double
connective.
Sometimes it may be hard to identify ARG1. In this
case, leave the ARG1 slot empty and make a
comment in the comment box.
“Neither … nor” in VP coordination: Do not annotate
(21) In doing so, he neither rejected a socialist planned
economy nor embraced the free market.
17
When to exclude conns from args
Do not include a connective in the selection of an
argument if it does not belong with the clause
selected as an argument.
Some such hard cases include the coordinate
conjunctions “and” and “but”.
But make sure you include the connective when it
belongs with the selected clause even if it’s at a
distance.
(22) But, say Mr. Dinkins, he did get an office. Therefore he
shouldn’t complain.
18
Implicit conns: multiple
interpretations
If you identify more than one relation between
adjacent sentences
And therefore are able to provide more than one
explicit connective
Put the one that you think most likely first and add
the rest as follows:
;;;CONN=because ;;;CONN=nevertheless
;;;CONN=moreover
If you think there is NO relation between the two
sentence that can be expressed with a connective type
NONE in the comment box. Do not just leave the
comment box empty.
19