Tutorial for the annotation of the Penn Discourse Treebank

Download Report

Transcript Tutorial for the annotation of the Penn Discourse Treebank

Annotation Guidelines for the
Penn Discourse Treebank
Part B
Eleni Miltsakaki, Rashmi Prasad,
Aravind Joshi, Bonnie Webber
1
Brief summary
 Annotation of discourse connectives and their
arguments.
 Discourse connectives: subordinate conjunctions,
coordinate conjunctions, adverbials, empty.
 Discourse connectives express relations between at least 2
events or states.
 Legal argument: a clause at minimum.
 Annotation tool: WordFreak
 Tags: CONN,ARG1, ARG2, SUP1, SUP2
 Features: Search, discontinuous text selection, comment
box.
2
Basic types of clauses
Tensed
Main
Subordinate
Non-tensed
Subordinate
Complement Adverbial Relative Infinitival Participial
3
Examples
Main:
Tom left
Complement: Mary said that Tom left
Adverbial: Tom left when he finished
Relative:
Tom, who finished early, left
Infinitival: Tom wants to leave
Participial: Tom spend the day watching TV
4
“Small clauses”
 Complements (“objects”) of certain verbs
 Verb “be” is understood but may not be
explicit
(1) I consider Mary a smart student.
5
“Small clauses” as arguments
 Selecting just the small clause is sufficient
even though there is no explicit verb
(2) I consider Mary a smart student
although she failed her exams.
6
Relative clauses as arguments
 Selecting just the relative clause is sufficient
 Syntactic information from Treebank will help
us identify the head of the relative clause
(3) I bought some booksi which (nulli)
were very expensive, even though they
were second hand.
7
Distinction between a relative
clause and an NP
 … a reference to spiders that attract males and
then kill them after mating.
 ‘that attract males and then kill them after
mating’  relative clause  OK ARG
 ‘spiders that attract makes and then kill them
after mating’  Noun Phrase  NOT OK
ARG
8
Modified connectives
 As other syntactic categories, connectives can be
modified.
 E.g., only when, largely because, especially after, etc.
 In such cases, select both the connective and the
modifier.
 When you see a comma before the connective, select
just the connective. In such cases, the modifier does
not modify the connective.
(4) He wears jeans only, because he wants to have a casual
look.
(5) He wears jeans only because he wants to have a casual
look.
9
Words that look like discourse
connectives
 Reminder:
 Discourse relations require clausal interpretations.
 Ignore instances of “connectives” in your set if they are
not associated with a clause
 Examples
(6) These mainly involved such areas as materials --advanced
soldering machines, for example – and medical devices
derived from experimentation in space.
(7) They bought wine and beer.
(8) Mary, also John, will leave late today.
10
Not all adverbials are connectives
 Some adverbials do not express a discourse relation. The
clause that contain them is sufficient for the interpretation.
 An adverbial counts as a connectives when it expresses a
relation between at least TWO situations in the discourse.
In:
(9) John did not finish the report. Therefore, we will
postpone today’s meeting.
Out:
(10) John was hungry. Strangely, he only ordered a
fruit salad.
11
ARG1 and ARG2 for double
connectives
 For double connectives such as On one hand … on
the other hand, If…then
 Select the two connectives using the discontinuous text
selection feature and enter them together under CONN.
 Mark as ARG1 the clause that contains the first connective.
 Mark as ARG2 the clause that contains the second
connectives.
(11) If you finish your homework before noon, then you may
go to the movies.
ARG1= you finish your homework before noon
ARG2=you may go to the movies
12
A few more conventions
 Exclude punctuation marks appearing at the
end of the clause that you are selecting.
 When selecting the content of a complement
clause, include ‘that’ in your selection.
 When selecting a relative clause, include the
relative pronoun in your selection.
 When a connective appears in the clause
you’re selecting as an argument, include that
connective.
13
The case of VP coordination
 Do not annotated connectives that related two
verb phrases.
 Diagnostic: missing subject in the second
(tensed) verb.
(12) OUT: Mary finished her food and left.
(13) IN: Mary finished her food and she left.
14
Some hard cases: “As”
 Multiple meanings of “as”
 Temporal
(14) He tripped over a bunch of plugs as he was leaving the room.
 Causal
(15) W. U. had major losses as its telex business faltered in the face
of competition from facsimile machines.
 Annotate “as” only if it has a temporal or causal
interpretation.
 Do NOT annotate, for example:
(16) As she puts it, there’s no hope.
(17) We do as we are told, as is the rule.
15
Some hard cases: “So”
 “So” expresses a consequence relation.
 But it’s not always easy to identify the
consequence:
(18) She flunked the exam. So, what’s next?
(19) You said she didn’t work hard. So, if you
believe this, you must be right.
 Sometimes it may be hard to identify ARG.
 If you run into such cases, let us know and
make a comment in the comment box.
16
Some hard cases: “Nor”
 “Nor” can be found:
 By itself: annotate as regular connective.
(20) This has nothing to do with you. Nor will it ever.
 In a “neither … nor” construction: annotate as a double
connective.
 Sometimes it may be hard to identify ARG1. In this
case, leave the ARG1 slot empty and make a
comment in the comment box.
 “Neither … nor” in VP coordination: Do not annotate
(21) In doing so, he neither rejected a socialist planned
economy nor embraced the free market.
17
When to exclude conns from args
 Do not include a connective in the selection of an
argument if it does not belong with the clause
selected as an argument.
 Some such hard cases include the coordinate
conjunctions “and” and “but”.
 But make sure you include the connective when it
belongs with the selected clause even if it’s at a
distance.
(22) But, say Mr. Dinkins, he did get an office. Therefore he
shouldn’t complain.
18
Implicit conns: multiple
interpretations
 If you identify more than one relation between
adjacent sentences
 And therefore are able to provide more than one
explicit connective
 Put the one that you think most likely first and add
the rest as follows:
 ;;;CONN=because ;;;CONN=nevertheless
;;;CONN=moreover
 If you think there is NO relation between the two
sentence that can be expressed with a connective type
NONE in the comment box. Do not just leave the
comment box empty.
19