Survey of Annotation Work
Download
Report
Transcript Survey of Annotation Work
Survey of Annotation Work
Joint session
Thursday afternoon, April 14
Chair: Eduard Hovy, ISI
Phenomena (from OntoBank)
Level
Who
Phenomenon
L1
Penn Treebank
bracketing/grouping of predications
L1
Propbank
verb sense creation and annotation (including copula)
L1
Propbank, Framenet,
Verbnet, LCS, ILIT
verb sense frames & predicate structure (what labels?)
L1
Propbank+Omega,
IAMTC+Omega, ILIT,
Scone
semantic term repository: conversion of senses to
concepts(/clusters), axiom creation, insertion into
ontology
L1,L2
NomBank, ACE
noun senses, NP structure, propositions, (genitives, …)
L1
Gazetteers
repository of instances (people, places, events…)
L1
BBN, (ACE)
co-reference links (including events)
pronoun (and empty trace?) classification (ref, bound,
event, generic, other)(proposition vs. event?)
L2
L2
Propbank II, ILIT
event identification
Level
Who
Phenomenon
L1
direct quotation and reported speech
L1
simple quantifier phrases and numerical exprs
L1,L2
TimeBank, TIMEX,
ISI (Hobbs), ILIT
inter-predicate relations: temporal, spatial, manner,
etc. (incl. effects from discourse and aspect)
L2+
WordNetPlus, Pantel, CYC
entailments
L2+
comparatives
L2
coordination
L2/L3
Penn Discourse Treebank,
RST Treebank, ILIT
discourse structure
L2/L3
U Pitt, ISI
opinions
L3
identifying propositions and simple modality
L3/L4
other adverbials (epistemic modals, evidentials)
L3/L4
polarity (more advanced than plain “neg” in L1)
L3+
Steedman, Hajicova, Sgall
information structure (theme/rheme), focus
L4
ILIT
pragmatics/speech acts, style
L4
presuppositions
?
CYC, Scone
axioms and reasoning
?
Framenet
metaphor
Notional goal
phenomenon
•
•
•
•
•
•
•
•
•
annot
speed
annot
reliability
functionality
noun senses
25 wph
86/90% IE,MT,QA...
verb senses
70 wph
~87%
MT,QA,WSD
verb frames
80 w/week
87%
MT,QA,IE…
time exprs
18 wpm
96%
QA,IR,Summ
discourse
100K in 400h ~90/80% Summ,QA
gazetteers
?
~95/90% QA,IE
opinions
100K in 400h ~76%
QA,Summ
number exprs
?
?
IE,QA,Summ
hypotheticals
?
?
QA,Summ
funder
need
high
high
high
med-hi
med
high
med-hi
med
low?
Agenda I
• Predicate/verb level:
–
–
–
–
–
PropBank I and II: Martha Palmer, UPenn
OntoBank corefs: Lance Ramshaw, BBN
IAMTC consortium: Steve Helmreich, NMSU
FrameNet: Charles Fillmore, UC Berkeley
Extended LCS: Bonnie Dorr, U Maryland
• Nominal level:
– NomBank: Adam Meyers, NYU
– ACE: Ralph Grishman, NYU
• Terminology banks:
– WordNet: Christiane Fellbaum, Princeton
– Omega: Eduard Hovy, USC/ISI
to PropBank
to OntoBank coref
to IAMTC
to Framenet
to LCS
to NomBank and
Pie-in-the-Sky
to ACE
to WordNetPlus
to Omega
Agenda II
• Discourse level:
– RST treebank: Lynn Carlson, DoD
– Penn discourse treebank: Aravind Joshi, UPenn
• Specific semantic phenomena:
–
–
–
–
to RST
to Penn discourse
to TIMEX
TIMEX: Lisa Ferro, MITRE & Beth Sundheim, SPAWAR to ILIT
ILIT: Sergei Nirenburg, UMBC
to opinions
Opinions: Jan Wiebe, U Pitt
Gazetteers: Beth Sundheim, SPAWAR
to gazetteers
• Inference and reasoning:
– WN Entailments: Christiane Fellbaum, Princeton
– CYC: Dave Schneider
– Scone: Scott Fahlman
to WN entailments
to CYC
to Scone
Summary of annot work
phenomenon
pred-arg
who
Propbank
IAMTC
ACE
FrameNet
entities
task
V frame annot
frame creation
V/N senses
frame roles
inter-N relations
accuracy
speed 1
~85%
70/h
.83/.66kappa
.52 kappa
~77%
speed 2
100/week
60/h
annotated
corpus size
1M+ Eng
250K Chi
3Kw (10x) Eng
23Kw Eng
Nombank
ACE
N sense annot
N types
coref
BBN
event parts
coref
apposition
ILIT
Nirenburg
numerous
phenomena
gazetteer
Sundheim
time exprs
discourse
Ferro,
Sundheim
Joshi
opinions
Wiebe et al.
link? Y/N
same gaz entry
value of expr
86%
90%
~57%
84–90%
~92%
95%
87–99%
96%
25/h
# individuals
annotated
135K preds
uses
IE,QA,MT
MT
IE,QA,MT
130K verbs in sents various
150K noun tokens IE,IR,QA,Summ
190Kw Eng
190Kw Chi
190Kw Ara
100K in 60 h
300Kw Eng
IE,Summ
100K in 2080 h
2.5Kw Eng?
MT,Summ,IE
QA,IE
100K in 93 h
explicit
implicit args
impl rel type
92%
90%
80%
100K in 450 h
350Kw Eng
220Kw Chi
16Kw Eng
8Kw Eng
8Kw Eng
opinion frame
~76%-95%
100K in 400 h
15Ksent Eng
QA,IR,Summ
Summ,QA
QA,Summ