ACE Annotation Practices and Quality Control
Download
Report
Transcript ACE Annotation Practices and Quality Control
Biomedical information extraction
at the University of Pennsylvania
Mark Liberman
[email protected]
Linguistic Data Consortium
http://www.ldc.upenn.edu
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Outline
The PennBioIE project:
Background, accomplishments, future
Public service announcement:
Publishing data via the LDC
The parable of Yang Jin
Annotation as “common law semantics”
a serviceable technology that will improve
are there better long-term alternatives?
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
PennBioie Project
Goals:
Learn to strip-mine the bibliome:
better NLP tools for text datamining
Publish biomedical text annotation:
Treebanks, entities, relations
Participants:
Penn NLP researchers
Biomedical researchers
(Penn, GSK, CHoP)
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Penn BioIE Project
Domains:
CYP
• inhibition of cytochrome P-450 enzymes
• 1100 abstracts
• collaboration with GSK
Onco
• genomic variations associated with cancer
• 1158 abstracts
• collaboration with Children’s Hospital of Philadelphia
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Annotation sequence
1. pretagging (document segmentation etc.)
2. named entities
3. POS
4. treebanking
5. relations
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Penn BioIE Project
Results:
Some improved techniques
Some published data
get rel. 0.9 from http://bioie.ldc.upenn.edu
rel. 1.0 soon to be published by LDC
Some applications -- e.g. FABLE
Some questions
• How to break the F-measure ceiling?
• How to decrease annotation burden?
• How to increase semantic coverage?
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
A note on the LDC
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
The Linguistic Data Consortium is
an open consortium
of universities,
companies,
and government laboratories;
founded in 1992
with seed money from DARPA;
run by the University of Pennsylvania
with 45 full-time staff in Philadelphia.
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
But really, the LDC is…
a specialized digital publisher,
which has distributed
>50,000 copies
of >750 corpora and other resources
to ~2,500 research organizations
in 62 countries.
… and might want to publish your data.
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Why publish with LDC?
It’s a publication!
LDC pubs have:
•
•
•
•
authors
ISBN numbers
standard bibliographic citation formats
editions
IPR, licensing are handled your way
(from “all rights reserved” to open access)
LDC deals with the hassle
of reproduction, distribution, maintenance
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
The parable of Yang Jin
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
The annotation conundrum
“Natural” annotation is inconsistent
poor agreement for entities, worse for relations
task-internal metrics are noisy
“Top down” specification is even worse
(e.g. existing elaborate ontologies)
Solution: iterative refinement of rules
via interaction with annotation practice
result: complex accretion of “common law”
slow to develop, hard to learn
more consistent -- but is it correct?
complexity may re-create inconsistency
new types and sub-types ambiguity, confusion
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
ACE 2005 consistency
English
Entity
Relation
Timex2
Value
Event
Chinese
Entity
Relation
Timex2
Value
Event
ACE Value Score
1P vs. 1P ADJ vs. ADJ
73.40%
84.55%
32.80%
52%
72.40%
86.40%
51.70%
63.60%
31.50%
47.75%
ACE Value Score
1P vs. 1P ADJ vs. ADJ
81.20%
85.90%
50.40%
61.95%
84.40%
82.75%
78.70%
71.65%
41.10%
32%
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
1P vs. 1P
independent first
passes by junior
annotator, no QC
ADJ vs. ADJ
output of two parallel,
independent dual first
pass annotations are
adjudicated by two
independent senior
annotators
Iterative improvement
From ACE 2005 (Ralph Weischedel):
Repeat until criteria met or until time has expired:
1. Analyze performance of previous task & guidelines
Scores, confusion matrices, etc.
2. Hypothesize & implement changes to tasks/guidelines
3. Update infrastructure as needed
DTD, annotation tool, and scorer
4. Annotate texts
5. Evaluate inter-annotator agreement
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
NLP as Law School
Rules, Notes, Fiats and Exceptions
Many complex rules
Plus Wiki
Plus Listserv
Task
#Pages
#Rules
Entity
34
20
Value
10
5
TIMEX2
75
50
Relations
36
25
Events
77
50
232
150
Total
Example Decision Rule (Event p33)
Note: For Events that where a single common trigger is ambiguous
between the types LIFE (i.e. INJURE and DIE) and CONFLICT
(i.e. ATTACK), we will only annotate the Event as a LIFE Event in
case the relevant resulting state is clearly indicated by the
construction.
The above rule will not apply when there are independent
triggers.
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
BioIE case law
Guidelines for oncology tagging (local)
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008
Discussion
How to make it better
Integrating multiple information sources
text, bioinformatic databases, microarray data, …
less-supervised learning
• inferring useful features from untagged text
• active learning, information markets, etc.
create a “basis set” of ready-made entity types
How to make it different
the analogy to translation
the lure of systematic semantics
(machine) learning: who is learning what?
Text mining for biology and medicine: Glasgow, Feb. 21-22, 2008