Analysis of inter-annotator agreement

Transcript Analysis of inter-annotator agreement

TEXT MINING & REG.
ANNOTATION
ANALYSIS OF INTER-ANNOTATOR
AGREEMENT
(TEXT MINING & REG. ANNOTATION)
RegCreative Jamboree ,
Friday, December, 1st, (2006)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
MAIN ASPECTS
Explore annotation overlap
Discuss variability in annotation
Text mining and regulatory element
annotation: needs, limits, tasks
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
SOCIOLOGY OF GENOME ANNOTATION
(Lincoln Stein 2001)
 Models of annotation
 Museum model: small group of specialized curators
 Jamboree model: a group of biologists and bioinformaticians
come together for a short intensive annotation workshop
 Cottage industry: decentralized effort of annotators among the
recruited community
 Factory model: highly automated methods
(Elsik et al, 2006)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
WHY PRE-JAMBOREE QUEUE?
 Get familiar with annotation system (before jamboree)!
 Understand content and annotation strategy of Oreganno
 Detect aspects which require improvements such as
incompleteness, ambiguity or wrong structures in annotation
strategy, guidelines or documentation -> active Feedback
(Questionnaire and wiki)
 Assess consistency of the current annotation procedures
 Explore which aspects affect annotation agreement
 Estimate difficulty of task (alternative interpretation,
uncertainty, etc,..)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
SIMILARITY MEASURES
 Similarity calculation popular subject in computer science
 Different entities considered:
 Feature vectors: Alignment, Cosine, Dice, Euclidean, …
 Strings or sequences of strings (text): averaged String Matching, TFIDF
 Sets: Jaccard, Loss of Information, Resembalance
 Sequences: Levensthein Edit Distance
 Trees:Bottom-up/Top-down Maximum Common Subtree, Tree Edit Distance
 Graphs: Conceptual Similarity, Graph Isomorphism, Subgraph Isomorphism,

Maximum Common Subgraph Isomorphism, Graph Isomorphism Covering,
Shortest Path
Information theory: Jiang & Conrath, Lin, Resnik
 Bioinformatics: sequence similarity, structural similarity,
similarity of gene expression
 Here similarity between human annotations
( refer to SimPack project examples)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
MEASUREMENT OF OBSERVER AGREEMENT
 Assumption when independent annotators agree they are
correct?!
 Statistical agreement measures for categorical data
 Overall proportion of agreement
 Pairwise comparison; Cohen’s kappa; Pearson Chi-square
 Weighted kappa for multiple categories
 High accuracy implies high agreement
 Kappa sometimes is inconsistent with
accuracy measured as AROC
Kappa coefficient
Measurement of Observer agreement Kundel and
Polansky, Statistical concepts Series (2003)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
ANNOTATOR AGREEMENT FOR WSD




Word Sense Disambiguation (WSD) a central problem in NLP
WSD: discerning the meaning of a word in context
Two human annotators may disagree in their sense assignment
Agreement of human annotators often the baseline for
evaluation of automated approaches
 Case study using more than 30,000 instances of the most
frequently occurring nouns and verbs in English
 Sense tagged word in sentences manually by two groups of
annotator to WordNet
 Used the Kappa score to measure inter-annotator agreement
considering effect of chance agreement
 Difficult to achieve high agreement when they have to
assign refined sense tags
 Importance of example sentences for the usage of each word
sense
MARTIN KRALLINGER, 2006
A case Study on Inter-Annotator Agreement for Word
Sense Disambiguation, Ng et al
TEXT MINING & REG.
ANNOTATION
AGREEMENT OF SPEECH CORPORA
 Phonetically annotated speech corpora
 Quality of manual annotations affected by:
 Implicit incoherence: labeling incoherent due to human
variability in perceptual capacities and other factors
 Lack of consensus on coding schema: manual annotations
reflect the variability of the interpretation and application of
the coding schema by the annotators
 Annotator characteristics: individual characteristics of coders
such as familiarity with the material, amount of former training,
motivation, interest and fatigue induced errors
MARTIN KRALLINGER, 2006
Measuring the reliability of Manual annotations of
Speech corpora, Gut and Bayerl
TEXT MINING & REG.
ANNOTATION
CHALLENGES FOR OREGANNO ANNOTATION
 Complexity of gene regulation
 Need of ontologies and lexical resources
 Deep inference of domain expert curators
 Spatial, temporal, experimental conditions
 Range of entity types: genes, regulatory sequences, proteins
 Gene family and individual gene member distinction
 TF binding site sequence extraction and mapping to genome
 TF mapping to normalized database entries (NCBI, Ensembl)
 Archeology-like annotation: annotation of old papers
 BUT GENE REGULATION IS ONE OF THE MAIN BIOLOGICAL
INFORMATION (ANNOTATION) ASPECTS!
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
SOURCES FOR ANNOTATION VARIABILITY (1)
 Curator background (biologist, bioinformatician,...)
 Familiarity with the annotation system
 Number of previously annotated papers or proteins
 Prior knowledge on the regulated gene or TF
 Prior knowledge (experience) on the experimental types
 Sub-domain knowledge (e.g. developmental biology or OS)
 Publication date (reflect the state of knowledge)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
SOURCES FOR ANNOTATION VARIABILITY (2)
 Nr. of papers annotated the same day (fatigue effect)
 Unclear or partial documentation of certain annotation aspects
 Annotation type (ontology of annotation types?, CV?)
 Nr. of pages, figures, tables, references,…
 Consultation of additional resources (material, databases, web)
 Different degrees of granularity in annotation
 Differences in the recall of manually extracted annotations (all ?)
 Sequence (paper/database, strand, typos, length)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
REGCREATIVE CASE STUDY: PREJAMBOREE (1)
 Relatively few articles -> only exploratory examination
 Annotation type: 9/11 (2071609, 10674400: RR vs. TFBS)
 Considerable difference in average nr. of annotations/paper
 Some only extracted a single annotation others basically
every annotation mentioned in the paper
 Almost perfect agreement in organism source (1 case of
human and mouse disagreement), but genes correct!
 Very high agreement on the gene names, only few user
defined cases (which are difficult to evaluate)
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
REGCREATIVE CASE STUDY: PREJAMBOREE (2)
Certain disagreement in TF names, many are user defined!
 Evidence class: high agreement many Transcription regulator
site, and unknown
 Evidence type: high agreement, some more complete than others,
(again, some annotate all the types others only some of them)
 Evidence sub-type: similar to evidence types, but in general
a little lower agreement than for the evidence type.
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
Transcription names factor: PREJAMBOREE
User defined
NCBI
Ensembl
Unkown
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
Example case 1: TF annotation variance
7534794
Curator B
UNKNOWN USER DEFINED
Curator B
AP-1
USER DEFINED
Curator B
AP-1
USER DEFINED
Curator A
c-Rel/p65 heterodimer USER DEFINED
Curator A
UNKNOWN USER DEFINED
Curator A
UNKNOWN USER DEFINED
MARTIN KRALLINGER, 2006
B
A
TEXT MINING & REG.
ANNOTATION
Example case 2: TF annotation variance
1718972
Curator A Tcf1
Curator B
Tcf1
NCBI
NCBI
Curator B
C\EBP family
USER DEFINED
Curator B
C\EBP and NF-1 USER DEFINED
Curator B
Tcf1
Curator B
UNKNOWN USER DEFINED
Curator B
UNKNOWN USER DEFINED
Curator B
UNKNOWN USER DEFINED
NCBI
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
Example case: difference in evidence types B
A
10674400
Curator A REGULATORY REGION
Curator B
Curator B
TRANSCRIPTION FACTOR BINDING SITE
TRANSCRIPTION FACTOR BINDING SITEB
2071609
Curator A
Curator B
Curator B
A
REGULATORY REGION
TRANSCRIPTION FACTOR BINDING SITE
TRANSCRIPTION FACTOR BINDING SITE
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
3038906
Different amount of annotation extracted
Curator A
TRANSCRIPTION FACTOR BINDING SITE
Col1a2 UNKNOWN
A
TCCAAACTTGGCAAGGGCGAGA
A
CLASS:OREGEC00001
TYPE: OREGET00003
SUBTYPE:OREGES00015
CLASS:OREGEC00001
TYPE: OREGET00001
SUBTYPE:OREGES00003
Curator B
1->TRANSCRIPTION FACTOR BINDING SITE
Col1a2 Nfia
B
B
TTCCAAACTTGGCAAGGGCGAGAGAGGGCGA
CLASS:OREGEC00001
TYPE: OREGET00003
SUBTYPE:OREGES00033
CLASS:OREGEC00001
TYPE: OREGET00003
SUBTYPE:OREGES00015
CLASS:OREGEC00002
TYPE: OREGET00001
SUBTYPE:OREGES00003
CLASS:OREGEC00002
TYPE: OREGET00001
SUBTYPE:OREGES00003
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
REGCREATIVE CASE STUDY: JAMBOREE
 Intensive annotation strategy: face to face with other curators
and expert annotators
 Get direct feedback and provide suggestions
 Promote integration of additional aspects in the annotation
structure as well as annotated information types
 Populate the database with new annotation records
 Explore efficient curation training strategies
 Create Gold Standard collection of annotation records, maybe
useful to allow example-based annotation training/evaluation
 Explore demands of biologists / curators to text mining
community - > where it would be useful
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
REGCREATIVE CASE STUDY: POST-JAMBOREE
 Monitor improvements in the annotation consistency
 Allow consistent community-based annotation
 Promote integration of additional aspects in the annotation
structure as well as annotated information types
 Increase efficiency in populating the database
 Construction for text mining training collection
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
ANNOTATION CONSISTENCY
 For selection as relevant paper for curation
 For the evidence class
 For the evidence types
 For the evidence subtypes
 For the regulated genes
 For the transcription factors
 For cell types
 How to structure comments
 Other aspects: ...
MARTIN KRALLINGER, 2006
TEXT MINING & REG.
ANNOTATION
TEXT MINING TASKS FOR GENE REGULATION
EXTRACTION
 Detection of relevant articles: abstracts or full text
 Extraction of ranked list of regulated genes: mention
or normalized gene (database entries)
 Extraction of ranked list if TF
 Extraction of ranked list of evidence type IDs
together with name and text passage (sentence)
 Extraction of ranked associations between these
genes and TF
 Extraction of associations to other controlled
vocabularies or ontologies
MARTIN KRALLINGER, 2006

Analysis of inter-annotator agreement

Transcript Analysis of inter-annotator agreement

Directory