Semantic distance in WordNet: An experimental, application
Download
Report
Transcript Semantic distance in WordNet: An experimental, application
Semantic distance in WordNet:
An experimental, application-oriented
evaluation of five measures
Written by
Alexander Budanitsky
Graeme Hirst
Retold by
Keith Alcock
Definitions
Semantic relatedness
–
General term involving many relationships
Semantic similarity
–
More specific term involving likeness
bank-trust company (synonymy)
Distance
–
Inverse of either one
2
car-wheel (meronymy)
hot-cold (antonymy)
pencil-paper (functional)
penguin-Antarctica (association)
reldist(x)=semantic relatedness-1(x)
simdist(x)=semantic similarity-1(x)
Evaluation
Theoretical examination
–
Comparison with human judgment
–
Coarse filter
Lack of data
Performance in NLP applications
–
Many different applications (with potentially conflicting results)
3
Word sense disambiguation
Discourse structure
Text summarization and annotation
Information extraction and retrieval
Automatic indexing
Automatic correction of word errors in text
Equation: Hirst— St-Onge
rel HS (c1 , c2 ) C path length k d
c1 , c2 : synsets
d : number of changes of direction in the path
C : constant
k : constant
rel HS (c1 , c2 ) k1 len( c1 , c2 ) k 2 dirChanges (c1 , c2 )
4
Equation: Leacock— Chodorow
len( c1 , c2 )
sim LC (c1 , c2 ) log
2D
c1 , c2 : synsets
D : overall depth of the taxonomy
sim LC (c1 , c2 ) log( 2) log(len( c1 , c2 )) log( D)
5
Equation: Resnik
sim R (c1 , c2 ) log(p(lso( c1 , c2 )))
c1 , c2 : synsets
p( x) : probabilit y of encounteri ng x
in a specific corpus
lso ( x, y ) : lowest super - ordinate
6
Equation: Jiang— Conrath
dist JC (c1 , c2 ) 2 log(p(lso( c1 , c2 ))) (log(p( c1 )) log(p( c2 )))
c1 , c2 : synsets
p( x) : probabilit y of encounteri ng x
in a specific corpus
lso( x, y ) : lowest super ordinate
p 2 (lso( c1 , c2 ))
simdist JC (c1 , c2 ) log
p(c1 ) p(c2 )
7
Equation: Lin
sim L (c1 , c2 )
2 log(p(lso( c1 , c2 )))
log(p( c1 )) log(p( c2 ))
c1 , c2 : synsets
p( x) : probabilit y of encounteri ng x
in a specific corpus
lso ( x) : lowest super - ordinate
log(p 2 (lso( c1 , c2 )))
sim L (c1 , c2 )
log(p( c1 ) p(c2 ))
8
Calibration: Step 1
Rubenstein & Goodenough (1965)
–
Humans judged semantic synonymy
Miller & Charles (1991)
–
Different humans, subset of words
9
51 subjects
65 pairs of words
0 to 4 scale
38 subjects
30 pairs of words
10 low (0-1), 10 medium (1-3), 10 high (3-4)
Calibration: Step 2
Calibration
Similarity as calculated
10
8
6
4
2
0
0
1
2
3
Similarity as judged by human
10
4
5
Testing: Simulation
Malapropism
–
–
–
Material
–
–
–
11
Real-word spelling error
*He lived on a diary farm.
When after insertion, deletion, or transposition of
intended letters, a real word results
500 articles from Wall Street Journal corpus
1 in 200 words replaced with spelling variation
1408 malapropisms
Testing: Assumptions
12
The writer’s intended word will be
semantically related to nearby words
A malapropism is unlikely to be semantically
related to nearby words
An intended word that is not related is
unlikely to have a spelling variation that is
related to nearby words
Testing: Suspicion
Suspect is unrelated to other nearby words
True suspect is a malapropism
number of true suspects
PS Precision S
number of suspects
number of true suspects
R S Recall S
number of malapropis ms in text
( 2 1) PS R S
F measure S 1
2 PS R S
13
2 PS R S
PS R S
1
Testing: Detection
Alarm is a spelling variation related to nearby words
True alarm is a malapropism that has been detected
number of true alarms
D
number of alarms
number of true alarms
R D Recall D
number of malapropis ms in text
PD Precision
( 2 1) PD R D
F measure D 1
2 PD R D
14
2 PD R D
PD R D
1
Results: Suspicion
15
Results: Detection
16
Conclusion
Measures are significantly different
–
simdistJC on single paragraph is best
–
relHS is worst
Relatedness doesn’t outperform similarity
–
17
18% precision
50% recall
WordNet gives obscure senses the same
prominence as more frequent senses
Discussion
18
Calibration of relatedness with similarity data
Calibration point inaccurate
Substitution errors untested
Semantic bias in human typing errors not
addressed
Binary threshold not best choice
Frequency on synset, word, or word sense