Automatic generation of UW Dictionary through WordNet

Download Report

Transcript Automatic generation of UW Dictionary through WordNet

Automatic generation of UW
Dictionary through WordNet
By
Nitin Verma
Under the guidance of
Prof. Pushpak Bhattacharya
Department of Computer Science and Engineering
Indian Institute of Technology
Bombay
June 1, 2002
Outline of the talk
Introduction
Universal Word Dictionary and its use in
Machine Translation
WordNet
 Synsets(basic building blocks of WordNet).
 Relations in the WordNet.
Extraction of semantic attributes through
WordNet.
Conclusion.
Introduction
Importance of Universal Word Dictionary

Used by both “enconverter” and “deconverter”
software's of the UNL.
Problem with its construction


Requires tremendous manual efforts.
Takes a large amount of time.
Automatic generation of UW-Dictionary
through WordNet.
Universal Word Dictionary
Plays an important role at the time of
Machine Translation.
It contains the mapping between Head words
and their corresponding Universal Words.
The Format of a UW-Dictionary entry
[HW] “UW(restriction)”(ATTR1, ATTR2,…)<FLG,FRE,PRI>
Extraction of Semantic attributes
through WordNet
Machine
Mango
……….
WordNet
Attribute Generator
[machine]{}"machine(icl>solid{>matter})"(N, INANI, OBJCT, ARTFCT)
[machine]{}"machine(icl>organization{>group})"(N, INANI, GRP)
[machine]{}"machine(icl>living thing{,icl>volitional thing})"(N, ANIMT, FAUNA, MML, PRSN)
[machine]{}"machine(icl > --)"(V, VOA-CRTE)
…………………..
[mango]{}"mango(icl>plant{>living thing})"(N, ANIMT, FLORA, TREE)
[mango]{}"mango(icl>matter{>concrete thing})"(N, INANI, OBJCT, EDBL, STE, PHSCL, SLD)
Extraction of Semantic attributes
through WordNet(cont’d)
For generating every attribute there is a
rule.
Most of the noun and verb attributes
can be generated by using hypernymy
relation of the WordNet.
Attributes for adjectives and adverbs
can be generated by using Synonymy
relation.
Algorithm for generating noun
Attributes
 In WordNet animate is represented by {organism,
being, living thing}
 If a word belongs to the category of “organism”
then we can generate ANIMT attribute for it else we
generate INANI.
 For example:

teak, teakwood
=> wood
=> material, stuff
=> substance, matter
=> object, physical object
=> entity
Algorithm for generating noun
Attributes (cont’d)

teak, Tectona grandis
=> tree
=> woody plant, ligneous plant
=> plant, flora
=> organism, being, living thing
=> entity
Extraction of Semantic attributes
attributes(cont’d)

An alternative for generating verb attributes.
 All the verbs related to change category are stored in
verb.change lexical file, so lexical filenames can be a clue
for generating verbs attributes
 A list of lexical filenames is shown below:
verb.change
Verb.cognition
Verb.comm
Verb.competition
Verb.consumption
Verb.contact
Verb.creation
Verb.emotion
Verb.motion
Verb.perception
Verb.possession
Verb.social
Verb.stative
Verb.weather
References
 [1] W. John Hutchins and Harold L. Somers, An introduction
to Machine Translation, Academic press 1992.
 [2] Miller, G.A. Nouns in WordNet: A lexical inheritance
system. Available at URL: http://clarity.princeton.edu:80/~wn,
1993
 [3] Fellbaum, C., Gross, D., Miller, K. Adjectives in WordNet.
Available at URL: http://clarity.princeton.edu:80/~wn/ 1993.
 [4] Fellbaum, C., English verbs as Semantic Net. Available at
URL: http://clarity.princeton.edu:80/~wn/ 1993.
 [5] Miller, G. A., BeckWith, R., Fellbaum, C., Gross, D., Miller,
K. Five papers on WordNet. 1993.