Transcript Document

Standardization of Lexicon
Team Members:
Jaya Saraswati
Gajanan K. Rane
Kunal K. Patel
INTRODUCTION:
 Dictionary is the major source of
information in the Enconversion and
Deconversion process
 The current Hindi Dictionary contains
about 80,000 common words and there
are about 200 Morphological,
Grammatical and Semantic Attributes
FORMAT OF THE DICTIONARY:
 [HW]{} “UW(icl>restriction)” (attributes);
HeadWord
Universal Word
Grammatical, Morphological
and Semantic Attributes
 [Am]{} “mango(icl>fruit)”(N,MALE,EDBL,OBJCT,INANI,Na);
THE NEED FOR STANDARDIZING THE
DICTIONARIES:
 The dictionary contains Universal Words
which represent concepts present in all the
languages
 Currently, the dictionaries are containing
different restrictions for the same concept
 Currently, the semantic attributes in the
different dictionaries are also different
Continued………….
e.g.: The boy is running
English Dictionary –
[run]{} "run(icl>walk)" (V,VINT);
[boy]{} "boy(icl>living thing)" (N,ANI,CONCRETE);
UNL: agt(run(icl>walk), boy(icl>living thing))
Hindi Dictionary –
[xOdZ]{} "run(icl>act)" (V,VINT,Va,VOA-MOT);
[ladZak]{} "boy(icl>person)“(N,MALE,ANIMT,MML,PRSN,NAA);
KNOWLEDGE BASE TO BE USED FOR
STANDARDIZING THE DICTIONARIES
 The UNU, Tokyo has sent a knowledge base which is
a hierarchy of concepts
 We have created a set of semantic attributes and
these semantic attributes have been incorporated
into the knowledge base
e.g.: “glass” – ARTFCT, OBJCT
 Our task is to map each word of the dictionary to the
concepts provided in the knowledge base
CURRENT ACTIVITIES
 The dictionary is divided into four parts - Noun,
Verbs, Adjectives and Adverbs
 For standardizing the Noun part, a program has been
created, which facilitates the user to select a
restriction quickly for a dictionary entry
 For each restriction selected, the semantic attributes
corresponding to that restriction are also
automatically entered in the dictionary entry
Continued………….
 Efforts are being made to automatically
standardize the verb, adjective and adverb
parts of the dictionary
 For the Adverb part, the adverbs which end
with “-ly” are given the restriction (icl>how)
while those which do not end with "-ly" are
given the restriction (icl>how(obj>thing))
FINAL GOAL
All the dictionaries should have uniform
restrictions and semantic attributes for
similar concepts