Transcript Document
Standardization of Lexicon
Team Members:
Jaya Saraswati
Gajanan K. Rane
Kunal K. Patel
INTRODUCTION:
Dictionary is the major source of
information in the Enconversion and
Deconversion process
The current Hindi Dictionary contains
about 80,000 common words and there
are about 200 Morphological,
Grammatical and Semantic Attributes
FORMAT OF THE DICTIONARY:
[HW]{} “UW(icl>restriction)” (attributes);
HeadWord
Universal Word
Grammatical, Morphological
and Semantic Attributes
[Am]{} “mango(icl>fruit)”(N,MALE,EDBL,OBJCT,INANI,Na);
THE NEED FOR STANDARDIZING THE
DICTIONARIES:
The dictionary contains Universal Words
which represent concepts present in all the
languages
Currently, the dictionaries are containing
different restrictions for the same concept
Currently, the semantic attributes in the
different dictionaries are also different
Continued………….
e.g.: The boy is running
English Dictionary –
[run]{} "run(icl>walk)" (V,VINT);
[boy]{} "boy(icl>living thing)" (N,ANI,CONCRETE);
UNL: agt(run(icl>walk), boy(icl>living thing))
Hindi Dictionary –
[xOdZ]{} "run(icl>act)" (V,VINT,Va,VOA-MOT);
[ladZak]{} "boy(icl>person)“(N,MALE,ANIMT,MML,PRSN,NAA);
KNOWLEDGE BASE TO BE USED FOR
STANDARDIZING THE DICTIONARIES
The UNU, Tokyo has sent a knowledge base which is
a hierarchy of concepts
We have created a set of semantic attributes and
these semantic attributes have been incorporated
into the knowledge base
e.g.: “glass” – ARTFCT, OBJCT
Our task is to map each word of the dictionary to the
concepts provided in the knowledge base
CURRENT ACTIVITIES
The dictionary is divided into four parts - Noun,
Verbs, Adjectives and Adverbs
For standardizing the Noun part, a program has been
created, which facilitates the user to select a
restriction quickly for a dictionary entry
For each restriction selected, the semantic attributes
corresponding to that restriction are also
automatically entered in the dictionary entry
Continued………….
Efforts are being made to automatically
standardize the verb, adjective and adverb
parts of the dictionary
For the Adverb part, the adverbs which end
with “-ly” are given the restriction (icl>how)
while those which do not end with "-ly" are
given the restriction (icl>how(obj>thing))
FINAL GOAL
All the dictionaries should have uniform
restrictions and semantic attributes for
similar concepts