Introduction to Computional Linguistics

Download Report

Transcript Introduction to Computional Linguistics

Introduction to
Computational Linguistics
Lecture 2
What is Computational Linguistics?
• CP: Computational Linguistics, NLP: Natural
Language Processing, NLE: Natural Language
Engineering, HLT: Human Language Technology
etc.
• Formal definition: CL is a discipline between
linguistics and computer science which is
concerned with the computational aspects of the
human language (Uszkreit, 2000).
Scientific and Technological
Aspects of CL
• Human use natural language to communicate
– Formal Theories
– Linguistic Knowledge
• Is linguistic information helpful for doing Natural
Language Processing?
• How machines can communicate with human
– CL focuses on the practical outcome of modeling human
language
– The main obstacle between human and computer is
communication
– Computational Linguistics develop formal models to simulate
human language technology and program them.
Linguistics Knowledge
Focusing on Words
• Phonology: sounds (cats/dogz), homophones
(bare/bear), rhythm (co`nvert, conve`rt)
• Morphology: related word forms (e.g., plural)
• Syntax: how to use the word in a sentence
• Lexical Meaning: meaning of words
• Compositional Semantics: the construction of
complex words from the meaning of the parts
(e.g., untruthfulness)
Words and Sentences Identification
•
•
•
String x is a words in a text if and only if x is
delimited by white space.
In order to tell whether a string is a word, look
it up in a dictionary.
String w is a sentence in a text if and only if w
starts with a capital letter and finishes with a
full stop.
Example
Next week, Mr. Ali will visit our department
and he is planning to provide an amount of
Rs. 12,000,000 to our bright students for
their further studies. His company ‘XYZ’
has a huge name in the field of
constructions.
• Plurals (chairs’)
Major Syntactic Constitutes
•
•
•
•
•
Noun Phrase (NP): referring expressions
Verb Phrase (VP): verbs plus complements
Prepositional Phrase (PP): direction, location etc.
Adjectival Phrase (AdjP): complemented adjectives
Adverbial Phrase (AdvP): modified adjectives (very
rapidly)
• Complementizers (Comp): (that, whether)
Examples
• The man saw a frog with a telescope
• The mouse was caught by the cat sat on
the table.
Applications
•
•
•
•
Automatic Tokenization
Automatic Part of Speech Tagger
Name Entity Recognition System
Machine Translation
– Word Sense Disambiguation
– Example: River bank erosion is a growing problem.
High-street bank is performing well to provide financial
solutions.
• Question Answering System
– QED: The Edinburgh TREC-2003 Question Answering System.
Available at
(www.iccs.informatics.ed.ac.uk/ ~s0239229/documents/Leidneretal-2003-TREC.pdf)
• Query Searching (in search engines like Google)