Computational Linguistics

Download Report

Transcript Computational Linguistics

LELA300431
Language and Computers
Harold Somers
Professor of Language Engineering
1/16
Computers and Language?
• Getting computer to handle language in a natural way
– As part of interface with user (eg alternatives to keyboard/mouse
input, text output, support for the disabled)
– consulting databases to get information (e.g. library catalogue, train
timetable, banking)
– As a task in itself: specifically linguistic purposes (e.g. dictation,
translation, summarizing, report writing, language teaching)
• Using computer to “do” linguistics
• Synonyms (or components?)
– Natural Language Processing
– Computational Linguistics
– Language Engineering
• Basic tools, techniques and models
• Applications
2/16
Syllabus
• Survey of applications
• Elements of language, levels of linguistic processing
–
–
–
–
–
Sound (speech recognition, synthesis)
Writing (OCR, handwriting, character sets, spelling)
Words (lexicon, morphology)
Sentences (syntax)
Semantics, Pragmatics
• Branches of language/linguistics
–
–
–
–
Psycholinguistics and AI
Applied linguistics (teaching, translation)
Information retrieval
Or: your choice of anything else relevant (historical, socio, etc.)
3/16
Mon 10am Newman/G16
Fri 10am AlanTuring/G205
1. Introduction
Sept 24, 28
Introduction, explanation of syllabus
What makes language hard for
computers?
Applications: overview
2. Phonetics
Oct 1, 5
Speech recognition
Speech synthesis
3. Writing
Oct 8, 12
Character sets, Unicode, input methods
Spell checkers, grammar checkers
4. Words
Oct 15, 19
Dictionaries, Thesauri
WordNet
5. Morphology
Oct 22, 26*
Morphology
Tagging
6. Syntax
Nov 5, 9
Parsing I
Parsing II
7. Semantics,Pragmatics
Nov 12, 16
Dialogue understanding
Text generation,
Cooperative responses
8. Branches
Nov 19, 23
Psycholinguistics: language and AI
Applied linguistics: CALL
9. Applied linguistics
Nov 26, 30*
CALL
Translation aids
10. Translation
Dec 3, 7
Machine translation I
Machine Translation II
11. Information retrieval
Dec 10, 14
Text retrieval; Summarization
Question answering, text mining
READING WEEK
* These lectures to be rearranged due to HS’s absence
4/16
Assessment
Examination:
50 multiple choice (or short phrase) answers
Essay – 3000 word essay due in at the start of the exam period
•
Choose a particular NLP application, and explain the difficulties that
natural language poses to the computer, and how (or whether) they are
addressed.
•
How is the study of language relevant to NLP? Focus on one or two
areas of NLP only.
•
Choose one branch of language study (eg historical, sociolinguistics,
psycholinguistics, child language, applied linguistics, etc.) and discuss
what role the computer, and in particular computational linguistics,
could play.
(Note that all involve taking lectures as starting point, and going into more
detail, based on your own “research”)
5/16
What is NLP?
Language and
Linguistics
Logic
Philosophy
Psychology
NLP
Artificial
Intelligence
Phonetics
Signal
Processing
Electrical
Engineering
HCI
Language
Engineering
Computer
Science
6/16
Language and AI
• “language ability” is an integral part of
Artificial Intelligence, which relates to
robotics
• Computers in SciFi use language readily:
how realistic is this? What are the
problems?
• HAL in “2001: A Space Odyssey” (1968)
7/16
How realistic is HAL?
Non-linguistic functions include...
• monitoring and controlling the spaceship
• playing chess
• vision
• general reasoning about the world …
• especially this particular mission
8/16
HAL’s use of language
He has to use language to communicate with
the crew, including ...
• chatting sociably
• discussing general and specific conditions
in the spaceship
• understanding commands
• initiating conversations, whether “work” or
“play”
9/16
What is language?
•
•
•
•
spoken vs. written language
speaker vs. hearer = production vs. analysis
different levels of language
different functions of language
10/16
Computer speech
• for natural-sounding speech, computer must
get individual sounds right, but also
combine them correctly
• intonation
• stress (pitch, loudness, length)
• pace (pauses can be significant)
11/16
Speech understanding
• signal processing (acoustic physics)
• separating speech from background noise
• recognizing individual speech sounds
(humans can make very fine distinctions)
It’s hard to wreck a nice beach
What dime’s a neck’s drain to stop port?
• variability in human voices
12/16
Problems with language in general
• Words are ambiguous (bank, round, take)
• Sentences are ambiguous
The chicken is ready to eat
Visiting relatives can be boring
End to free school looms
The man saw the girl with a telescope
Remove bulb, cover, and replace
13/16
Pragmatic problems
• We don’t always say what we mean
Can you pass the salt?
It’s cold in here, isn’t it?
I’m sorry (= Say it again)
Do you want some more? You’re alright.
• We don’t always mean what we say
It’s raining cats and dogs
I could murder a sandwich
14/16
Solutions
Linguistics (“grammar”)
can often tell us which interpretations are
possible, including limited aspects of
meaning, e.g. The man saw the girl with a hat
Restricted domain: if we know what the
subject is, a lot of ambiguity disappears
Context
Real-world knowledge
15/16
Next up
• A whistle-stop tour of applications and what
they involve
• Then down to business!
16/16