Presentation

Download Report

Transcript Presentation

Historical Perspectives on
Natural Language Processing
Mike Rosner
Dept Artificial Intelligence
[email protected]
April 2008
Historical Perspectives on NLP
1
Outline
• What is NLP?
• What makes natural languages special?
• Classic computational models
– Knowledge Free NLP
– Knowledge Based NLP
• Issues
• Demos
April 2008
Historical Perspectives on NLP
2
References
• Dan Jurafsky and Jim Martin, Speech and
Language Processing, Prentice Hall 2000.
• www.cs.um.edu.mt/~mros/nlpworld/historic
al/index.html
April 2008
Historical Perspectives on NLP
3
What is NLP
• NLP aims to get computers to process/use
natural language like people.
• Motivation
– performance goal: make computers more
human friendly.
– scientific goal: understand how language
works by building computational models.
April 2008
Historical Perspectives on NLP
4
Language-Enabled Programs
• Spelling and Style
Correction
• Parsing and
Generation
• Document Processing
•
•
•
•
Dialogue
Question Answering
Speech
Multimodal
Communication
– Classification
– Summarisation
– Retrieval/Extraction
• Translation
April 2008
Historical Perspectives on NLP
5
NLP is Interdisciplinary
•
•
•
•
•
•
Computer Science +
Linguistics +
Artificial Intelligence +
Software Engineering +
Signal Processing +
Knowledge Representation
April 2008
Historical Perspectives on NLP
6
Overall History
• 1950-1965: Machine Translation
• 1970-1980: Artificial Intelligence
approaches based on semantics
• 1980-1995: Increasingly sophisticated use
of syntactic models
• 1995-Present: Data driven statistical
models.
April 2008
Historical Perspectives on NLP
7
Machine Translation
• Les soldats sont dans le café
April 2008
Historical Perspectives on NLP
8
Machine Translation
Typical Problem:
• Les soldats sont dans le café
• The soldiers are in the coffee.
April 2008
Historical Perspectives on NLP
9
Natural Languages
What is so Special?
•
•
•
•
•
Not designer languages
Universal application
Context dependence
Linguistic + para-linguistic elements
Ambiguity
April 2008
Historical Perspectives on NLP
10
Ambiguity
• Word-sense ambiguity
– The board balanced the decision
– The board balanced the lead weights
• Syntactic Ambiguity
– I shot an elephant in my underpants
• Semantic Ambiguity
– Every girl likes a sailor
• Pragmatic Ambiguity
– Can you pass the salt?
April 2008
Historical Perspectives on NLP
11
ELIZA
• Joseph Weizenbaum coded ELIZA at MIT during
the years 1964-1966
• The aim was to produce a very simple program,
based on the presence of keywords in the input,
capable of typed NL conversation.
• ELIZA simulates a psychotherapist who reflects
the patient's statements back to him and
encourages the patient to keep talking.
• The role of the computer was carefully chosen to
disguise the program's lack of knowledge.
April 2008
Historical Perspectives on NLP
12
ELIZA Interaction
•
•
•
•
•
•
•
•
•
•
Men are all alike
IN WHAT WAY?
They're always bugging us about something or other
CAN YOU THINK OF A SPECIFIC EXAMPLE?
Well my boyfriend made me come here
YOUR BOYFRIEND MADE YOU COME HERE?
He says I'm depressed much of the time
I'M SORRY TO HEAR YOU ARE DEPRESSED
Perhaps I could learn to get on with my mother.
TELL ME MORE ABOUT YOUR FAMILY
April 2008
Historical Perspectives on NLP
13
ELIZA – How it works
• Database of keywords, each associated with
• A decomposition rule to match against input
(0 you 0 me)
| | | |
1 2 3 4
• A reassembly rule that constitutes an output specification
e.g.
(what makes you think that I 3 you?)
– I wonder if you really like me
– WHAT MAKES YOU THINK THAT I REALLY LIKE YOU
• Various heuristics for selecting next topic, keeping
conversation going, avoiding repetition etc.
April 2008
Historical Perspectives on NLP
14
ELIZA - Comments
• Remarkable example of robust NLP which
always produces an answer.
• Contains no interesting knowledge about
language or the world
• Good example of creating apparently intelligent
behaviour using programming tricks and an
artificial context
• J. Weizenbaum, ELIZA – A Computer Program
for the Study of NL Communication between
Man and Machine CACM vol 9.1 pp36-43, 1966
April 2008
Historical Perspectives on NLP
15
Eliza Anecdote [1]
•
•
•
•
•
•
•
•
•
•
•
•
Student: May I use your computer time from the office?
response: Do you want to use my computer time?
Student: Yes.
response: I understand.
Student: Are you planning to come into the office tonight?
response: Why do you want to know whether I am planning to come
into the office tonight?
Student: Because I want to use your computer time!
response: I see.
Student: Well?
response: What do you mean, "Well"?
Student: Well, what's your answer to my questions?
response: Let's explore further why you want to use my computer
time.
April 2008
Historical Perspectives on NLP
16
Eliza Anecdote [2]
• After about ten minutes of increasing
frustration and anger, the student rushed
to a telephone and dialed Bobrow's home,
rousing him from a deep sleep.
• Student: What the heck do you think
you're doing?
• Bobrow: What do you mean, What the
heck do I think I'm doing?
•
http://www.kurzweilai.net/meme/frame.html?main=/articles/art0319.html
April 2008
Historical Perspectives on NLP
17
SHRDLU (T. Winograd 1971)
Three Basic Principles
• A computer cannot deal reasonably with
language unless it can understand the subject it
is discussing.
• In modelling language understanding, different
levels of language processing must be
integrated.
• Humans do not understand isolated sentences.
Sentences always occur in a context which
includes discourse context, spatio/temporal
context, and shared knowledge.
April 2008
Historical Perspectives on NLP
18
Winograd's Approach
• It is difficult or impossible to isolate one aspect
of language use syntax, morphology, semantics,
pragmatics) from another:
He gave the house plants to charity vs
He gave the boy plants to water.
• It is difficult to draw the line between use of
linguistic knowledge and other kinds of
knowledge
• How can the problem be contained?
April 2008
Historical Perspectives on NLP
19
Blocks World
• Winograd's strategy was to pick a microworld to
talk about, the "blocks world" of a toy robot. In
the course of a dialogue:
–
–
–
–
Manipulate objects
Answer questions
Discuss events
Accept new facts
• Key idea: by limiting the subject area, it
becomes possible to address general issues of
how language is used in a framework of physical
objects, events and continuing discourse.
April 2008
Historical Perspectives on NLP
20
Main Components
• Syntactic Parser – complex grammar of
English
• Semantic routines – knowledge of
concepts attached to words
• Cognitive deductive system for exploring
the consequences of actions
• Simple set of programs for generating
appropriate English responses
April 2008
Historical Perspectives on NLP
21
Sample Interaction I
•
•
•
•
•
Pick up a big red block
OK
Grasp the pyramid
I DON'T KNOW WHICH PYRAMID YOU MEAN
Find a block which is taller than the one you are
holding and put it in the box
• BY IT I ASSUME YOU MEAN THE BLOCK
WHICH IS TALLER THAN THE ONE I AM
HOLDING
April 2008
Historical Perspectives on NLP
22
Sample Interaction II
• Had you touched any pyramid before you put
the little green one on the little cube?
• YES, THE GREEN ONE
• When did you pick it up?
• WHILE I WAS STACKING UP THE RED CUBE,
A LARGE RED BLOCK AND A LARGE GREEN
CUBE
• Why?
• TO GET RID OF IT
• Why did you do that
• TO CLEAN OFF THE RED CUBE
April 2008
Historical Perspectives on NLP
23
Issues
• Does NLP have to be knowledge-based?
• What kind of knowledge? How is it
represented?
• How does the knowledge get there?
– Expert Systems approach – knowledge is
entered explicitly in the form of rules
– Data driven approach – take large quantities
of data and use machine learning techniques
to acquire knowledge automatically
April 2008
Historical Perspectives on NLP
24