Natural Language Processing Course
Download
Report
Transcript Natural Language Processing Course
Amirkabir University of Technology
Computer Engineering Faculty
Natural Language Processing
Course
Dr. Ahmad Abdollahzadeh
Session Agenda
Artificial Intelligence
Natural Language Processing
History of NLP
Applications of NLP
2
AI Concepts and Definitions
•
•
•
Encompasses Many Definitions
AI Involves Studying Human Thought
Processes
Representing Thought Processes on
Machines
3
Artificial Intelligence
•
•
•
Behavior by a machine that, if performed
by a human being, would be considered
intelligent
“…study of how to make computers do
things at which, at the moment, people
are better” (Rich and Knight [1991])
Theory of how the human mind works
(Mark Fox)
4
Decision Support Systems and Intelligent Systems, Efraim Turban and Jay E. Aronson
6th ed, Copyright 2001, Prentice Hall, Upper Saddle River, NJ
AI Objectives
•
•
•
Make machines smarter (primary goal)
Understand what intelligence is (Nobel
Laureate purpose)
Make machines more useful
(entrepreneurial purpose)
(Winston and Prendergast [1984])
5
Signs of Intelligence
•
•
•
•
Learn or understand from experience
Make sense out of ambiguous or
contradictory messages
Respond quickly and successfully to new
situations
Use reasoning to solve problems
6
More Signs of Intelligence
•
•
•
•
•
Deal with perplexing situations
Understand and infer in ordinary,
rational ways
Apply knowledge to manipulate the
environment
Think and reason
Recognize the relative importance of
different elements in a situation
7
Turing Test for Intelligence
A computer can be considered to be smart
only when a human interviewer,
“conversing” with both an unseen human
being and an unseen computer, can not
determine which is which
8
Symbolic Processing
•
Use Symbols to Represent Problem
Concepts
•
Apply Various Strategies and Rules to
Manipulate these Concepts
9
AI Represents Knowledge as
Sets of Symbols
A symbol is a string of characters that stands for
some real-world concept
Examples
•
•
•
•
Product
Defendant
0.8
Chocolate
10
Symbol Structures
(Relationships)
•
•
•
•
(DEFECTIVE product)
(LEASED-BY product defendant)
(EQUAL (LIABILITY defendant) 0.8)
tastes_good (chocolate).
11
•
AI Programs Manipulate Symbols to Solve
Problems
•
Symbols and Symbol Structures Form
Knowledge Representation
•
Artificial Intelligence Dealings Primarily with
Symbolic, Nonalgorithmic Problem- Solving
Methods
12
AI Computing
•
•
•
•
Based on symbolic representation and
manipulation
A symbol is a letter, word, or number
representing objects, processes, and their
relationships
Objects can be people, things, ideas, concepts,
events, or statements of fact
Creates a symbolic knowledge base
13
AI Computing (cont’d)
•
•
•
Manipulates symbols to generate advice
AI reasons or infers with the knowledge base
by search and pattern matching
Hunts for answers (via algorithms)
14
Major AI Areas
Expert Systems
Natural Language Processing
Speech Understanding
Robotics and Sensory Systems
Computer Vision and Scene Recognition
Intelligent Computer-Aided Instruction
Neural Computing
15
Additional AI Areas
News Summarization
Language Translation
Fuzzy Logic
Genetic Algorithms
Intelligent Software Agents
16
Natural Language?
Natural language is the language we write and speak in
everyday social interaction.
There are of course many varieties of natural language
It is quite possible to argue that the spoken and the
written forms of the language are different and may be
largely independent.
There are systems of vocabulary, syntax and semantics
which can be observed (or similarly discovered) and
recorded.
Those working in NLP also would claim (or at least
hope) that it is possible to "automate" these
descriptions to produce useful systems that are based
on these descriptions.
17
Natural Language Processing
(NLP)
Natural language processing concerns the development of
computational models of aspects of human language
processing such as :
•
•
•
•
•
Reading and interpreting a textbook
Writing a letter
Holding a conversation
Translating a document
Searching for useful information
Such models are useful in order to write computer programs
to perform useful tasks involving language processing and in
order to develop a better understanding of human
communication.
18
Other Titles
•
The most common titles, apart from Natural
Language Processing include:
•
Automatic Language Processing
Computational Linguistics
Natural Language Understanding
•
•
19
Computational Lingusitics
This is the application of computers to the scientific
study of human language.
This definition suggests that there are connections
with Cognitive Science, that is to say, the study of
how humans produce and understand language.
Historically, Computational Linguistics has been
associated with work in Generative Linguistics and
formerly included the study of formal languages (eg
finite state automata) and programming languages.
The computer is used as a tool on which models can
be developed and evaluated, for instance
implementations of theories of child language
acquisition.
20
Natural Language Understanding
Distinguish a particular approach to Natural
Language Processing.
The people using this title tend to lay much
emphasis on the meaning of the language being
processed, in particular getting the computer to
respond to the input in an apparently intelligent
fashion.
At one time, those who belonged to the Natural
Language Understanding camp avoided the use of
any syntactic processing, but textbooks that bear this
title now include significant sections on syntactic
processing, which suggests that the edge of the title
has been rather blunted. (For instance, see Allen
(1987; part 1).
21
NLP History (1)
The first recognisable NLP application was a
dictionary look-up system developed at
Birkbeck College, London in 1948.
NLP from 1966-1980
Augmented Transition Networks
The Augmented Transition Network (ATN) is a piece of
searching software that is capable of using very powerful
grammars to process syntax.
Case Grammar
The significance of the proposal for NLP is
that it contributed a
relatively easily
implementable theory which could contribute
much semantic information with little
processing effort. It also
contributed to the solution of one of
theintractable problems of
Machine Translation:
22
thetranslation of prepositions.
NLP History (2)
NLP from 1966-1980
Semantic representations
Schank and his workers introduced the notion of Conceptual
Dependency, a method of expressing language in terms of
semantic primitives. Systems were written which included no
syntactic processing.
QuillianÕs work on memory introduced the idea of the semantic
network, which has been used in varying forms for knowledge
representation in many systems.
William Woods used the idea of procedural semantics to act as an
intermediate representation between a language processing
system and a database system.
The key systems were:
SHRDLU
LUNAR: A database interface system that used ATNs and Woods'
Procedural Semantics.
LIFER/LADDER: One of the most impressive of NLP systems. It was
designed as a natural language interface to a database of information
about US Navy ships.
23
NLP History (3)
NLP from 1980 - 1990
- Grammar Formalisms
NLP from 1990- now
- Multilinguality and Multimodality
24
NLP Applications
Applications can be classified in different
ways, e.g. medium/modality; depth of
analysis;degree of interaction
Text-based applications
NL Understanding
Dialogue Systems
Multimodal
25
Text-based Applications
Processing of written texts such as books,news, papers,reports:
Finding appropriate documents on certain topics from a text
database
Extracting information from messages,articles, Web pages, etc.
Translating documents from one language to another
Text summarisation
Note: Not all such applications require NLP
Keyword based techniques can suce for identifying particular
subject areas, e.g. legal, financial, etc.
26
NL Understanding
Other kinds of request require a deeper level of analysis
Find me all articles concerning car accidents involving more than
two cars in Malta during the first half of 2001
Here the system must extract enough information to determine
whether the article meets the criterion defined by the query.
A crucial characteristic of an understanding system is that it can
compute some representation of the information that can be used
for later inference
A crucial question for an NLP system is how much understanding is
necessary to achieve the purpose of the system.
27
Dialogue-based Applications
Dialogue-based applications involve man-machine
communication
NL database query systems
Automated customer services, e.g. banking services
General NL mediated problem solving systems
Some of the differences between dialogue and text-based systems:
Language used is less formal
System needs to act proactively in order to maintain smooth conversation
Use of acknowledgements clarication sub-dialogues
28
Text-based Applications
Processing of written texts such as books,news, papers,reports:
Finding appropriate documents on certain topics from a text
database
Extracting information from messages,articles, Web pages, etc.
Translating documents from one language to another
Text summarisation
Note: Not all such applications require NLP
Keyword based techniques can suce for identifying particular
subject areas, e.g. legal, financial, etc.
29
Multimodal Applications
Involve two or more modalities of communication
Text
Speech
Gesture
Image
Text speech
Speech text
Multimodal document generation
Spoken translation systems
Spoken dialogue systems
30