Lecture 36-40 - เว็บไซต์บุคลากรภาควิชาวิทยาการคอมพิวเตอร์

Download Report

Transcript Lecture 36-40 - เว็บไซต์บุคลากรภาควิชาวิทยาการคอมพิวเตอร์

Chapter 15
Natural Language Processing (cont)
323-670 Artificial Intelligence
ดร.วิภาดา เวทย์ ประสิทธิ์ ภาควิชาวิทยาการคอมพิวเตอร์ คณะวิทยาศาสตร์ มหาวิทยาลัยสงขลานครินทร์
NLP Problems
 Figure 15.1 P. 378
 English sentences are incomplete descriptions of the
information that are intended to convey.
 The same expression means different things in
different context.
 No natural language program can be complete because
of new words, expression, and meaning can be
generated quite freely.
 There are lots of ways to say the same thing.
323-670 Artificial Intelligence
Lecture36-40
Page 2
NLP Problems
 1) Processing written text
– using lexical, syntactic, and semantic knowledge
of the language
– the require real world information
 2) Processing spoken language
– using all information needed above
– plus additional knowledge about phonology
– handle ambiguities in speech
323-670 Artificial Intelligence
Lecture36-40
Page 3
NLP
 Natural Language processing
 Language translation / multilingual translation
 Language understanding
– Figure 14.5 p. 365 Interaction among
component
– Figure 14.6 p. 366 A speech Waveform
323-670 Artificial Intelligence
Lecture36-40
Page 4
Step in NLP
 1) Morphological Analysis
 2) Syntactic Analysis
 3) Semantic Analysis
 4) Discourse Integration
 5) Pragmatic Analysis
– boundaries between these five phrases are
often fuzzy.
323-670 Artificial Intelligence
Lecture36-40
Page 5
1. Morphological Analysis
 Individual words are analyzed into components
 Nonword tokens such as punctuation are
separated from the words
 I want to print Bill’s .int file.
file extension
proper noun
possessive suffix
323-670 Artificial Intelligence
Lecture36-40
Page 6
2. Syntactic Analysis
 linear sequence of words are transformed into
structures
 show how words relate to each other
 English syntactic analyzer
 If do not pass the syntactic analyzer  reject
(Boy the go to store the)
323-670 Artificial Intelligence
Lecture36-40
Page 7
2. Syntactic Analysis
 Example of syntactic analysis
– Figure 15.2 p. 382  RM2, RM5, RM5
 A knowledge base Fragment
– Figure 15.3 p. 383
– User073, F1, Printing, File_Structure, Waiting
– Mental Event/ Physical Event Animate/Event
 Partial meaning for a sentence
– Figure 15.4 p. 384
323-670 Artificial Intelligence
Lecture36-40
Page 8
3. Semantic Analysis
 the structures created by the syntactic analyser are assign
meanings
 mapping between the syntactic structure and objects in the task
domain
 If no mapping  reject (colorless green ideas sleep
furiously)
 1) It must map individual words into appropriate objects in the
knowledge base or database.
 2) It must create the correct structures to correspond to the
meanings of the individual words combine with each other.
323-670 Artificial Intelligence
Lecture36-40
Page 9
4. Discourse Integration
 the meaning of the individual sentence may depend on the
sentences that precede it and may influence the meanings of
the sentences that follow it.
 (Ex. John want it.)  “It” depends on the previous
sentence.
 Current user who type word “I” is
– User068 = Susan_Black
 We get F1 with filename in /wsmith/ directory
323-670 Artificial Intelligence
Lecture36-40
Page 10
5. Pragmatic Analysis
 The structure representing what was said is
reinterpreted to determine what was actually meant.
 (Ex. Do you know what time it is?)
 we should understand what to do....
 Understand to decide what to do as a result
 Representing the intended meaning
– Figure 15.5 P. 385
323-670 Artificial Intelligence
Lecture36-40
Page 11
Syntactic Processing
 Top-down Parsing
– Begin with start symbol and apply the grammar rules
forward until the symbols at the terminals of the tree
correspond to the components of the sentence being parsed.
 Bottom-up Parsing
– Begin with the sentence to be parsed and apply the
grammar rules backward until a single tree whose terminals
are the words of the sentence and whose top node is the
start symbol has been produced.
323-670 Artificial Intelligence
Lecture36-40
Page 13
ATN : Augmented Transition Network
 similar to finite state machine
– Figure 15.8 p.392 An ATN network
– Figure 15.9 p.3923An ATN Grammar in List Form
– sentence  “The long file has printed.”
– S  NP  Q1  AUX  Q3  V  Q4 (F) halt
– NP  Det Q6  Adj Q6  N  Q7 (F)
(S
DCL
(NP (FILE (LONG) DEFINITE))
HAS
323-670 Artificial Intelligence
(VP PRINTED))
Lecture36-40
Page 14