Inteligencia Artificial

Download Report

Transcript Inteligencia Artificial

Artificial Intelligence
Communication by natural
language
• Fall 2008
• professor: Luigi Ceccaroni
Communication
• Communication is the intentional
exchange of information
– brought about by the production and perception
of signs drawn from a shared system of
conventional signs.
• What sets humans apart from other animals
and machines is the complex system of
structured messages known as natural
language.
– It enables us to communicate most of what we
22
know about the world.
Natural language processing
• In contrast with formal languages, natural
languages, such as Spanish, French and
English, have no strict definition.
– They are used by a community of speakers.
• Natural language processing (NLP) treats
natural languages as if they were formal
languages
– to build computational systems able to
understand and generate human language in
all its forms.
33
Understanding speech acts
• The action of producing language is called
speech act.
• The problem of understanding speech
acts is much like other understanding
problems
– such as understanding images or diagnosing
illnesses.
• We are given a set of ambiguous inputs,
– from them we have to work backwards to
decide what state of the world could have
created these inputs.
44
Fundamentals of language
• A formal language is defined as a
(possibly infinite) set of strings.
• Each string is a concatenation of terminal
symbols, sometimes called words.
• Formal languages such as first-order logic
and Java have strict mathematical
definitions.
• A grammar is a finite set of rules that
specifies a language.
55
Fundamentals of language
• Formal languages always have an official
grammar, specified in some document.
• Natural languages have no official
grammar.
– Linguists strive to discover properties of the
language and then to codify their discoveries
in a grammar.
– To date, no linguist has succeeded
completely.
66
Fundamentals of language
• Linguists attempt to define a language as
it is.
• Prescriptive grammarians try to dictate
how a language should be.
• They create rules which are sometimes
printed in style guides, but have little
relevance to actual language usage.
77
Fundamentals of language
• Both formal and natural languages associate
a meaning or semantics to each valid string.
• In natural languages, it is also important to
understand the pragmatics of a string:
– the actual meaning of the string as it is spoken in
a given situation:
• There are very different ways to say “please”.
• The meaning is not just in the words
themselves, but in the interpretation of the
words in situ.
88
Fundamentals of language
• Most grammar rule formalisms are based
on the idea of phrase structure:
– Strings are composed of substrings called
phrases, which come in different categories.
• Examples of the category noun phrase,
or NP:
– “the king”
– “the agent in the corner”
99
Fundamentals of language
1.Phrases usually correspond to natural
semantic elements
– from which the meaning of an utterance can
be constructed; for example:
• Noun phrases refer to objects in the world.
2.Categorizing phrases helps us to describe
the allowable strings of the language.
– Any of the noun phrases can combine with a
verb phrase (or VP) such as “is dead” to form
a phrase of category sentence (or S). 10
Fundamentals of language
• Without the intermediate notions of NP and
VP, it would be difficult to explain why “the
king is dead” is a sentence whereas “king the
dead is” is not.
• Category names such as NP, VP and S are
called nonterminal symbols.
• Grammars define nonterminals using rewrite
rules:
S → NP VP
An S may consist of any NP followed by any VP.
Levels of analysis in NLP
• Lexico-morphological
• Detecting lexical units and their morphological
information
• Syntactic
• Checking if a sentence is syntactically valid
• Semantic
• Extracting global meaning from individual
meanings and from relations
• Pragmatic
• Relating a sentence to the line of discussion
• Illocutive
• Relating a sentence to intentions
Problems in NLP: examples
• Lexical ambiguity
• “reinventing the front wheel”
• “wheel” can be a noun or a verb (part-of-speech
tagging or POS-tagging)
• “she saw the bank”
• Building of a financial institution? Sloping land?
Supply held in reserve for future use? (word sense
disambiguation or WSD)
Problems in NLP: examples
• Syntactic ambiguity
• “He saw a man on the mountain top with
binoculars”
• Who’s got the binoculars?
• “The seller of newspapers of the
neighborhood”
• What is the prepositional-phrase attached to?
(prepositional-phrase attachment or PPattachment)
Problems in NLP: examples
• Semantic ambiguity
• “He gave the children a cake”
• A cake in total or one to each child? (scope of the
quantification)
• “Colorless green ideas sleep furiously”
• Sentence composed by Noam Chomsky in 1957
as an example of a sentence whose grammar is
correct but whose meaning is nonsensical.
• It was used to show inadequacy of the thenpopular probabilistic models of grammar, and the
need for more structured models.
Problems in NLP: examples
• References, ellipsis, pragmatics
• “She gave him a book”
• "We gave the monkeys the bananas because
they were hungry“
• "We gave the monkeys the bananas because
they were over-ripe"
• Same surface grammatical structure. However, the
pronoun they refers to monkeys in one sentence
and bananas in the other, and it is impossible to tell
which without a knowledge of the properties of
monkeys and bananas.
Problems in NLP: examples
• Illocution (Where is the stress? What intentions?)
• "I never said she stole my money" - Someone else
said it, but I didn't.
• "I never said she stole my money" - I simply didn't
ever say it.
• "I never said she stole my money" - I might have
implied it in some way, but I never explicitly said it.
• "I never said she stole my money" - I said someone
took it; I didn't say it was she.
• "I never said she stole my money" - I just said she
probably borrowed it.
• "I never said she stole my money" - I said she stole
someone else's money.
• "I never said she stole my money" - I said she stole
something, but not my money.
Statistical natural-language
processing
• It uses stochastic, probabilistic and statistical methods
to resolve some of the difficulties discussed above,
especially those which arise because longer
sentences are highly ambiguous when processed
with realistic grammars, yielding thousands or millions
of possible analyses.
• Methods for disambiguation often involve the use of
corpora and Markov models.
• Statistical NLP comprises all quantitative approaches
to automated language processing, including
probabilistic modeling and information theory.
• The technology for statistical NLP comes mainly from
machine learning and data mining, both of which are
fields of artificial intelligence that involve learning from
data.
Major tasks and applications in
NLP
•
•
•
•
•
Automatic summarization
Foreign language reading aid
Foreign language writing aid
Information extraction
Information retrieval (IR)
• IR is concerned with storing, searching and
retrieving information.
• It is a separate field within computer science
(closer to databases), but IR relies on some NLP
methods (for example, stemming).
• Some current research and applications seek to
bridge the gap between IR and NLP.
Major tasks and applications in
NLP
• Machine translation
• Automatically translating from one human
language to another.
• Named entity recognition (NER)
• Given a stream of text, determining which
items in the text map to proper names, such
as people or places.
• Although in English, named entities are
marked with capitalized words, many other
languages do not use capitalization to
distinguish named entities.
Major tasks and applications in
NLP
•
•
•
•
Natural language generation
Natural language understanding
Optical character recognition (OCR)
Question answering
• Given a human language question, the task of
producing a human-language answer.
• The question may be a closed-ended (such as
"What is the capital of Canada?") or openended (such as "What is the meaning of
life?").
Major tasks and applications in
NLP
• Speech recognition
• Given a sound clip of a person or people
speaking, the task of producing a text dictation
of the speakers.
• (The opposite of text to speech.)
•
•
•
•
Spoken dialogue system
Text simplification
Text-to-speech
Text-proofing
Resources
• Natural language processing (in Spanish)
[http://es.geocities.com/lenguajenatural/]
• Introductory book
[http://www.gelbukh.com/clbook/]
• Resources for text, speech and language
processing
[http://www.cs.technion.ac.il/~gabr/resourc
es/resources.html]
• Natural language processing blog
[http://nlpers.blogspot.com/]
Resources
• About Opinion, Language, and Blogs
[http://opinlab.wordpress.com/]
• A comprehensive list of resources,
classified by category
[http://www.proxem.com/]
• ACL Wiki for natural language processing
and computational linguistics
[http://aclweb.org/aclwiki/index.php?title=
Main_Page]
Research and development
groups
• IBM NLP Research Area
[http://domino.watson.ibm.com/comm/research.nsf/page
s/r.nlp.html]
• Microsoft Research: NLP
[http://research.microsoft.com/nlp/]
• Language Technologies Institute at Carnegie Mellon
University [http://www.lti.cs.cmu.edu/]
• Natural Language Group at the Information Sciences
Institute [http://www.isi.edu/natural-language/]
• Natural Language Generation Group at the Open
University [http://mcs.open.ac.uk/nlg/]
Research and development
groups
• Survey of the State of the Art in Human Language
Technology [http://cslu.cse.ogi.edu/HLTsurvey/]
• University of Edinburgh Natural Language Processing
Group [http://www.iccs.informatics.ed.ac.uk/]
• Natural Language and Information Processing Group at
the University of Cambridge
[http://www.cl.cam.ac.uk/research/nl/]
• Stanford Natural Language Processing Group
[http://nlp.stanford.edu/]
• UPC center for research and technology development
on language and speech processing (TALP)
[http://www.talp.cat/talp/]