Communication & NL

Download Report

Transcript Communication & NL

Communication & NL
By
Rohana Mahmud
(NLP week 1-2)
Study of Language



Language:
 Written: Long-term record of knowledge from one generation to
another
 Spoken: primary mean of coordinating day-to-day behavior with others
 Natural (eg. Malay, English) vs. Artificial (Java, Prolog, Coding)
Communication
 Use sign / natural language/ body language
 Sender and Receiver
Studied in several disciplines:
 Linguist: structure of language
 PsychoLinguist: the process of human language production and
comprehension
 Philospher: how words can mean anything & how they identify object
in the world, what it means to have belief, goals and intention,
cognitive capabilities relate to language
 CL: to develop a computational theory of Language (using the notions
of algorithm & data structure from CS)
History

Speech & Language Processing:
overlapping fields:




Computational Linguistics (Linguistics)
Natural Language Processing (CS)
Speech Recognition (EE)
Computational Psycholinguistics (Psychology)
History

1940s & 1950s – Foundational Insights:






the automaton and probabilistic/ information-theoretic models
(Turing1936, Shannon 1948)
Formal language theory (chomsky 1956)
Probabilistic algorithm
1957-1970: 2 Camps- Symbolic and Stochastic
1970-1983: 4 paradigms – Stochastic, Logic-based, NLU
(SHRDLU Winograd, LUNAR Woods), Discourse Modelling
(grosz) & Believe-Desire-Intention
Current: The Field comes together – probabilistic & data
model, increase in speed and memory of computers &
applied to Augmentative and Alternative
Communication(AAC) and the rise of Web, need for
language-based IR & IE.
Goal



Scientific goal: Cognitive science –
interdisciplinary research
Technological/practical goal - NLP revolutionize
the computers are used Computer that
understand NL could access all information
(human knowledge)
NL interfaces to computers: allow complex
system to be accessible to everyone. More
flexible and intelligent
Speech and Language Processing





Understand human: speech recognition & NLU
(lip-reading)
Communicating with human: NLG & Speech
Synthesis
Information retrieval, information extraction,
inference (draw conclusions based on facts)
Spelling corrections, grammar checking, machine
translation
Data processing vs. language processing
(knowledge of language – what it means to be a
word)
Application of NLU


It represents the meaning of sentences in some
representation language that can be used later
for further processing
Text-based applications

Written text processing (books,newpaper, reports,
manual, email, sms) = reading-based tasks





Searching/finding from database of text
Extracting information from text
Translating documents
Summarizing texts for certain purpose
Story understanding
Application of NLU

Dialogue-based applications – involve humanmachine communication (spoken / keyboard/mouse/
recognizer)







Q&A systems, eg. Query database
Automated customer service (phone)
Tutoring systems (interaction with students)
Spoken language control of machine
General cooperative problem-solving system
Speech recognition <> Language understanding system
(only identify the word spoken from a given speech
signal, not – how words are used to communicate)
Discuss ELIZA system
ELIZA system


Mid-1960s, MIT, a Therapist (system) & patient (user),
Weizenbaum, 1966
Algorithm:






Has a Dbase of particular words (keywords)
For each keyword -> store an integer, a pattern to match against
the input and a specification of the output
Given Sentence(S), find a keyword in S whose pattern matches S
If > 1 keyword, pick the one with highest integer value
Use the output specification that is associated with this keyword to
generate next sentence
If there are No keywords, generate an innocuous continuation
statement, eg: Tell me more, Go on. (figure 1.2, 1.3 Allen)
Representations and Understanding




Computing a representation of the meaning of sentences
and texts (Notion of representation)
Why can’t use the sentence itself as a representation of
its meaning? Most words have multiple meanings
(Senses). eg. Cook, bank, still (verb or noun), I made
her duck. I saw a man in the park with a telescope
Thus, ambiguity inhibit system from making the
appropriate inferences needed to model understanding
(need to resolve or disambiguate: eg. Use Lexical
disambiguation: POS, word-sense disambiguation)
A program must explicitly consider each senses of a
word to understand a sentence



Represent meaning: must have a more precise
language
Mathematics & Logic and the use of formally
specified representation languages (formal
language) – notion of an atomic symbol
Useful representation languages have 2
properties:


Precise and unambiguous
Capture the intuitive structure of the natural language
sentences that it represents
Models and Algorithm






Toolkit: state machines, formal rule systems, logic, probability
theory, machine learning
States, transitions among states and inpur representation
Basic procedural models: deterministic, Non-Deterministic
finite-state automata, finite-state transducer -> weighted
automata, markov model, HMM
Formal rules: Regular grammar, regular relations, context free
grammar (phonology, morphology, syntax)
Involve a search through a state of spaces representing by
hypotheses about an input : depth first, best-first and A*
search
Logic – first order logic and predicate calculus, semantic
network and conceptual dependency – logical representation
(semantic, pragmatic and discourse)
Bibliography









ACL (Association for CL) / EACL
COLING (int conference of CL)
Applied NLP
Workshop on Human Language Technology
Journal: CL & NLE
IEEE ICASSP: Acoustic, Speech and Signal Processing
IEEE Transactions on Pattern Analysis and Machine
Intelligence
IJCAI: Int Joint Conference on AI
Journal: AI, Computational Intelligence, Cognitive
Science
TUTORIAL – WEEK 2


Submit your first week Tutorial
Presents your findings




Definitions (NLP/NLG/NLU) (CL) (LE)
History?
Key works / research done in the area
References
Tutorial 2 - task





Practical
Hands-on with ELIZA system or similar
system
Software?
Algorithm?
Domain?
PROJECT / COURSEWORK

System Development

Human-Computer Interface







Database Retrieval
Expert System Interface
Word Analyzer (morpheme, suffix)
CALL
Dictionary & Senses
Meaning Postulate
Back-end Engine





Discourse Segmentation (Essay – Paragraph – sentence – word)
POS Recognizer (Malay / English)
Information Retrieval (web/ book)
Frequency-checker
Parser
Assessment







25 % out of 50% (5% - tutorial, 20% mid-exam/test)
Topic = Open
System = 15 % (Demonstration)
Report = 10% (SD, TE)
Using HLL – Java, Prolog, C & MM softwares
Showing the NL processor tasks and the interface
(major/minor)
Submit:


Week 3/4 – topic & proposal
Week 9 – demonstration & report