Communication & NL
Download
Report
Transcript Communication & NL
Communication & NL
By
Rohana Mahmud
(NLP week 1-2)
Study of Language
Language:
Written: Long-term record of knowledge from one generation to
another
Spoken: primary mean of coordinating day-to-day behavior with others
Natural (eg. Malay, English) vs. Artificial (Java, Prolog, Coding)
Communication
Use sign / natural language/ body language
Sender and Receiver
Studied in several disciplines:
Linguist: structure of language
PsychoLinguist: the process of human language production and
comprehension
Philospher: how words can mean anything & how they identify object
in the world, what it means to have belief, goals and intention,
cognitive capabilities relate to language
CL: to develop a computational theory of Language (using the notions
of algorithm & data structure from CS)
History
Speech & Language Processing:
overlapping fields:
Computational Linguistics (Linguistics)
Natural Language Processing (CS)
Speech Recognition (EE)
Computational Psycholinguistics (Psychology)
History
1940s & 1950s – Foundational Insights:
the automaton and probabilistic/ information-theoretic models
(Turing1936, Shannon 1948)
Formal language theory (chomsky 1956)
Probabilistic algorithm
1957-1970: 2 Camps- Symbolic and Stochastic
1970-1983: 4 paradigms – Stochastic, Logic-based, NLU
(SHRDLU Winograd, LUNAR Woods), Discourse Modelling
(grosz) & Believe-Desire-Intention
Current: The Field comes together – probabilistic & data
model, increase in speed and memory of computers &
applied to Augmentative and Alternative
Communication(AAC) and the rise of Web, need for
language-based IR & IE.
Goal
Scientific goal: Cognitive science –
interdisciplinary research
Technological/practical goal - NLP revolutionize
the computers are used Computer that
understand NL could access all information
(human knowledge)
NL interfaces to computers: allow complex
system to be accessible to everyone. More
flexible and intelligent
Speech and Language Processing
Understand human: speech recognition & NLU
(lip-reading)
Communicating with human: NLG & Speech
Synthesis
Information retrieval, information extraction,
inference (draw conclusions based on facts)
Spelling corrections, grammar checking, machine
translation
Data processing vs. language processing
(knowledge of language – what it means to be a
word)
Application of NLU
It represents the meaning of sentences in some
representation language that can be used later
for further processing
Text-based applications
Written text processing (books,newpaper, reports,
manual, email, sms) = reading-based tasks
Searching/finding from database of text
Extracting information from text
Translating documents
Summarizing texts for certain purpose
Story understanding
Application of NLU
Dialogue-based applications – involve humanmachine communication (spoken / keyboard/mouse/
recognizer)
Q&A systems, eg. Query database
Automated customer service (phone)
Tutoring systems (interaction with students)
Spoken language control of machine
General cooperative problem-solving system
Speech recognition <> Language understanding system
(only identify the word spoken from a given speech
signal, not – how words are used to communicate)
Discuss ELIZA system
ELIZA system
Mid-1960s, MIT, a Therapist (system) & patient (user),
Weizenbaum, 1966
Algorithm:
Has a Dbase of particular words (keywords)
For each keyword -> store an integer, a pattern to match against
the input and a specification of the output
Given Sentence(S), find a keyword in S whose pattern matches S
If > 1 keyword, pick the one with highest integer value
Use the output specification that is associated with this keyword to
generate next sentence
If there are No keywords, generate an innocuous continuation
statement, eg: Tell me more, Go on. (figure 1.2, 1.3 Allen)
Representations and Understanding
Computing a representation of the meaning of sentences
and texts (Notion of representation)
Why can’t use the sentence itself as a representation of
its meaning? Most words have multiple meanings
(Senses). eg. Cook, bank, still (verb or noun), I made
her duck. I saw a man in the park with a telescope
Thus, ambiguity inhibit system from making the
appropriate inferences needed to model understanding
(need to resolve or disambiguate: eg. Use Lexical
disambiguation: POS, word-sense disambiguation)
A program must explicitly consider each senses of a
word to understand a sentence
Represent meaning: must have a more precise
language
Mathematics & Logic and the use of formally
specified representation languages (formal
language) – notion of an atomic symbol
Useful representation languages have 2
properties:
Precise and unambiguous
Capture the intuitive structure of the natural language
sentences that it represents
Models and Algorithm
Toolkit: state machines, formal rule systems, logic, probability
theory, machine learning
States, transitions among states and inpur representation
Basic procedural models: deterministic, Non-Deterministic
finite-state automata, finite-state transducer -> weighted
automata, markov model, HMM
Formal rules: Regular grammar, regular relations, context free
grammar (phonology, morphology, syntax)
Involve a search through a state of spaces representing by
hypotheses about an input : depth first, best-first and A*
search
Logic – first order logic and predicate calculus, semantic
network and conceptual dependency – logical representation
(semantic, pragmatic and discourse)
Bibliography
ACL (Association for CL) / EACL
COLING (int conference of CL)
Applied NLP
Workshop on Human Language Technology
Journal: CL & NLE
IEEE ICASSP: Acoustic, Speech and Signal Processing
IEEE Transactions on Pattern Analysis and Machine
Intelligence
IJCAI: Int Joint Conference on AI
Journal: AI, Computational Intelligence, Cognitive
Science
TUTORIAL – WEEK 2
Submit your first week Tutorial
Presents your findings
Definitions (NLP/NLG/NLU) (CL) (LE)
History?
Key works / research done in the area
References
Tutorial 2 - task
Practical
Hands-on with ELIZA system or similar
system
Software?
Algorithm?
Domain?
PROJECT / COURSEWORK
System Development
Human-Computer Interface
Database Retrieval
Expert System Interface
Word Analyzer (morpheme, suffix)
CALL
Dictionary & Senses
Meaning Postulate
Back-end Engine
Discourse Segmentation (Essay – Paragraph – sentence – word)
POS Recognizer (Malay / English)
Information Retrieval (web/ book)
Frequency-checker
Parser
Assessment
25 % out of 50% (5% - tutorial, 20% mid-exam/test)
Topic = Open
System = 15 % (Demonstration)
Report = 10% (SD, TE)
Using HLL – Java, Prolog, C & MM softwares
Showing the NL processor tasks and the interface
(major/minor)
Submit:
Week 3/4 – topic & proposal
Week 9 – demonstration & report