Transcript document

CHAPTER 9
LANGUAGE PROCESSING:
HUMANS AND COMPUTERS
(363-408)
PowerPoint by Don L. F. Nilsen
to accompany
An Introduction to Language (8e, 2007)
by Victoria Fromkin, Robert Rodman
and Nina Hyams
53
1
BOTTOM-UP AND TOP-DOWN PROCESSING


Bottom-up processing relates to
decoding. You start with the actual
sounds, letters, morphemes, etc. and
figure out the words, phrases,
clauses, sentences, paragraphs, etc.
Top-down processing is based on
reasoning. You make a generalization
and see how well the sounds, letters,
morphemes, etc. support your
generalization.
 (Fromkin Rodman Hyams 369)
53
2



Top-down reasoning is powerful,
but it can be dangerous if it is
not accompanied by bottom-up
reasoning.
For example, Otto Jesperson
assumed that men were better
thinkers than women.
He conducted an experiment in
which men and women read a
story and were given a quiz.
53
3



The women responded more quickly
and more accurately than the men,
which was not what Jacobson had
expected.
So he concluded that women’s minds
have “vacant chambers” that men’s
minds don’t have.
This allowed Jacobson to account for
his evidence while at the same time
not disproving his original hypothesis
that men were better thinkers than
women.
53
4
COMPUTER WORDS AND METAPHORS


COMPUTER WORDS: bits, bytes, code police,
cyberspace, future shock, hackers, hard
copy, menu, third wave, user-friendly
COMPUTER METAPHORS: 42, bug, cookies,
GIGO, great runes, heavy wizardry, Lotus
Software, Melvyl (California Library
System), a sagan, snail mail, Sorcerer’s
Apprentice mode, Trojan Horse, USLT,
vulcan nerve pinch, web, YABA compatible
53
5
CONCORDANCE


A Concordance checks for word
frequency and word associations.
Word associations are
determined by providing a
window of three words (more or
less) on each side of the targeted
word.
 (Fromkin Rodman Hyams 379)
53
6
ELIZA


ELIZA is a program designed to pass
the turing test. If a person
communicating with ELIZA cannot tell
whether or not ELIZA is a human,
then ELIZA passes the turing test.
ELIZA claims to be a psychiatrist
talking to a patient. ELIZA, like a
psychiatrist, is attempting to get
rather than give information:
53
7








PATIENT: Men are all alike.
ELIZA: In what way?
PATIENT: They’re always bugging me
about something or other.
ELIZA: Can you think of a specific
example?
PATIENT: Well, my boyfriend made
me come here.
ELIZA: Your boyfriend made you come
here?
PATIENT: He says I’m depressed
much of the time.
ELIZA: I am sorry to hear that you are
depressed.
 (Fromkin Rodman Hyams 389)
53
8


Victor Raskin fooled ELIZA by
saying, “I don’t like my mother
and I don’t want to talk about
her.”
ELIZA responded, “Tell me more
about your mother.
 (Nilsen Nilsen & Combs 199)
53
9
CIRCUIT FIX-IT SHOP





Twenty-five years after ELIZA, a program
named “Circuit Fix-It Shop” was developed:
COMPUTER: This is the circuit fix-it shop.
How may I help you?
TECHNICIAN: I need to fix the RS111 circuit.
COMPUTER: I am familiar with that circuit.
The LED is supposed to be displaying
alternatively blinking one and seven.
TECHNICIAN: Where is the LED?
53
10





COMPUTER: In the middle right of the
blue region is the LED.
TECHNICIAN:
No light.
COMPUTER: What is the switch at
when the LED is off?
TECHNICIAN: The switch is down.
COMPUTER: You put up the switch.
 (Fromkin Rodman Hyams 390)
53
11
MORE SOPHISTICATED PROGRAMS


Today more sophisticated programs
are needed. One such program is the
little Paperclip guy that answers
questions in Microsoft Word.
Another sophisticated program is
“Script Model Grammar” designed by
Roger Schank and Robert Abelson and
modified by linguist Victor Raskin and
others at Purdue University and
elsewhere.
53
12
SAM: SCRIPT APPLIER MECHANISM


Of course sentences need to be
parsed in Artificial Intelligence. But
constituents larger than a sentence
must be parsed as well.
One of the devices for doing this
discourse parsing is the “Script
Applier Mechanism” devised by Roger
Schank and Robert Abelson and
modified for humor and language play
by Victor Raskin and others.
53
13



Note that a play or a movie has a
script for the actors to follow.
The script in Artificial
Intelligence is the same, but it is
much simpler. It is a “mundane
script.”
The “Restaurant Script,” for
example involves a customer, a
server, a cashier, etc.
53
14












Props in the “Restaurant Script” include the
restaurant, the table, the menu, the food, the check,
the payment, the tip, etc.
The sequence of actions is as follows:
1. Customer goes to restaurant.
2. Customer goes to table.
3. Server brings menu.
4. Customer orders food.
5. Server brings food.
6. Customer eats food.
7. Server brings check.
8. Customer leaves tip for server.
9. Customer gives payment to cashier.
10. Customer leaves restaurant.
 (Hendrix and Sacerdote 654)
 (Nilsen Nilsen & Combs 199)
53
15


There are two exciting things about
the Script Applier Mechanism. First, it
will be able to spot anything that is
missing, added, or out of place in the
sequence of events and ask, “What’s
up.”
Second, it is able to handle two scripts
at the same time, so that it is capable
of dealing with jokes, language play,
satire, irony, sarcasm, parody,
paradox and double entendre in
general.
53
16
PARSING PROBLEMS





GARDEN PATH:
The horse raced past the barn fell.
After the child visited the doctor prescribed a course
of injections.
The doctor said the patient will die yesterday.
 (Fromkin Rodman Hyams 365, 373)
EMBEDDING: “Never imagine yourself not to be
otherwise than what it might appear to others…to be
otherwise.”
 (Lewis Carroll’s Alice’s Adventures in Wonderland)
 (Fromkin Rodman Hyams 365)
53
17
RIGHT-BRANCHING VS. EMBEDDING



RIGHT BRANCHING: This is the dog that
worried the cat that killed the rat that ate
the malt that lay in the house that Jack
built.
EMBEDDING: Jack built the house that the
malt that the rat that the cat that the dog
worried killed ate lay in.
NOTE Multiple embedding is OK for a
computer, but not OK for the human brain.
 (Fromkin Rodman Hyams 373-374)
53
18





ANOMALOUS WORDS: A sniggle blick
is procking a slar.
 (Fromkin Rodman Hyams 368)
METANALYSIS (incorrect phrase
breaking):
grade A vs. grey day
night rate vs. nitrate
 (Fromkin Rodman Hyams 370)
NOTE: English “adder” and “apron”
were borrowed incorrectly from the
French expressions “un nadder” and
“un naperon” respectively
53
19

AMBIGUOUS SYNTAX IN NEWSPAPER
HEADLINES:

Teacher Strikes Idle Kids

Enraged Cow Injures Farmer with Ax


Killer Sentenced to Die for Second
Time in 10 Years
Stolen Painting Found by Tree
 (Fromkin Rodman Hyams 372)
53
20
REAL-WORLD KNOWLEDGE




Explain why the following sentences are
ambiguous to a computer but not to a
human:
A cheesecake was on the table. It was
delicious and was soon eaten.
SIGN IN A CHURCH: For those of you who
have children and don’t know it, we have a
nursery downstairs.
NEWSPAPER AD: Our bikinis are exciting;
they are simply the tops.
 (Fromkin Rodman Hyams 403)
53
21
ANTISMOKING CAMPAIGN SLOGAN:
 It’s time we make smoking history.
 Do you know the time?
 Concerned with spreading violence,
the president called a press
conference.
 The ladies of the church have cast off
clothing of every kind and they may
be seen in the church basement
Friday.
 (Fromkin Rodman Hyams 403)
53
22
AMBIGUOUS NEWSPAPER HEADLINES

Red Tape Holds Up New Bridge

Kids Make Nutritious Snacks

Sex Education Delayed, Teachers
Request Training
 (Fromkin Rodman Hyams 403)
53
23
SEMANTIC PRIMING



In the human brain, the word “doctor” is
more easily and more completely processed
if it is preceded by “nurse” than if it is
preceded by “flower.”
This is because “doctor” and “nurse” “are
located in the same part of the mental
lexicon.”
 (Fromkin Rodman Hyams 371)
This same feature could easily be built into
Artificial Intelligence.
53
24
SPEECH RECOGNITION
& SPEECH SYNTHESIS



“Computational phonetics and phonology has two
concerns. The first is with programming computers
to analyze the speech signal into its component
phones and phonemes.
The second is to send the proper signals to an
electronic speaker so that it enunciates the phones of
the language and combines them into morphemes
and words.
The first of these is speech recognition; the second is
speech synthesis.”
 (Fromkin Rodman Hyams 384)
53
25

“Machines which…imitate human
speech, are the most difficult to
construct, so many are the
agencies engaged in uttering even
a single word—so many are the
inflections and variations of tone
and articulation, that the
mechanician finds his ingenuity
taxed to the utmost to imitate
them.”
 (Fromkin Rodman Hyams 385)
53
26


TO SYNTHESIZE SPEECH:
1. Start with a tone at the same frequency as vibrating
vocal cords (higher if a woman’s or child’s voice is being
synthesized, lower for a man’s)

2. Emphasize the harmonics corresponding to the formants
required for a particular vowel, liquid, or nasal quality.

3. Add hissing or buzzing for fricatives.

4. Add nasal resonances for nasal sounds.


5. Temporarily cut off sound to produce stops and
affricates….

(Fromkin Rodman Hyams 386)
A Sound Spectrogram will give an indication of some of the
variables of analyzing or synthesizing speech:
53
27
SOUND SPECTROGRAM
(Fromkin Rodman Hyams 366)
53
28
SPELL CHECKER
I have a spelling checker.
 It came with my PC.
 It plane lee marks four my revue
 Miss steaks aye can knot sea.
 (Fromkin Rodman Hyams 381)


Explain why the spell checker is
not working in the poem above.
53
29
THEORIES AND MODELS


In The Physicist’s Conception of Nature,
Manfred Eigen said, “A theory has only the
alternative of being right or wrong. A model
has a third possibility: it may be right, but
irrelevant.”
 (Fromkin Rodman Hyams 397)
Explain why a theory for Artificial
Intelligence must be rigorous and at the
same time allow for language play. In AI,
are rigor and language play compatible
concepts or not?
53
30
TRANSLATION

“Translation is more than word-forword replacement. Often there is no
equivalent word in the target
language, and the order of words may
differ, as in translating from an SVO
language like English to an SOV
language like Japanese. There is also
difficulty in translating idioms,
metaphors, jargon, and so on.”
 (Fromkin Rodman Hyams 382)
53
31


“Machine translation is often impeded
by lexical and syntactic ambiguities,
structural disparities between the two
languages, morphological
complexities, and other crosslinguistic differences.”
 (Fromkin Rodman Hyams 382)
In the following examples consider
what information must be taken into
consideration for better machine
translation:
53
32

BUCHAREST HOTEL: The lift is being fixed for the next
day. During that time we regret that you will be
unbearable.

SWISS NUNNERY HOSPITAL: The nuns harbor all
diseases and have no respect for religion.

GERMAN HOTEL: All the water has been passed by the
manager.


ZURICH HOTEL: Because of the impropriety of
entertaining guest of the opposite sex in the bedroom,
it is suggested that the lobby be used for this purpose.
TURKEY: The government bans the smoking of children.
 (Fromkin Rodman Hyams 382)
53
33
Having Fun
with Computer
Terminology

53
34
1024



When Alan Schoenfeld of the
University of California at Berkeley
attended a conference on Artificial
Intelligence, he was given Hotel
Room Number 1024.
Wow! he said.
1024 is 2 to the tenth power. It is a
megabyte.
 (Nilsen & Nilsen 98)
53
35
ACRONYMS




Acronyms are so common in computer
terminology that programmers make
fun of them.
“TLA” stands for “Three Letter
Acronym.”
“YABA” stands for “Yet Another
Bloody Acronym.”
“YABA Compatible” means that the
initials can be pronounced easily are
are not obscene.
 (Nilsen & Nilsen 99)
53
36
CHAT GROUPS











Linguist Susan Herring at the University of Texas,
Arlington studied the humor in chat groups. Her
results were as follows:
imaginary situations: 20 percent
a mock persona: 14 percent
teasing: 13 percent
irony: 6 percent
name play: 5 percent
silliness: 4 percent
real situations: 3 percent
riddles: 2 percent
pretended misunderstandings: 2 percent
puns: 1 percent
 (Nilsen & Nilsen 167)
53
37
EMOTICONS










In conversation we can show our emotions, but on
the internet this is difficult, so we use emoticons:
:-) Smilling
:-)))))))))) Really Smiling
;-) Winking
:-* Kissing
I-0 Yawning
:-& Tongue-Tied
:’-{ Crying
:-/ Undecided
:-II Angry
 (Nilsen & Nilsen 100)
53
38
SCIENCE FICTION AND FANTASY






Many computer terms come from Science Fiction and
Fantasy:
A huge network packet is a “Godzillagram” from
Godzilla
Teenage hackers are “Munchkins” from The Wizard of
Oz
A mischievlous program is called a “wabbit” from
Elmer Fudd’s “You wascawwy wabbit.”
A program that repeats itself indefinitely is said to be
in “Sorcerer’s Apprentice Mode” from Fantasia
The meaning of life, truth, and everything is “42”
from a computer in Douglas Adams’ novel.
 (Nilsen & Nilsen 99)
53
39


When someone goes onto the internet
to get information that is easily
available from a manual, etc. the Cyber
Police might say, “USLT.” This means
“Use the Source, Luke!” from Starwars.
Another word from Starwars is an
“Obi-Wan Error.” This comes from the
name “Obi-Wan Kenobi” and refers to
an “off-by-one code,” as in 2001: A
Space Odyssey where the computer is
named “HAL.” This comes from “IBM”
but is the three letters before I, B, and
M.
 (Nilsen & Nilsen 99)
53
40




In computer terminology a soft boot refers
to the hitting of “Control,” “Alternate” and
“Delete” at the same time.
This is refered to as the “Vulcan Nerve
Pinch” from Star Trek.
“Droid” from “Android” has become a suffix
in such words as “trendroids,” who follow
trends, and “sales droids” which promise
customers things that can be delivered or
are useless.
The “code police” and “net police” are
named after the “thought police” in George
Orwell’s 1984.
53
41
SIGNATURES


People like to create enigmatic and
puzzling signatures. One user named
Eddie follows his signature with “Ceci
n’est pas une signature.”
This is an allusion to a painting of a
pipe by René Magritte with the
disclaimer, “Ceci n’est pas une pipe.”
 (Nilsen & Nilsen 166)
53
42
TEXT MESSAGING








Since numbers and letters require
more than a single stroke on cell
phones, acronyms are often used:
AFAIK: As far as I know
BTW: By the way
CUL or CUL8R: See you later
GIGO: Garbage In Garbage Out
GFR: Grime File Reaper
LOL: Lots of Laughs
OIC: Oh, I see
53
43








POS: Parent Over Shoulder
ROTF: Rolling on the Floor
ROTFLMAO: Rolling on the Floor
Laughing My Ass Off
RUOK: Are you OK?
TIA: Thanks in Advance
WYSIWYG: What you See Is What You
Get
and
BCNU: Be Seein’ you
 (Nilsen & Nilsen 99)
53
44
TWENTE, NETHERLANDS
• Every year there is an annual
workshop on Language Technology
at the University of Twente.
• In 1996 this workshop was devoted
to “Automatic Interpretation and
Generation of Verbal Humor.”
• The papers at this conference had
such titles as:
53
45











“Why do People Use Irony?”
“Password Swordfish: Verbal Humour in the
Interface.”
“Computer Implementation of the General Theory of
Verbal Humor.”
“Humor Theory beyond Jokes.”
“Speculations on Story Puns.”
“Relevance Theory and Humorous Interpretations.”
“What Sort of a Speech Act is the Joke?”
“A Neural Resolution of the Incongruity-Resoulution
Theory of Humor”
“Humorous Analogy: Modeling the Devil’s Dictionary.”
“Why Is a Riddle Not Like a Metaphor?” and
“An Attempt at Natural Humor from a Natural
Language Robot.”
 (Nilsen and Nilsen 98)
53
46
VIRUS JOKES


AT&T Virus: Every three minutes
it tells you what great service
you are getting.
MCI Virus: Every three minutes it
reminds you that you’re paying
too much for the AT&T virus.
53
47


Paul Revere Virus: This
revolutionary virus does not
horse around. It warns you of
impending hard disk attack—once
if by LAN, twice if by C:>.
New World Order Virus: Probably
harmless, but it makes a lot of
people really mad just thinking
about it.
 (Nilsen & Nilsen 177)
53
48
KURT VONNEGUT ON THE INTERNET



In August of 1997 a piece appeared
on the Internet by Kurt Vonnegut.
When Vonnegut’s wife was given a
copy of the article she was so pleased
with her clever husband that she
forwarded a copy to their children.
Vonnegut said that it was “funny and
wise and charming,” but he said he
never wrote it.
53
49




The article had actually been published by Mary
Schmich in the Chicago Tribune and then picked up
and redistributed by a computer hacker.
Ian Fisher of The New York Times said that as long as
readers thought the piece was Vonnegut’s, they
viewed the Internet as a wonderful tool that could
keep people in touch with each other.
But when they learned it was a hoax, their perception
of the internet changed. The internet was now an
unreliable hotbed of hoaxes and wild-eyed
conspiracies.
Probably both opinions are true.
 (Nilsen & Nilsen 168)
53
50
References # 1.
Clark, Virginia, Paul Eschholz, and Alfred Rosa. Language: Readings in
Language and Culture, 6th Edition. New York, NY: St. Martin’s
Press, 1998.
English, Katharine, ed. Most Popular Web Sites: The Best of the Net
from A2Z. Indianapolis, IN: Lycos Press, 1996.
Fromkin, Victoria, Robert Rodman, and Nina Hyams. “Language
Processing: Humans and Computers.” An Introduction to
Language, 8th Edition. Boston, MA: Thomson Wadsworth, 2007,
363-408.
Gralla, Preston. How the Internet Works. Emoryville, CA: Ziff-Daivs
Press, 1997.
Hendrix, Gary G., and Earl D. Sacerdoti. “Natural-Languag Processing:
The Field in Perspective.” in Language: Introductory Readings, 4th
edition. Eds. Virginia P. Clark, Paul A. Eslchholz and Alfred F. Rosa.
New York, NY: St. Martin’s, 1985.
Hulstijn, J., and A. Nijholt eds. Twente Workshop on Language
Technology 12: Automatic Interpretation and Generation of Verbal
Humor. Twente, Netherlands: Univ of Twente Dept of Computer
Science, 1996.
53
51

References # 2:
Nilsen, Alleen Pace, and Don L. F. Nilsen. “Computer
Humor,” and “Internet Influences.” Encyclopedia of
20th Century American Humor. Westport, CT:
Greenwood, 2000, 97-100 and 165-168.
Nilsen, Don L. F., Alleen Pace Nilsen, and Nathan H.
Combs. “Teaching a Computer to Speculate.”
Computers and the Humanities. 22 (1988): 193-201.
Nilsen, Kelvin, and Alleen Pace Nilsen. “Literary
Metaphors and Other Linguistic Innovations in
Computer Language” (Clark, 166-176).
Raskin, Victor. Semantic Mechanisms of Humor. Boston,
MA: Reider/Kluwer, 1985.
53
52
References # 3:
Raymond, Eric S. The New Hacker’s Dictionary, 2nd Edition.
Cambridge, MA: MIT Press, 1993.
Roberts, Steven K. “Artificial Intelligence.” in Writing and
Reading Across the Curriculum, 2nd Edition. Laurence
Behrens and Leonard J. Rosen. Boston, MA: Litle, Brown,
1985, 214-222.
Rosch, Eleanor. “On the Internal Structure of Perceptual and
Semantic Categories.” in Cognitive Development and the
Acquisition of Language. Ed. T. Moore. New york, NY:
Academic Press, 1973.
Schank, Roger C., and Robert Abelson. Scripts, Plans, Goals,
and Understanding: An Inquiry Into Human Knowledge
Structures. Hillsdale, NJ: Lawrence Erlbaum, 1977.
Siegel, David. Creating Killer Web Sites. Indianapolis, IN:
Hayden Books, 1996.
53
53