PPT - Department of Computer Science
Download
Report
Transcript PPT - Department of Computer Science
COM1070: Introduction to
Artificial Intelligence: lecture 3
Yorick Wilks
Computer Science Department
University of Sheffield
www.dcs.shef.ac.uk/-yorick
The Turing Test
There are at least two alternative positions which criticise
AI with respect to the Turing Test:
i. ‘Too hard’ definition of Artificial Intelligence.
Computers not likely to be able to pass the test.
ii. Hollow shell criticism.
Computer may pass test, but computers still won’t be able
to think.
As we shall see (on i) computers arent doing badly and are
getting better.
On (ii) the answer just begs the question as to what thinking
is--which was Turing’s point in the first place!
Turing’s own objections:
Turing considered, and dismissed, some possible
objections to the idea that computers can think.
Some of these objections might still be raised today.
Some objections are easier to refute than others.
Objections considered by Turing:
1. The theological objection
2. The ‘heads in the sand’ objection
3. The mathematical objection
4. The argument from consciousness
5. Arguments from various disabilities
6. Lady Lovelace’s objection
7. Argument from continuity in the nervous system
(8.) The argument from informality of behaviour
(9.) The argument from extra-sensory perception
The theological objection
‘…Thinking is a function of man’s
immortal soul. God has given an
immortal soul to every man and woman,
but not to any other animal or to
machines. Hence no animal or machine
can think…’
BUT:Why not believe that God could
give a soul to a machine if He had
wished to?
Heads in the sand objection
i.e. The consequence of machines
thinking would be too dreadful. Let us
hope and believe that they cannot do so.
This is related to the theological argument;
idea that Humans are superior to the rest
of creation, and must stay so……...
‘.. Those who believe in ..(this and the
previous objection).. would probably not
be interested in any criteria.[for deciding if
machines could think].’
The mathematical objection
Results in mathematical logic which can be
used to show that there are limitations to the
powers of discrete-state machines.
The Halting Problem: will the execution of a
program P eventually halt or will it run for ever?
Turing (1936) proved that for any algorithm H
that purports to solve halting problems there will
always be a program Pi such that H will not be
able to answer the halting question correctly
Hence certain questions cannot be answered
correctly by any formal system (cf. Goedel)
But, similar limitations may also apply to the
human intellect.
The argument from consciousness
‘…This argument is very well expressed in
Professor Jefferson’s Lister Oration for 1949,
from which I quote. “Not until a machine can
write a sonnet or compose a concerto because
of thoughts and emotions felt, and not by the
chance fall of symbols, could we agree that
machine equals brain – that is not only write it
but know that it had written it. No mechanism
could feel (and not merely artificially signal, an
easy contrivance) pleasure at its successes,
grief when its valves fuse, be warmed by
flattery, be made miserable by its mistakes, be
charmed by sex, be angry or depressed when it
cannot get what it wants”..’
But…...
The only way one could be sure that a machine
thinks is to be that machine and feel oneself
thinking.
- similarly, only way to be sure someone else
thinks, is to be that person.
How do we know that anyone is conscious?
(=solipsism).
Instead, we assume that others can think and are
conscious----it is a polite convention. Similarly,
could assume that machine which passes Turing
test is so too.
Consciousness
Thought and consciousness do not always go
together:
Freud and unconscious thought.
Thought we cannot introspect about. (eg
searching for forgotton name)
Blindsight (Weiskrantz) – removal of visual
cortex, blind in certain areas, but can still
locate spot without consciousness of it.
Arguments from various disabilities
ie ‘I grant that you can make machines to all the
things you have mentioned but you will never be
able to make one do X’.
eg be kind, resourceful, beautiful, friendly, have
initiative, have a sense of humour, tell right from
wrong, make mistakes, fall in love, enjoy
strawberries and cream, make someone fall in love
with it, learn from experience, use words properly,
be the subject of its own thought, have as much
diversity of behaviour as a man, do something
really new.
These criticisms often disguised forms of
argument from consciousness.
Lady Lovelace’s objection:
(memoir from Lady Lovelace about Babbage’s
Analytical Engine)
Babbage (1792-1871) and Analytical Engine:
general purpose calculator. Entirely mechanical.
Entire contraption never built – engineering not up
to it and no electricity!
‘..The Analytical Engine has no pretensions to
originate anything. It can do whatever we know
how to order it to perform..’
Objection: A computer cannot be creative, it cannot
originate anything, only carry out what was given to
it by the programmer.
But…...
computers can surprise their
programmers. – ie by producing
answers that were not expected.
Original data may have been given to
computer, but may then be able to work
out its consequences and implications
(cf. level of chess programs and their
programmers).
Argument from continuity in the nervous
system
Nervous system is continuous: the digital
computer is discrete state machine.
I.e. in the nervous system a small error in
the information about the size of a nervous
impulse impinging on a neuron may make
a large difference to the size of the
outgoing impulse.
Discrete state machines: move by
sudden jumps and clicks from one
state to another. For example,
consider the ‘convenient fiction’ that
switches are either definitely on, or
definitely off.
However, discrete state machine can
still give answers that are
indistinguishable from a continuous
machine.
Other objections
Copeland (1993) [see ‘Artificial Intelligence: a
philosophical introduction’] discusses 4 further
objections to Turing Test. The first three of these
he dismisses, and the fourth he incorporates into a
modified version of the Turing Test.
1. Too conservative: Chimpanzee objection
Chimpanzees, dolphins, dogs, and pre-linguistic
infants all can think (?) but could not pass Turing
Test.
But this only means that Turing Test cannot be a
litmus test (red = acid, not red = non acidic).
- nothing definite follows if computer/animal/baby
fails the test.
Ie negative outcome does not mean computer
cannot think.
(In philosophical terms: TT gives a sufficient not a
necessary condition of thought)
2. Too easy: Sense organs objection
Turing Test focuses only on verbal responses:
the computer is able to use words without really
knowing their meanings.
(like taking a driving test that consists only of
answering questions).
Possible answer: Equip it with sense organs,
and you can then test knowledge of world
(remember turning the turtle over in ‘Blade
Runner’ when detecting robots.)
But there huge numbers of words which can be
investigated without having sense organs. Eg
‘teasing’.
Think about getting to know people through
webchat--can you?
3. Simulation objection
The simulation of X is never an X.
In the original Victorian parlour game, if a
man misleads the interrogator into thinking
he is a woman, it doesn’t mean he IS a
woman.
But there are two kinds of simulation:
a) simulation1 which lacks essential
features of what is being simulated. Eg in
simulated death the person is not dead.
b) simulation2, exactly like what is being
simulated, but not produced in standard
way (e.g. aircraft in windtunnels).
One could claim that a program that passed
the Turing Test was a simulation2
4. Black box objection.
This objection is presented by Copeland (1993), and not
rejected by him.
A blackbox is a device whose inner workings are
unknown. The Turing Test: relies only on outward
behaviour.
But one might look inside computer, to see how it ‘thinks’.
ie might have program which just has list of sentence to
be output in response to sentences it encounters. All
possibilities covered, and computer not thinking at all,
just matching sentences.
Consider Weizenbaum’s ELIZA
Joseph Weizenbaum’s program with a therapeutic
interviewer has been available for decades:
Men are all alike.
IN WHAT WAY?
They’re always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE?
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE?
He says I’m depressed much of the time.
I AM SORRY TO HERE THAT YOU ARE DEPRESSED.
It’s true that I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU
NOT TO BE UNHAPPY?
I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME
HELP?
Perhaps I could learn to get along with my mother.
TELL ME MORE ABOUT YOUR FAMILY.
My mother takes care of me.
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU?
My father.
YOUR FATHER
You are like my father in some ways.
WHAT RESEMBLANCE DO YOU SEE?
You are not very aggressive, but I think you don’t want me
to notice that.
WHAT MAKES YOU THINK I AM NOT VERY
AGGRESSIVE?
You don’t argue with me.
WHY DO YOU THINK I DON’T ARGUE WITH YOU?
You are afraid of me.
DOES IT PLEASE YOU TO THINK I AM AFRAID OF
YOU?
My father is afraid of everybody.
WHAT ELSE COMES TO MIND WHEN YOU THINK OF
YOUR FATHER
Bullies
DOES THAT HAVE ANYTHING TO DO WITH THE
FACT THAT YOUR BOYFRIEND MAKE YOU COME
HERE?
But, when we look at how Eliza works, this conversation seems
less impressive.
1. Transform input to change point of view.
Eg change you to I etc.
If pattern matches then give stored response
Otherwise give general purpose response
Pattern *need* [can you do without]
Eg I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT
SOME HELP?
*Mother* [tell me more about your family]
eg.
Perhaps I could learn to get along with my mother.
TELL ME MORE ABOUT YOUR FAMILY.
Eliza lacks
Means of recognizing grammatical structure of
language
Means of converting users query into a
representation of the underlying knowledge from which
inferences can be drawn.
But notice here the AI theory
being snuck in……...
We don’t have any INDEPENDENT
evidence that people do asign
grammatical structures to what they hear
in dialogue,
nor that they use a great deal of
background knowledhe when carrying on
a superficial conversations…..
Consider PARRY, a much better
program:
PARRY had far more to say than ELIZA
and wanted to tell you things
It modelled a psychiatrically disordered
person (PARanoid) and hence it was
easy to excuse when it didn’t make
sense.
It remembered what you had said and
got ‘annoyed’ and ‘frightened’
It was claimed it had passed the TT.
But notice here the AI theory
being snuck in……...
We don’t have any INDEPENDENT
evidence that people do asign
grammatical structures to what they hear
in dialogue,
nor that they use a great deal of
background knowledhe when carrying on
a superficial conversations…..
Consider PARRY, a much better
program:
PARRY had far more to say than ELIZA
and wanted to tell you things
It modelled a psychiatrically disordered
person (PARanoid) and hence it was
easy to excuse when it didn’t make
sense.
It remembered what you had said and
got ‘annoyed’ and ‘frightened’
It was claimed it had passed the TT.
Parry, the paranoid program:
Colby, K.M., Weber, S., Hilf, F.D. (1971)
Artificial Paranoia, Artificial Intelligence,
2, 1-25.
The computer is in capital letters
Why are you in the hospital?
I SHOULDN’T BE HERE.
Who brought you here?
THE POLICE.
What trouble did you have with the police?
What do they do instead?
COPS ARREST THE WRONG PEOPLE.
They arrest the right ones sometimes.
THE LAW SHOULD GET THOSE ITALIAN
CROOKS.
Are you Italian?
MY BACKGROUND IS BRITISH BUT I
WAS BORN IN THIS COUNTRY.
Do you know any Italian crooks?
I TRY TO AVOID THE UNDERWORLD.
Are you afraid of the underworld?
THE MAFIA ARE THE WORST
GANGSTERS.
Are you connected with the Mafia?
YOU KNOW, THEY KNOW ME.
How do they know you?
THEY SPY ON ME.
Why do they?
THEY WANT TO GET FALSE EVIDENCE
TO FRAME ME.
What do they have against you?
THEY ARE OUT TO GET ME.
What for?
I HAVE INCRIMINATING EVIDENCE
AGAINST THE MAFIA.
Are you sure of all this?
YOU DON’T BELIEVE ME?
I am doing my best.
…silence
CAN’T YOU THINK OF A DECENT
QUESTION TO ASK?
You seem hostile to me.
SOME OF YOUR REMARKS ANNOY ME.
I am sorry.
FORGET THE EXCUSES, YOU GUYS
ARE ALL ALIKE.
Colby’s PARRY
Perhaps best ever performance, many
users, robust, but not a normal subject
(I.e. a paranoid)
primitive individual models, some
control of dialogue process; but it had
lots to say!
Primitive simulation of intentionality
not syntax but fast pattern matching
much better than ELIZA
How is this achieved in PARRY?
WHAT IS YOUR MAIN PROBLEM
_______________________________
WHAT BE YOU MAIN PROBLEM
BE YOU MAIN PROBLEM
WHAT BE MAIN PROBLEM
WHAT BE YOU PROBLEM
WHAT BE YOU MAIN
Loebner Test [Turing test + money]
Competition started in 1991 by Hugh
Loebner.
Judges type questions in at terminal:
replied to either by human or a
computer contestant.
Judges rate responses according to how
human-like they appear: Prize of 2000
dollars to most human-like. Grand prize
of 100,000 dollars if a program does
better than one of the humans.
1997 winner: Intelligent Research Ltd,
designed and largely programmed by
members of Natural Language
Processing in this Department.
Fragment of winning transcript
Program: hi, my name is Catherine, what’s
yours?
Judge2: My name is Nathan.
Program: Well, its nice to meet you
Nathan. Did you see that story on CNN
last night about the lesbian couple who
came out at a White House party on
Sunday.
Judge2: No. I’m sorry I missed that. How
did the Prez take it?
Program: Ellen Degeneres was one of
them - she was kissing her lover in front
of Bill Clinton. Do you think he was very
embarassed?
Judge2: You tell me, I didn’t see it.
Program: Are you still a friend of Bill
Clinton
Judge2: Yes. I would like to talk about
something else. Do you know anything
about sculpture.
Program: Well I suppose he’s doing his
best like most of us, but he does seem
to attract an UNUSUAL amount of bad
publicity, wouldn’t you say -- can it
Judge2: Do you know what sculpture is?
Program: Well you’re more of a fan of his
than I am I guess --- though I can’t
SWEAR what I’d have done in that
locked hotel room, I must say.
SOME DAYS IT’S BETTER THAN
OTHERS AT CHANGING THE
SUBJECT!
Maybe the idea of Turing Test should be abandoned.
Reason 1: Unitary notion of ‘intelligence’ too
simplistic. Too simplistic to think that it is useful to
assess whether computers possess ‘intelligence’, or
the ability to think.
Better to break down this question into smaller
questions.
- similar to idea that unitary measure of intelligence
(ie intelligence as measured by IQ tests) is not very
useful
- better to have tests that reveal the relative strengths
and weaknesses of individuals.
Could assess computers in terms of more specific
abilities; eg ability of robot to navigate across a
room, ability of computer to perform logical
reasoning, metaknowledge (knowledge of own
limitations).
Reason 2: Too anthropocenctric.
Too anthropocentric to insist that program
should work in same way as humans.
Dogs are capable of cognition, but would
not pass Turing Test. Still, producing
machine with cognitive and communicative
abilities of a dog would be (another)
challenge for AI.
But how can we NOT be anthropocentric
about intelligence? We are the only really
intelligent things we know, and language is
closer to our intelligence than any other
function we have…?
Perhaps for now (till opening
heads helps) behaviour is all we
have.
Increasingly complex programs means
that looking inside machines doesn’t tell
you why they are behaving the way they
are.
Those who don’t think the TT effective
must show why machines are in a
different position from our fellow humans
(I.e. not from OURSELVES!). Solipsism
again.
Turing Test (as now interpreted!) suggests that we base our
decision about whether a machine can think on its outward
behaviour, and whether we confuse it with humans.
Concept of Intelligence in humans
We talk about people being more or less intelligent. Perhaps
examining the concept of intelligence in humans will provide
an account of what it means to be intelligent.
What is intelligence? Intelligence is what is measured by
intelligence tests.
Potted history of IQ tests
Early research begun into individual differences:
1796: assistant at Greenwich Observatory recording when stars
crossed the field of the telescope. Consistently reported observations
eight-tenths of a second later than Astronomer Royal.
Discharged! Later realized that observers respond to stimuli at different
speeds – the assistant wasn’t misbehaving, he just couldn’t do it as
quickly as the Astronomer Royal.
Francis Galton, in latter half of 19th century: interested in individual
differences.
He developed measures of keenness of senses, and mental imagery:
early precursors of intelligence tests. Found evidence of genius
occurring often in certain families.
Stanford-Binet IQ test
Alfred Binet (1857-1911) tried devising tests to find out how “bright” and
“dull” children differ.
His aim was educational – to provide appropriate education depending
on ability of child.
Emphasis on general intelligence.
Idea of quantifying the amount of intelligence a person has.
Stanford-Binet test makes use of concept of mental age versus chronological
age.
Intelligence quotient (IQ) produced as ratio of mental age to chronological
age.
Items in the test are age-graded, and mental age corresponds to level achieved
in test. Bright child’s mental age is above his or her chronological age, slow
child’s mental age is below his or her mental age.
Move of emphasis from general to specific abilities
World War 1: US test ‘Army Alpha’. Tested simple reasoning, ability to
follow directions, arithmetic and information. Used to screen thousands of
recruits, sorting into high/low/intermediate responsibilities.
Beginning of measures of specialized abilities:
Realisation that rating on single dimension not very informative. ie different
jobs require different aptitudes.
eg 1919 Seashore: Measures of Musical Talent.
Tested ability to discriminate pitch, timbre, rhythm etc.
1939: Wechsler-Bellevue scale: goes beyond composite performance to
separate scores on different tasks. eg mazes, recall of information, memory
for digits etc.
Items divided into performance scale and verbal scale.
Block design: pictured
designs must be copied
with blocks; tests ability to
perceive and analyse
patterns.
Verbal item:
Arithmetic. Verbal problems
testing arithmetic reasoning.
Nature of Intelligence
Binet, and Wechsler, assuming that intelligence is a general capacity.
Spearman: also proposed individuals possess a general intelligence
factor g in varying amounts, together with specific abilities.
Thurstone (1938): believed intelligence could be broken down into a
number of primary abilities. Used factor analysis to identify 7 factors
verbal comprehension
word fluency
number
space
memory
perceptual speed
reasoning
Thurstone devised test based on these factors;
Test of primary mental abilities.
But the predictive power of Test for primary mental abilities was no
greater than for Wechsler and Binet tests, and several of these factors
correlated with each other..
IQ tests: provide one view of what intelligence
is.
History of intelligence testing shows that our
conception of what is intelligence is subject to
change.
Change from assuming there is a general
intelligence factor, to looking at specific abilities.
But emphasis is still on quantification, and
measuring how much intelligence a person
possesses – doesn’t really say what intelligence
is.
Specific and general theories seem to have similar
predictictive abilities about individual outcomes.
Try this right now:
PICK OUT THE ODD ONE
Cello
Harp
Drum
Violin
Guitar
Limitations of ability tests:
1. IQ scores do not predict achievement very well, although they can
make gross discriminations. The predictive value of tests is better at
school (correlation between .4 and .6 between IQ scores on StanfordBinet and Wechsler and school grades), but less good at university.
Possible reasons for poor prediction: Difficult to devise tests
which are culturally fair, and independent of educational experience.
E.g. pick one word that doesn’t belong with the others.
Cello harp drum violin guitar
Children from higher income families chose ‘drum’; those from lower
income families picked ‘cello’.
Tests do not assess motivation or creativity.
2. Human-centred: Animals might possess an intelligence, in a way
that a computer does not, but it is not something that will show up in an
IQ test.
3. Tests only designed to predict future performance; they do not help
to define what intelligence is., but again, the search for definitions is
rarely helpful.
What things possess intelligence?
What are examples of things which
are and are not intelligent?
What are their characteristics (what
determines what group they fall in).