Com1005: Machines and Intelligence
Download
Report
Transcript Com1005: Machines and Intelligence
Amanda Sharkey
COM1005: MACHINES AND
INTELLIGENCE
Where were we?
Turing test: how can we decide if something is intelligent?
Traditional (symbolic) AI
early programs, and knowledge representation and search
GPS, microworlds, expert systems
Chinese room
Functionalists: thought is symbol manipulation
Searle and Chinese room – computers can manipulate symbols, but
that is not enough for real understanding.
Even a computer that passes the Turing test will not really
understand or be intelligent.
Strong AI – appropriately programmed computer really is a mind
Weak AI – using computers to model and understand human
intelligence.
AI and mind
AI and applications
AI and illusion of intelligence
Alternatives to traditional
AI
Neural Computing
also known as
Connectionism
Artificial Neural Nets
Parallel Distributed Processing
Brain-like computing
Biologically-inspired AI
Paradigm shift
Kuhn: “The structure of scientific revolutions”
(1962)
A paradigm – shared set of assumptions
Period of normal science
Paradigm shift – begins as data collected that
doesn’t fit with current assumptions
Shift to new viewpoint and assumptions
Idea related to “Zeitgeist”
Examples in science
Shift from Ptolemeic cosmology to Copernican
(earth as centre of universe, to earth revolving
round the sun)
Examples in social science
Shift from behaviourism to emphasis on cognition
and thought
Examples of changing Artificial Intelligence
approaches to mind
Traditional symbolic AI
Neural computing
Embodied AI
Intro to Artificial Neural
Nets
The human brain
The human brain
Contains approximately 10 thousand
million basic neurons
Each neuron connected to many others
Each neuron accepts many inputs
If enough active inputs received at once,
neuron will be activated and fire
Soma – body of neuron
Dendrites - Long filaments attached to soma
Inputs arrive through dendrites
Axon - Output for neuron
How are they connected?
Synapse – where axon and dendrite meet.
Indirect chemical linkage – neurotransmitters
released which activate gates on dendrites.
Activating gates allow charged ions to flow
Charged ions alters dendritic potential, provides
voltage pulse which is conducted to next neuron
body.
Human brain – capable of impressive tasks
E.g. vision, speech recognition, learning
Fault tolerant – distributed processing, many simple
elements sharing each job.
Graceful degradation – performance gradually falls with
damage
Neural computing – by modelling major features of brain
and its operation, we can capture useful properties of the
brain.
Knowledge in the brain?
Connections between neurons can be
strengthened.
Simple model of neuron – proposed in 1943 by
McCulloch and Pitts
Simplified – no complex patterns and timings of
actual nervous activity in real nervous systems
Neuron on or off
Output depends on inputs – needs enough activation to fire (threshold).
Basic model – weighted sum of inputs, compared to internal threshold, turn on if
above threshold
McCulloch and Pitts (1943) Brain-like
mechanisms – showing how artificial neurons
could compute Boolean logical functions like
AND, or OR
Simplification 1: neurons threshold – on or off.
Simplification 2: Synapses, equivalent to weighted
connections
Rosenblatt (1962) Perceptron.
Single layer net which can learn to produce an
output given an input.
Learning?
Stronger connection between neurons is
captured by multiplicative weight
Essentials of Perceptron learning algorithm
Set the weights and thresholds randomly
Present an input
Calculate actual output by taking thresholded value of weighted
sum of inputs
Alter weights to reinforce correct decisions and discourage
incorrect decisions – to reduce the error
Supervised learning – uses target of what we want it to
achieve.
1969 Minsky and Papert – Perceptrons book.
Minsky and Rosenblatt were old rivals
They showed there were some problems that
could not be solved by 1 layer net, (e.g. XOR)
and there was no learning mechanism for 2
layer nets
Loss of interest in neural nets – heyday of
symbolic AI
AND
1
1
x0 x1
output
1
1
1
0
0
0
1
0
0
0
1
0
XOR
x0 x1 t
1
1
0
0
0
1
1
0
1 0
1
0
1
1
0.5
Inputs: 0 0
1
-1
01
10
11
0.5
1.5
A solution to XOR
problem
1
1
1
1
Second wave of Artificial
Neural Nets
Backpropagation (Rumelhart et al, 1986)
Learning method for 2 layer nets
Error contribution of hidden units is computed by
transmitting the delta error on the output units back to
the hidden units
1986 Rumelhart and McClelland
2 books on Parallel Distributed Processing.
Presented many NN models including Past-
tense model.
Cognitive models – a model of some aspect
of human cognition.
1986 Rumelhart and McClelland: multi-layer
perceptron
Training MLPs requires repeated presentations
An inexact science – no guarantee that net will
converge
Can involve long training times
Little guidance about parameters – e.g. number of
hidden units
Also need to find a good input-output representation
of task.
Generalisation
Main feature of NNs: ability to generalise and to
go beyond the patterns they have been trained
on.
Unknown pattern will be classified with others
that are similar
Therefore learning by example is possible.
Examples of ANN applications
Gorman and Sejnowski, 1988 classified mines
versus rocks using sonar signals
Le Cun et al 1989: recognising handwritten
postcodes
Pomerleau, 1989: navigation of car on
winding road.
Pattern recognition
Pandemonium:
Oliver Selfridge 1959
McClelland and Rumelhart: Interactive activation model
and context effects. (1981)
Example applications of
Neural Nets
NETtalk: Sejnowski and Rosenberg 1987
Learned to pronounce English text.
Takes text, maps text onto phonemes, produces
sounds using electronic speech generator
Difficult to specify rules mapping text onto
speech
E.g. x in box or axe, differs from xylophone
203 input units, 80 hidden units, 26 output units (phonemes)
Window of 7 letters moved over text – learning to pronounce middle
letter.
29 input units, one for each of 26 letters, and 3 for punctuation.
(29x7)
Trained on 1024 words.
After 50 passes NETtalk can perform at 95% accuracy on training
set, and generalise to unseen words 78%
Influential example – recorded NETtalk beginning by
babbling and gradually improving.
Past-tense learning model
A model of human ability to learn past-tenses
of verbs
Developed by Rumelhart and McClelland
(1986).
Past-tenses?
Today I look at you, yesterday I ? at you.
Today I speak to you, yesterday I ? to you.
Today I wave at you, yesterday you ? at me.
rick – yesterday he ? the sheep
Many regular examples:
E.g walk -> walked, look -> looked
Many irregular examples:
E.g. bring -> brought, sing -> sang
Children learning to speak
Baby: DaDa
Toddler: Daddy
Very young child: Daddy home!
Slightly older child: Daddy came home!
Older child: Daddy comed home!
Even older child: Daddy came home!
Stages of acquisition
Stage 1
Past tense of a few specific verbs, some regular e.g. looked,
needed
Most irregular e.g. came, got, went, took, gave
As if learned by rote (memorised)
Stage 2
Evidence of general rule for past-tense – add ed to stem of verb
E.g. camed or comed
Also for past-tense of nonsense word e.g rick
They added ed - ricked
Stage 3
Correct forms for both regular and irregular verbs
Verb type
Stage 1
youngest
Stage 2
older
Stage 3
Older again
Early verbs
correct
Regularised
correct
regular
correct
correct
Other
irregular
novel
regularised
Correct or
regularised
regularised
Regularised
U shaped curve – correct form in stage 1,
errors in stage 2, few errors in stage 3.
Suggests rule acquired in stage 2, and
exceptions learned in stage 3.
Rumelhart and McClelland – aim to
demonstrate that connectionist network
would show same stages and learning
patterns.
Trained net by presenting
Input – root form of word e.g. walk
Output – phonological structure of correct past-tense version
of word e.g. walked
Test model by presenting root form as input, and
see what past-tense form it generates as output.
Used Wickelfeature method to encode words
Wickelphone: Target phoneme and context
E.g. came #Ka, kAm, aM#
Coarse coded onto Wickelfeatures, 16
wickelfeatures for each wickelphone
Input and output of net 460 units
Shows need for input representation
Training: used perceptron convergence procedure
(problem linearly separable)
Target used to tell output unit what value it should have.
If output is 0 and target is 1, need to increase weights from active
input units
If output is 1 and target is 0, need to reduce weights from active
units.
560 verbs divided into High, Medium, and Low
frequency (regular and irregular)
1. Train on 10 high frequency verbs for 10 epochs
Live-lived, look-looked, come-came, get-got, give-
gave, make-made, take-took, go-went, have-had,
feel-felt
2. 410 medium frequency verbs added, trained
for 190 more epochs
Net showed dip in performance – making errors like
children e.g come -comed
3. Tested on 86 low frequency verbs not used for
training
Got 92% regular verbs right, 84% irregular right.
Model illustrates:
Neural net training – repeated examples of input-
output pairs
Generalisation – correct outputs produced for
untrained words
E.g. input guard -> guarded
Input cling -> clung
Past-tense model: Showed
- a neural net could be used to model an aspect of
human learning
- same u-shaped curve shown as found in children.
- the neural net discovered the relationship
between inputs and outputs, not programmed.
- that it is possible to capture apparently rulegoverned behaviour in a neural net.
Strengths of connectionism
Help in understanding how a mind, and thought,
emerges from the brain
Better account of how we learn something like
past-tense, than explicit programming of a rule?
Is this a good model of how we learn past-tenses?
Fierce criticisms: Steve Pinker and Alan Prince
(1988)
More than 150 journal articles followed on the debate
Net can only produce past-tense forms, cannot
recognise them.
Model presents pairs of verb+past tense,
children don’t get this.
Model only looks at past-tense, not the rest of
language
Getting similar performance to children was the
result of decisions made about:
Training algorithm
Number of hidden units
How to represent the task
Input and output representation
Training examples, and manner of presentation
Assignment: due in Monday
November 22nd, Week 9.
Write an essay on one of the following, in 1000-2500 words.
1. Why is it so difficult to produce a computer program that can pass the
Turing Test?
2. Explain Searle’s Chinese Room Argument, and consider whether it is
still relevant to Artificial Intelligence today.
3. Which of the following can be more easily claimed to be intelligent: a
Chatbot, a robot, a pig and a human baby?
4. Discuss the ways in which the relationship between chess and
Artificial Intelligence has changed over time.
5. Explain the main differences between traditional symbolic Artificial
Intelligence and Connectionism, and consider their respective strengths
and limitations.
1. Comprehensiveness and Relevancy (is the
question satisfactorily addressed?)
2. Argument structure (is the argument clear and
well developed?)
3. Use of resources (is the essay well
researched?)
4. Technical (references, spelling, grammar,
punctuation).
References
In text: (1)
In References:
(1) Searle, J.R. (1980) Minds, brains and programs. Behavioural and Brain Sciences, 3, 41724
Or for a book...
In text: Pfeifer and Scheier (2001)
In references:
Pfeifer, R., and Scheier, C. (2001) Understanding Intelligence, MIT Press, Cambridge
Massachusetts
Or for a journal
In text: Walrus et al (2009)
In references: Walrus, W., Lamb, B., Wolf, W., Sheep, S., Rat, R., Slug, W. (2009) Animals
are intelligent, Journal of Incredible Research, 3, 111-116.
Differences between
Connectionism and
traditional Symbolic AI
Knowledge – represented by weighted
connections and activations, not explicit
propositions
Learning – Artificial Neural Nets (ANNs)
trained versus programmed. Also greater
emphasis on learning.
Emergent behaviour – rule-like behaviour,
without explicit rules
More Differences
Examinability:
you can look at a symbolic program to see how it
works.
Artificial Neural net – black box ...consists of numbers
representing activations, and weighted links.
Symbols:
Symbolic AI – manipulating symbols
connectionism has no explicit symbols
Relationship to the brain:
‘Brain-style’ computing
versus manipulation of symbols.
Connectionism versus Symbolic AI
Which is better?
Which provides a better account of thought?
Which is more useful?
Artificial neural nets more like the brain than
traditionally programmed computer?
Are Brains like Computers?
Parallel operation – 100 step argument
Neuron slower than flip-flop switches in computers.
Takes thousandth of a second to respond, instead of a
thousand-millionth of a second.
Brain running AI program would take 1000th of a
second for each instruction
Brain can extract meaning from sentence, or recognise
visual pattern in 1/10th of a second.
Means program should only be 100 instructions long.
But AI programs contain 1000s of instructions
Suggests parallel operation
Connectionism scores
Are brains like computers?
Computer memory – exists at specific physical
location in hardware.
Human memory – distributed
E.g. Lashley and search for engram.
Trained rats to learn route through maze. Could
destroy 10% of brain without loss of memory.
Lashley (1950) “there are no special cells reserced for
special memories….The same neurons which retain
memory traces of one experience must also
participate in countless other activities”
Connectionism scores
Graceful degradation
When damaged, brain degrades gradually,
computers crash.
Phineas Gage
Railway worker – iron rod through the anterior and
middle left lobes of cerebrum, but lived for 13 years
– conscious, collected and speaking.
Connectionism scores
Brains are unlike von Neumann machines
with:
Sequential processor
Symbols stored at specific memory locations
Access to memory via address
Single seat of control, CPU
Connectionism and the brain
Units in net like neurons
Learning in NNs like learning in brain
Nets and brains work in parallel
Both store information in distributed fashion
NNs degrade gracefully – if connections, or
some neurons removed, can still produce
output.
But ….
Connectionism only “brain-style” computing
Neurons simplified – only one type
Learning with backpropagation biologically
implausible.
Little account of brain geometry and structure
Also ….
Artificial neural nets are simulated on
computers
e.g. past-tense network
Rumelhart and McClelland simulated neurons,
and their connections on a computer.
Connectionism
Getting closer to real intelligence?
Closer to computation that does occur in brain
than standard symbolic AI
Provides a way of relating brain, and AI
Connectionism and thought
Can connectionism provide an account of
mind?
Symbolicists arguing that only a symbol
system can provide an account of cognition
Functionalists: Not interested in hardware, only in
software
Connectionists arguing that you need to
explain how thought occurs in the brain.
Rule-like behaviour
Past tense learning
Create rules, and exceptions
Or show that rule-like behaviour can result from a
neural net, even though no rule is present.
Symbolic AI vs Connectionism
Different strengths
Connectionism – good at low level pattern
recognition, in domains where there are many
examples, and it’s hard to formulate the rule.
Symbolic AI – good at conscious planning,
reasoning, early stages of learning a skill (rules).
Hybrid system
Connectionist account of lower level processes
Symbolic account of higher level processes.
Connectionism and Strong AI
Searle – Chinese room shows program does
not understand, any more than operator of
Chinese room.
Problem of symbol grounding …..
But does a neural net understand?
It learns…..
Chinese Gym – a room full of english speaking
people carrying out neural processes, and
outputing chinese. Do they understand?
Presentations
Who was Alan Turing?
Computers versus Humans: the important differences
Is the mind a computer?
Artificial Intelligence and Games
What challenges are left for Artificial Intelligence?
The social effects of Artificial Intelligence: the good, the bad and the ugly
Chatbots
Computers and emotions
AI and the media
Fact or Fiction?: Artificial Intelligence in the movies
- groups of 5
acknowledgements - research, presentation, delivery etc.
Next week (week 6) Reading week.
Week 7: Guru lecture
Text processing
Why hasn’t Turing Test been passed yet? And
applications.
Implementational connectionism
A different way of implementing symbolic structures,
different level of description
Eliminative connectionism
Radical position, cognition and thought can only be
properly described at connectionist level
Revisionist connectionism
Symbolic and connectionist are both legitimate levels
of description.
Hybrid approach – choose best level depending on
what is being invesigated.