instrumental conditioning

Download Report

Transcript instrumental conditioning

Instrumental Conditioning
• Basic Concepts
– classical conditioning
• Learning about relationships between stimuli
• Pavlovian Conditioning is Stimulus learning
– operant conditioning
• learning about consequences of one’s own behavior
• Instrumental Conditioning is Response learning
– Instrumental behavior is behavior that occurs because it is instrumental in
producing certain consequences
• The behavior is instrumental in the outcome such as getting food
– Put a coin in a vending machine to get candy
– Turning a key to open a lock
• This type of learning became known as ‘instrumental conditioning’ or “goal
directed” behavior
Early influences on Thorndike and Skinner
• George John Romanes (1848 –1894)
–
–
–
–
–
–
evolutionary biologist and physiologist
comparing cognitive processes between humans and other animals
The beginning of comparative psychology
Protégée of Charles Darwin, he invented the term neo-Darwinism
Animal Intelligence, 1892 [1st Pub. 1882]
Mental Evolution in Animals, 1883.
• C. Lloyd Morgan, (1852 –1936)
– a British ethologist and psychologist
– Morgan's canon
• experimental approach to animal psychology
• “higher mental faculties should only be considered as explanations if lower faculties
could not explain a behaviour”
• John Broadus Watson (1878 –1958)
– Established the psychological school of behaviorism
– Psychology as the Behaviorist Views it, 1913
– conducted the controversial "Little Albert" experiment
Edward L. Thorndike (1874-1949)
• A Biographical Memoir by ROBERT S. WOODWORTH
• Published Animal Intelligence in 1911 which describes experiments
to test animal intelligence by putting cats in a Puzzle Box
– These experiments where in response to George Romanes’ book also titled
“Animal Intelligence” which had anecdotal explanations of animal behavior
that included insight, reasoning and inference
• for example Thorndike wrote "It also suffices as a rebuke to those who would
have the kitten ratiocinate about the matter, but it fails to tell what real mental
content is present." p22
– Historically this is an important distinction
• the anecdotal approach was much criticized as unscientific and rightly so
• However, Thorndike did not have any experimental evidence that cats are not
capable of using cognitive mechanisms such as insight or reasoning
– See Different Approaches to the Explanation of Animal Intelligence for
additional comments on the distinction that Thorndike is making about the
Romanes explanations.
Puzzle Box Procedure
• In one type of puzzle box (see figure 5.1) the cats had to manipulate
a latch to open the door
• Initially the cats would move around and paw at things which is
typical cat behavior until they accidentally opened the door
• The latency to escape (see figure 5.2) would decrease over
successive trials
• Thorndike considered this learning to escape by trial and error not
by insight i.e. cognition
• For more historical background see Thorndike’s puzzle boxes and
the origins of the experimental analysis of behavior by Paul Chance
FIGURE 5.1
Two of Thorndike’s puzzle boxes, A and I. In Box A, the participant had to pull a loop to release the
door. In Box I, pressing down on a lever released a latch on the other side. (Left: Based on
“Thorndike’s Puzzle Boxes and the Origins of the Experimental Analysis of Behavior,” by P. Chance,
1999, Journal of the Experimental Analysis of Behaviour, 72, pp. 433–440. Right: Thorndike, Animal
Intelligence Experimental Studies, 1898.)
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Thorndike's Interpretation
• Thorndike interpreted the results of his experiment as reflecting the
learning of an S-R association
– the cats learned an association between the stimuli inside the puzzle box and
the escape response
• Note: not clear what constitutes box stimuli
– The consequence of the successful response – escaping the box –
strengthened the association between the box stimuli and that response
• On the basis of his work, Thorndike formulated the law of effect
– Law of Effect
• if a response in the presence of a stimulus is followed by a satisfying event, the
association between the stimulus (S) and the response (R) is strengthened
• if the response is followed by an annoying event, the S-R association is weakened
– according to the law of effect
• animals learn an association between the response and the stimuli present at the
time of the response
• the consequence of the response (escape) is not one of the elements in the
association
• the satisfying or annoying consequence simply serves to strengthen or weaken the
association between the response and the stimulus situation
Modern Approaches to Instrumental Conditioning
• Discrete-Trial Procedures
– Example 1:Each placement of the cat in the box and subsequent escape is a
trial
• So the instrumental response occurs only during a specified period determined
by presentation of a stimulus or placement of the animal in the experimental
apparatus
– Example 2: Rats placed in a maze start box eventually move to the goal box
• Runway or straight-alley maze: running speed or latency to leave the start-box
• T-maze – to study choice behavior see Figure 5.3
– Involves a single response performed only at a certain time
•
•
•
•
Such as rat runs to goal arm
only one discrete behavior (running speed) is recorded for each trial
then rat removed from the apparatus
after ITI the animal is placed in the start arm again for another trial
– So each response is a discrete action
• The speed and onset of the behavior is determined by the subject
• However, the experimenter determines when the subject may begin the action
(usually by putting the rat in the start arm) which could bias the behavior
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
B.F. Skinner (1904-1990)
• 1938: The Behavior of Organisms
• 1953: Science and Human Behavior
– “All we need to know in order to describe and explain behavior is this:
actions followed by good outcomes are likely to recur , and actions
followed by bad outcomes are less likely to recur.” (Skinner, 1953)
• Dealt only with observable behavior
– The task of scientific inquiry:
– To establish functional relationships between experimenter-controlled
stimulus and organism’s response
• Single subject design
– Large numbers of subjects not necessary
– statistical comparisons of group means not necessary
– requires "sufficient" data collected under well-controlled experimental
• 1990: Vigorously attacked the growth of cognitive psychology
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Free-Operant Procedures
• Invented by Skinner to make experiments more efficient
– Concept of operant to divide up behavior into measureable units
• Permit continuous performance of the instrumental (operant)
response
• the experimenter decides which behavior is operant but the subject
determines when the behavior will be executed
• pigeon is put in an operant chamber and allowed to respond at
their own pace
• an operant response such as:
– lever press by rats (Figure 5.5) is defined in terms of the effect that it has on
the environment
– the critical thing is not the muscles involved in the behavior but the way in
which the behavior ‘operates’ on the environment
– So the rat could bar press with it’s tail or it’s paw and either would be an
operant response
• “However, any response that is required to produce a desired
consequence is an instrumental response” p. 128 Domjan
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Magazine Training and Shaping
• Need to have Magazine Training before operant behavior can be
studied
– Animal needs to learn the relationship between stimuli in the environment
and getting food which is classical conditioning
– the food-delivery device is called the food magazine so the preliminary phase
is called magazine training
– when food is delivered into the magazine it makes a mechanical noise
– Learn to associate noise with food delivery
– after enough pairings of this noise with food delivery, the rat will go to the
food cup when it hears the noise which is goal tracking
• After the association between food and the magazine stimuli is
established operant conditioning can occur
• For instrumental conditioning to occur, the subject must make the
desired response “push the lever” prior to receiving the food
Response Shaping
• Response shaping is used to ‘teach’ the operant response
– Pecking at key light for pigeons
– Pressing the lever for rats
• Shaping involves:
– reinforcement of successive approximations to the required response
– gradual nonreinforcement of earlier response forms
• Shaping is usually not required for lever pressing by rats
– just place a hungry rat into an operant chamber and they will figure it out
– as they naturally explore the operant chamber they will push the lever and
get food
• Common examples with people would be
–
–
–
–
sport coaches
piano teacher
driving instructor
drug abstinence behavior
Shaping “New” Behavior
• Shaping behavior that is part of the typical behavior of rats is easier
then behavior that is rarely or never done spontaneously
– Rat naturally explore their environment with their paws so lever pressing is
easy to train
– Requires more training “shaping” to have them do an unusual behavior such
as play basketball
• Shaping requires inherent variability of behaviour
– Which makes it possible to shape a variety of behaviors
• Shaping can result in new response forms that have never been
performed previously by the individual
– New response forms are made up of smaller parts of naturally occurring
behavior. See Figure 5.6
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Response Rate as a measure of operant Behavior
•
•
•
•
Cumulative Recorder
way of presenting data in free-operant procedures
one response builds on the previous response
Response Measures
– with discrete-trial procedures can measure speed, and latency to make the
response
– with free-operant procedures can measure rate of responding (number bar
presses/min)
Cumulative Recorder