Operant Conditioning
Download
Report
Transcript Operant Conditioning
III. Operant Conditioning
E.L. Thorndike
and
B.F. Skinner
Operant Conditioning
A. At the same time that Pavlov (and later
Watson) was experimenting with what was
to be known as “Classical” conditioning,
E.L. Thorndike was experimenting with
“Operant” conditioning or “Instrumental”
Conditioning. His research served as the
basis for B.F. Skinner’s research.
Edward L. Thorndike ( 1874–1949)
Operant Conditioning
B. Describe a puzzle box:
Clip - http://www.youtube.com/watch?v=yigW-izs8oc
Law of Effect
Thorndike’s principle that:
1. Behaviors followed by favorable
consequences become more likely
2. Behaviors followed by unfavorable
consequences become less likely
Early Operant Conditioning
E. L. Thorndike (1898)
Puzzle boxes and cats
First Trial
in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
After Many
Trials in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
Etc.
Etc.
Press lever
Press lever
Operant Conditioning
B.F. Skinner
(1904-1990)
elaborated
Thorndike’s Law
of Effect
developed
behavioral
technology
B. F. Skinner (1904–1990)
Operant Conditioning
Operant Conditioning
type of learning in which behavior is
strengthened if followed by reinforcement
or diminished if followed by punishment
*Instrumental because you use an “instrument”
(reinforcement or punishment) to shape
behavior.
Operant Conditioning
Operant Behavior
operates (acts) on environment
produces consequences
Respondent Behavior
occurs as an automatic response to
stimulus
behavior learned through classical
conditioning
Operant Chamber
Skinner Box
chamber with a
bar or key that an
animal
manipulates to
obtain a food or
water reinforcer
contains devices
to record
responses
What is the instrument?
What behavior is strengthened?
Instruments of
Conditioning
Types of Conditioning
Instruments
Positive
(Stimulus is Given)
Reinforcement
(Increases Desired
Behavior)
Give Something
Increase Behavior
Negative
(Stimulus is
Removed)
Remove Something
Increase Behavior
Punishment
(Decreases
Undesirable Behavior)
Give Something
Decrease Behavior
Remove Something
Decrease Behavior
Our Class:
Experiment 1:
What is the instrument?
Experiment 2:
What is the instrument?
Which worked best?
Operant Conditioning
Classroom Practice
Sheldon Choc 1.mp4
Sheldon Choc 2.mp4
Sheldon Choc 3.mp4
1. What is the desired behavior?
2. What is the instrument?
https://www.youtube.com/watch?v=LhI5h5
JZi-U
Operant Conditioning
In shaping, successively closer
versions of a desired response are
reinforced (as in learning to play
tennis).
In chaining, each part of a sequence is reinforced; the
different parts are put together into a whole (as in
learning the steps to a dance).
Punishment
Punishment
aversive event that decreases the
behavior that it follows
powerful controller of unwanted
behavior
Punishment
Problems with Punishment
Does not teach or promote alternative,
acceptable behavior
May produce undesirable results such as
hostility, passivity, fear
Likely to be temporary
May model aggression
Samples
Positive Reinforcement
Negative Reinforcement
Positive Punishment
Negative Punishment
Operant Conditioning Processes
Primary Reinforcement is unlearned and
usually necessary for survival. Food is the best
example of a primary reinforcer. Examples?
Secondary Reinforcement is anything that comes to
represent a primary reinforcer such as praise from a
friend or a gold star on a homework assignment.
Also called conditioned reinforcer. Examples?
Common Terminology
Classical
Operant
Acquisition – repeatedly pair CS with UCS
Acquisition – behavior repeatedly
FOLLOWED BY reinforcement/punishment
Extinction – CS no longer paired with UCS,
CR extinguished
Extinction – behavior no longer
reinforced/punished so extinguished
Spontaneous Recovery – After period of
extinction, CR returns in presence of CS
Spontaneous Recovery – behavior
suddenly reappears (after extinguished) in
presence of reinforcer
Generalization – Will respond to stimuli
similar to CS
Generalization – will respond to
reinforcement/punishment similar to original
Discrimination – Will ONLY respond to CS
Discrimination – will ONLY respond to
original reinforcement/punishment
Schedules of
Reinforcement
Immediate Reinforcers
To our detriment, small but immediate
reinforcements are sometimes more alluring than big,
but delayed reinforcements
Continuous Reinforcement
reinforcing the desired response each time it occurs
Partial/Delayed/Intermittent Reinforcement
reinforcing a response only part of the time
results in slower acquisition
greater resistance to extinction
Schedules of
Reinforcement
Fixed Ratio (FR)
reinforces a response only after a
specified number of responses
faster you respond the more rewards you
get
different ratios
very high rate of responding
like piecework pay
Schedules of
Reinforcement
Variable Ratio (VR)
reinforces a response after an
unpredictable number of responses
average ratios
like gambling, fly fishing
very hard to extinguish because of
unpredictability
Schedules of
Reinforcement
Fixed Interval (FI)
reinforces a response only after a
specified time has elapsed
response occurs more frequently as
the anticipated time for reward
draws near
Schedules of
Reinforcement
Variable Interval (VI)
reinforces a response at unpredictable
time intervals
produces slow steady responding
like pop quiz, fishing
Practice Worksheet
Fixed or Variable?
Ratio or Interval?
Consideration of Future
Consequences Scale
1
2
3
4
5
=
=
=
=
=
Extremely Uncharacteristic
Somewhat Uncharacteristic
Uncertain
Somewhat Characteristic
Extremely Characteristic
Number 1 - 12
Delay of Gratification
Delay of Gratification Scale
Marshmallow Study
Walter Mischel(Columbia University)
Tracked children longitudinally
Kids that could wait went on to achieve more
Marshmallow Test