Operant Conditioning

Transcript Operant Conditioning

III. Operant Conditioning
E.L. Thorndike
and
B.F. Skinner
Operant Conditioning
A. At the same time that Pavlov (and later
Watson) was experimenting with what was
to be known as “Classical” conditioning,
E.L. Thorndike was experimenting with
“Operant” conditioning or “Instrumental”
Conditioning. His research served as the
basis for B.F. Skinner’s research.
Edward L. Thorndike ( 1874–1949)
Operant Conditioning
B. Describe a puzzle box:
Clip - http://www.youtube.com/watch?v=yigW-izs8oc
Law of Effect
Thorndike’s principle that:
1. Behaviors followed by favorable
consequences become more likely
2. Behaviors followed by unfavorable
consequences become less likely
Early Operant Conditioning
E. L. Thorndike (1898)
Puzzle boxes and cats
First Trial
in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
After Many
Trials in Box
Situation:
stimuli
inside of
puzzle box
Scratch at bars
Push at ceiling
Dig at floor
Howl
Etc.
Etc.
Etc.
Press lever
Press lever
Operant Conditioning
 B.F. Skinner
(1904-1990)
 elaborated
Thorndike’s Law
of Effect
 developed
behavioral
technology
B. F. Skinner (1904–1990)
Operant Conditioning
 Operant Conditioning
 type of learning in which behavior is
strengthened if followed by reinforcement
or diminished if followed by punishment
*Instrumental because you use an “instrument”
(reinforcement or punishment) to shape
behavior.
Operant Conditioning
 Operant Behavior
 operates (acts) on environment
 produces consequences
 Respondent Behavior
 occurs as an automatic response to
stimulus
 behavior learned through classical
conditioning
Operant Chamber
 Skinner Box
 chamber with a
bar or key that an
animal
manipulates to
obtain a food or
water reinforcer
 contains devices
to record
responses
What is the instrument?
What behavior is strengthened?
Instruments of
Conditioning
Types of Conditioning
Instruments
Positive
(Stimulus is Given)
Reinforcement
(Increases Desired
Behavior)
Give Something
Increase Behavior
Negative
(Stimulus is
Removed)
Remove Something
Increase Behavior
Punishment
(Decreases
Undesirable Behavior)
Give Something
Decrease Behavior
Remove Something
Decrease Behavior
Our Class:
Experiment 1:
What is the instrument?
Experiment 2:
What is the instrument?
Which worked best?
Operant Conditioning
Classroom Practice
Sheldon Choc 1.mp4
Sheldon Choc 2.mp4
Sheldon Choc 3.mp4
1. What is the desired behavior?
2. What is the instrument?
https://www.youtube.com/watch?v=LhI5h5
JZi-U
Operant Conditioning
In shaping, successively closer
versions of a desired response are
reinforced (as in learning to play
tennis).
In chaining, each part of a sequence is reinforced; the
different parts are put together into a whole (as in
learning the steps to a dance).
Punishment
 Punishment
 aversive event that decreases the
behavior that it follows
 powerful controller of unwanted
behavior
Punishment
Problems with Punishment
Does not teach or promote alternative,
acceptable behavior
May produce undesirable results such as
hostility, passivity, fear
Likely to be temporary
May model aggression
Samples
Positive Reinforcement
Negative Reinforcement
Positive Punishment
Negative Punishment
Operant Conditioning Processes
Primary Reinforcement is unlearned and
usually necessary for survival. Food is the best
example of a primary reinforcer. Examples?
Secondary Reinforcement is anything that comes to
represent a primary reinforcer such as praise from a
friend or a gold star on a homework assignment.
Also called conditioned reinforcer. Examples?
Common Terminology
Classical
Operant
Acquisition – repeatedly pair CS with UCS
Acquisition – behavior repeatedly
FOLLOWED BY reinforcement/punishment
Extinction – CS no longer paired with UCS,
CR extinguished
Extinction – behavior no longer
reinforced/punished so extinguished
Spontaneous Recovery – After period of
extinction, CR returns in presence of CS
Spontaneous Recovery – behavior
suddenly reappears (after extinguished) in
presence of reinforcer
Generalization – Will respond to stimuli
similar to CS
Generalization – will respond to
reinforcement/punishment similar to original
Discrimination – Will ONLY respond to CS
Discrimination – will ONLY respond to
original reinforcement/punishment
Schedules of
Reinforcement
 Immediate Reinforcers
 To our detriment, small but immediate
reinforcements are sometimes more alluring than big,
but delayed reinforcements
 Continuous Reinforcement
 reinforcing the desired response each time it occurs
 Partial/Delayed/Intermittent Reinforcement
 reinforcing a response only part of the time
 results in slower acquisition
 greater resistance to extinction
Schedules of
Reinforcement
 Fixed Ratio (FR)
 reinforces a response only after a
specified number of responses
 faster you respond the more rewards you
get
 different ratios
 very high rate of responding
 like piecework pay
Schedules of
Reinforcement
 Variable Ratio (VR)
 reinforces a response after an
unpredictable number of responses
 average ratios
 like gambling, fly fishing
 very hard to extinguish because of
unpredictability
Schedules of
Reinforcement
 Fixed Interval (FI)
 reinforces a response only after a
specified time has elapsed
 response occurs more frequently as
the anticipated time for reward
draws near
Schedules of
Reinforcement
 Variable Interval (VI)
 reinforces a response at unpredictable
time intervals
 produces slow steady responding
 like pop quiz, fishing
Practice Worksheet
Fixed or Variable?
Ratio or Interval?
Consideration of Future
Consequences Scale
1
2
3
4
5
=
=
=
=
=
Extremely Uncharacteristic
Somewhat Uncharacteristic
Uncertain
Somewhat Characteristic
Extremely Characteristic
Number 1 - 12
Delay of Gratification
Delay of Gratification Scale
Marshmallow Study
Walter Mischel(Columbia University)
Tracked children longitudinally
Kids that could wait went on to achieve more
Marshmallow Test

Operant Conditioning

Transcript Operant Conditioning

Directory