Instrumental Conditioning II

Download Report

Transcript Instrumental Conditioning II

Instrumental Conditioning II
Delay of Reinforcement
Choice
Delay
Goal
Start
Correct
Reward
or No
Reward
Incorrect
Grice (1948)
Grice (1948) Results
1.2
s
100
90
80
70
60
50
40
30
20
2s
5s
Trials
700
625
550
475
400
325
250
175
100
10s
25
Percent Correct
0s 0.5s
Overcoming the effects of delay
• Secondary reinforcers
• “Marking” procedure
Lieberman, McIntosh & Thomas (1979)
Reinforcement Punishment
Positive
contingency
Chocolate Bar
Electric Shock
Negative
contingency
Excused from
Chores
No TV
privileges
Effect on Rate
Behavior
Professor Drew
Anticipatory Contrast - Crespi (1942)
Running Speed (ft/sec)
Rats run down maze to find food pellets in goal arm.
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
256-16 Pellets
16-16 Pellets
1 - 16 Pellets
2 4 6 8 10 12 14 16 18 20 2 4 6 8
Trials
What is a reinforcer?
Thorndike: A stimulus that produces a “satisfying state of
affairs”
Operational Definition (behaviorists): That which
increases the probability of the response that preceded it.
Drive Reduction Theory
Compare with Set Point
Amt of
H2O in
body
drives
Seek water/
don’t seek
water
Drive Reduction Considered: Are
reinforcers necessary for survival?
– Eating to excess
– Drugs of Abuse
– “Pleasure centers” of the brain
Behavioral Regulation View: The
Premack Principle
• Behaviors are reinforcing, not stimuli
• To predict what will be reinforcing, observe
the baseline frequency of different
behaviors
• Highly probable behaviors will reinforce
less probable behaviors
Premack Revised: The Response
Deprivation Hypothesis
Timberlake & Allison (1974)
• Low frequency behaviors can reinforce high
frequency behaviors (and vice versa)
• All behaviors have a preferred frequency = the
behavioral bliss point
• Deprivation below that frequency is aversive, and
organisms will work to remedy this
Response deprivation hypothesis
The ice cream
scale (in pints)
.25
.5
.75
1.0
1.25
Will work
to obtain
1.5
1.75
2.0
Will work to avoid ice cream
Bliss point
(1.0 pints/night)
2.25
2.5
Contiguity versus Contingency in
operant conditioning
Degraded Contingency Effect
= bar press
Perfect
contingency
Degraded
contingency
= food
Strong
Responding
Weak
Responding
G.V. Thomas (1983)
Contiguity pitted against contingency
“Free” reinforcers given every 20s
Lever press advances delivery of pellet, but
cancels pellet for next 20-s interval
20s
40s
60s
So if you press at second 2, you get a pellet immediately,
but you get no pellet during seconds 3-20 and 21-40.
G.V. Thomas (1983)
Contiguity pitted against contingency
Lever press here
Lose this
pellet
20s
40s
60s
So if you press at second 2, you get a pellet immediately,
but you get no pellet during seconds 3-20 and 21-40.
“Superstitious Behavior”
• Suggested that temporal contiguity more
important than contingency
• 15-s FT, no response requirement
• “adventitious reinforcement”
“In 6 out of 8 cases the resulting responses were so
clearly defined that two observers could agree
perfectly in counting instances. One bird was
conditioned to turn counter-clockwise about the cage,
making 2 or 3 turns between reinforcements. Another
repeatedly thrust its head into one of the upper corners
of the cage….”
Orienting
toward feeder
Pecking
near feeder
Moving
along wall
¼ turn
“Misbehavior” and the limits of
operant conditioning
Limits of Operant Conditioning
• Some behaviors can’t be conditioned
– Yawning
– Scratching
• Belongingness
– Presentation of a female won’t reinforce biting
• “Misbehavior”
Marian Breland Bailey – How to train a
chicken
The famous dancing chicken
What is learned in operant
conditioning?
What is learned?
Edwin Guthrie: mere contiguity of a
stimulus and a behavior stamps in
that S-R; reinforcement is not
necessary
S
R
What is learned?
Thorndike:
S
Reinforcement
“stamps in” this
connection
R
What is learned?
S
R
O
?
2-Process Theory
operant
S
R
Pavlovian
O
2-Process Theory
operant
S
R
Pavlovian
CR
Evidence for 2-process theory
Pavlovian-Instrumental Transfer
Phase 1
Phase 2
LeverFood
LightFood
#
Presses
Light
No CS
Test
Light: #Presses?
No Light: #Presses?
The presence
of the CS
intensifies
operant
responding
What is learned?
S
R
?
?
Does the Pavlovian S-O association activate a
vague emotional state or a specific mental
representation of the outcome?
O
Specific Outcome Representations
Trapold
Phase 1
Phase 2
Test
(operant)
R LeverPellet
(classical)
TonePellet
Tone:Left? Right?
L LeverSucrose
LightSucrose
Light:Left? Right?
Left
Right
#
Presses
Light
Noise