Transcript Convert
Reinforcement & Punishment:
What is an SR?
Lesson 8
What is an SR?
Thorndike’s Law of Effect
Satisfiers & annoyers
Skinner
determined by how B changes
reinforcer: B
punisher: B
Primary reinforcers & punishers
biologically important stimuli ~
What is an SR? (continued)
Secondary reinforcers & punishers
money
praise
How do they become an SR?
Classical Conditioning
Higher order learning ~
Drive Reduction View
(50s & 60s)
Similar to Law of Readiness
Relative state of deprivation required
for a basic drive
thought to always be true
Drive motivation
B reduction of drive state (SR)
~
But...
Sometimes
hard to identify
drive
What drive is
this? ~
Sensory reinforcement
Sensory stimulus unrelated to
biological drive
monkeys learn response
reward is watching toy train
rats learn to bar press
reward = turning on a light
or turning off light ~
Premack Principle
Commonly used in educational setting
impractical or unethical to use food
Thought of reinforcers as responses
press bar eating response
wider application of I/O conditioning
Differential probability principle
High probability responses
reinforce low probability responses ~
Premack Principle
Homme et al (1963)
Unruly 3 year olds
High probability behaviors
ignored teacher
screaming
pushing furniture
Low probability behavior
sitting quietly ~
Premack Principle: Homme et al
Rewarded sitting quietly with...
3 min of running around screaming
Results: sitting quietly increased
Particular behaviors observed by
different kids
different responses effective
reinforcers for different kids ~
Premack Principle
Charlop, Kurtz, & Casey (1990)
autistic children
High probability behaviors
echolalia
perseveration
Low probability behaviors
adding up coins
judging objects: same or different ~
Premack Principle: Charlop et al
100
% correct
responses
80
echolalia RFT
60
food RFT
40
# of sessions
Premack Principle: Problems
Fluctuation of response probabilities
e.g., sometimes kid would rather
play outside than play video games
Solution: token economies
Does not explain how reinforcer
increases response probability ~
Behavioral Regulation Approach
Response deprivation
limit access to a response
does not require high vs. low probability
Behavioral homeostasis
preferred distribution of activities
operant conditioning imposes limits
behavioral bliss point
e.g.,
time spent studying vs. video games ~
Behavioral Regulation Approach
A behavior is limited below bliss point
disturbance of behavioral homeostasis
analogous
to increased biological drive
Contingency set during I/O procedure
establish relationship between responses
B move toward bliss point (baseline) ~
Behavioral Regulation Approach
Low probability behaviors as reinforcers
observe baseline rate of behavior
limit activity below baseline
Require a response to engage in deprived
behavior
contingency
Increase toward bliss point
cost
vs. benefits determines how much ~
What Becomes Connected?
Skinner?
refused to consider associations
Thorndike: S-R view (SD-B)
association b/n stimulus context
and response
R
NOT the outcome (S )
no representation of reinforcer ~
S-R-O (SD-B-SR) view: Tinkelpaugh (1928)
Goal-oriented responding
respond with idea of getting reward
The monkey and the hidden banana
2 cups, put banana under 1
task: choose cup with banana
Secretly substituted rotten lettuce
monkey became agitated
Expected banana reward (outcome) ~
S-R vs. S-R-O
Adams & Dickinson (1981)
Taste aversion paradigm
Associate sucrose (sweetner)
w/ lithium chloride (LiCl) illness
Will rats press bar to get something that
makes them sick? ~
S-R vs. S-R-O
Phase 1:
Trained rats to bar press for sucrose
Phase 2:
associate sucrose w/ illness
Phase 3:
Will rats press bar now?
No
sucrose delivered ~
S-R vs. S-R-O : Results
Predictions?
If S-R-O
If S-R
Results
Rats did not press bar
Supports S-R-O ~
S-R vs. S-R-O
Use different levels of training
Phase 1: Same procedure but…
some get 100 RFTs
some get 500 RFTs ~
Results & Conclusions
Less training low response rate
Little training outcome important
S-R-O
Extensive training high response rate
outcome less important
response is well established
S-R ~
Parallel learning in humans
Learning a skill
e.g., to drive a car
Early trials
consider consequences
must think about what you are doing
After extensive experience
becomes automatic
after many trials ~
Extrinsic Reward vs Intrinsic Motivation
Early trials
expectation of reinforcer
extrinsic reward
CER = positive affect
Well-established behavior
no expectation of reward
intrinsic motivation
CER = positive affect ~