Evidence of RO Associations

Download Report

Transcript Evidence of RO Associations

S-R; S-O; S-(R-O); OH! OH!
What motivates and directs instrumental behavior?
• Two different approaches:
– Associative Structure
• used by Thorndike and Pavlov
• focus on "molecular" short-term mechanisms
– Response Allocation
•
•
•
•
used by Skinner
focus on long-range goals or functions of the behavior
is anchored in ecology and economics
Instrumental conditioning limits the free flow of activity
– Choice with commitment
– Self-control
The Associative Structure of Instrumental Conditioning
• Three-Term Contingency (Skinner)
– The instrumental response (R) occurs in the presence of distinctive stimuli (S)
and results in the delivery of the outcome (O)
– Three components in instrumental learning situation
• ( S ) any environmental stimuli signaling
– Specific cues, tone, light, odor, etc.
– Context: a complex of stimuli for a place and or time
• ( R ) behavior producing the outcome
• ( O ) either appetitive or aversive outcomes
• Several Types of Associations
– Instrumental conditioning permits the development of several types of
associations:
• S-R association: discriminative stimulus can become directly associated with the
response
• S-O association: discriminative stimulus can become associated with the
outcome (basically a Pavlovian association)
• R-O association: response becomes associated with the outcome
– See Figure 7.1
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
The S-R Association and the Law of Effect
• Thorndike, an S-R association between the stimuli present in the
experimental situation and the instrumental response
• Law of Effect: the role of the reinforcer (or response outcome) is to
‘stamp in’ an association between the contextual cues (S) and the
instrumental response (R)
– The outcome (O) did not enter into any associations with either S or R.
• S - R can explain maintenance of habitual behavior (routines) such
as drug taking
– Form an associations between responses (behavior) and context stimuli
• Context stimuli such as “physical settings” or “particular people” preceding a
behavior sequence
– perception of context stimuli or behavioral sequences activates the response
– Once formed habits occur automatically
• without regard to outcome
• Such as compulsive eating, gambling or drug addiction
Expectancy of Reward and the S-O Association
• In Pavlovian conditioning
– animals learn about stimuli that signal some important event
– CS tone then US Food
• Reward expectancy
– Role of Pavlovian processes in instrumental conditioning
– Like Pavlovian CS, ( S ) also become associated with outcomes
• Two-Process Theory: Hull (1930's) and Spence (1950's)
– thought that both S-R and S-O associations are acquired
Two-Process Theory
• (Rescorla & Solomon 1967)
– A ‘central emotional state’ such as expectancy
• classically conditioned stimuli (CSs) signal the arrival of the US
– The instrumental response is motivated by two factors
• the presence of Stimulus "tone" comes to evoke the response directly, through a
Thorndikian S-R association
• the instrumental response comes to be made in response to the expectancy of
reward because of an S-O association
– S comes to motivate the instrumental behavior by activating a central
emotional state
• This emotional state can be positive or negative
• can therefore facilitate or interfere with instrumental conditioning
• Use transfer-of-control experimental design to test two-process theory
Transfer-of-control experiments
• Transfer-of-control experimental design (see table 7.1)
–
–
–
–
Using appetitive food US
phase 1: operant conditioning, press Lever to get Food
phase 2: Pavlovian conditioning, Tone paired with Food
phase 3: Transfer phase
• while subject is pressing lever to get food occasionally turn on the tone
• rate of lever pressing will increase
• Transfer-of-control experimental design
–
–
–
–
Suppression of responding with aversive shock US
phase 1: operant conditioning, press Lever to get Food
phase 2: Pavlovian conditioning, Tone paired with Footshock
phase 3: Transfer phase
• while subject is pressing lever to get food occasionally turn on the footshock
• what happens to the rate of lever pressing
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Conditioned Emotional States or Reward-specific Expectancies?
• Evidence for Reward-Specific Expectancies
– Specific reward such as CS food can facilitate instrumental however,
• more influence on instrumental behavior reinforced with solid food
• Less influence on instrumental behavior reinforced with sugar water
– Kruse et al., (1983)
•
•
•
•
Some animals have CS1--- solid food during Pavlovian conditioning
Other animals have CS2 --- sucrose in water during Pavlovian conditioning
Then tested with food or sucrose rewarded instrumental responding
Facilitation was greatest when outcomes were the same
– So these results are inconsistent with two-process theory
• If it was just a central emotional state effect then this preferential facilitation
would not be present
• There is an emotional state effect plus specific reward type
R-O and S(R-O) Relationships
• Evidence of R-O Associations
• R-O association intuitively makes sense
– turn key to open the lock
• Use devaluing the reinforcer procedure for assessing R-O
associations
– When outcome is food devalue by
• reduce hunger by feeding before testing
• conditioned taste aversion to make food taste aversive
– Substantial evidence of R-O associations
• Hograth and Chase 2011 see Figure 7.4
– Student smokers
• Responding for either pictures of cigarettes or chocolate
• Devalue cigarettes by smoking before the test
– Reduced responding for cigarette pictures
• Devalue chocolate by eating chocolate before testing
– Reduced responding for chocolate pictures
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
R-O and S(R-O) Relationships
• Hierarchical S(R-O) Relations
• In addition to the simple associations of 2 elements (i.e., S-R, S-O,
R-O), can also have hierarchical associations
• the (S) signals the relationship between a response and its outcome
S -> (R ---- O)
• the (S) becomes an occasion setter that signals when a specific
response will be followed by a specific reinforcer
– For example (S) can be a context, a place such as a casino or a restaurant
– Experimental approach
• One (S) tone signals (R) lever push – (O) food
• Another (S) light signals (R) pull string – (O) sucrose
• Then switch the (S) – (R-O) combinations
– Tone  (pull string – (O) sucrose) produces less responding
• considerable support for S(R-O) relationships
Response Interactions in Pavlovian Instrumental Transfer
• Response Interactions
– Pavlovian CS tone can produce overt behaviour related to food
– Example of sign-tracking behavior
– Tendency to move towards signals for food as part of foraging
• Krank (2008)
– Instrumental conditioning for access to alcohol
• Rats push a lever to get alcohol
– Setup with two levers both on VI 20 second to train consistent behavior
– Then Pavlovian conditioning
• CS light located above the levers followed by access to alcohol
• Produces sign-tracking behavior
• Rats approach and sniff the light
– Finally, tested light effect while lever pressing for alcohol
• If light was above the lever it increased responding on that lever
• See Figure 7.2
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Response Allocation and Behaviour Economics
• Theoretical perspective on instrumental (operant) behavior that is
not based on molecular S-R, S-O, R-O associations.
• Based on the tradition of Skinnerian free-operant responding
• Also based on Behavior Systems theory approach
– Ethological perspective of animals distributing their behavior during foraging
– so this is a more reproach looking at behavior crossed some time span
Antecedents of Behavior Regulation
• Do reinforcers have “special stimulus properties” that satisfy
biological need states? Such as food or water
• Thorndike: A stimulus that produces a “satisfying state of affairs”
– the problem with this definition is that it is circular
• Operational Definition (behaviorists):
– That which increases the probability of the response that preceded it.
• Problems with reinforcers as special types of stimuli
– Food will not always reinforce behavior i.e. when not hungry
– Saccharin (artificial sweetener) as a reinforcer but it is non-nutritive i.e. it is
not food
– Although, Movies and Thrill rides are reinforcers
– So anything that you need, want or like could be a reinforcer, there are no
special reinforcer stimuli
• But all of these examples do have “special response properties”
• Both Consummatory Response Theory and the Premack Principle
approached reinforcers from this perspective
Consummatory Response Theory
• Sheffield (1954) Consummatory Response Theory
– Consummatory behavior as part of behavior systems approach
– species-typical consummatory responses (chewing, licking, swallowing, etc)
– "reinforcer responses" any behavior that completes the sequence
• Study consummatory response not the reinforcer stimuli
– special because they occur in the presence of a reinforcer.
– Eating behavior such as chewing and swallowing can occur for any reinforcing
“food” item.
• A problem with this approach in instrumental conditioning
– Instrumental behaviors are “typically low probability” unusual behaviors
• Bar pressing is unusual and not consummatory
• Pecking at a key light is more typical but not consummatory
• David Premack( 1965) had a solution to this problem
The Premack Principle
• Responses “consummatory behavior” are special because they are
more likely to occur then other behaviors
– Hungry animals are likely to eat whereas thirsty animals are likely to drink
– To predict what will be reinforcing, observe the baseline frequency of
different behaviors
– Differential Probability Principle
• Highly probable behaviors will reinforce less probable behaviors
• OR: Highly desirable behaviors will reinforce less desirable behaviors
• Reinforcer responses are special because they are more probable than
instrumental responses
• For example: Put a hungry rat into a skinner box
– If free food is available how much bar pressing behavior?
– But when food access is restricted and made contingent on bar pressing then
what will the rat do?
– Bar Pressing is Low Probability behavior while Eating Food is High Probability
behavior
• Make high probability eating contingent on low probability bar pressing
The Premack Principle
• Tested using two different methods
• Manipulate response probability by changing deprivation conditions for
different responses (e.g., drinking and wheel-running) with rats
– Rats like to drink sugar water and they like to run in wheels so both of these are
high probability behaviors see Figure 7.5
– Either sucrose (sugar water) or running will work as a reinforcer for lever pressing
– Rats required to drink water to get access to running (Premack, 1964)
• Young children given a choice between eating candy and playing pinball
machine (Premack, 1965)
– Measure response probability during a baseline phase when a subject is free to
respond
– allows for predictions to be made of whether one response will reinforce another
in the response contingency phase
– arrange the contingency so that the low probability behavior is required to access
the high probability behavior
• for some children this would be eating candy to get access to the pinball machine
• for other children this would be playing pinball to get access to eating candy
FIGURE 7.5
Rate of lever pressing during successive 5-minute periods of a fixed interval 30-second schedule
reinforced with access to a running wheel or access to various concentrations of sucrose. (Based on
Belke & Hancock, 2003.)
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
Application of the Premack Principle
• Working with schizophrenic patients whom prefer to sit (Mitchell 1973)
– Sitting should work as an reinforcer
– Could use sitting as a reinforcer for doing some work
• patients were required to work on a task to get a chance to sit down
• increased the amount of time the were willing to engage in the task
• Working with Autistic children
– high rates of delayed echolalia or perseverative behavior see Figure 7.6
– Train academic skills by using either food or perseverative behavior as reinforcers
– academic skills improved when delayed echolalia or perseverative behavior was
used as a reinforcer but not with food
• Encouraging Sammy to eat new foods by using preferred foods to
reinforce eating new foods
–
–
–
–
a child with learning difficulties and chronic food refusal
would only eat a limited number of food items
the preferred food items were used as the incentive for the new food items
for example a familiar flavor of yogurt would be given if Sammy ate a little bit of a
new flavor of yogurt
– this was effective in increasing eating across several food items
FIGURE 7.6
Task performance for two children with autism. One student’s behavior was reinforced with food or the
opportunity to engage in delayed echolalia. Another student’s behavior was reinforced with food or the
opportunity to engage in perseverative responding. (Responding during baseline periods was also
reinforced with food.) (Based on “Using Aberrant Behaviors as Reinforcers for Autistic Children,” by M.
H. Charlop, P. F. Kurtz, and F. G. Casey, Journal of Applied Behavior Analysis, 23, pp. 163–181.)
The Principles of Learning and Behavior, 7e by Michael Domjan
Copyright © 2015 Wadsworth Publishing, a division of Cengage Learning. All rights reserved.
The Response-Deprivation Hypothesis
• Timberlake and Allison (1974)
– revised the Premack Principle
– response deprivation is responsible for instrumental behavior
– the opportunity to engage in a an activity will become reinforcing if you are
currently deprived of that activity
•
•
•
•
•
Similar to choice behavior concepts
Several behaviors you want to do such as going to a movie, sleeping, etc.
However you need to study for a test
Studying for the test deprives you of sleep and other activities
Those deprived activities will become reinforcers
– If you study for two hours then you can do those activities
– all organisms have an ideal distribution of behaviors that would occur in the
absence of restrictions
– These ideas were further developed in The Response Allocation Approach