reinforcement
Download
Report
Transcript reinforcement
OPERANT CONDITIONING
Changing Behavior Through Reinforcement and Punishment
OPERANT CONDITIONING
•
Learning Objectives:
1.
Outline the principles of operant conditioning.
2.
Explain how learning can be shaped through the use of reinforcement schedules and
secondary reinforcers.
OPERANT CONDITIONING
•
Operant Conditioning
– learning based on the consequences of behavior
– may involve the learning of new behaviors
HOW REINFORCEMENT AND PUNISHMENT INFLUENCE
BEHAVIOR
•
Edward L. Thorndike
– First scientist to systematically study
operant conditioning
– Observed cats trying to escape from
puzzle boxes
– Developed law of effect:
• Responses that produce a
pleasant outcome are likely to be
repeated in a similar situation.
• Responses that produce an
unpleasant outcome are less
likely to be repeated in a similar
situation.
•
B. F. Skinner
– Expanded on Thorndike’s ideas to
develop a more complete set of
principles to explain operant
conditioning
– Created specially designed
environments called operant chambers
or Skinner boxes to study learning
systematically
HOW REINFORCEMENT AND PUNISHMENT INFLUENCE
BEHAVIOR
•
Operant chamber or “Skinner box”
– Cage large enough for a rodent or bird
– Contains a bar or key that the
organism can press or peck to release
food or water
– Contains a device to record the
animal’s responses
HOW REINFORCEMENT AND PUNISHMENT INFLUENCE
BEHAVIOR
HOW REINFORCEMENT AND PUNISHMENT INFLUENCE
BEHAVIOR
•
Positive reinforcement is a more effective way to change behavior than is punishment
– Punishment creates only a temporary change in behavior.
– Punishment creates a negative and adversarial relationship with the individual providing
the punishment.
CREATING COMPLEX BEHAVIORS THROUGH OPERANT
CONDITIONING
•
Continuous reinforcement
– A response is reinforced each
time it occurs.
• example: Each time a dog
rolls over, it receives a
biscuit.
– Leads to rapid initial learning,
but also to poor resistance to
extinction
•
Partial (or intermittent)
reinforcement
– A response is sometimes
reinforced, sometimes not.
• example: When you hold a
door for someone, sometimes
you are reinforced with a
smile or a “thank you,” but
sometimes you aren’t,
– Leads to slower initial learning,
but also to greater resistance to
extinction
CREATING COMPLEX BEHAVIORS THROUGH OPERANT
CONDITIONING
CREATING COMPLEX BEHAVIORS THROUGH OPERANT
CONDITIONING
•
•
Schedules based on the number of responses (ratio types) induce greater
response rate than do schedules based on elapsed time (interval types).
Also, unpredictable schedules (variable types) produce stronger responses than
do predictable schedules (fixed types).
CREATING COMPLEX BEHAVIORS THROUGH OPERANT
CONDITIONING
•
Shaping
– Process of guiding an organism’s behavior to the desired outcome through the use of
successive approximations to a final desired behavior
• allows the creation of complex behaviors
CREATING COMPLEX BEHAVIORS THROUGH OPERANT
CONDITIONING
• Primary reinforcers
– Stimuli that are naturally
preferred by the organism
• examples include food,
water, relief from pain
• Secondary reinforcers
– Neutral event that has
become associated with a
primary reinforcer through
classical conditioning
• one example is money
OPERANT CONDITIONING
•
Key Takeaways
– Edward Thorndike developed the law of effect: the principle that responses that create a
pleasant outcome in a particular situation are more likely to occur again in a similar
situation, whereas responses that produce an unpleasant outcome are less likely to occur
again.
– B. F. Skinner expanded on Thorndike’s ideas to develop a set of principles to explain
operant conditioning.
– Positive reinforcement strengthens a response by presenting something pleasant after the
response, whereas negative reinforcement strengthens a response by reducing or removing
something unpleasant.
OPERANT CONDITIONING
•
Key Takeaways, continued
– Positive punishment weakens a response by presenting something unpleasant after the
response, whereas negative punishment weakens a response by reducing or removing
something pleasant.
– Reinforcement may be either partial or continuous. Partial reinforcement schedules are
determined by whether the reinforcement is presented on the basis of the time that
elapses between reinforcements (interval) or on the basis of the number of responses that
the organism engages in (ratio), and by whether the reinforcement occurs on a regular
(fixed) or unpredictable (variable) schedule.
– Complex behaviors may be created through shaping, the process of guiding an organism’s
behavior to the desired outcome through the use of successive approximation to a final
desired behavior.