learning - Frazier
Download
Report
Transcript learning - Frazier
LEARNING
Chapter 8
Daisy, daisy, who shall it be?
Who shall it be who will marry me?
Rich man, poor man, beggarman, thief,
Doctor, lawyer, merchant, chief,
Tinker, tailor, soldier, sailor,
Who shall it be who will marry me?
(Old children’s nursery rhyme)
What is NOT Learning?
• Reflexes
• Instincts
• Imprinting (certain animals form
attachments during a critical period early in
life)
Researched by Konrad Lorenz
What is Learning?
•Any relatively permanent change in behaviour
potential that occurs because of experience
•We learn by association
•Conditioning - basic kind of learning
Ivan Petrovich Pavlov, 1849-1936
• “What the hell
difference does a
revolution make. .
.when you’ve got
work to do in the lab?”
• Classical or Pavlovian
Conditioning
Classical Conditioning
•Studied digestion
•Noticed salivation
sec
Components of Classical conditioning
•Environmental conditioning of involuntary
behaviour
–Unconditioned stimulus (UCS, US)
–Unconditioned response (UCR, UR)
•unconditioned = unlearned
–Conditioned stimulus (CS)
–Conditioned response (CR)
Processes of Conditioning
•Acquisition
•best learned when CS presented 1/2 sec prior to
UCS
Processes of Conditioning
•Extinction
Processes of Conditioning
•Spontaneous Recovery
How Long Does it Take?
• John Garcia’s experiments found that it can
take one instance
• The Garcia effect: when you experience
nausea after eating a food, that food
becomes a CS to provoke nausea as a CR.
The Study of Learning
•Behaviourism
–approach to psychology, U.S., 20th century
–led by John Watson, carried further by B.F.
Skinner
–Emphasized observable behaviour, NOT the
unconscious (like Freud).
John Broadus Watson, 1878-1958
•1913 – He declared
psychology a failure
•“failed to establish
itself as a natural
science”
•“should be a purely
objective experimental
branch of natural
science”
failur
Environmental Determinism
•“Give me a dozen healthy infants well formed
and my own specified world to bring them up in
and I’ll guarantee to take any one at random and
train him to become any type of specialist I
might select - doctor, lawyer, artist, merchant,
chief, and yes, even beggarman and thief regardless of the talents, tendencies, abilities,
vocations, and race of his ancestors.”
–1925
Fear conditioning
•Pairing of a NS with a fear-provoking object
–very powerful
–resistant to extinction
–can occur after only 1 pairing
–model for phobias
–Little Albert (Watson/Rayner)
Fear conditioning
Generalization vs. Discrimination
• Generalization: Responding to a similar
stimulus
• Discrimination: Responding ONLY to the
exact stimulus
Other applications
Instrumental Conditioning
•Edward Thorndike (1874-1949)
– The Law of Effect
“Behavior that is rewarded will be repeated”
Burrhus Frederic Skinner, 1904-1990
Radical Behaviourism
Operant Conditioning (Skinner)
•Conditioning or learning of voluntary behaviors
Operant Conditioning
• Terminology
– Reinforcement: a consequence that
INCREASES the likelihood of the behavior
– Punishment: a consequence that DECREASES
the likelihood of the behavior
Operant Conditioning
Operant Conditioning
•Negative Reinforcement: Two variations
–Avoidant Behavior/Conditioning: behaving so that
an unpleasant consequence is avoided—never starts
–Escape Behavior/Conditioning: behaving so that an
unpleasant consequence stops
Schedules of Reinforcement
•Continuous Reinforcement—consequence occurs
for every target behavior
•Partial or Intermittent Reinforcement—
consequence does NOT occur for every behavior,
but for some
•*Actually better for getting desired behavior*
Continuous vs Partial Reinforcement
•Partial reinforcement
–slower acquisition, but
–MORE resistant to extinction
–Skinner’s findings:
–A pigeon on a schedule of continuous reinforcement
may peck a key 50 to 100 times after reinforcement
has been cut off
–after some types of intermittent reinforcement, the
bird will peck from 4000 to 10,000 times before
responding extinguishes
Continuous vs Partial
Reinforcement
• Partial Reinforcement Effect
•
-parents often learn this the hard way!
– “I want candy.” “No.” “I want candy.” “No.”
“I want candy.” “No.” “I want candy.” “Well,
okay, just this once.”
– The demanding has just been REINFORCED!
Schedules of Reinforcement
•Ratio = reinforcing according to the number of
responses/behaviors
•Interval = reinforcing according to time elapsed
•Different combinations:
•Fixed-Ratio (FR) getting paid by the piece
•Variable-Ratio (VR) slot machines
•Fixed-Interval (FI) candy on Fridays
•Variable-Interval (VI) a realtor sells a house
Reinforcers
• Primary
– Rewards that have biological or “natural”
value (food)
– Vs.
– Secondary
– Rewards that we must learn to value (money,
approval from others)
Reinforcers
• Immediate: follow directly after the
behavior (cookie for picking up toys)
• Vs.
• Delayed: a period of time passes
(paychecks)
Motivation
• Skinner would object to discussing this: not
observable!
• Extrinsic Motivation
–
–
–
–
For rewards that originate from outside
vs.
Intrinsic Motivation
For rewards that bring personal satisfaction
Motivation
• Overjustification
• When extrinsic rewards are emphasized too
greatly, it can short circuit the development
of intrinsic motivation
Shaping
•Rewarding successive approximations
•
•
•
•
(“baby steps”)
toward the
overall desired
behavior
Biological Predispositions
•Skinner underestimated importance of biology
•Conditioning principles are constrained by
biological predispositions of each species
•(what animals can and can’t do, no matter how
rewarding your reinforcement)
•“Never try to teach a pig to sing. It wastes your
time and annoys the pig.” Mark Twain
Observational Learning
Observational Learning
•Bandura - Social Learning Theory
–we learn by watching what others do and what
happens to them for doing it
–vicarious learning
–vicarious reinforcement
–vicarious punishment
•Imitation: children tend to imitate adult models
–classic Bobo doll study
Observational Learning
•people may learn anti-social behavior simply by
observing it on tv