PSY402 Theories of Learning

Download Report

Transcript PSY402 Theories of Learning

PSY402
Theories of Learning
Monday
December 1, 2003
Chapter 9 – Contemporary Theories
Contemporary Theories

Shift from global theories (e.g.,
Hull’s drive theory) to theories
about specific aspects of learning.



Global theories were about operant
responding not classical conditioning.
An animal’s biology influences whether,
what, and how fast it can learn.
Cognitive view requires emphasis on
specific cognitive processes.
Contemporary Theories (Cont.)

Classical Conditioning:



Nature of the CR – stimulus
substitution theory and SOP theory
Predictiveness of the CS – RescorlaWagner associative model, comparator
theory, attentional theory, retrospective
processing approach.
Operant Conditioning:


Nature of reinforcement
Behavioral economics
Stimulus-Substitution Theory


What is the nature of the CR – is it
just the UCR or is it different?
Pavlov – stimulus-substitution
theory:


The CS stimulates the same areas of
the brain as the UCS, producing the
same response.
Activation of CS with UCS establishes
neural connection between brain areas.
Conditioned Opponent Response

The CR and UCR are often different:


CR of fear is different than UCR of pain.
Siegel – best evidence of difference:




Morphine (UCS) produced analgesia,
reduced pain (UCR)
Light or tone (CS) produced
hyperalgesia, increased pain (CR).
Rats remove paws from heat quickly
with CS, slowly with UCS.
Insulin (glycemia) works the same way
Drug Tolerance Overdoses

Elimination of a CS results in a
stronger response to the UCS, drug.


Extinction of responding to environmental cues strengthens drug response
Changing the context in which a
drug is administered increases
response to the drug.

Novel environment does not elicit an
opponent CR.
SOP Theory


Sometimes Opponent-Process
theory (SOP) – explains why CR
varies.
UCS elicits primary A1 (fast) and
secondary A2 (longer) responses.


A1 & A2 can be same or different.
Conditioning only occurs to A2 – the
CR is always an A2 response.

When A1 & A2 differ, UCR & CR differ.
Two-Phase Reactions

Shock – results in:




A1 -- Initial agitated hyperactivity
A2 -- Long-lasting hypoactivity
(freezing)
CER elicited by CS is A2
Morphine – results in:



A1 – sedation or hypoactivity
A2 – hyperactivity two hours later
CR elicited by CS is hyperactivity
More Support for SOP Theory


Rabbit eyeblink mechanisms
support the idea of two-phases.
Backward conditioning – learning
occurs if the CS is presented just
before the peak of the A2 response.

Larew – conditioning occurred with a
31 sec lapse but not 60 sec or 1 sec.
Affective Extension of SOP Theory


Why do different A2 responses have
different optimal CS-UCS intervals?
Two distinct UCR sequences activate
distinct A1 & A2 sequences:



Sensory
Emotive
These distinct sequences can have
different strengths, time scales
(latencies), or eliciting CS’s.
Rescorla-Wagner Theory

There is a maximum associative
strength between CS and UCS.


Strength gained on each training
trial depends on prior training.



UCS determines the limit
More learning early, less later on
Rate of conditioning varies.
Conditioning of a CS depends on
prior conditioning to other stimuli.
UCS Preexposure Effect



If the UCS is encountered without
the CS prior to pairing of the two,
less learning occurs.
UCS becomes associated with other
environmental stimuli (without CS).
Since there is a limit to association
strength, some is drained off by
such prior associations.

CS-UCS association is weakened.
Problems with Rescorla-Wagner

Overshadowing – salient cues have
more associative strength.




Sometimes a salient cue potentiates
another cue instead of overshadowing.
Garcia says cues are indexed.
R-W says cues are seen as unitary
stimulus.
Unclear which explanation is
correct.
More Problems

CS preexposure effect – appearance
of CS without UCS prior to learning
weakens learning.


Shouldn’t have any effect according to
Rescorla-Wagner theory, but it does.
Cue-deflation effect – extinction of a
more salient cue enhances learning
for the less salient cue.

Should be no change according to R-W.
Comparator Theory

If two CS’s are associated,
extinction of one should reduce
responding to the other.


Sometimes true, other times not.
CS-UCS associations exist for many
stimuli but are exhibited only for
the strongest.

CS’s are judged in relation to each
other.
Attentional View


Mackintosh – learned irrelevance
occurs during preexposure of CS.
Animals exposed to a novel stimulus
exhibit an orienting response.



No orienting with preexposure.
Habituation results in failure of
conditioning.
Pairing of CS/UCS in novel context
results in learning.
Retrospective Processing


Most theories assume the level of
responding will be constant after
learning.
Baker & Mercier suggest association
can change after learning.



Retrospective processing – CS-UCS
contingency reevaluated after learning.
Backward blocking – support for theory
Suggests animals have mental
representations, memory for events.
Retrospective Processing


Most theories assume the level of
responding will be constant after
learning.
Baker & Mercier suggest association
can change after learning.



Retrospective processing – CS-UCS
contingency reevaluated after learning.
Backward blocking – support for theory
Suggests animals have mental
representations, memory for events.
Operant Conditioning

Nature of reinforcement:



Premack’s probability differential theory
Response deprivation theory
Behavioral economics:




Behavioral allocation – blisspoint
Choice behavior – Herrnstein’s
matching law.
Momentary maximization theory
Delay-reduction theory
Probability-Differential Theory

Premack – a reinforcer can be any
activity that is more likely to occur
than the reinforced behavior.


Manipulators vs eaters
High probability behaviors can be
used as reinforcers of low
probability behaviors.

Frequency of the reinforcer decreases
when it is made contingent on another
response.
Response Deprivation Theory

Timberlake & Allison – deprivation
occurs when an activity is used as a
reinforcer and is not freely emitted.



The activity is reinforcing because it
satisfies the deprivation created.
The animal tries to return to its predeprivation level of responding.
Activities can be reinforcing even if
their baselines were not higher.
Behavioral Allocation

Blisspoint (paired basepoint) – the
free operant level of two responses.



Unrestricted responding with two
choices of behaviors.
Blisspoint is used to figure out how
much behavior an animal will
engage in to obtain a reward.
Animals try to get as close to the
blisspoint as possible.
Problems with Contingencies


Blisspoint is established by looking
at behavior before a contingency is
established.
The established contingency must
take blisspoint into account or it
may not increase desired behavior.
Choice Behavior

Herrnstein’s matching law –
describes how animals act when
they have two or more choices.



Different responses have different
schedules of reinforcement.
Responding to each choice is
proportionate to the reinforcement for
each choice – after learning.
This can be expressed mathematically.
Delayed Gratification

Why does anyone choose a smaller
reward part of the time?


Animals and people typically choose a
small immediate reward over a larger
delayed reward.
Large rewards are selected when:


The choice is made in advance of
reward.
Reinforcers are not visible or reward is
already present (pleasurable activity).
Complexities of the Matching Law

Maximizing law – sometimes the
aim is to obtain as many rewards as
possible.




Explains FR-10 vs FR-40 schedules.
Doesn’t work for VI vs VR schedules.
Momentary maximization theory –
choose best alternative at the time.
Delay reduction theory – choose
what will get the reward the fastest.