Transcript SDT

Signal Detection Theory
March 1, 2016
Some Psychometrics!
• Response data from a perception experiment is usually
organized in the form of a confusion matrix.
• Data from Peterson & Barney (1952)
• Each row corresponds to the stimulus category
• Each column corresponds to the response category
Detection
• In a detection task (as opposed to an identification task),
listeners are asked to determine whether or not a signal
was present in a stimulus.
• For example--do the following clips contain release
bursts?
• Potential response categories:
Signal
Response
Hit:
Present (in stimulus)
“Present”
Miss:
Present
“Absent”
False Alarm:
Absent
“Present”
Correct Rejection: Absent
“Absent”
Confusion, Simplified
• For a detection task, the confusion matrix boils down to
just two stimulus types and response options…
(Response Options)
(Stimulus
Types)
Present
Absent
Present
Hit
Miss
Absent
False Alarm Correct Rejection
• Notice that a bias towards “present” responses will
increase totals of both hits and false alarms.
• (Just like increasing the  criterion!)
• Likewise, a bias towards “absent” responses will
increase the number of both misses and correct rejections.
Canned Examples
• From the text: in session 1, listeners are rewarded for
“hits”. The resultant confusion matrix looks like this:
Present
Absent
Present
82
18
Absent
46
54
• The “correct” responses (in bold) = 82 + 54 = 136
Canned Examples
• In session 2, the listeners are rewarded for “correct
rejections”…
Present
Absent
Present
55
45
Absent
19
81
• The “correct” responses (in bold) = 55+ 81 = 136
• Moral of the story: simply counting the number of
“correct responses” does not satisfactorily tell you what
the listener is doing…
• And response bias is not determined by what they can
or cannot perceive in the signal.
Detection Theory
• Signal Detection Theory: a “parametric” model that
predicts when and why listeners respond with each of the
four different response types in a detection task.
• “Parametric” = response proportions are derived
from underlying parameters
• Assumption #1: listeners base response decisions on
the amount of evidence they perceive in the stimulus for
the presence of a signal.
• Evidence = gradient variable.
perceptual evidence
The Criterion
• Assumption #2: listeners respond positively when the
amount of perceptual evidence exceeds some internal
criterion measure.
criterion ()
“absent” responses
“present” responses
perceptual evidence
• evidence > criterion  “present” response
• evidence < criterion  “absent” response
The Distribution
• Assumption #3: the amount of perceived evidence for a
particular stimulus includes random variation…
• and the variation is distributed normally.
F
r
e
q
u
e
n
c
perceptual evidence
y
 The categorization of a particular stimulus will vary
between trials.
Normal Facts
• The normal distribution is defined by two parameters:
• mean (= “average”) ()
• standard deviation ()
• The mean = center point of values in the distribution
• The standard deviation = “spread” of values around the
mean in the distribution.
standard deviation 
standard deviation 
Comparisons
• Assumption #4: the perceptual evidence for both
“absent” and “present” stimuli in a detection task will be
distributed normally.
• Generally speaking:
• the mean of the “present” distribution will be higher
on the evidence scale than that of the “absent”
distribution.
• Assumption #5: both “absent” and “present” distributions
will have the same standard deviation.
• (This is the simplest version of the model.)
Interpretation
correct rejections
misses
false alarms
criterion
hits
Important: the
criterion level is
the same for both
types of stimuli…
…but the means of the two distributions differ
Sensitivity
•
The distance (on the perceptual evidence scale)
between the means of the distributions reflects the
listener’s sensitivity to the distinction.
•
Q: How can we estimate this distance?
•
A: We measure the distance of the criterion from
each mean.
•
•
We can use z-scores to standardize our distance
measures!
In normal distributions, this distance:
•
determines the proportion of responses on either
side of the criterion
Z-Scores
Hits
Misses
• Example 1: criterion at the mean 
• Z-score = 0
• 50% hits, 50% misses
• 50% present responses, 50% absent responses
Z-Scores
Hits
Misses
• Example 2: criterion one standard deviation below the
mean 
• Z-score = -1
• 84.1% hits, 15.9% misses
Z-Scores
Hits
Misses
• Note: P(Hits) = 1-P(Misses)
•  z(P(Hits)) = z(1-P(Misses)) = -z(P(Misses))
• In this case: z(84.1) = -z(15.9) = 1
D-Prime
• D-prime (d’) is a measure of sensitivity.
• = perceptual distance between the means of the
“present” and “absent” distributions.
• This perceptual distance is expressed in terms of zscores.
n
s
d’
D-Prime
n
d’
Hits
s
• d’ combines the z-score for the percentage of hits…
D-Prime
n
Hits
s
-z(P(FA)) z(P(H))
False Alarms
• d’ combines the z-score for the percentage of hits…
• with the z-score for the percentage of false alarms.
• d’ = z(P(H)) - z(P(FA))
D-Prime Examples
1.
Present
Absent
Present
82
18
Absent
46
54
d’ = z(P(H)) - z(P(FA)) = z(.82) - z(.46) = .915 - (-.1) = 1.015
2.
Present
Absent
Present
55
45
Absent
19
81
d’ = z(P(H)) - z(P(FA)) = z(.55) - z(.19) = .125 - (-.878) = 1.003
•
Note: there is no absolute meaning to the value of d-prime
•
Also: NORMSINV() is the Excel function that converts
percentages to z-scores. (qnorm() works in R)
Near Zero Correction
• Note: the z-score is undefined at 100% and 0%.
• Fix: replace perfect scores with a minimal deviation from
the limit (.5% or 99.5%)
•
Present
Absent
Present
100
0
Absent
72
28
d’ = z(P(H)) - z(P(FA)) = z(.995) - z(.72) = 2.57 - .58 = 1.99
Near Zero Correction
• Also note that we do not normally deal with sets of
responses that total to 100 in our experimental data!
• Here’s another example of the “fix” in which perfect scores
are replaced with scores that are just half a response unit
above or below the minimum and maximum scores,
respectively.
•
Present
Absent
Present
20
0
Absent
6
14
• Replace 20 with 19.5, so P(H) = 19.5/20 = .975
d’ = z(P(H)) - z(P(FA)) = z(.975) - z(.3) = 1.96 - (-.52) = 2.48
Calculating Bias
• An unbiased criterion would fall halfway between the
means of both distributions.
• No bias (λu): P (Hits) = P (Correct Rejections)
u
b
• Bias (λb): P (Hits) != P (Correct Rejections)
Calculating Bias
• Bias = distance (in z-scores) between the ideal
criterion and the actual criterion

u
b
• Bias () = -1/2 * (z(P(H)) + z(P(FA)))
For Instance
Let’s say: d’ = 2
z(P(FA)) = -1
z(P(H)) = 1
• An unbiased criterion would be one standard
deviation from both means…
• z(P(H)) = 1  P(H) = 84.1%
Bias () = -1/2 * (z(P(H)) + z(P(FA)))
• z(P(FA)) = -1  P(FA) = 15.9% •= -1/2 * (1 + (-1)) = -1/2 * (0) = 0
Wink Wink, Nudge Nudge
Now let’s move the criterion over 1/2 a standard deviation…
z(P(FA)) = -.5
z(P(H)) = 1.5
• z(P(H)) = 1.5  P(H) = 93.3%
(cf. 84.1%)
• z(P(FA)) = -.5  P(FA) = 30.9%
(cf. 15.9%)
• Bias () = -1/2 * (z(P(H)) + z(P(FA)))
= -1/2 * (1.5 + (-.5)) = -1/2 * (1) = -.5
Calculating Bias: Examples
1.
Present
Absent
Present
82
18
Absent
46
54
 = -1/2 * (z(P(H)) + z(P(FA)) = -1/2 * (z(.82) + z(.46)) = 1/2 * (.915 + (-.1)) = -.407
2.
Present
Absent
Present
55
45
Absent
19
81
= -1/2 * (z(P(H)) + z(P(FA)) = -1/2 * (z(.55) + z(.19)) = 1/2 * (.125 + (-.878)) = .376
• The higher the criterion is set, the more positive this
number will be.
Peach Colo(u)rs
• Listeners could replay stimuli as many times as they
liked.
• Order of pictures was counterbalanced across
presentations.
• Target
identification
significantly
better than
chance (p <
.001)
• Difference in
accuracy
between IDS
and ADS
utterances was
nearly significant
(p = .056).
• In terms of
sensitivity (d’):
• Sensitivity
significantly
greater in IDS
utterances!
(p = .003)
•  The
properties of
Infant-directed
speech provide
cues to
syntactic
disambiguation.
• In terms of bias
():
• IDS utterances
induced a
significantly
greater bias
towards NV
responses
(p = .032)
• Why? Perhaps
duration
differences
between
utterance types
provide a clue…