Invited Lunchtime Session: Groves

Download Report

Transcript Invited Lunchtime Session: Groves

Nonresponse Rates and
Nonresponse Bias In Surveys
Robert M. Groves
University of Michigan and Joint Program in Survey Methodology, USA
Emilia Peytcheva
University of Michigan, USA
Funding from the Methodology, Measurement, and Statistics Program of the US National
Science Foundation, Grant 0297435
Four Mutually-Problematic
Observations
1. With 100% response rates probability sampling
offers an inferential paradigm with measurable
uncertainties for unbiased estimates
2. Response rates are declining
3. Keeter et al. (2000), Curtin et al. (2000),
Merkle and Edelman (2002) show no
nonresponse bias associated with varying
nonresponse rates
4. Practitioners are urged to achieve high
response rates
Result: Confusion among practitioners
Assembly of Prior Studies of
Nonresponse Bias
• Search of peer-reviewed and other publications
• 47 articles reporting 59 studies
• About 959 separate estimates (566
percentages)
– mean nonresponse rate is 36%
– mean bias is 8% of the full sample estimate
• We treat this as 959 observations, weighted by
sample sizes, multiply-imputed for item missing
data, standard errors reflecting clustering into 59
studies and imputation variance
Percentage Absolute Relative Bias
100 * ( y r  y n )
yn
where y r is the unadjusted respondent mean
y n is the unadjusted full sample mean
Percentage Absolute Relative
Nonresponse Bias by Nonresponse
Rate for 959 Estimates from 59 Studies
Percentage Absolute Relative Bias
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
Nonresponse Rate
60
70
80
Conclusions from 959 Estimates
• Examples of large nonresponse bias exist
• Variation in nonresponse bias lies mostly
among estimates within the same survey
• The nonresponse rate by itself is not a
good predictor of nonresponse bias
• [Note: We cannot infer from the scatterplot
about what would happen within a study if
response rates were increased]
Thinking Causally About Nonresponse
Rates and Nonresponse Error
• Key scientific question concerns mechanisms of
response propensity that create covariance with
survey variable
 yp
E (y r  y n ) 
p
where  yp is the covariance between the survey
variable, y, and the response propensity, p
• What mechanisms produce the covariance?
Alternative Causal Models for Studies of
Nonresponse Rates and Nonresponse Bias
Z
X
P
Y
Z
P
Y
Y
2. Common
Cause Model
1. Separate Causes
Model
P
3. Survey Variable
Cause Model


P
Y
P
Y
Y*
4. Nonresponse-Measurement
Error Model
Y*
5. Nonresponse Error
Attenuation Model
Types of Hypotheses about
Influences on Nonresponse Error
• Influences on response rates
–
–
–
–
urbanicity, gender, age
topic of survey, population’s interest in topic
mode of data collection
prenotification, incentives
• Sponsorship
– prior involvement of population with sponsor
– government vs. other sponsor
• Types of measures
– attitudinal vs. behavioral
– questions related to topic of survey vs. others
• Type of statistic
– means on counts/continuous variables vs. percentages
– differences of subclass means
Types of Hypotheses about
Influences on Nonresponse Error
• Influences on response rates
–
–
–
–
urbanicity, gender, age
topic of survey, population’s interest in topic
mode of data collection
prenotification, incentives
Attributes of
Surveys
• Sponsorship
– prior involvement of population with sponsor
– government vs. other sponsor
• Types of measures
– attitudinal vs. behavioral
– questions related to topic of survey vs. others
• Type of statistic
– mean vs. percentages
– differences of subclass means
Attributes of
Estimates
Exploratory Analysis in Two Steps
Examine only estimates that are percentages,
using standardized values of the percentages
•
•
Step 1: pooling all 566 estimates, examine
y r  y m  , difference of respondent
and nonrespondent means
Step 2: separating the estimates by their
survey’s response rate, examine
( y r  y n ) , nonresponse bias
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Type of Population
(y r  y m )
Proportion of Standard Deviation
0.25
0.2
0.15
0.1
0.05
0
Specific
General
diff=.075
ste=.033
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Type of Population
(y r  y m )
(y r  y n )
Specific
Proportion of Standard Deviation
0.25
General
0.1
0.09
0.2
0.08
0.07
0.15
0.06
0.05
0.04
0.1
0.03
0.02
0.05
0.01
0
0
<23%
General
Specific
diff=.075
ste=.033
diff=
ste=
.013
.0092
23-38%
.0013
.0058
39%+
.045
.020
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Mode
(y r  y m )
(y r  y n )
Self-Administered
Interviewer-Administered
Proportion of Standard Deviation
0.25
0.2
0.15
0.1
0.05
0
Self-Administered
InterviewerAdministered
diff=.040
ste=.024
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
<23%
diff=
ste=
.00091
.0011
23-38%
.0012
.0055
39%+
.025
.011
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Involvement with Sponsor
(y r  y m )
(y r  y n )
Involvement with Sponsor
No involvement with Sponsor
Proportion of Standard Deviation
0.25
0.2
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.15
0.1
0.05
0
Involvement with
Sponsor
No Involvement with
Sponsor
diff=.052
ste=.022
<23%
diff=
ste=
.017
.0079
23-38%
.0014
.0063
39% +
.029
.010
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Type of Measure
(y r  y m )
(y r  y n )
Behavioral
Proportion of Standard Deviation
0.25
Attitudinal
0.14
0.2
0.12
0.1
0.15
0.08
0.1
0.06
0.04
0.05
0.02
0
0
Behavioral
<23%
Attitudinal
diff=.14
ste=.022
diff=
ste=
.084
.0074
23-38%
-------
39%+
.070
.016
Respondent-Nonrespondent Functions on
Standardized Percentage Estimates by
Statistic’s Relevance to Topic
(y r  y m )
(y r  y n )
Not Relevant
Proportion of Standard Deviation
0.25
Relevant
0.08
0.2
0.07
0.06
0.15
0.05
0.04
0.1
0.03
0.02
0.05
0.01
0
0
Not Relevant
Relevant
diff=.006
ste=.023
<23%
diff=
ste=
.000
.0074
23-38%
-.0014
.0091
39% +
.014
.025
Respondent-Nonrespondent Functions on
All Estimates by Type of Estimator
(y r  y m )
(y r  y n )
Mean
Nonresponse bias
7
6
4
5
3.5
Percentage
3
4
2.5
3
2
2
1.5
1
1
0.5
0
0
Mean
Percentage
diff=4.37
ste=0.72
<23%
diff=
ste=
1.07
0.38
23-38%
1.38
0.39
39%+
2.14
0.51
Do Differences of Subclass Means
have Lower Nonresponse Bias?
• When estimating subclass differences, we hope
that nonresponse biases of the two estimates
cancel
• 120 reported estimates of subclass means and
their differences
• Only 45 of them have bias of the differences of
subclass means lower than average bias of the
two subclass means
– this comports with only 45 having two subclass
means with biases of the same sign
Absolute Value of Bias of Difference of Subclass
Mean by Absolute Value of Subclass Mean
Absolute Value of Bias of Difference of Two
Subclass Means
16
14
12
10
8
6
4
2
0
0
2
4
6
8
10
12
Absolute Value of Bias of Subclass Mean
14
16
Types of Hypotheses about
Influences on Nonresponse Error
• Influences on response rates
–
–
–
–
urbanicity, gender, age
topic of survey, population’s interest in topic
mode of data collection
prenotification, incentives
• Sponsorship
– prior involvement of population with sponsor
– government vs. other sponsor
• Types of measures
– attitudinal vs. behavioral
– questions related to topic of survey vs. others
• Type of statistic
– mean vs. percentages
– differences of subclass means
Five Summary Statements
1. Large nonresponse biases exist
2. Most variation in nonresponse biases lie among
estimates in the same survey
3. Some types of surveys are more susceptible to biases
(e.g., interviewer-administered, studies of population
without prior involvement with the sponsor, general
population surveys)
4. Some types of estimates seem more susceptible to
biases (e.g., measures of attitudes, percentages,
(maybe) estimates related to survey topic)
5. Differences of subclass means do not tend to have
lower biases that the individual subclass means
Note: preliminary and exploratory analysis; there is
much more to do