The estimation strategy of the National Household Survey

Download Report

Transcript The estimation strategy of the National Household Survey

The estimation strategy
of the National Household
Survey (NHS)
François Verret,
Mike Bankier, Wesley Benjamin & Lisa Hayden
Statistics Canada
Presentation at the ITSEW 2011
June 21, 2011
Outline of the presentation
1. Introduction
2. Handling non-response error
3. Simulation set-up
4. Results
5. Limits of the study
6. Conclusion
7. Future work
2
2
1. Introduction
 2006 Census: 20% long form, 80% short form
 2011:
• 100% Census mandatory short form
• 30% sampled to voluntarily complete the NHS long form
 Objectives of the long form: get data to plan, deliver and
support government programs directed at target populations
 2011 common topics to both forms: demography, family
structure, language
 Additional 2011 long form topics: education, ethnicity,
income, immigration, mobility…
 NHS sample size is 4.5 million dwellings (f = 30%)
3
1. Introduction
 Non-response error in the NHS:
• Survey now voluntary => expect significant non-response
• To minimize the impact, after a fixed date restrict the collection efforts to
a Non-Response Follow-Up (NRFU) random sub-sample
U
sr
snr
s
NRFU
NRFUr
NRFUnr
 Set-up developed by Hansen & Hurwitz (1946)
1.
2.
3.
4.
4
Select 1st phase sample s from population U
Non-response snr observed in s
NRFU selected from snr
Response NRFUr and non-response NRFUnr observed in the NRFU (HH
assumed 100% resp. rate)
1. Introduction
U
sr
snr
NRFU
NRFUr



5
s
NRFUnr
When 100% of the NRFU responds (as in Hansen and
Hurwitz original setting), the NRFU can be used to
estimate without non-response bias the total in snr
This is not the case in the NHS.
However focusing the collection efforts on the NRFU
converts part of the non-response bias (that would be
observed in the full snr) into sub-sampling error
2. Handling non-response error
 The estimation method chosen to minimize the remaining
non-response bias should have the following properties:
• As few bias assumptions as possible should be made
• The method should be simple to explain and to
implement in production
 Available micro-level auxiliary data to adjust for nonresponse:
• 2011 Census short form
• Tax data
 Calibration: Agreement with Census totals is desirable from
a user’s perspective
6
2. Handling non-response error
 First class of contenders: Reweighting
• Usual method used to compensate for total non-response
in social surveys
• The Hansen & Hurwitz estimator of a total
ˆtHH   yk 
sr
 ak

NRFU
yk
ak
k s
nr
is unbiased if 100% of the NRFU answers
 When the assumption does not hold, we must model the
last non-response mechanism/phase and reweight
accordingly…
7
2. Handling non-response error
 Scores method:
• Model the probability of response with a logistic
regression
• Form Response Homogeneity Groups (RHG) of
respondents and non-respondents with similar predicted
response probabilities
• Calculate the response rate in each RHG and assign
these new predicted response probabilities to respondents
• Divide the NRFUr weights by this probability:
ˆtscores   yk 
sr
8
 ak

NRFU r
yk
 ak k s pˆ kRHG
nr
2. Handling non-response error
 Second class of contenders: Imputation
• Usual method to compensate for item non-response
• We will consider nearest-neighbour imputation using the
CANadian Census Edit & Imputation System (CANCEIS) only
1. Partial imputation: Impute only non-respondents to the
subsample (NRFUnr) and use reweighting to take sampling into
account
ˆtpartial   yk 
sr
 ak

yk
NRFU r
 ak k s
nr


NRFU nr
yˆk
 ak k s
nr
2. Mass imputation: Impute all non-respondents (snr/NRFUr)
tˆmass 
9

sr  NRFU r
yk
 ak


snr  NRFU rc
yˆ k
 ak
2. Handling non-response error
 Some pros & cons
Method
Scores
Preserves micro-level information of
non-respondents
10
Partial
imputation
Mass
imputation
√
√√
Does not create synthetic information
√√
√
Uses less heavy non-response
hypotheses
√√
√√
Fully takes sub-sampling design into
account
√√
√√
Census systems available
√√
√√
More calibration to known Census
totals can be done
√
√√
3. Simulation set-up
 Use 2006 Census 20% long form sample data
 Restricted to Census Metropolitan Area (CMA) of
Toronto
 Simulation aimed at preserving the properties of the
NHS (except for the f = 30%):
• Non-response to the 1st phase was simulated by
deterministically blanking out the data of the 63% of
respondents who answered last in 2006
• Of these non-respondents, the 78% who answered first will
have their response restored if they are selected in the NRFU
sub-sample
• NRFU sub-sampling was simulated by selecting a stratified
random sample of 41% of snr
11
3. Simulation set-up
 Estimators calculated
• As points of reference, unbiased estimators:
ˆt2006   yk
s
ˆtHH   yk 
 ak
 ak
sr

NRFU
yk
ak
k s
nr
• As contenders:
ˆtscores   yk 
sr
 ak
tˆmass 
12
 ak

sr  NRFU r

 ak k s pˆ kRHG
NRFU r
ˆtpartial   yk 
sr
yk

nr
yk
NRFU r
yk
 ak

 ak k s

nr

snr  NRFU rc

NRFU nr
yˆ k
 ak
yˆk
 ak k s
nr
3. Simulation set-up
 The scores method
• A single logistic regression was done for the whole CMA of
Toronto
• Household response probability was predicted
• Considered for stepwise selection: household-level variables,
our best attempt at summarizing the person-level information
and one paradata variable
• R-square of 26%
• 13 RHG formed with predicted probabilities ranging from 29%
to 95%
13
3. Simulation set-up
 Imputation methods
• Nearest-neighbour imputation done with CANCEIS
• RHG is defined by household size
• The distance between non-respondents and donors
(respondents) is defined by weighting each household-level,
person-level and paradata characteristics in the distance
function
• Preference is given to donors who are geographically close
• For each non-respondents, a list of donors is made and one is
randomly selected with probability proportional to a measure
of size (1st phase weight for mass imputation, score method
weights for partial imputation)
14
3. Simulation set-up
 M=84 non short form characteristics over the various topics
 Average relative difference:
•
Calculated at the CMA level:
100 M tˆj  tˆ2006 j

M j 1 tˆ2006 j
100 M tˆj  tˆHHj

M j 1 tˆHHj
• At the Weighting area (953 WA in total) level within the CMA:
100
953M
15
953 M

i 1 j 1
tˆij  tˆ2006ij
tˆ2006ij
100
953M
953 M

i 1 j 1
tˆij  tˆHHij
tˆHHij
4. Results
 Errors at the CMA and WA levels for Toronto
CMA
WA
Point of comparison
Point of comparison
Full firstphase
Hansen & Hurwitz
estimator
Mass imputation
Partial imputation
Scores method
16
Hansen &
Hurwitz
Full firstphase
Hansen &
Hurwitz
0.94
0.00
22.98
0.00
2.97
N/A
24.56
N/A
2.25
1.52
26.69
13.22
2.03
1.45
26.77
18.67
5. Limits of the study
 Results:
• The simulation only includes one replication of the subsampling and non-response mechanisms
• Non-response bias is the measure of interest, but errors
were presented
• Non-response mechanisms were generated
deterministically. Should they be generated
probabilistically?
• The 2011 sampling, non-response and available data (ex:
paradata) cannot be replicated exactly
• Only totals studied. What about other parameters such as
correlations?
17
5. Limits of the study
 Possible confounding effects:
• Logistic regression was done at the aggregated level of the
CMA and no WA effect or interaction were considered
• Paradata for imputation is more closely related to nonresponse mechanism (give preference to late respondents
in the distance)
• Weighting of donors in imputation has an impact
• Calibration done from sample to U; calibration at inner
levels/phases could help scores and partial imputation
18
6. Conclusion

With these preliminary results, it seems scores
method is doing well at aggregate levels, while partial
imputation is doing better than scores at finer levels
•
•
•
19
Mass imputation: Can you override the known sub-sample
design with an imputation model?
Partial imputation: Can include more information (personlevel, paradata) than scores, but weighting of each
component in the distance is partially data driven and not
straightforward
Scores method: More difficult to include the information, but
variable selection to explain non-response is direct
7. Future Work

Possible:
•
•
•

Definite:
•
•

20
Replicate sub-sampling and imputation more than once to
isolate bias components
Consider other levels of calibration in the comparisons
Hybrid of scores and partial imputation
Implement a method into NHS production
Estimate the errors and variances (multi-phase, large sampling
fractions, errors due to modeling,…) and educate data users
Important to get a good model for the last nonresponse mechanism. Whatever the method, quality of
the results is a function of the auxiliary data available.
For more information,
please contact:
François Verret - SSMD/DMES
[email protected]
(613) 951-7318
21