The Role of Local Specificity in the Interpretation of Small Area

Download Report

Transcript The Role of Local Specificity in the Interpretation of Small Area

The Role of Local Specificity in the
Interpretation of Small Area
Estimation
Benmei Liu
Scott Gilkeson
Gordon Willis
Rocky Feuer
2012 FCSM Statistical Policy Seminar
December 4, 2012
Outline
I. Overview of small area estimation
II. The importance of local specificity and how it
could affect data use
III. An example from a recent project to estimate
cancer risk factors and screening behavior
IV.Discussion
2
I. Overview of Small Area Estimation (SAE)
 The demand for survey estimates for small
areas (small geographic areas or domains) has
increased in many different areas of application
(e.g., income and poverty, education, health,
substance use) over the past several decades
 The standard direct estimation methods for
survey data cannot provide reliable estimates
due to the small sample size
 Model-based methods that combine information
from multiple related sources have been
developed to increase the precision
3
Basic SAE Model and Estimates
 Fay-Herriot model (1979) has been considered the
prominent fundamental approach
 The final estimate for area 𝑖 derived from the Fay-Herriot
class of models:
𝜃𝑖 = 1 − 𝑝𝑖 𝐷𝑖 + 𝑝𝑖 𝑀𝑖 ,
𝑖 = 1, … , 𝑚
where:
𝐷𝑖 is the direct estimate;
𝑀𝑖 is a regression-based synthetic estimate;
𝑝𝑖 is the proportion of the final estimate due to regression
based synthetic estimate, or a measure of this borrowed
strength; 0 ≤ 𝑝𝑖 ≤ 1.
4
II. The importance of Local Specificity
 We label the information about the use of local
versus borrowed data based on the SAE
techniques as local specificity
 We propose that the term local specificity be
used as a generalizable and intuitively
understandable term for the degree to which
local data contribute to the small area estimate
for a specified area
5
The importance of Local Specificity
(Cont’d)
 Local specificity can be an important indicator of
fitness for use
 We argue that local specificity provides unique
information that is not otherwise available
 For local data users, a measure of local
specificity could be useful
 A measure of local specificity was not provided
on any of the government websites that release
small area estimates data (e.g., SAIPE, NAAL,
NSDUH)
6
III. Communicating Local Specificity to End Users: An
Example
 Combining information from two health surveys
to enhance small-area estimation (Raghunathan
et al. 2007; Davis et al. 2010)
 Project led by National Cancer Institute, with
collaboration by:
 National Center for Health Statistics
 National Center for Chronic Disease Prevention and
Health Promotion
 University of Michigan
 University of Pennsylvania
 Information Management Services
7
Motivation for the Project
 Cancer screening and risk factor data are of
great interest to cancer control planners at the
state and sub-state level, but accurate local
statistics have been difficult to obtain
 Different surveys have different strengths
 Combining information from surveys and
borrowing strength from other sources (e.g.,
Census or administrative records) using small
area modeling approach could improve smallarea estimates
8
Surveys Used
 Behavioral Risk Factor Surveillance System
(BRFSS) – the largest U.S. survey tracking health
conditions and risk behaviors at the state and substate level since 1984
Limitations: Potential nonresponse bias; Undercoverage of
hhlds without landline phones
 National Health Interview Survey (NHIS) – the
principal source of information on the health of the
civilian noninstitutionalized population of U.S. since
1957
Limitations: Smaller sample size; only includes data on
about ¼ of U.S. counties
9
Project Description
 Bayesian methods are developed to combine
information from the two surveys; also
incorporated telephone coverage rates from the
Census
 National Cancer Institute released estimates for
two time periods: 1997-99 and 2000-03
(http://sae.cancer.gov/)
- Smoking, mammography, and pap smear
- Counties, health service areas, and states
 Current work involves including component for
cellphone-only households and for the recent
periods
10
Focus Group Suggestions
 Conducted two focus groups with cancer control
planners and public health professionals at the
Comprehensive Cancer Control Leadership Institute in
June 2010
 Recommendations:
Include these estimates within NCI’s State Cancer Profiles
website (http://statecancerprofiles.cancer.gov/)
 The website is a comprehensive system of interactive maps and
graphs enabling the investigation of cancer trends at the national,
state, and county level
Need a way to describe the differences between the biasadjusted model-based estimates and existing direct estimates
Data users would appreciate an indicator like local specificity to
validate the estimate against local evidence
11
Issues on Communicating Local Specificity
1) How should it be measured?
2) What should it be labeled?
3) What thresholds should be set in assigning
values to it?
12
1) Measuring Local Specificity
 The bias-adjusted SAE model is complex and
lacks an explicit shrinkage factor
 The concept of borrowed strength still applies,
depending primarily on the combined BRFSS
and NHIS sample size within the area
 NHIS sample size is confidential. The sample
size of the combined sample is close to the
BRFSS sample size
 BRFSS sample size is published, and alone was
the best practical measure of the amount of local
data
13
2) Labeling Local Specificity
 Presenting the BRFSS sample size as a number
along with the estimates didn’t convey the
message of local specificity
 Developed the term local specificity and selected
qualitative (i.e., high, medium, and low) rather
than quantitative descriptors
14
3) Assigning Thresholds
 Selected BRFSS sample size of 50 as the
threshold for low local specificity
 Determining break points for the categories of
local specificity deserves further study
15
Ratios of model-based county level current mammography
screening rate over the bias-corrected BRFSS direct estimate
16
Small area estimates of mammography screening by county in
Pennsylvania, with a mini-map showing local specificity
Warren county 2000-2003
percentage = 65.9 (56.6-75.2)
Westmoreland county 2000-2003
percentage = 64.8 (57.5-72.2)
17
IV. Discussion
 Our experience has convinced us that such a measure is
critical for end users in their use and interpretation of
results
 The potential importance of local specificity should not
be under-emphasized, given that users demand more
from SAEs than from the results of most other statistical
models
 There is no single computational formula for calculating
levels of local specificity that will apply generally across
various models and further research is needed
 Whenever estimates are based on non-ignorable levels
of borrowed strength, it is vitally important to disseminate
analyses in such a way that local specificity, as an
important index of fitness for use, be conveyed to data
users in a clear and unbiased manner
18
Thank you!
Contact information:
Benmei Liu, Ph.D.
Survey Statistician
National Cancer Institute
[email protected]
19