Transcript BIOS 124
Surveys and Population-Based Studies
Definition of a "Survey"
A method of collecting information about a human
population in which direct (or indirect) contact is made
with the units of the study (e.g., individuals,
organizations, communities, etc.) by using systematic
methods of measurements like questionnaires and
interview schedules. (Warwick and Lininger, 1975)
Examples of well-known surveys:
– U.S. Decennial Census
– Current Population Survey (n=60,000 HHs/mo.)
– Health Interview Survey (n=50,000 HHs/yr.)
– Other Examples in Groves, et al. (2004)
1
Nonprobability Sampling
Selection
by nonrandom methods
Membership in the sample is ultimately left
to human judgment
No basis for assuming stochastic behavior
of sample estimates
One method: quota sampling
2
Quota Sampling
Quota control/allocation for each interviewer:
Category
Age
Gender
Interviewer
Assignment
1
<40
M
15
2
<40
F
15
3
>40
M
10
4
>40
F
10
TOTAL
50
Filling category is left to interviewer's discretion (i.e., judgment)
3
Probability Sampling
Ultimate
selection left to some randomized
(i.e., chance) mechanism
Two types:
– Random sampling
– Survey sampling
4
Random Sampling
"Population"
is infinite and abstract;
distribution of measurements follows some
assumed form (e.g., a normal distribution)
Sample is the result of independently
selecting a measurement at random from the
assumed distribution, with sample size as
the number of selections
5
Random Sampling
“Random sample” as defined by Hogg & Craig:
"Let X1, X2, . . ., Xn denote n mutually statistically independent
random variables, each of which has the same but possibly
unknown probability density function, f(x). The random variables
X1, X2, . . ., Xn are then said to constitute a random sample from
a distribution that has pdf, f(x).
Example: f(x) for the normal distribution:
f (x)
2
L
(x ) O
exp M
P
2
N
Q
2
1
2
2
x
6
Population-Based Sampling
Population
is finite (i.e., made up of a
countable set of members)
Distribution of measurements usually does
not follow a neat mathematical form
– Ex: Number of health care visits in the past 12
months
Randomization
used but selections may not
be made independently
7
Probability Sampling
Each population element has a known and nonzero
probability of being selected into the sample
EPSEM sample design:
– Sample in which selection probability for each
element is equal;
– Stands for Equal Probability Selection Method.
– Also use the term "self-weighting"
8
Advantages of Probability
Sampling
Statistical
theory (including sampling
theory) assumes this method
Not subject to biases of human judgment
Can directly measure the precision (i.e.,
statistical quality) of estimates produced
from sample
9
Utility of Sampling Theory
Basis
for settling on ways to estimate
population parameters and the precision of
those estimates
Basis for much of the decision making in
designing the sample
10
Inference in Population-Based Studies
Circle
of inference:
Population:
Values to be
Estimated
Analysis:
(Population Values
Estimated)
Sample Design
(Probability Sampling)
Selected Sample:
(Data Collected)
11
Population Hierarchy
Population
Member
12
Population Hierarchy: Some Examples
First
grade students in NC schools
Residents
of the United States
13
Components of a
Population-Based Study
Planning
– Study specifications
Target population
vs.
Survey population
– Budget considerations
– Staff communication
– Sample size
14
Components of a
Population-Based Study
Sampling
– Preliminary activities
– Search for sampling frame(s)
List(s) of units to be sampled
– Develop the sample design
Plan to choose the sample
Consists of a sequence of statistical issues and
decisions
– Select the sample
15
Components of a
Population-Based Study
Data collection instrument
– Design questionnaire and forms
– Small-scale testing
– Manuals for training
Data collection
– Preparation (e.g., hiring and training)
– Field operations (e.g., monitoring and supervision)
Manual editing/coding
– Preparation
– Operations
16
Components of a
Population-Based Study
Data
entry
– Preparation
– Operations
Machine
editing/coding and file processing
– Preparation
– Run edits
– Prepare analysis work files
Analysis
and dissemination
17