Lecture 3 - Tulane University

Download Report

Transcript Lecture 3 - Tulane University

Survey Methodology
Sampling, Part 2
EPID 626
Lecture 3
1/26/00
Random digit dialing
• Delineate the geographic boundaries of
the sampling area
• Identify all of the exchanges used in the
geographic area
• Identify the distribution of prefixes with
the sampling area
– Example: There may be 8 exchanges, but
you may find that 3 of them are used for
nearly two-thirds of residential lines.
1/26/00
Random digit dialing
• You may stratify based on the
distribution of prefixes
– Ex. Take more samples of the 3 exchanges
that account for the most residential lines
• Try to identify vacuous suffixes
– These are suffixes not yet assigned or
assigned in large groups to a business
– Usually consider suffixes in 100s
• ex. 0000-0099, 0100-0199
1/26/00
Random digit dialing
• May randomly select the four-digit
suffixes
– ex. use a random-numbers table
• Alternatively, you may use a plus-one
approach
– When you reach residence, use the
number as a seed, and add fixed digits
(one or two) to get the next sample
1/26/00
Random digit dialing
• What are the advantages and
disadvantages of the random method?
• What are the advantages and
disadvantages of the plus-one method?
1/26/00
Random digit dialing
• Provides a nonzero chance of reaching
any household within a sampling area
that has a telephone line regardless of
whether the number is listed
• Is the probability of reaching every
household equal?
– No. Households with more than one phone
line will have a greater probability than
households with one phone line.
– Adjust for unequal probability by weighting
1/26/00
Random Digit Dialing
• Advantages: Inexpensive and easy to
do
• Disadvantages:
1. Large number of unfruitful calls
2. Will exclude individuals without
phones
3. May be difficult to ascertain
geographic area
1/26/00
Cluster sampling
• Each member of the study population is
assigned to a group or cluster, then
clusters are selected at random and all
members of a selected cluster are
included in the sample
• Clusters are often naturally-occurring
groupings such as schools, households,
or city blocks
(Henry, 1990)
1/26/00
Multistage sampling strategy
• Underlying concept is similar to cluster
sampling
• Clusters are selected as in the cluster
sample, then sample members are
selected from the cluster members by
simple and random sampling
• Clustering may be done at more than
one stage
(Henry, 1990)
1/26/00
School example
• Assume 20,000 students; 40 schools
• Desired sample size=2,000 students
(i=10)
1/26/00
Possible Approaches
• Select all schools, list all students, and
select 1/10 of students (SRS)
• Select 1/2 of schools, then select 1/5 of
students (multistage)
• Select 1/5 of schools, then select 1/2 of
students (multistage)
• Select 1/10 of schools, then collect data
on all students in those schools (cluster)
1/26/00
Area probability sampling
• Geographic basis
• Divide a total geographic area into
exhaustive, mutually exclusive subareas
• Sample subareas
• List all housing units within selected
subareas
• Sample from list or collect data from
entire list
1/26/00
(Fowler, 1993)
Example
• City has 400 blocks with a total of
20,000 housing units
• Sample size is 2,000 housing units
(i=10)
1/26/00
Approaches
• Sample 80 blocks, list housing units,
then sample 1/2 of the housing units
• Sample 40 blocks, list housing units,
then sample all of them
1/26/00
Area probability sampling
proportional to size
• Choose a cluster size for the last stage
of sampling (for example, 10 housing
units)
• Estimate the number of housing units
on each block
• Order the blocks so that similar ones
are contiguous
• Create a cumulative count of housing
1/26/00
units
• Determine the interval between clusters
(I=i*cluster size)
• Choose a random start between 1 and I.
• Proceed through cumulative count,
selecting every Ith block
• List units on the selected block and
select cluster size (ex. 10)
(Fowler, 1993)
1/26/00
Example
Block
Est. HU
Cum. HU Hits
1
43
43
-
2
87
130
70
3
99
229
170
1/26/00
• What if your estimate of the number of
housing units on the block was wrong?
Use (cluster size/estimated units on
block) for each block
• Result is 10/43 for block 1,
10/87 for block 2,
and 10/99 for block 3.
1/26/00
Respondent selection
• Once you have selected a housing unit,
how do you select the individual
respondent?
• Who is the best person to provide the
desired information?
• For self-reporting surveys, we use
probability sampling to select the
respondent.
1/26/00
• Ascertain the number of eligible
individuals in the housing unit
• Number them
• Randomly select a respondent
• You may need to weight the response
by the number of eligible individuals in
the housing unit
1/26/00
Nonprobability sampling
designs
• Convenience: select cases based on
their availability for the study
• Most similar/dissimilar cases: select
cases that are judged to represent
similar conditions or, alternatively, very
different conditions
• Typical cases: select cases that are
known beforehand to be useful and not
1/26/00
to be extreme
Nonprobability sampling
designs
• Critical cases: select cases that are key
or essential for overall acceptance or
assessment
• Snowball: group members identify
additional members to be included in
sample
1/26/00
Nonprobability sampling
designs
• Quota: interviewers select sample that
yields the same proportions as the
population proportions on easily
identified variables
(Henry, 1990)
1/26/00
Terminology
•
•
•
•
•
•
Universe
Population
Survey population
Sampling frame
Sampling unit
Observation unit
1/26/00