Some kind of sampling
Download
Report
Transcript Some kind of sampling
Cluster Sampling
Module 3
Session 8
1
Purpose of the session
To demonstrate how a cluster sample is
selected in practice
To demonstrate how parameters are
estimated under cluster sampling
We do this for clusters of same size and
clusters of different sizes.
The practicalities of cluster sampling is
also discussed.
2
Introduction - Simple random
sampling not always appropriate!
Example
Population of N=324
households
Households arranged into 36
“villages” of 9 households
each
Costly to travel between
villages
Cheap to travel between
households in a village
Taking a SRS of n=27 households is a “costly” strategy
3
Cluster sampling
Example (cont.)
Each village is a primary
sampling unit (PSU)
Each household in a village
is a secondary sampling unit
(SSU)
Take a sample of villages
Sample all households
within the selected villages
This is one-stage cluster
sampling.
4
Cluster sampling
Cluster sampling is useful:
•
•
•
-
Structure of the units is hierarchical (e.g.
villages and households within villages)
Sampling frame may not exist at SSU
level (may only exist at PSU level)
Cost
e.g. in example, cluster sampling is cheaper
than SRS for same sampling effort.
5
Illustration: Estimation
Cluster sampling: 3 villages
out of 36 selected using
SRS.
Income from sale of goods
recorded for each
household, and totalled up
for village.
Estimates: Mean village income is 256.7
Total income for area is 9240
6
In practice…
Units in a cluster tend to be more similar to
each other and different to units in other
clusters
Cluster sampling often leads to less precise
estimates than SRS
(opposite concept to stratification)
Trade-off between convenience and precision:
If cluster sampling cheap to do, could take
larger sample to help improve precision.
7
Selecting the PSUs
In this first (unrealistic) example, the villages all
have the same number of households, hence we
select villages using simple random sampling
In general the PSUs (villages) may not have the
same number of SSUs (households). Might then
want to select PSUs using
Probability proportional to size.
gives large PSUs a greater probability of occurring
in the sample than a small PSU
8
PPS Sampling (with replacement)
Example:
M=8 Villages (PSUs) of different sizes.
Want to sample 3 of them (m=3).
Assume interest is still in income from sale of
goods (recorded for households and totalled for
each village).
Larger villages are likely to have higher incomes,
and smaller villages lower incomes.
9
PPS sampling (cont)
240 households (SSUs) in the population arranged
in the villages as follows:
PSU (e.g. village no.) 1 2 3 4 5 6 7 8
SSUs (e.g. no. of h’holds) 10 10 20 20 40 40 50 50
Probability of village being selected (pi ) is:
PSU 1
2
3
4
pi
1/24 1/24 1/12 1/12
5
1/6
6
1/6
7
8
5/24 5/24
10
PPS sampling (cont)
Step 1: Calculate the cumulative sum of the SSUs
PSU
1
2
3
4
5
6
7
8
Sum 10 20 40 60 100 140 190 240
Step 2: Draw a number at random from 1,2,…240
This determines which village is selected
e.g. 48 would be in Village 4, and 190 in Village 7.
11
PPS sampling (cont)
Step 3: Replace number and repeat to select other
villages
Three numbers may be 33, 174, 137
to give Villages 3, 7 and 6
Step 4: Sample all households in the selected
villages
The calculation of estimated total income for the area
then weights according to the size of the village.
12