Transcript Slide 1

Sampling
Basic
concepts
Overview

Why do sampling?

Steps for deciding sampling methodology

Sampling methods

Representative vs. bias

Probability vs. non-probability

Simple, random, systematic and cluster
sampling
What is the objective of
sampling?
The objective of
sampling is to
estimate an indicator
for the larger
population if we cannot
measure everybody.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Population of Papua New
Guinea

726,680 children less than 5 years of age

1,298,503 women 15-49 years of age
With 6 teams who each measure 13 women and
13 children per day, data collection would take
16,648 days or
45.6 years
What is necessary to achieve
this objective?
The sample must be representative
of the larger population.
Representative versus bias…
Representative
All members of a
population have an
equal chance of
being included in
the sample
 Results will be
close to the
population’s true
value
Bias
Some members have
greater chance of being
included than others
(e.g. interviewer bias,
main road bias).
 Results will differ from
the actual population
prevalence
 This error cannot be
corrected during the
analysis
random or biased sample?
a survey of child malnutrition is
conducted by measuring the
children of women who were
advised over the radio to bring
their under-fives to the health clinic
on Tuesday morning
BIASED
random or biased sample?
Proportion of HIV/AIDS affected
population is 5.8% based on
statistics from health facilities who
frequently take blood samples from
pregnant women
BIASED
Steps for deciding sampling
methodology
Define
objectives
and
geographi
c area
Identify
what info
to collect
Determin
e
sampling
method
Calculate
sample
size
Additional factors: time available, financial
resources, physical access (security)
Types of sampling

Non-probability sampling

Probability sampling
non-probability sampling…
sampling that doesn’t use random selection
to choose units to be examined or measured:
non-representative results
non-probability sampling…
When is it used?
 Rapid appraisal methods (e.g. key
informant/community group
interviews/focus group discussions)
 Often used in rapid assessments
 Sampling with “a purpose” in mind:
generally one or more pre-defined
groups or areas to assess
 Useful to reach targeted sample
quickly
probability sampling…
sampling that uses random selection to
choose units. Results
b are representative
of the larger population
Pro’s and Con’s of Probability and
Non-Probability Sampling
factor
probability
non- probability
precision:
time:
cost:
++
++
++
+
+
+
if lack of access
due to insecurity:
skill
requirements:
+
++
statistics skills
needed
qualitative
analysis skills
needed
key concepts for probability
sampling
population:
the group of people for which indicators
are measured
sampling frame:
the population list from which the sample
is to be drawn
sample:
the randomly selected subset of the
population
sampling unit:
the unit that is selected during the
process of sampling (e.g. first stage:
community, 2. stage: household)
Example
A food security and nutrition survey
is conducted in Flexiland. 100,000
households live in the area in 1,000 Identify
• Population
villages. First, 30 villages will be
selected. In each village 15
• Sampling frame
households will be visited. The head
of household head or spouse reports • Sample
on all food items consumed by the
• Respondent
household over the last 7 days. In
addition, all children 6-59 months are • Sampling units
measured. On average household
have 1.5 children in this age group.
Example cont.


Population: Flexiland
Sampling frame:



Sample:




First stage: List of villages
Second stage: List of households within villages
450 HHs (30*15)
675 children (450*1.5)
Respondent: Household head or spouse
Sampling units:


Primary: Villages
Secondary: Households, children (6-59 months)
Types of probability sampling
A: Simple random
B: Systematic
C: Cluster
A: Simple Random Sampling
Each household/person randomly is
selected from population list.
Easier to use when population of
interest is small and confined to
small geographic area.
Steps:
1. Number each sampling unit
2. Choose new random number for
each selection (random number
table or lottery)
Example: Select 5 people out of 10
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Example: 1. Person = 2
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Example: 2. Person = 3
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Example: 3. Person = 5
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Example: 4. Person = 6
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Example: 5. Person = 9
Number
1
2
3
4
5
6
7
8
9
0
Household
Edmond
Daniel
Jyoti
Victor
Anne
Sheriff
Vandi
Iye
Victor
Rauf
Random number table
2352 6959 7678 1937
2554 6804 9098 4316
4318 2346 7276 1880
7136 9603 0163 3152
7000 2865 8357 4475
9804 0042 1106 7949
2932 9958 9582 2235
1140 1164 7841 1688
4097 8995 5030 1785
5420 0125 4953 1332
5540 6278 1584 4392
3258 1374 1617 7427
Using Random Number Tables

If units < 10, then use 1 digit of table numbers

If units < 100, then use 2 digits of table numbers

If units < 1000, then use 3 digits of table numbers
Example: You want to randomly select 6 out of 71 towns
1.
2.
You number them from 1 to 71.
Close eyes and place fingertip on the table to start
3.
Decide if you want to move right, left, up or down
4.
5.
Select first two digits of each number in the table
Cross out those that start with 72 or higher
TABLE OF RANDOM NUMBERS
39634 62349 74088 65564 16379 19713 39153 69459 17986 24537
14595 35050 40469 27478 44526 67331 93365 54526 22356 93208
30734 71571 83722 79712 25775 65178 07763 82928 31131 30196
64628 89126 91254 99090 25752 03091 39411 73146 06089 15630
42831 95113 43511 42082 15140 34733 68076 18292 69486 80468
80583 70361 41047 26792 78466 03395 17635 09697 82447 31405
6 villages are
00209 90404 99457 72570 42194 49043 24330 14939 09865 45906
selected
05409 20830 01911 60767 55248 79253 12317 84120 77772 50103
95836 22530 91785 80210 34361 52228 33869 94332 83868 61672
65358 70469 87149 89509 72176 18103 55169 79954 72002 20582
Class exercise

Select randomly 4 members in this class
using the random number table
3647
4316
0163
0042
1140
1785
1584
Random number table
2352 6959 1937 2554 6804
4318 2346 7276 1880 7136
3152 7000 2865 8357 4475
1106 7949 2932 9958 9582
1164 7841 1688 4097 8995
5420 0125 4953 1332 5540
4392 3258 1374 1617 7427
9098
9603
9804
2235
5030
6278
3320
Using SPSS

SPSS can help to randomly select cases by using
the “select cases” function
 Data  Select cases  Random sample of cases
(option 1: xx% of all cases; option 2: x cases from
the first x cases)
Simple Random Sampling
B: Systematic Random Sampling
Similar to simple random sampling, works well in wellorganized refugee/IDP camps or neighborhoods
• First person chosen randomly
• Systematic selection of subsequent people
• Statistics same as simple random sampling
Steps:
• List or map all units in the population
• Compute sampling interval (Number of population /
Sample size)
• Select random start between 1 and sampling
interval
• Repeatedly add sampling interval to select
subsequent sampling units
Example 1 (household list): selection of 15 households in a
community of 47 households
1. Peter Smith
2. John Edward
3. Mary McLean
4. George Williams
5. Morris Tamba
6. Sayba Kolubah
7. James Tamba
8. Clifford Howard
9. Thomas Tarr
10. Jerry Morris
11. Jules Sana
12. Lisa Miller
13. David Harper
14. Peter Smith
15. John Edward
16. Mary McLean
17. George Williams
18. Morris Tamba
19. Sayba Kolubah
20. James Tamba
21. Clifford Howard
22. Thomas Tarr
23. Jerry Morris
24. Lisa Miller
25. David Harper
26. Hilary Scott
27. Smith Suba
28. Zoe Mulbah
29. Roosevelt Hill
30. Johnson Snow
31. Salif Jensen
32. Fassou Clements
33. Massa Kru
34. Emanuel Liberty
35. Stella Morris
36. Peter Smith
37. John Edward
38. Mary McLean
39. George Williams
40. Morris Tamba
41. Sayba Kolubah
42. James Tamba
43. Clifford Howard
44. Thomas Tarr
Select randomly starting
45. Jerry Morris
46. Lisa Miller
point: 1, 2 or 3 (counting,
47. David Harper
Sampling
interval:
47/15 = 3
lottery)
Example 1: selection of 15 households in a
community of 47 households
1. Peter Smith
2. John Edward
3. Mary McLean
4. George Williams
5. Morris Tamba
6. Sayba Kolubah
7. James Tamba
8. Clifford Howard
9. Thomas Tarr
10. Jerry Morris
11. Jules Sana
12. Lisa Miller
13. David Harper
14. Peter Smith
15. John Edward
16. Mary McLean
17. George Williams
18. Morris Tamba
19. Sayba Kolubah
20. James Tamba
21. Clifford Howard
22. Thomas Tarr
23. Jerry Morris
24. Lisa Miller
25. David Harper
26. Hilary Scott
27. Smith Suba
28. Zoe Mulbah
29. Roosevelt Hill
30. Johnson Snow
31. Salif Jensen
32. Fassou Clements
33. Massa Kru
34. Emanuel Liberty
35. Stella Morris
36. Peter Smith
37. John Edward
38. Mary McLean
39. George Williams
40. Morris Tamba
41. Sayba Kolubah
42. James Tamba
43. Clifford Howard
44. Thomas Tarr
45. Jerry Morris
46. Lisa Miller
47. David Harper
 15 HHs
are selected
Example 2 (refugee camp): selection of 40 households in a
camp made up of 480 households
Systematic Sampling
480/40 = 12 Interval = 12
Example 1: Which sampling method if no
registration took place yet?
Stankovic I camp, Macedonia
Example 2: Which sampling method if
registration already took place?
Chaman camp, Pakistan
Example 3: Which sampling method?
Kabumba camp, Zaire
What is required for both simple
and systematic random sampling?
Both require a complete list of
sampling units arranged in some order.
What do we do when no accurate list
all basic sampling units is available?
C: Cluster Sampling
Used when sampling frame or geographic
area is large
 Saves time and resources
Objective: To choose smaller geographic
areas in which simple or systematic random
sampling can be done
of
Two-stage Cluster Sampling
1st stage: sites are selected using
‘probability proportion to size (PPS)’
methodology (= “clusters”)
2nd stage: within each cluster, households
are randomly selected
Example 1: 25 clusters per district, 15
households per cluster = 375 households in
each district
Two-stage Cluster Sampling in Flexiland
1. Step: Select randomly 25 communities
Flexiland
2. Step: Within each cluster (community), select 15
households using random or systematic random sampling
Example 4: Which sampling method?
1500 kms
Stratification

Stratification is the process of grouping members of the population
into relatively homogeneous subgroups (e.g. regions, districts,
livelihood zones)

The strata should be mutually exclusive: every element in the
population must be assigned to only one stratum
Within each stratum, random, systematic or two stage cluster
sampling is applied



Advantages:
 Sub-groups can be compared
 Representativeness is improved as the sample is more homogeneous
During the analysis, weighting is used to generate results that are
representative at the aggregate level (e.g. nation, rural/urban
population)
Example 5: How many strata?
Example 6: How many strata?
Final panel exercise:
Which sampling method would you choose?

Rapid emergency food security assessments
following a flood in the Northern Atlantic Coast
region of Nicaragua?

Nutrition survey in IDP-camp in Darfur?

Comprehensive Food Security and Vulnerability
Analysis (CFSVAs) in Zambia?

Market assessment in Yemen?
Questions