No Slide Title

Download Report

Transcript No Slide Title

Systematic Sampling (SYS)

Up to now, we have only considered
one design:


SRS of size n from a population of size N
New design: SYS

DEFN: A 1-in-k systematic sample is a
sample obtained by randomly selecting
one sampling unit from the first k sampling
units in the sampling frame, and
every k-th sampling unit thereafter
1
Sampling procedure for SYS

Have a frame, or list of N SUs


Determine sampling interval, k


k is the next integer after N/n
Select first SU in the list



Assume SU = OU for now
Choose a random number, R , between 1 & k
R-th SU is the first SU to be included in the sample
Select every k-th SU after the R-th SU

Sample includes unit R, unit R + k, unit R + 2k,
…, unit R + (n-1)k
2
Example

Telephone survey of members in an
organization abut organization’s website use





N = 500 members
Have resources to do n = 75 calls
N / n = 500/75 = 6.67
k=7
Random number table entry: 52994



Rule: if pick 1, 2, …, 7, assign as R; otherwise discard #
Select R = 5
Take SU 5, then SU 5+7 =12, then SU 12+7 =19,
26, 33, 40, 47, …
3
Example – 2

Arrange population in rows of length
k=7
R
1
2
3
4
5
6
i
7
1
2
3
4
5
6
7
1
8
9
10
11
12
13
14
2
15
16
17
18
19
20
21
3
22
23
24
25
26
27
28
4
…
…
491 492 493 494 495 496 497
71
498 499 500
72
4
Properties of systematic
sampling – 1


Number of possible SYS samples of size n is k
Only 1 random act - selecting R


After select 1st SU, all other SUs to be included in
the sample are predetermined
A SYS is a cluster with sample size 1


Cluster = set of SUs separated by k units
Unlike SRS, some sample sets of size n have
no chance of being selected given a frame

A SU belongs to 1 and only 1 sample
5
Properties of systematic
sampling – 2

Number of possible SYS samples of size
n is k


Probability of selecting a sample


Are these samples equally likely to be
selected?
P{S } = 1/k
Inclusion probability for a SU

P{ SU i  S } = 1/k
6
Properties of systematic
sampling – 3

Plan for sample size of n , but actual
sample size may vary


If N / k is an integer, then n = N / k
If N / k is NOT an integer, then n is
either the integer part of (N / k )
or
the integer part of (N / k ) + 1
7
Properties of systematic
sampling – 4

Because only the starting SU of a SYS
sample is randomized, a direct estimate
of the variance of the sampling
distribution can not be estimated


Under SRS, variance of the sampling
distribution was a function of the
population variance, S2
Have no such relationship for SYS
8
Estimation for SYS

Use SRS formulas to estimate population
parameters and variance of estimator
Estimate pop MEAN y U
2
s
n

ˆ
with y and V [y ] 
1  
n 
N
Estimate pop TOTAL t with tˆ and Vˆ[tˆ]  N 2Vˆ[y ]
pˆ(1  pˆ) 
n
ˆ
Estimate pop PROPORTION p with pˆ and V [pˆ] 
1  
n 1 
N
9
Properties of systematic
sampling – 5

Properties of SRS estimators depends
on frame ordering


SRS estimators for population parameters
usually have little or no bias under SYS
Precision of SRS estimators under SYS
depends on ordering of sample frame
10
Order of sampling frame

Random order



Ordered in relation to y



SYS acts very much like SRS
SRS variance formula is good approximation
Improves representativeness of sample
SRS formula overestimates sampling variance
(estimate is more precise than indicated by SE)
Periodicity in y = sampling interval k


Poor quality estimates
SRS formula underestimates sampling variance
(overstate precision of estimate)
11
Example – 3


Suppose X [age of member] is correlated with Y
[use of org website]
Sort list by X before selecting sample
k
1
2
3
4
5
6
X
7
7 young
i
1
2
3
4
5
6
8
9
10
11
12
13
14
2
15
16
17
18
19
20
21
3
22
23
24
25
26
27
28
4
…
mid
491 492 493 494 495 496 497
498 499 500
1
…
71
old
72
12
Practicalities



Another building block (like SRS) used in
combination with other designs
SYS is more likely to be used than SRS if
there is no stratification or clustering
Useful when a full frame cannot be
enumerated at beginning of study


Exit polls for elections
Entrance polls for parks
13
Practicalities – 2

Best if you can sort the sampling frame by an
auxiliary variable X that is related to Y



Improve representativeness of sample (relative to
SRS)
Improve precision of estimates
Essentially offers implicit form of stratification
14