20.Additional Topics in Sampling
Download
Report
Transcript 20.Additional Topics in Sampling
Statistics for
Business and Economics
6th Edition
Chapter 20
Sampling:
Additional Topics in Sampling
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-1
Chapter Goals
After completing this chapter, you should be
able to:
Explain the basic steps of a sampling study
Describe sampling and nonsampling errors
Explain simple random sampling and stratified sampling
Analyze results from simple random or stratified
samples
Determine sample size when estimating population
mean, population total, or population proportion
Describe other sampling methods
Cluster Sampling, Two-Phase Sampling, Nonprobability Samples
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-2
Steps of a Sampling Study
Step 6: Conclusions?
Step 5: Inferences From
Step 4: Obtaining Information?
Step 3: Sample Selection?
Step 2: Relevant Population?
Step 1: Information Required?
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-3
Sampling and
Nonsampling Errors
A sample statistic is an estimate of an unknown
population parameter
Sample evidence from a population is variable
Sample-to-sample variation is expected
Sampling error results from the fact that we only
see a subset of the population when a sample
is selected
Statistical statements can be made about
sampling error
It can be measured and interpreted using confidence
intervals, probabilities, etc.
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-4
Sampling and
Nonsampling Errors
(continued)
Nonsampling error results from sources not
related to the sampling procedure used
Examples:
The population actually sampled is not the relevant
one
Survey subjects may give inaccurate or dishonest
answers
Nonresponse to survey questions
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-5
Types of Samples
Probability Sample
Items in the sample are chosen on the
basis of known probabilities
Nonprobability Sample
Items included are chosen without
regard to their probability of occurrence
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-6
Types of Samples
(continued)
Samples
Probability Samples
Simple
Random
Non-Probability
Samples
Stratified
Systematic
Judgement
Cluster
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Convenience
Quota
Chap 20-7
Simple Random Samples
Suppose that a sample of n objects is to be selected
from a population of N objects
A simple random sample procedure is one in which
every possible sample of n objects is equally likely to be
chosen
Only sampling without replacement is considered here
Random samples can be obtained from table of random
numbers or computer random number generators
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-8
Systematic Sampling
Decide on sample size: n
Divide frame of N individuals into groups of j
individuals: j=N/n
Randomly select one individual from the 1st
group
Select every jth individual thereafter
N = 64
n=8
First Group
j=8
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-9
Finite Population
Correction Factor
Suppose sampling is without replacement and
the sample size is large relative to the
population size
Assume the population size is large enough to
apply the central limit theorem
Apply the finite population correction factor
when estimating the population variance
finite population correction factor
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Nn
N
Chap 20-10
Estimating the Population Mean
Let a simple random sample of size n be
taken from a population of N members with
mean μ
The sample mean is an unbiased estimator of
the population mean μ
The point estimate is:
1 n
x xi
n i1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-11
Estimating the Population Mean
(continued)
An unbiased estimation procedure for the variance
of the sample mean yields the point estimate
2
s
N
n
2
σˆ x
n
N
Provided the sample size is large, 100(1 - )%
confidence intervals for the population mean are
given by
x z α/2σˆ x μ x z α/2σˆ x
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-12
Estimating the Population Total
Consider a simple random sample of size
n from a population of size N
The quantity to be estimated is the
population total Nμ
An unbiased estimation procedure for the
population total Nμ yields the point
estimate NX
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-13
Estimating the Population Total
An unbiased estimator of the variance of the
population total is
2
s
N2σˆ 2x N(N n)
n
Provided the sample size is large, a 100(1 - )%
confidence interval for the population total is
Nx z α/2Nσˆ x Nμ Nx z α/2Nσˆ x
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-14
Confidence Interval for
Population Total: Example
A firm has a population of 1000 accounts and
wishes to estimate the total population value
A sample of 80 accounts is selected with
average balance of $87.6 and standard
deviation of $22.3
Find the 95% confidence interval estimate of
the total balance
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-15
Example Solution
N 1000, n 80,
x 87.6,
s 22.3
2
2
s
(22.3)
N σˆ N(N n)
(1000)(920 ) 5718835
n
80
2
2
x
Nσˆ x 5718835 2391.41
Nx z α/2N σˆ x (1000)(87. 6) (1.96)(239 1.41)
82912.84 Nμ 92287.16
The 95% confidence interval for the population total
balance is $82,912.52 to $92,287.16
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-16
Estimating the
Population Proportion
Let the true population proportion be P
Let p̂ be the sample proportion from n
observations from a simple random sample
The sample proportion, p̂ , is an unbiased
estimator of the population proportion, P
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-17
Estimating the
Population Proportion
(continued)
An unbiased estimator for the variance of the
population proportion is
ˆ (1 pˆ ) (N n)
p
σˆ
n 1
N
2
pˆ
Provided the sample size is large, a 100(1 - )%
confidence interval for the population proportion is
pˆ zα/2σˆ pˆ P pˆ zα/2σˆ pˆ
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-18
Stratified Sampling
Overview of stratified sampling:
Divide population into two or more subgroups (called
strata) according to some common characteristic
A simple random sample is selected from each subgroup
Samples from subgroups are combined into one
Population
Divided
into 4
strata
Sample
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-19
Stratified Random Sampling
Suppose that a population of N individuals can be
subdivided into K mutually exclusive and collectively
exhaustive groups, or strata
Stratified random sampling is the selection of
independent simple random samples from each
stratum of the population.
Let the K strata in the population contain N1, N2,. . .,
NK members, so that N1 + N2 + . . . + NK = N
Let the numbers in the samples be n1, n2, . . ., nK.
Then the total number of sample members is
n1 + n2 + . . . + nK = n
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-20
Estimation of the Population Mean,
Stratified Random Sample
Let random samples of nj individuals be taken from
strata containing Nj individuals (j = 1, 2, . . ., K)
Let
K
K
Nj N and n j n
j1
j1
Denote the sample means and variances in the strata
by Xj and sj2 and the overall population mean by μ
An unbiased estimator of the overall population mean
μ is:
1 K
x st Nj x j
N j1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-21
Estimation of the Population Mean,
Stratified Random Sample
(continued)
An unbiased estimator for the variance of the overall population
mean is
σˆ 2xst
where
1 K 2ˆ2
2 Nj σ x j
N j1
2
s
(N j n j )
j
2
σˆ x j
nj
Nj
Provided the sample size is large, a 100(1 - )% confidence
interval for the population mean for stratified random samples is
x st zα/2σˆ xst μ x st zα/2σˆ xst
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-22
Estimation of the Population Total,
Stratified Random Sample
Suppose that random samples of nj individuals from
strata containing Nj individuals (j = 1, 2, . . ., K) are
selected and that the quantity to be estimated is the
population total, Nμ
An unbiased estimation procedure for the population
total Nμ yields the point estimate
K
Nx st Nj x j
j1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-23
Estimation of the Population Total,
Stratified Random Sample
(continued)
An unbiased estimation procedure for the variance of
the estimator of the population total yields the point
estimate
K
N2σˆ 2xst N2jσˆ 2xst
j1
Provided the sample size is large, 100(1 - )%
confidence intervals for the population total for
stratified random samples are obtained from
Nx st z α/2Nσˆ st Nμ Nx st z α/2Nσˆ st
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-24
Estimation of the Population
Proportion, Stratified Random Sample
Suppose that random samples of nj individuals from
strata containing Nj individuals (j = 1, 2, . . ., K) are
obtained
Let Pj be the population proportion, and p̂ j the
sample proportion, in the jth stratum
If P is the overall population proportion, an unbiased
estimation procedure for P yields
K
1
pˆ st Njpˆ j
N j1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-25
Estimation of the Population
Proportion, Stratified Random Sample
(continued)
•
An unbiased estimation procedure for the
variance of the estimator of the overall population
proportion is
σˆ p2ˆ st
1 K 2ˆ2
2 Nj σ pˆ j
N j1
where
pˆ j (1 pˆ j ) (N j n j )
σˆ
nj 1
Nj
2
pˆ j
is the estimate of the variance of the sample proportion in
the jth stratum
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-26
Estimation of the Population
Proportion, Stratified Random Sample
(continued)
Provided the sample size is large, 100(1 - )%
confidence intervals for the population proportion for
stratified random samples are obtained from
pˆ st zα/2σˆ pˆ st P pˆ st zα/2σˆ pˆ st
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-27
Proportional Allocation:
Sample Size
One way to allocate sampling effort is to make the
proportion of sample members in any stratum the same
as the proportion of population members in the stratum
If so, for the jth stratum,
nj
n
Nj
N
The sample size for the jth stratum using proportional
allocation is
nj
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Nj
N
n
Chap 20-28
Optimal Allocation
To estimate an overall population mean or total and if the
population variances in the individual strata are
denoted σj2 , the most precise estimators are obtained
with optimal allocation
The sample size for the jth stratum using optimal
allocation is
nj
N jσ j
n
K
N σ
i1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
i
i
Chap 20-29
Optimal Allocation
(continued)
To estimate the overall population proportion, estimators
with the smallest possible variance are obtained by
optimal allocation
The sample size for the jth stratum for population
proportion using optimal allocation is
nj
N j Pj (1 Pj )
K
N
i1
i
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
n
Pi (1 Pi )
Chap 20-30
Determining Sample Size
The sample size is directly related to the size
of the variance of the population estimator
If the researcher sets the allowable size of
the variance in advance, the necessary
sample size can be determined
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-31
Sample Size, Mean,
Simple Random Sampling
Consider estimating the mean of a population of N
members, which has variance σ2
2
If the desired variance, σ x of the sample mean is
specified, the required sample size to estimate the
population mean through simple random sampling is
Nσ 2
n
(N 1)σ 2x σ 2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-32
Sample Size, Mean,
Simple Random Sampling
(continued)
Often it is more convenient to specify directly the
desired width of the confidence interval for the
population mean rather than σ 2x
Thus the researcher specifies the desired margin of error for
the mean
Calculations are simple since, for example, a 95%
confidence interval for the population mean will
extend an approximate amount 1.96 σ x on each side
of the sample mean, X
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-33
Required Sample Size Example
2000 items are in a population. If σ = 45,
what sample size is needed to estimate the
mean within ± 5 with 95% confidence?
N = 2000, 1.96 σ x = 5 → σ x = 2.551
Nσ 2
(2000)(45) 2
n
269.39
2
2
2
2
(N 1)σ x σ
(1999)(2.5 51) (45)
So the required sample size is n = 270
(Always round up)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-34
Sample Size, Proportion,
Simple Random Sampling
(continued)
Consider estimating the proportion P of individuals
in a population of size N who possess a certain
attribute
2
If the desired variance, σpˆ , of the sample proportion
is specified, the required sample size to estimate the
population proportion through simple random
sampling is
NP(1 P)
n
(N 1)σ p2ˆ P(1 P)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-35
Sample Size, Proportion,
Simple Random Sampling
(continued)
The largest possible value for this expression occurs
when the value of P is 0.25
nmax
0.25N
(N 1)σ p2ˆ 0.25
A 95% confidence interval for the population proportion
will extend an approximate amount 1.96 σpˆ on each
side of the sample proportion
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-36
Required Sample Size Example
How large a sample would be necessary
to estimate the true proportion of voters
who will vote for proposition A, within ±3%,
with 95% confidence, from a population of
3400 voters?
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-37
Required Sample Size Example
(continued)
Solution:
N = 34000
For 95% confidence, use z = 1.96
1.96 σ pˆ s = .03 → σ pˆ s = .015306
nmax
0.25N
(0.25)(340 00)
1035.47
2
2
(N 1)σ pˆ 0.25 (33999)(.0 153) 025
So use n = 1036
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-38
Sample Size, Mean,
Stratified Sampling
Suppose that a population of N members is subdivided
in K strata containing N1, N2, . . .,NK members
Let σj2 denote the population variance in the jth stratum
An estimate of the overall population mean is desired
If the desired variance, σ 2xst , of the sample estimator is
specified, the required total sample size, n, can be
found
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-39
Sample Size, Mean,
Stratified Sampling
(continued)
For proportional allocation:
K
2
N
σ
j j
j1
n
Nσ
2
xst
1 K
N jσ 2j
N j1
For optimal allocation:
1 K
2
N jσ j
N j1
n
1 K
2
Nσ x s t N jσ 2j
N j1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-40
Cluster Sampling
Population is divided into several “clusters,”
each representative of the population
A simple random sample of clusters is selected
Generally, all items in the selected clusters are examined
An alternative is to chose items from selected clusters using
another probability sampling technique
Population
divided into
16 clusters.
Randomly selected
clusters for sample
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-41
Estimators for Cluster Sampling
A population is subdivided into M clusters and a simple
random sample of m of these clusters is selected and
information is obtained from every member of the
sampled clusters
Let n1, n2, . . ., nm denote the numbers of members in
the m sampled clusters
Denote the means of these clusters by x1, x 2, , xm
Denote the proportions of cluster members possessing
an attribute of interest by P1, P2, . . . , Pm
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-42
Estimators for Cluster Sampling
(continued)
The objective is to estimate the overall population mean
µ and proportion P
Unbiased estimation procedures give
Mean
Proportion
m
xc
n x
i 1
m
i
n
i1
m
i
i
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
p̂c
n p
i1
m
i i
n
i1
i
Chap 20-43
Estimators for Cluster Sampling
(continued)
Estimates of the variance of these estimators, following from
unbiased estimation procedures, are
Mean
σˆ 2xc
Proportion
m 2
2
n
(
x
x
)
i i
c
M m i1
Mm n 2
m 1
σˆ p2ˆ c
m 2
2
ˆ
n
(P
p
)
i i
c
M m i1
Mm n 2
m 1
m
Where n
n
i1
m
i
is the average number of individuals in the sampled clusters
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-44
Estimators for Cluster Sampling
(continued)
Provided the sample size is large, 100(1 - )%
confidence intervals using cluster sampling are
for the population mean
x c zα/2σˆ xc μ x c zα/2σˆ xc
for the population proportion
pˆ c zα/2σˆ pˆ c P pˆ c zα/2σˆ pˆ c
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-45
Two-Phase Sampling
Sometimes sampling is done in two steps
An initial pilot sample can be done
Disadvantage:
takes more time
Advantages:
Can adjust survey questions if problems are noted
Additional questions may be identified
Initial estimates of response rate or population
parameters can be obtained
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-46
Non-Probability Samples
Samples
Probability Samples
Simple
Random
Non-Probability
Samples
Stratified
Systematic
Judgement
Cluster
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Convenience
Quota
Chap 20-47
Non-Probability Samples
(continued)
It may be simpler or less costly to use a nonprobability based sampling method
Judgement sample
Quota sample
Convience sample
These methods may still produce good
estimates of population parameters
But …
Are more subject to bias
No valid way to determine reliability
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-48
Chapter Summary
Reviewed basic steps in a sampling study
Defined sampling and nonsampling errors
Examined probability sampling methods
Simple Random Sampling, Systematic Sampling, Stratified
Random Sampling, Cluster Sampling
Identified Estimators for the population mean, population
total, and population proportion for different types of
samples
Determined the required sample size for specified
confidence interval width
Examined nonprobabilistic sampling methods
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 20-49