sampling distribution
Download
Report
Transcript sampling distribution
Sampling Distributions
Chapter 18
X
Sampling Distributions
A parameter is a number that describes the
population. In statistical practice, the value of a
parameter is unknown. (µ, σ, and now p or π)
A statistic is a number that can be computed
from the sample data without making use of
any unknown parameters. We often use a
statistic to estimate an unknown parameter.
( X, sx, and now p̂ )
Sampling Distributions
• A proportion is computed from a set of
categorical data. It is a random quantity
that has a distribution. We call that the
sampling distribution for the proportion.
Sampling Distributions
• Sampling variability: in repeated random
sampling, the value of the statistic will
vary. This makes sense, the proportions
vary from sample to sample because the
samples are composed of different values.
Sampling Distributions
• To describe sampling distributions, use the
same descriptions as other distribution:
overall shape, outliers, center, and spread.
(CUSS & BS)
Sampling Distributions
• Bias suggests that a sampling technique favors
a certain outcome (the sampling technique is
unfair). Bias in a sampling distribution, is the
idea that the center of the sampling distribution
is not the same as the population center.
• A statistic is unbiased if the mean of its
sampling distribution is equal to the population
mean it is estimating.
Sampling Distributions
• The variability of a statistic (σ) is
described by the spread of its sampling
distribution. The spread is determined by
the sampling design and the sample size.
Larger samples give less variability.
Sampling Distributions of
Proportions
• Sampling Distribution of a Sample
Proportion – Categorical Data
Choose an SRS of size n from a large population
with population proportion p having some
characteristic of interest. Let p̂ be the proportion
of the sample having that characteristic. Then
the sampling distribution of p is approximately
normal as long as the conditions on the
following page are met.
Sampling Distributions of
Proportions
• Conditions:
1) Randomization. The sample should be a
simple random sample (SRS) of the
population. (This is often difficult to
achieve in reality. We at least need to be
very confident that the sampling method was
unbiased and that the sample is
representative of the population.)
Sampling Distributions of
Proportions
• Conditions:
2) 10% Rule. In order to insure
independence, we can not take a sample
that is too large without replacement. As
long as our sample is no more than 10%
of our population size, we protect
independence.
Sampling Distributions of
Proportions
• Conditions:
3) Success/Failure. To insure that the
sample size is large enough to
approximate normal, we must expect at
least 10 successes and at least 10
failures.
np 10 and n(1 – p) 10
Sampling Distributions of
Proportions
Sampling Distributions of Sample
Means
• Sampling Distribution of a Sample
Mean – Quantitative Data
• A distribution is created from the means of many
samples. Data is quantitative. What is the
purpose?
Averages are less variable and more normal
than individual observations
Sampling Distributions of Sample
Means
• The shape of the distribution of x-bar
depends on the shape of the
population.
** If the population is normal, then the
distribution of the sample mean will be
normal (regardless of sample size).
Sampling Distributions of Sample
Means
• The shape of the distribution x-bar
depends on the shape of the
population.
**For skewed or odd shaped distributions, if
the sample size is large enough, the
sampling distribution will be approximately
normal. This idea leads us to…
Sampling Distributions of
Sample Means
The Central Limit Theorem (CLT)
CLT addresses two things in a distribution,
shape and spread.
As the sample size increases:
• The shape of the sampling distribution
becomes more normal
• The variability of the sampling distribution
decreases
Sampling Distributions of Sample
Means
• The Law of Large Numbers
Draw observations at random from any
population with given mean . As n
increases, the mean of the observed
values (x-bar) gets closer and closer to the
true mean, .
x
Sampling Distributions of Sample
Means
• Conditions: (1st 2 are the same for
both kinds of data)
1) Randomization.
2) 10% rule.
Sampling Distributions of Sample
Means
• Conditions:
3) Large Enough Sample. There is no “for
sure” way to tell if your sample is large
enough. It is common practice that if your
sample is at least 30 (n ≥ 30), you are OK
to assume normal for the sampling
distribution.
(Remember, if the distribution is given
normal, then any sample size is OK)
Sampling Distributions of Sample
Means
• When conditions are met, and the data is
quantitative, the sampling distribution is
normal with a center at the population
mean, μ, and a standard deviation
at X
So….
n
N(μ, )
X
n
Sampling Distributions of Sample
Means
• Since the standard deviation decreases at
a rate of √n, we must take a sample 4
times as large to reduce the standard
deviation by ½.
Sampling Distributions
• We said at the beginning that in most real
life cases, we will not know the population
parameters (µ, σ, p or π) so we will have
to use the sample statistics as estimates
of those. Our terminology changes just a
little…
Sampling Distributions
Sampling Distributions
Sampling Distributions