sampling distribution

Download Report

Transcript sampling distribution

Sampling
Distributions
Parameter
A
number that describes the
population
 Symbols we will use for parameters
include
m - mean
s – standard deviation
p – proportion (Also p is used)
a – y-intercept of LSRL
b – slope of LSRL
Statistic
A
number that can be computed from
sample data without making use of any
unknown parameter
 Symbols we will use for statistics
include
x – mean
s – standard deviation
p – proportion
a – y-intercept of LSRL
b – slope of LSRL
Identify the boldface values as parameter or
statistic.
A carload lot of ball bearings has mean
diameter 2.5003 cm. This is within the
specifications for acceptance of the lot by
the purchaser. By chance, an inspector
chooses 100 bearings from the lot that
have mean diameter 2.5009 cm. Because
this is outside the specified limits, the lot
is mistakenly rejected.
Why do we take samples
instead of taking a census?
A
census is not always accurate.
 Census are difficult or impossible
to do.
 Census are very expensive to do.
A distribution is all the
values that a variable can be.
The sampling distribution
of a statistic is the
distribution of values taken
by the statistic in all
possible samples of the
same size from the same
population.
Sampling Distribution of Means
Consider the population – the length
of fish (in inches) in my pond consisting of the values
2, 7, 10, 11, 14
What is the mean
mand
8.8
x =standard
deviation
of this
sx = 4.0694
population?
Let’s take samples of size 2
(n = 2) from this population:
How many combinations of
samples of size 2 are possible?
5C2
= 10
mx = 8.8
sx = 2.4919
Find all 10 of
What
is theand
mean
these
samples
recordand
thestandard
sample
deviation
means. of the
sample means?
Repeat this procedure with sample
size n = 3
How many samples of size 3 are
possible?
C = 10
5
mx = 8.8
sx =
3
What
mean
Find
allisofthe
these
and standard
samples
and
deviation
of the
record
the
sample
1.66132 sample means?
means.
What do you notice?
 The
mean of the sampling distribution
EQUALS the mean of the population.
mx = m
Unbiased Estimator
 As
the sample size increases, the
standard deviation of the sampling
distribution decreases.
as n
sx
 Activity

http://onlinestatbook.com/stat_sim/index.html
A statistic used to estimate a
parameter is unbiased if the
mean of its sampling
distribution is equal to the
true value of the parameter
being estimated.
Thought to Ponder:
In real life, is it possible to create a
sampling distribution for every
study?
How many times can we afford to
take a sample?
General Properties for
Sample Means
Rule 1:
mx = m
s
Rule 2: sx =
n
This rule is approximately correct as
long as no more than 10% of the
population is included in the sample
General Properties for
Sample Means
Rule 3:
When the population distribution is
normal, the sampling distribution
of x is also normal for any sample
size n.
General Properties for
Sample Means
Rule 4: Central Limit Theorem
When n is sufficiently large, the
sampling distribution of x is
approximated by a normal curve,
even when the population
distribution is not itself normal.
To apply CLT , n ≥ 30
If n is large or the population
distribution is normal, then
x  mx
x mx
z
 s
sx
n
has approximately a standard
normal distribution.
Sampling Distribution of
Proportions
 Activity:




Reese’s Pieces
This is a proportion activity where the sample
proportion (p) of orange pieces to the total is
calculated based on a sample that we can
vary
http://www.rossmanchance.com/applets/Rees
es3/ReesesPieces.html
Take multiple samples to see the trends
(shape, center and spread)
Increase sample size, and repeat
What do you notice?
 The
mean of the proportion sampling
distribution EQUALS the proportion of the
population as number of samples
increased.
mp = p (p)
 As
the sample size increases, the
standard deviation of the sampling
distribution decreases.
as n
sp
General Properties of
Sample Proportions
Rule 1: m = p or (p) (unbiased
p
estimator)
Rule 2: sp =
p(1-p)
n
This rule is approximately correct as
long as no more than 10% of the
population is included in the sample
General Properties for
Sample Proportions
Rule 3:
As n increases, the sampling
distribution of p becomes
approximately normal. You need
to check for the Normal condition
before you perform Normal
calculations
np ≥ 10 and n(1-p) ≥ 10
Review of Sample Restrictions
Limit Theorem – if n ≥ 30, the
sampling distribution of the mean can be
assumed to be Normal
 Normal Condition for sampling distribution
of proportions – np ≥ 10 & n(1-p) ≥ 10
 Central
Review of Sample Restrictions
(cont’d)
 Sample
sizes must be less than 10% of
the population – This is because when we
sample, we are doing so without
replacement which means that the
probabilities are changing (Not
Independent) but the standard deviation
formula assumes same probability each
time (Independence). Having our sample
be small in comparison to the population
allows us to assume independence.
To Solve Sampling Distributions
of Proportions
State the probability ie P(p≥70)
 Plan: Determine if all sample
requirements are met to use Normal
Calculations
 Do: the math
 Conclude: State the answer in the proper
context ie Approximately 90% of all
SRS’s of sample size xxx will have a
 p ≥ 70
 State:
EX) The army reports that the distribution
of head circumference among soldiers is
approximately normal with mean 22.8
inches and standard deviation of 1.1
inches.
a) What is the probability that a randomly
selected soldier’s head will have a
circumference that is greater than 23.5
inches? P(X > 23.5) = .2623
b) What is the probability that a
random sample of five soldiers will
have an average head circumference
that is greater than 23.5 inches?
Do you expect the probability to
be moreWhat
or lessnormal
than the
answer
curve are
to you
partnow
(a)? working
Explainwith?
P(X > 23.5) = .0774
 Ex
2) A polling organization asks an SRS
of 1500 first year college students how far
away their home is. Suppose that 35% of
all first-year students (population) actually
attend college within 50 miles of home.
What is the probability that the random
sample of 1500 students will give a result
within 2 percentage points of this true
value?
 State
problem: P(0.33≤p≤0.37)
 Plan: Sample size must be less than 10% of the
population – there are 1.7 1st year students – Check!!
 Can we assume Normal Distribution?
 np≥ 10? Check!! N(1-p) ≥10? Check!!
 Do:
sp =
p(1-p)
n
=0.0123
 Normalcdf(0.33,0.37,0.35,0.0123)=0.8961
 Conclude:
Approximately 89.6% of all
SRSs of size 1500 will give a result within
2% points of the truth of the population.
Suppose a team of biologists has been
studying the Pinedale children’s fishing
pond. This group of biologists has
determined that the length has a normal
distribution with mean of 10.2 inches and
standard deviation of 1.4 inches. Let x
represent the length of a single trout taken
at random from the pond. What is the
probability that a single trout taken at
random from the pond is between 8 and
12 inches long?
P(8 < X < 12) = .8427
What is the probability that the mean
length of five trout taken at random is
between 8 and 12 inches long?
Do xyou
expect
the probability to
P(8<
<12)
= .9978
be more or less than the answer
to part (a)? Explain
What sample mean would be at the 95th
percentile? (Assume n = 5)
x = 11.23 inches
A soft-drink bottler claims that, on average, cans
contain 12 oz of soda. Let x denote the actual
volume of soda in a randomly selected can.
Suppose that x is normally distributed with
s = .16 oz. Sixteen cans are to selected with a
mean of 12.1 oz. What is the probability that the
average of 16 cans will exceed 12.1 oz?
P(x >12.1) = .0062
Do you think the bottler’s claim is correct?
No, since it is not likely to happen by chance alone & the
sample did have this mean, I do not think the claim that
the average is 12 oz. is correct.
A hot dog manufacturer asserts that one of its
brands of hot dogs has a average fat content of 18
grams per hot dog with standard deviation of 1
gram. Consumers of this brand would probably
not be disturbed if the mean was less than 18
grams, but would be unhappy if it exceeded 18
grams. An independent testing organization is
asked to analyze a random sample of 36 hot dogs.
Suppose the resulting sample mean is 18.4 grams.
Does this result indicate that the manufacturer’s
claim is incorrect?
Yes, not likely to happen by chance alone.
What if the sample mean was 18.2 grams, would
you think the claim was incorrect? No