No Slide Title

Download Report

Transcript No Slide Title

Chapter 9.1: Sampling
Distributions
ACTIVITY 9 Young Women’s Heights
1. Use N(64.5,2.5) as the distribution.
2. On your calculator clear L1/list 1.
3. Simulate the heights of 100 randomly
selected women and store the heights in
L1/list 1
4. On TI-83/84 Press MATH; choose PRB;
6:randNorm (μ,σ, n)
5. Complete the command:
randNorm(64.5,2.5,100) ENTER
6. Plot a histogram of the 100 heights
7.
8.
9.
10.
Clear any functions in Y=
Turn off STAT PLOTS
Set Window to X[57,72]2.5 and Y[-10,45]5
Define PLOT 1 to be a Histogram using the
heights in list 1.
11. GRAPH
Is your histogram fairly symmetric or clearly
skewed?
Approximately how many heights should be within
3σ of the mean? i.e. 64.5  3(2.5)
What are these values?
Use TRACE to count the number of heights
within 3σ. How many heights should
there be within 1σ of the mean? Within
2σ of the mean? Find these counts
Standard
deviation
1σ
2σ
3σ
Counts
Use 1-Var Stats to find the mean, median and
standard deviation of your data. Compare
your mean to the population mean μ = 64.5.
Compare the standard deviation of your
data to the standard deviation of the
population σ= 2.5.
How do the mean and median for your
100 heights compare? What is true
about a distribution the closer the mean
is to the median?
Define PLOT 2 to be a boxplot using list 1.
Graph it. The boxplot should be plotted
above the histogram.
Would you say the distribution is
nonsymmetric, moderately symmetric, or
very symmetric?
POST-IT HISTOGRAM OF CLASS DATA
SUMMARY:
What is the approximate shape of the
distribution of the xbars?
Where is the center of the distribution of
xbar?
How does this center compare with the mean
of heights of the population of all young
women?
How does the spread of the distribution of
xbar compare with the original distribution
(σ = 2.5)?
Enter the xbar values in list 2.
Turn off PLOT 1. Define PLOT 3 to be a
boxplot of the xbar data. How do these
distributions of X and xbar compare
visually? Use the 1-Var STAT to calculate
the standard deviations of the xbars.
Compare this number with the σ/100.
Fill in the blanks with with appropriate
function of μ or σ.
“The distribution of xbar is
approximately normal with mean
μ(xbar)=_______and standard deviation
σ(xbar) = ________.
Fill in the blanks with with appropriate
function of μ or σ.
“The distribution of xbar is
approximately normal with mean
μ(xbar)= μ and standard deviation
n
σ(xbar) = _ σ /_______.
In this activity, xbar is a sample statistic while
μ is the population parameter.
• Sample Statistic is a number that describes a
sample. The value is known but can change
from sample to sample.
• Population parameter is a fixed number that
describes the population but we do not
know it because we cannot examine the
entire population.
• Sampling Distribution: of a statistic is the
distribution of values taken by the statistic
in all possible samples of the same size
from the same population. ( like the
distribution of all of the means from each of
you in the previous activity)
• Bias: The sampling distribution allows
us to describe bias using methods other
than describing the sampling method.
• Bias concerns the center of the
sampling distribution.
• A statistic used to estimate a parameter
is unbiased if the mean of sampling
distribution is equal to the true value of
the parameter being estimated.
• Variability of a statistic is described by
the spread of its sampling distribution.
Larger samples give smaller spread as
long as the population is much larger
than the sample, the spread of the
sampling distribution is the same for
ANY population.
Proportion sample of viewers who watched Survivor II in samples of
n= 100
Sampling distribution of the sample proportion from SRSs’ of size 1000
Sampling distribution for samples of size 1000 redrawn with
expanded scale to better display the shape.
The approximate sampling distributions for sample
proportions for SRSs of 100 with population p=.37
The approximate sampling distributions for sample
proportions for SRSs of 1000 with population p=.37
Both statistics are unbiased because the means of their
distributions equal the true population value p =0.37.
The statistic from the larger sample is less variable.
The desired balance is low bias, low variability.
Describe these sampling distributions.
high bias, high variability
low bias, low variability
low bias, high variability
high bias, low variability
• Do problems 9.1, 9.2, 9.3 and 9.4 and 9.5
Chapter 9.2: Sample Proportions
• The objective of some statistical
applications is to reach a conclusion about a
population proportion, p. For example, we
may try to estimate an approval rating
through a survey, or test a claim about a
proportion of defective light bulbs in a
shipment based on a random sample. Since
p is unknown to us, we must base our
conclusion on a sample proportion.
However, the value of p-hat will vary from
sample to sample. The amount of
variability depends on the size of the
sample.
The Sampling Distribution for Proportions:
If we take repeated random samples of size
n from a population, the sample proportion,
p, will have the following distribution and
properties:
(1) p (hat) is unbiased estimate of p since
μ p= p
p(1  p)
(2) sampling variability σ p=
n
(3) shape of p(hat) approximates a
normal distribution np and n(1-p) ≥ 10
Census example
Based on Census data, we know 11% of US
adults are Black. Therefore, p = .11. We
would expect a sample to contain roughly
11% Black representation. Suppose a
sample of 1500 adults contain 138 Black
individuals. Should we suspect
“undercoverage” in the sampling method?
Note: p(hat) = 138/1500= .092. Is this lower
than what would be expected by chance?
Census example
If we expect to get a certain sample result
e.g. .11 and we don’t get it, this could be due
to sampling variability. ( remember in repeated
samples from the same population, we will
have different sample results). It could also be
caused from sampling from a different
population than we thought. If the result is far
from the expected value then we think that
something other than chance is operating and
the result is statistically significant.
Census example
• We know it is possible for a sample to
contain 9.2% Black representation…but is it
likely that would happen due to natural
variation in random sampling methods?
• (1) Check assumptions:
Is np > 10?
Is n(1-p)> 10?
• (2) Assume the Sampling distribution of
p(hat) is approximately Normal.
Census example
• (3) Calculate the Probability P (p(hat) ≤
.092) = P(z ≤ -2.223) = 0.0129
• (4) Interpret results…what does it
mean?
Only 1.29% of the samples of size 1500
would have less than 9.2% Black
representation. Since this is very
unlikely, we have reason to suspect
possible undercoverage in this sample.
The normal approximation to the sampling distribution of
p(hat)
Do problems 9.12, 9.13 and 9.29
in the book
9.3 Sample Means
When the objective of a statistical
application is to reach a conclusion
about a population mean, μ, we
must consider the sample mean,
xbar. However, as we have noted
several times, the value of xbar will
vary from sample to sample. The
amount of variability will depend
on the size of our sample.
The Sampling Distribution for means:
If we take repeated random samples of size n from a
population, the sample proportion, xbar, will have
the following distribution and properties:
(1) The set of all sample means is unbiased,
approximately normal
(2) The mean of the set of sample means is equal to mean
μ of the population.
(3) The standard deviation of xbar is approximately

equal to the
n
(4) xbar is less spread out as standard deviation
decreases by n
The Sampling Distribution for means:
• 5) averages are less variable than individual
observations
• 6) averages are more normal than individual
observations.
Are you smarter than a 5th grader?
• Let x represent the time it takes a 5th grader
to complete a math problem. Suppose the
mean and standard deviation are μ= 2 min.
and std dev σ=.8 min. respectively.
• Let xbar be the sample average time for
9 students. Describe the sampling
distribution of xbar.
Are you smarter than a 5th grader?
• Let x represent the time it takes a 5th grader to
complete a math problem. Suppose the mean and
standard deviation are μ= 2 min. and std dev
σ=.8 min. respectively.9
• Let xbar be the sample average time for 9
students. Describe the sampling distribution
of xbar. xbar = normal distribution with mean
2 min and std dev of .8/√9
Are you smarter than a 5th grader?
• Suppose we take a SRS of 20 students.
Describe the sampling distribution of xbar.
Use it to find the probability that xbar is
greater than 2.5 min for the sample of 20
students.
Are you smarter than a 5th grader?
• Suppose we take a SRS of 20 students.
Describe the sampling distribution of xbar.
Use it to find the probability that xbar is
greater than 2.5 min for the sample of 20
students.
• xbar = normal with mean 2 minutes
• std dev = .8/√20 = .1788
• p(x >2.5) =z > 2.5-2 /(.1788)= 2.7964
• p = 1-.9974 = .0026 very small
The Central Limit Theorem
• As the sample size increases, the distribution gets
closer and closer the a normal distribution. This is
true no matter what shape the population
distribution has, as long as the population has a
finite standard deviation σ. More observations
are required if the shape is far from normal.
• When n is large, the sampling distribution of
the sample mean xbar is close to the normal
distribution N(μ,σ/√n) with mean μ and
standard deviation σ/√n.
To demonstrate the Central Limit
Theorem
• Activity:A Penny for your Thoughts