Section 8.1 ~ Sampling Distributions
Download
Report
Transcript Section 8.1 ~ Sampling Distributions
Section 8.1 ~
Sampling Distributions
Introduction to Probability and Statistics
Ms. Young
Sec. 8.1
Objective
After this section you will understand the
fundamental ideas of sampling distributions
and how the distribution of sample means and
distribution of sample proportions are
formed. You will also learn the notation used
to represent sample means and proportions.
Sec. 8.1
Reporting Statistics
How does someone come up with these statistics?
The mean daily protein consumption by Americans is 67 grams
Nationwide, the mean hospital stay after delivery of a baby
decreased from 3.2 days in 1980 to the current mean of 2.0 days
Thirty percent of high school girls in this country believe they would
be happier being married than not married
About 5% of all American children live with a grandparent
A sample is drawn from the population, the sample statistic is found, and
an inference is made about the entire population based on what was found
in the sample
What is the difference between the first two statements
and the last two statements?
The first two statements give estimates of a mean of a quantity
The last two statements say something about a proportion of the
population
Sec. 8.1
Notation Review
n
Sample size
μ (mu)
Population mean
x
Sample mean
σ (sigma)
Population standard deviation
s
Sample standard deviation
z
Z-score
r
Correlation coefficient
∑x (sigma)
Sum of x values
P(x)
Probability of x
Sec. 8.1
Sample Means: The Basic Idea
A distribution of sample means is a histogram
that shows the distribution of a sample
statistic, such as a mean, taken from ALL
samples of a particular size
Ex. ~ refer to supplemental activity
Concluding questions for the activity
What do you notice about the mean of the distribution of
sample means in comparison to the population mean
(242.4)?
It is the same as the population mean
What do you notice about the histogram?
As the sample size increases, the distribution narrows and
clusters around the population mean
Sec. 8.1
Sample Means Cont’d…
When you work with ALL possible samples of a
population of a given size, the mean of the distribution
of sample means is always the population mean
Typically, the population size is too large to calculate the
means for all possible samples, so we calculate the mean of a
sample, x , to estimate the population mean, μ
So when all you have is the mean of a sample, then that is your best
estimate for the population mean
When you are working with very large populations, as your sample
size increases, the distribution will look more and more like a
normal distribution and the distribution of sample means will
approach the population mean
This allows us to make inferences about a population
Sec. 8.1
Sample Means Cont’d…
In working with samples from a large
population, you cannot expect the estimate of
the population mean to be perfect
This is known as a sampling error – the error that is
introduced by working with a sample
The more samples that you gather, the better your
estimate will be, but if you can only gather one sample,
that is your best estimate
Sec. 8.1
Example of sampling error:
The following values are results from a survey of 400 students
who were asked how many hours they spend per week using a
search engine on the Internet.
n = 400
μ = 3.88
σ = 2.40
Sec. 8.1
Suppose these were the values that were
randomly selected to obtain a sample of 32
students:
Sample 1
1.1
3.8
1.7
7.8
5.7
2.1
6.8
6.5
1.2
4.9
2.7
0.3
3.0
2.6
0.9
6.5
1.4
2.4
5.2
7.1
2.5
2.2
5.5
7.8
5.1
3.1
3.4
5.0
4.7
6.8
7.0
6.5
The mean of this sample is xx̄ = 4.17
x̄ is a sample statistic because it
We say that x
comes from a sample of the entire population.
Sec. 8.1
Now suppose a different sample of 32 students
was selected from the 400:
Sample 2
1.8 0.4 4.0
5.2 5.7 6.5
0.5 3.9 3.1
2.4
1.2
5.8
0.8
5.4
2.9
6.2
5.7
7.2
0.8
7.2
0.9
6.6
5.1
4.0
5.7
3.2
7.9 2.5 3.6
3.1 5.0 3.1
For this sample x̄ is = 3.98.
Now you have two sample means that don’t agree
with each other (4.17 & 3.98 respectively), and
neither one agrees with the true population mean
(3.88). This is an example of sampling error.
Sec. 8.1
In summary, when including all possible samples of
size n, the characteristics of the distribution of
sample means are as follows:
• The distribution of sample means is approximately a
normal distribution.
• The mean of the distribution of sample means is the
mean of the population.
• The standard deviation of the distribution of
sample means depends on the population standard
deviation and the sample size.
s
n
Sec. 8.1
Example 1 - Sampling Farms
Texas has roughly 225,000 farms, more than any other state in
the United States. The actual mean farm size is μ = 582 acres and
the standard deviation is σ = 150 acres. For random samples of n =
100 farms, find the mean and standard deviation of the
distribution of sample means. What is the probability of selecting
a random sample of 100 farms with a mean greater than 600
acres?
Solution: Because the distribution of sample means is a normal
distribution, its mean should be the same as the mean of the entire
population, which is 582 acres.
The standard deviation of the sampling distribution is
150
s
n
100
150
15
10
s
n
Sec. 8.1
Example 1 - Sampling Farms
Solution: (cont.)
A sample mean of acres therefore has a standard score of
sample mean – pop. mean
600 – 582
z=
=
= 1.2
standard deviation
15
According to the z-score table, this standard score is in the 88th
percentile, so the probability of selecting a sample with a mean
less than 600 acres is about 0.88.
Thus, the probability of selecting a sample with a mean greater
than 600 acres is about 0.12.
Sec. 8.1
Sample Proportions
Much of what you have learned about distribution of
sample means carries over to distributions of sample
proportions
Suppose instead of being interested in knowing how
many hours per week students spend using search
engines, we took those same 400 students and asked
them a simple Yes or No question, “Do you own a car?”
(refer to the raw data on P.341)
If you counted carefully, you would find that 240 of
the 400 responses are Y’s, so the exact proportion, or
population proportion is p = 0.6 (240/400)
This would be a population parameter
Sec. 8.1
Sample Proportions Cont’d…
When you take a sample to estimate the
population proportion, you follow the same
process as you do when taking a sample to
estimate the population mean
The sample proportion is represented with p̂ (read
as p-hat)
Sec. 8.1
Example 2 ~ Analyzing a Sample Proportion
Consider the distribution of sample proportions shown on P.341.
Assume that its population proportion is p = 0.6 and its standard
deviation is 0.1. Suppose you randomly select the following sample
of 32 responses:
YYNYYYYNYYYYYYNYYNYYYNYYNYYNYNYY
Compute the sample proportion, p,
p̂ for the number of Y’s in
this sample. How far does it lie from the population
proportion? What is the probability of selecting another
sample with a proportion greater than the one you selected?
Solution: The proportion of Y responses in this sample is
p̂ =
24
= 0.75
32
Sec. 8.1
Example 2 ~ Analyzing a Sample Proportion
Solution: (cont.)
Using a population proportion of 0.6 and a standard deviation of
0.1, we find that the sample statistic, p̂ = 0.75, has a standard
score of
sample proportion – pop. proportion 0.75 – 0.6
z=
=
= 1.5
standard deviation
0.1
The sample proportion is 1.5 standard deviations above the mean
of the distribution.
Using the z-score chart, we see that a standard score of 1.5
corresponds to the 93rd percentile. The probability of selecting
another sample with a proportion less than the one we selected is
about 0.93.
Thus, the probability of selecting another sample with a proportion
greater than the one we selected is about 1 – 0.93 = 0.07.