chapter 7 power point

Download Report

Transcript chapter 7 power point

Chapter 7 ~ Sample Variability
Empirical Distribution of Sample Means
9
8
7
6
Frequency
5
4
3
2
1
0
6.8
7.2
7.6
8.0
8.4
8.8
9.2
9.6
10.0 10.4 10.8 11.2
Sample Mean
1
Chapter Goals
• Investigate the variability in sample statistics
from sample to sample
• Find measures of central tendency for sample
statistics
• Find measures of dispersion for sample statistics.
• Find the pattern of variability for sample
statistics
2
7.1 ~ Sampling Distributions
• To make inferences about a population, we need
to understand sampling
• The sample mean varies from sample to sample
• The sample mean has a distribution; we need to
understand how the sample mean varies and the
pattern (if any) in the distribution
3
Sampling Distribution of a Sample Statistic
• Sampling Distribution of a Sample Statistic: The
distribution of values for a sample statistic obtained from
repeated samples, all of the same size and all drawn from the
same population
Example: Consider the set {1, 2, 3, 4}:
1) Make a list of all samples of size 2 that can be drawn
from this set (Sample with replacement)
2) Construct the sampling distribution for the sample mean
for samples of size 2
3) Construct the sampling distribution for the minimum for
samples of size 2
4
Table of All Possible Samples
This table lists all
possible samples of
size 2, the mean for
each sample, the
minimum for each
sample, and the
probability of each
sample occurring
(all equally likely)
Sample
x
{1, 1}
{1, 2}
{1, 3}
{1, 4}
{2, 1}
{2, 2}
{2, 3}
{2, 4}
{3, 1}
{3, 2}
{3, 3}
{3, 4}
{4, 1}
{4, 2}
{4, 3}
{4, 4}
1.0
1.5
2.0
2.5
1.5
2.0
2.5
3.0
2.0
2.5
3.0
3.5
2.5
3.0
3.5
4.0
Minimum Probability
1
1
1
1
1
2
2
2
1
2
3
3
1
2
3
4
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
1/16
5
Sampling Distribution
• Summarize the information in the previous table to obtain the
sampling distribution of the sample mean and the sample
minimum:
Sampling Distribution
of the Sample Mean
x
1.0
1.5
2.0
2.5
3.0
3.5
4.0
P( x )
1/16
2/16
3/16
4/16
3/16
2/16
1/16
Histogram: Sampling Distribution
of the Sample Mean
P( x )
0
.
2
5
0
.
2
0
0
.
1
5
0
.
1
0
0
.
0
5
0
.
0
0
1
.
0
1
.
5
2
.
0
2
.
5
3
.
0
3
.
5
4
.
0
x
6
Sampling Distribution
Sampling Distribution of the Sample Minimum:
m
1
2
3
4
P (m )
7/16
5/16
3/16
1/16
Histogram: Sampling Distribution of the Sample Minimum:
P ( m)
0
.
5
0
.
4
0
.
3
0
.
2
0
.
1
0
.
0
1
2
3
4
m
7
Example 1
 Example: Consider the population consisting of six equally likely integers: 1,
2, 3, 4, 5, and 6. Empirically investigate the sampling distribution
of the sample mean. Select 50 samples of size 5, find the mean for
each sample, and construct the empirical distribution of the sample
mean.
The Population: Theoretical Probability Distribution
P( x )
0.18
m = 35
.
s = 17078
.
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
1
2
3
4
5
6
x
8
Empirical Distribution of the Sample Mean
• Samples of Size 5
x = 3.352
sx = 0.714
14
12
10
Frequency
8
6
4
2
0
1.8
2.3
2.8
3.3
3.8
4.3
4.8
5.3
x
Sample Mean
9
Important Notes & Random Sample
1. x : the mean of the sample means
2. sx : the standard deviation of the sample means
3. The theory involved with sampling distributions described in
the remainder of this chapter requires random sampling
Random Sample: A sample obtained in such a way that each
possible sample of a fixed size n has an equal probability of being
selected
– (Every possible handful of size n has the same probability of
being selected)
10
7.2 ~ Where Does This Lead Us?
• Describing the most important idea in all of
statistics
• Describes the sampling distribution of the sample
mean
• Examples suggest: the sample mean (and sample
total) tend to be normally distributed
11
Important Definition & Theorem
Sampling Distribution of Sample Means
If all possible random samples, each of size n, are taken from any
population with a mean m and a standard deviation s, the sampling
distribution of sample means will:
1. have a mean m x equal to m
2. have a standard deviation s x equal to s
n
Further, if the sampled population has a normal distribution, then the
sampling distribution of x will also be normal for samples of all
sizes
Central Limit Theorem
The sampling distribution of sample means will become normal as
the sample size increases.
12
Summary
• The mean of the sampling distribution of x is equal to the mean of the
original population: m x = m
• The standard deviation of the sampling distribution of x (also called the
standard error of the mean) is equal to the standard deviation of the
original population divided by the square root of the sample size: s x = s n
Notes:
– The distribution of x becomes more compact as n increases. (Why?)
– The variance of x : s x2 = s 2 n
• The distribution of x is (exactly) normal when the original population
is normal
• The CLT says: the distribution of x is approximately normal regardless
of the shape of the original distribution, when the sample size is large
enough!
13
Standard Error of the Mean
Standard Error of the Mean: The standard deviation of
the sampling distribution of sample means: s x = s n
Notes:
• The n in the formula for the standard error of the mean is
the size of the sample
• The proof of the Central Limit Theorem is beyond the
scope of this course
• The following example illustrates the results of the
Central Limit Theorem
14
Graphical Illustration of the Central Limit Theorem
Distribution of x:
n=2
Original Population
10
20
30
x
10
20
Distribution of x:
n = 30
Distribution of x:
n = 10
10
x
x
30
10
20
x
15
7.3 ~ Applications of the Central Limit Theorem
• When the sampling distribution of the sample
mean is (exactly) normally distributed, or
approximately normally distributed (by the CLT),
we can answer probability questions using the
standard normal distribution, using the TI 83/84
functions for dealing with the normal distribution,
like normcdf and invNorm.
16
Example 2
Example: Consider a normal population with m = 50 and
s = 15. Suppose a sample of size 9 is selected at
random. Find:
1) P ( 45  x  60)
2) P ( x  47.5)
Solutions: Since the original population is normal, the
distribution of the sample mean is also (exactly) normal
1) m x = m = 50
2) s x = s
n = 15
9 = 15 3 = 5
17
Example 2
0.4772
0.3413
45
- 1.00
z=
x-m
s
n
;
50
0
60
2.00
x
z
 45 - 50
60 - 50
 z 
P (45  x  60) = P

 5
5 
= P( -1.00  z  2.00)
= 0.3413 + 0.4772 = 0.8185
18
Example 2
0.3085
01915
.
47.5 50
-0.50
z=
x-m
s
n
;
0
x
z
 x - 50 47.5 - 50

P( x  47.5) = P

 5
5 
= P( z  -.5)
= 0.5000 - 01915
= 0.3085
.
19
Example 3
 Example: A recent report stated that the day-care cost per week in Boston is
$109. Suppose this figure is taken as the mean cost per week and
that the standard deviation is known to be $20.
1) Find the probability that a sample of 50 day-care centers would show a
mean cost of $105 or less per week.
2) Suppose the actual sample mean cost for the sample of 50 day-care centers
is $120. Is there any evidence to refute the claim of $109 presented in the
report?
Solutions:
• The shape of the original distribution is unknown, but the sample size, n, is
large. The CLT applies.
• The distribution of x is approximately normal
m x = m = 109
sx = s
n = 20
50  2.83
20
Example 3
1)
0.4207
0.0793
105
-141
.
z=
x-m
s
n
;
109
0
x
z

105 - 109 
P( x  105) = Pz 


2.83 
= P ( z  -141
. )
= 0.5000 - 0.4207 = 0.0793
21
Example 3
2)
• To investigate the claim, we need to examine how likely an observation
is the sample mean of $120
• Consider how far out in the tail of the distribution of the sample mean
is $120

x-m

=
z=
; P ( x 120) P  z  120 - 109 

2.83 
s n
= P ( z  3.89 )
= 0.5000 - 0.4999 = 0.0001
• Since the probability is so small, this suggests the observation of $120 is
very rare (if the mean cost is really $109)
• There is evidence (the sample) to suggest the claim of m = $109 is likely
wrong
22