32 The CLT - WordPress.com

Download Report

Transcript 32 The CLT - WordPress.com

“Teach A Level Maths”
Statistics 1
The Central Limit Theorem
© Christine Crisp
The Central Limit Theorem
S1: The Central Limit Theorem
AQA
Normal Distribution diagrams in this presentation have been drawn using FX Draw
( available from Efofex at www.efofex.com )
"Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with
permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages"
The Central Limit Theorem
How Good are Estimates?
In the previous presentation we met a question where
the sample was small. So, we couldn’t be sure that
the estimate of the population mean was a good one.
A numerical measure of the accuracy of an estimate
can be made if we know the standard deviation of the
population.
To explain this, we’ll look again at the diagrams showing
the means of 1000 samples from a population of weights
of hens’ eggs.
The Central Limit Theorem
Population and
1000 sample
means
  2  94
n=5
s.d .  1  34
Population:
Samples:
Population and
1000 sample
means
mean
  60
s.d .  0  67
60·0
60·0
n = 20
standard deviation
mean of means
samples of size 5
samples of size 20
  2  94
  2  94
standard deviation
of means
1·34
2·67
We want to concentrate on the standard deviations.
The Central Limit Theorem
Population and
1000 sample
means
  2  94
n=5
s.d .  1  34
Population:
Samples:
Population and
1000 sample
means
mean
  60
s.d .  0  67
60·0
60·0
n = 20
standard deviation
mean of means
samples of size 5
samples of size 20
  2  94
  2  94
standard deviation
of means
1·34
2·67
We want to concentrate on the standard deviations.
The Central Limit Theorem
Population and
1000 sample
means
Population and
1000 sample
means
  2  94
s.d .  1  34
n=5
  2  94
s.d .  0  67
n = 20
It can be shown that the standard deviation of the
sample means is given by 
For samples of size 5:

n
n

samples of size 20:  
2  94
5
2  94
 1  31
 0  66
n
20
( Our values were 1·34 and 0·67 but we didn’t have all
possible samples. )
The Central Limit Theorem
The standard deviation of the distribution of the sample
means is called the standard error of the sample mean,
often shortened to the standard error (s.e.).
The standard error is given by

n
where,
 is the population standard deviation and
n is the size of each sample.
Since the distribution of sample means is Normal,
approximately 68% of the sample means lie within  1 s.e.
of the population mean and 95% within  2 s.e.
The Central Limit Theorem
We now know the following facts about the distribution
of the means of samples of size n from a population with
an approximately Normal distribution:
 The distribution of means is approximately Normal.
 The mean of the means is equal to the population
mean,  .
 The standard deviation is called the standard error
and is equal to the population standard deviation,
, divided by n .
 2
We write X ~ N   ,

n





Very importantly, the above also hold when the population
is not Normal but in this case n should be greater than
30. This is the Central Limit Theorem (C.L.T.)
The Central Limit Theorem
e.g.1. A General Studies test was given to 720 students
in a College. The standard deviation of the marks is 20.
The following marks are from a random sample of 12
students:
35, 23, 17, 38, 20, 25, 29, 32, 28, 31, 33, 24
(a) Estimate the mean mark of all the students.
(b) Find the standard error of your estimate.
(c) What size sample would be needed to halve the
standard error?
xless than 30,
48 so23we
 .must
. .  assume
29

The
sample
size
is
Solution: (a) x 
 x
 27 the
9
population is Normal.12
n
This is the estimate of  .
(b) The standard error =
(c)

5  77


2
n


20
 5  77
n
12
20
 n  6  93  n  48
n
2  89
The Central Limit Theorem
Part (c) of the last question illustrates a useful principle.
We halved the standard error from 5·77 to 2·89 by
increasing the sample size from 12 to 48.
To halve the standard error we must multiply the
sample size by 4.
Can you say directly what the sample size would need to
be if we wanted the standard error to be a third of its
original value?
ANS: We need 9  12  108
The rule is:
2
To divide s.e. by 2, multiply sample size by 2 (  4 )
2
To divide s.e. by 3, multiply sample size by 3 (  9 ) etc.
The reason is that the formula for the s.e. contains
division by n .
The Central Limit Theorem
e.g.2. The heights of plants grown from a particular
variety of seeds are claimed to have a Normal distribution
with mean 90 cm. and standard deviation of 10 cm.
(a) Find the probability that a randomly selected plant is
less than 100 cm.
(b) A random sample of 5 plants are selected. Find the
probability that the sample mean is less than 85 cm.
Solution:
Let X be the r.v. “ height of a plant (cm) ”
 X ~ N (90, 102 )
(a) We have just 1 plant so this part is not dealing with
sample means.
We want P ( X  100)
100  90
1
10
P ( Z  1)  (1)  0  8413
Standardising,
z
Z
N.B. z = 1 because
100 is one s.d. above
the mean,0 901.
The Central Limit Theorem
e.g.2. The heights of plants grown from a particular
variety of seeds are claimed to have a Normal distribution
with mean 90 cm. and standard deviation of 10 cm.
(a) Find the probability that a randomly selected plant is
less than 100 cm.
(b) A random sample of 5 plants are selected. Find the
probability that the sample mean is less than 85 cm.
Solution: (b)
The Central Limit Theorem
e.g.2. The heights of plants grown from a particular
variety of seeds are claimed to have a Normal distribution
with mean 90 cm. and standard deviation of 10 cm.
(a) Find the probability that a randomly selected plant is
less than 100 cm.
(b) A random sample of 5 plants are selected. Find the
probability that the sample mean is less than 85 cm.
Solution: (b)
 2
X ~ N  ,

n





102
)  X ~ N (90, 20)
With a sample size of 5, X ~ N (90,
5
We want P ( X  85)
Z
85  90
  1  12
Standardising: z 
20
P( X  85)  P( Z  1  12 )
 1  (1.12)  1  0  8686  0  1314
 1  12 0 1  12
The Central Limit Theorem
Exercise
1. The length of telephone calls received by an
organization is known to have a standard deviation of
13 mins. The table gives the lengths of 50 randomly
selected telephone calls.
Length (min)
1-2
3-5
6-8
9-11
Frequency
14
12
10
8
12-17 18-25
4
2
(a) Use the sample to calculate an estimate of , the
mean length of calls.
(b) Find the standard error of your estimate.

13
Solution:
s.e .  than 30. The
1  84
  x size
6  4 is(b)
N.B. The(a)sample
greater
Central
n
50 assume the
Limit Theorem (C.L.T.) tells us we need
not
population is Normal.
Exercise
2.
The Central Limit Theorem
A Normal distribution has a mean of 40 and a
variance of 6. Find the probability that
(a) the average of 10 observations exceeds 41 and
(b) the average of 50 observations exceeds 41.
Interpret your answers to (a) and (b), using sketches
to help you.
3.
The random variable X has a distribution X ~ N (50, 81) .
(a) Write down the distribution of X , the mean of
random samples of size 9 taken from X.
(b) Find the probability that X is less than 45.
The Central Limit Theorem
2. A Normal distribution has a mean of 40 and a variance
of 6. Find the probability that
(a) the average of 10 observations exceeds 41 and
(b) the average of 50 observations exceeds 41.
Interpret your answers to (a) and (b), using sketches
to help you.
Solution:
Let X be the r.v. Then, X ~ N (40, 6)

6
  X ~ N (40, 0  6)
10 

41  40
We want P ( X  41)
z
 1  29
06
 P( Z  1  29)
(a) X ~ N  40,
 1  (1  29 )  1  0  9015  0  0985
(b) Method as (a) with X ~ N (40, 0  12)
Ans: 0  0019
Z
0 1  29
The Central Limit Theorem
2. A Normal distribution has a mean of 40 and a variance
of 6. Find the probability that
(a) the average of 10 observations exceeds 41 and
(b) the average of 50 observations exceeds 41.
Interpret your answers to (a) and (b), using sketches
to help you.
X
0  0985
n  10
0  0019
X
n  50
40 41
40
41
With a sample size of 10, about 10% of the sample
means will lie above 41 but with a sample size of 50 only
about 0·2% will do so.
The Central Limit Theorem
Exercise
3.
The random variable X has a distribution X ~ N (50, 81) .
(a) Write down the distribution of X , the mean of
random samples of size 9 taken from X.
(b) Find the probability that X is less than 45.
Solution:
(a) X ~ N  50, 81  

(b) We want P ( X  45)
9
z
 P( Z  1  67)
 1  (1  67 )
X ~ N (50, 9)
45  50
 1  67
9
Z
 1  0  9525  0  0475
 1  67 0
1  67
The Central Limit Theorem
The following slides contain repeats of
information on earlier slides, shown without
colour, so that they can be printed and
photocopied.
For most purposes the slides can be printed
as “Handouts” with up to 6 slides per sheet.
The Central Limit Theorem
The standard deviation of the distribution of the sample
means is called the standard error of the sample mean,
often shortened to the standard error (s.e.).
The standard error is given by

n
where,
 is the population standard deviation and
n is the size of each sample.
Since the distribution of sample means is Normal,
approximately 68% of the sample means lie within  1 s.e.
of the population mean and 95% within  2 s.e.
The Central Limit Theorem
We now know the following facts about the distribution
of the means of samples of size n from a population with
an approximately Normal distribution:
 The distribution of means is approximately Normal.
 The mean of the means is equal to the population
mean,  .
 The standard deviation is called the standard error
and is equal to the population standard deviation,
, divided by n .
 2
We write X ~ N   ,

n





Very importantly, the above also hold when the population
is not Normal but in this case n should be greater than
30. This is the Central Limit Theorem (C.L.T.)
The Central Limit Theorem
e.g.1. A general studies test was given to 720 students in
a College. The standard deviation of the marks is 20. The
following marks are from a random sample of 12 students:
35, 23, 17, 38, 20, 25, 29, 32, 28, 31, 33, 24
(a) Estimate the mean mark of all the students.
(b) Find the standard error of your estimate.
(c) What size sample would be needed to halve the
standard error?
The sample size is less than 30, so we must assume the
population is Normal.
Solution: (a)
x

x
n
48  23  . . .  29
 x
 27  9
12
This is the estimate of  .
(b) The standard error =

n

20
12
 5  77
The Central Limit Theorem
(c)

5  77


2
n
20

n
2  89
n  6  93  n  48
This part illustrates a useful principle. We halved the
standard error from 5·77 to 2·89 by increasing the sample
size from 12 to 48.
To halve the standard error we must multiply the
sample size by 4.
To divide the standard error by 3, we need a sample
size that is 9 times as large. i.e. 108
The rule is:
2
To divide s.e. by 2, multiply sample size by 2 (  4 )
2
To divide s.e. by 3, multiply sample size by 3 (  9 ) etc.
The reason is that the formula for the s.e. contains
division by n .
The Central Limit Theorem
e.g.2. The heights of plants grown from a particular
variety of seeds are claimed to have a Normal distribution
with mean 90 cm. and standard deviation of 10 cm.
(a) Find the probability that a randomly selected plant is
less than 100 cm.
(b) A random sample of 5 plants are selected. Find the
probability that the sample mean is less than 85 cm.
Solution:
Let X be the r.v. “ height of a plant (cm) ”
 X ~ N (90, 102 )
(a) We have just 1 plant so this part is not dealing with
sample means.
Z
We want P ( X  100)
100  90
Standardising, z 
1
10
P ( Z  1)  (1)  0  8413
0
1
The Central Limit Theorem
e.g.2. The heights of plants grown from a particular
variety of seeds are claimed to have a Normal distribution
with mean 90 cm. and standard deviation of 10 cm.
(a) Find the probability that a randomly selected plant is
less than 100 cm.
(b) A random sample of 5 plants are selected. Find the
probability that the sample mean is less than 85 cm.
Solution: (b)
 2
X ~ N  ,

n





102
)  X ~ N (90, 20)
With a sample size of 5, X ~ N (90,
5
We want P ( X  85)
Z
85  90
  1  12
Standardising: z 
20
P( X  85)  P( Z  1  12 )
 1  (1.12)  1  0  8686  0  1314
 1  12 0 1  12