chapter2(1) - Portal UniMAP

Download Report

Transcript chapter2(1) - Portal UniMAP

CHAPTER 2
Statistical Inference
2.1 Estimation
Confidence Interval Estimation for Mean and
Proportion
Determining Sample Size
2.2 Hypothesis Testing:
Tests for one and two means
 Test for one and two proportions
Statistical Inference
 The field of statistical inference consist of those methods used
to make decisions or to draw conclusions about a population.
 These methods utilize the information contained in a sample
from the population in drawing conclusions.
 Statistical Inference may be divided into two major areas:
parameter estimation and hypothesis testing.
In estimation, you will learn:
 To construct and interpret confidence interval estimates for
the mean and the proportion.
 How to determine the sample size necessary to develop a
confidence interval for the mean or proportion
2
Estimation
 In interval estimation, an interval is constructed around the
point estimate and it is stated that this interval is likely to
contain the corresponding population parameter.
 Each interval is constructed with regard to a given confidence
level and is called a confidence interval. The confidence level
associated with a confidence interval states how much
confidence we have that this interval contains the true
population parameter. The confidence level is denoted by
1   100% .
Lower
Confidence Limit
Point Estimate
Width of
confidence interval
Upper
Confidence Limit
3
Confidence Intervals
Confidence
Intervals
Population
Mean
σ Known
Population
Proportion
σ Unknown
EQT 373
Confidence Interval Estimates for Population Mean, μ
i)  is known: x  z

2

 

or  x  z
   x  z

2
2
n
n
n


ii)  is unknown, n  30: x  z
2
s
n
s
s 

or  x  z
   x  z

2
2
n
n

iii)  is unknown, n  30 : x  tn 1,
2
s
n
s
s 

or  x  tn 1,
   x  tn 1,

2
2
n
n

5
Example
If a random sample of size n=20 from a normal population with
variance  2  225 , and mean x  64.3, construct a 95%
confidence interval for the population mean, μ.
Solution:
Given n=20, x  64.3 and   15
For 95% confident interval, we have
95%  100(1 –  )%
1 –  0.95    0.05 
From standard normal table:

2
 0.025
z  z0.025  1.96
2
6
  
Hence, 95% CI  x  z 

n


2
 15 
 64.3  1.96 

20


 64.3  6.57
 [57.73, 70.87]
@
57.73    70.87
Thus, we are 95% confident that the mean of random variable
is between 57.73 and 70.87
7
Example
A publishing company has just published a new textbook.
Before the company decides the price at which to sell this
textbook, it wants to know the average price of all such textbooks
in the market.
The research department at the company took a sample of 36
comparable textbooks and collected the information on their
prices. This information produced a mean price RM 70.50 for this
sample. It is known that the standard deviation of the prices of all
such textbooks is RM4.50.
Construct a 90% confidence interval for the mean price of all
such college textbooks.
8
Solution:
Given n=36, x  RM70.50 and   RM4.50
For 90% confident interval, we have
90%  100(1–  )%
1–  0.90    0.1 

2
 0.05
From standard normal table:
z  z0.05  1.65
2
9
  
Hence, 90% CI  x  z 

n
2 
 4.50 
 70.50  1.65 

36


 70.50  1.24
 [ RM 69.26, RM 71.74]
Thus, we are 90% confident that the mean price of all such
college textbooks is between RM69.26 and RM71.74
10
Confidence Interval Estimates for the Difference
between Two Population Mean, 1  2
i)  1 and  2 is known:
x  x  z
1
2

 12
2
n1

 22
n2
ii)  1 and  2 is unknown, n1  30, n2  30:
x  x  z
1
2

s12 s2 2

n1 n2
2
iii)  1 and  2 is unknown, n1  30, n2  30
 x1  x2   tn  n 2, 2
1
2
s12 s2 2

n1 n2
11
Example
The scientist wondered whether there was a difference in the
average daily intakes of dairy products between men and women.
He took a sample of n =50 adult women and recorded their daily
intakes of dairy products in grams per day. He did the same for
adult men. A summary of his sample results is listed below.
Men
Women
Sample size
50
50
Sample mean
780 grams per day
762 grams per day
Sample standard
deviation
35
30
Construct a 95% confidence interval for the difference in the
average daily intakes of daily products for men and women. Can
you conclude that there is a difference in the average daily intakes
of daily products for men and women?
12
Solution:
Hence, 95% CI, we have
95%  100(1 –  )%
1 –  0.95    0.05 

2
 0.025
z  z0.025  1.96
2


 s2 s 2 
x1  x 2  z  1  2    780  762   1.96 
 n1 n2 

2 



 35
50
2


50 

30 


2
 18  12.78
 [5.22,30.78]
Thus, we conclude that there is a difference in the average
daily intakes for men and women as 1  2 ` 0 .
13
Confidence Interval Estimates for Population Proportion
 The CI for p for n≥30
pˆ  z
2
pˆ 1  pˆ 
n
or
pˆ  z
2
pˆ 1  pˆ 
 p  pˆ  z
2
n
pˆ 1  pˆ 
n
14
Example
According to the analysis of Women Magazine in June 2005,
“Stress has become a common part of everyday life among
working women in Malaysia. The demands of work, family and
home place an increasing burden on average Malaysian women”.
According to this poll, 40% of working women included in the
survey indicated that they had a little amount of time to relax. The
poll was based on a randomly selected of 1502 working women
aged 30 and above.
Construct a 95% confidence interval for the corresponding
population proportion.
15
Solution:
Let p be the proportion of all working women age 30 and above,
who have a limited amount of time to relax, and let p̂ be the
corresponding sample proportion. From the given information,
n=1502,

Hence 95% CI  pˆ  z 
2 
pˆ 1  pˆ  


n

 0.40(0.60)
 0.40  1.96 
1502

 0.40  0.02478



 [0.375, 0.425] or 37.5% to 42.5%
Thus, we can state with 95% confidence that the proportion of all
working women aged 30 and above who have a limited amount
of time to relax is between 37.5% and 42.5%.
16
Confidence Interval Estimates for the Differences
betweenTwo Population Proportion
The CI for p1  p2 given n1  30 and n2  30
 pˆ1  pˆ 2   z 2
pˆ1 1  pˆ1  pˆ 2 1  pˆ 2 

n1
n2
17
Example
A researcher want to estimate the difference between the
percentages of users of two toothpastes who will never switch to
another toothpaste.
In a sample of 500 users of Toothpaste A taken by this
researcher, 100 said that the will never switch to another
toothpaste. In another sample of 400 users of Toothpaste B taken
by the same researcher, 68 said that they will never switch to
another toothpaste.
Construct a 97% confidence interval for the difference
between the proportions of all users of the two toothpastes who
will never switch.
18
Solution:
100
Toothpaste A : n1 = 500 and x1 = 100  pˆ1 
 0.20
500
68
ˆ

p

 0.17
Toothpaste B : n2 = 400 and x2 = 68
2
400
97% confidence interval:

  pˆ1  pˆ 2   Z  
2 
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 ) 


n1
n2

 0.20(0.80) 0.17(0.83) 
  0.20  0.17   2.17 


500
400


 0.03  0.05628  [0.026, 0.086]
Thus, with 97% confidence we can state that the difference
between the proportions of all users of the two toothpastes
who will never switch is between -0.026 and 0.086.
19
Determining Sample Size
Determining
Sample Size
For the
Mean
For the
Proportion
EQT 373
Sampling Error
The required sample size can be found to reach a desired margin of
error (e) with a specified level of confidence (1 - )
The margin of error is also called sampling error
 the amount of imprecision in the estimate of the population
parameter
 the amount added and subtracted to the point estimate to form
the confidence interval
EQT 373
Determining Sample Size for population mean problems
Determining
Sample Size
For the
Mean
X  Zα / 2
σ
n
Sampling error (margin of error)
e  Zα / 2
σ
n
Example
 If  = 45, what sample size is needed to estimate the mean
within ± 5 with 90% confidence
Z2 σ 2 (1.645) 2 (45)2
n

 219.19
2
2
e
5
 So the required sample size is n = 220
(Always round up)
EQT 373
Determining Sample Size for population proportion problems
Determining
Sample Size
For the
Proportion
eZ
pˆ (1- pˆ )
n
Now solve for n
to get
Z 2 pˆ (1  pˆ )
n
e2
EQT 373
Example
How large a sample would be necessary to estimate the true
proportion defective in a large population within ±3%, with 95%
confidence? (Assume a sample yields p = 0.12)
Solution:
For 95% confidence, we have Zα/2 = 1.96
e = 0.03;
p = 0.12
Z / 2 2 pˆ (1  pˆ ) (1.96) 2 (0.12)(1  0.12)
n

 450.74
2
2
e
(0.03)
So use n = 451
Example
A team of efficiency experts intends to use the mean of a random
sample of size n=150 to estimate the average mechanical
aptitude of assembly-line workers in a large industry (as
measured by a certain standardized test). If, based on
experience, the efficiency experts can assume that for such
data, what can they assert with probability 0.99 about the
maximum error of their estimate?
Solution:
Substituting n=150, σ=6.2 and z0.005  2.575 into the expression
for the maximum error, we get
z /2 2.575(6.2)
E

 1.30
n
150
Thus, the efficiency experts can assert with probability 0.99 that
their error will be less than 1.30.
26
Example
A study is made to determine the proportion of voters in a sizable
community who favor the construction of a nuclear power plant.
If 140 of 400 voters selected at random favor the project and we
use pˆ  140  0.35 as an estimate of the actual proportion of all
400
voters in the community who favor the project, what can we say
with 99% confidence about the maximum error?
27
Solution:
Substituting n  400, pˆ  0.35, and z0.005  2.575 into the formula,
we get
E  z /2
pˆ (1  pˆ )
n
(0.35)(0.65)
 2.575
 0.061
400
Thus, if we use pˆ  0.35 as an estimate of the actual proportion of
voters in the community who favor the project, we can assert with
99% confidence that the error is less than 0.061.
28