Estimation - Lyle School of Engineering

Download Report

Transcript Estimation - Lyle School of Engineering

BOBBY B. LYLE
SCHOOL OF ENGINEERING
EMIS - SYSTEMS ENGINEERING PROGRAM
SMU
EMIS 7370 STAT 5340
Department of Engineering Management, Information and Systems
Probability and Statistics for Scientists and Engineers
Estimation Basic Concepts &
Estimation of Proportions
Dr. Jerrell T. Stracener
1
BOBBY B. LYLE
SCHOOL OF ENGINEERING
EMIS - SYSTEMS ENGINEERING PROGRAM
SMU
EMIS 7370 STAT 5340
Department of Engineering Management, Information and Systems
Probability and Statistics for Scientists and Engineers
Estimation Basic Concepts &
Estimation of Proportions
Dr. Jerrell T. Stracener
2
Estimation
• Types of estimates and methods of estimation
• Estimation - Binomial distribution
- Estimation of a Proportion
- Estimation of the difference between two
proportions
• Estimation - Normal distribution
- Estimation of the mean
- Estimation of the standard deviation
- Estimation of the difference between two
means
3
Estimation
• Estimation - Normal distribution (continued)
- Estimation of the ratio of the two standard
deviations
- Tolerance intervals
• Estimation - Lognormal distribution
• Estimation - Weibull distribution
• Estimation - Unknown distribution Types
- Continuous populations
- Finite populations
4
Estimation
Types of Estimates & Methods of Estimation
5
Definition - Statistic
A statistic is a function of only the values of a
random sample, X1, X2, …, Xn.
For example
1 n
X   Xi
n i 1
is a statistic
6
Properties of Estimates
^
A statistic  is said to be an unbiased estimator
of the parameter  if
^
 ^  E()  

If we consider all possible unbiased estimators of
some parameter , the one with the smallest
variance is called the most efficient estimator
of 
7
Types of Estimates
• Point Estimate
A function of the values of a random sample that
yields a single value, i.e., a point
• Interval Estimate
An interval, whose end points are functions of the
values of a random sample, for which one can assert
with a specified confidence that the interval contains
the parameter being estimated
8
Types of Estimates & Methods of Estimation
If we use a sample mean to estimate the mean
of a population, a sample proportion to estimate
the probability of success on an individual trial,
or a sample variance to estimate the variance of
a population, we are in each case using a point
estimate of the parameter in question. These
estimates are called point estimates since they are
single numbers, single points, used, respectively,
to estimate , , and 2. Since we can hardly
expect the point estimates based on samples to
hit the parameters they are supposed to estimate
exactly ‘on the nose’, it is often desirable to give an
interval rather than a single number.
9
Types of Estimates & Methods of Estimation
We can then assert with a certain probability (or
degree of confidence) that such an interval
contains the parameter it is intended to estimate.
For instance, when estimating the average IQ of
all college students in the US, we might arrive at
a point estimate of 117, or we might arrive at an
interval estimate to the effect that the interval
from 113 to 121 contains the ‘true’ average IQ of
all college students in the US.
10
Interval Estimates of  for Different Samples
11
Method of Maximum Likelihood
Given independent observations x1, x2, ..., xn from
a probability density function (continuous case)
f(x; ) or probability mass function (discrete case)
^
p(x; ) the maximum likelihood estimator  is
that which maximizes the likelihood function.
L(X1, X2, ..., Xn; ) = f(X1; )·f(X2; )·...·f(Xn; ),
if x is continuous
= p(X1; )·p(X2; )·...·p(Xn; ),
if x is discrete
12
Method of Maximum Likelihood
Let x1, x2, ..., xn denote observed values in a
sample. In the case of a discrete random variable
the interpretation is very clear. The quantity
L(x1, x2, ..., xn; ), the likelihood of the sample,
is the following joint probability:
P(X1 = x1, X2 = x2, ... , Xn = xn)
This is the probability of obtaining the sample
values x1, x2, ..., xn. For the discrete case the
maximum likelihood estimator is one that results
in a maximum value for this joint probability, or
maximizes the likelihood of the sample.
13
Estimation of Proportions
14
Estimation of Proportions
Estimation of the proportion, p, based on a
random sample of the Binomial Distribution B(n,p)
• Point Estimation
• Interval Estimation
• Approximate Method
• Sample Size
• Exact Method
Estimation of the difference in two proportions
P1 - P2 based on random samples from
B(n, P1) and B(n, p2).
15
Estimation of Proportions
• Point Estimation
• Interval Estimation
16
Estimation - Binomial Distribution
Estimation of a Proportion, p
• X1, X2, …, Xn is a random sample of size n from
B(n, p), where
1 if success
Xi  
 0 if failure
for i  1, ..., n
• Point estimate of p:
fs _
p X
n
^
where fs = # of successes
17
Estimation - Binomial Distribution
• Approximate (1 - )·100% confidence interval
for p:
p 'L , p 'U

where
where
and

and
^
p  p  p
'
L
p  Z / 2
Z
2
p^ q^
n
^
p  p  p
'
U
,
is the value of the standard normal
random variable Z such that
PZ  z / 2  

2
18
Note
When n is small and the unknown proportion p is
believed to be close to 0 or to 1, the approximate
confidence interval procedure established here is
unreliable and, therefore, should not be used. To be
on the safe side, one should require
n^
p5
or
n^
q5
19
^
Error in Estimating p by p
error
p p^
^
p z
 /2
^^
pq
n
^
p  z / 2
^^
pq
n
20
Error in Estimating p by ^
p
• If ^
p is used as an estimate of p, we can be
(1 - )·100% confident that the error will not
exceed z / 2
^^
pq
n
• If ^
p is used as an estimate of p, we can be
(1 - )·100% confident that the error will be less
than a specified amount e when the sample size is
z2 / 2 ^^
pq
n
e2
21
Error in Estimating p by ^
p
• If ^
p is not used as an estimate of p, we can be
at least (1 - )·100% confident that the error will
not exceed a specified amount e when the sample
size is
z2 / 2
n 2
4e
22
Example
In a random sample of n = 500 families owning
television sets in the city of Hamilton, Canada, it
is found that x = 340 subscribed to HBO.
a. How large a sample is required if we want to
be 95% confident that our estimate of p, ^
p, is within
0.02?
b. How large a sample is required if we want to be
95% confident that our estimate of p is within
0.02?
23
Example - Solution
a. Let us treat the 500 families as a preliminary
sample providing an estimate ^
p = 0.68.

1.96 0.680.32
n
 2090
2
0.02
2
Therefore, if we base our estimate of p on a
random sample of size 2090, we can be 95%
confident that our sample proportion will not
differ from the true proportion by more than
0.02
24
Example - Solution
b. We shall now assume that no preliminary
sample has been taken to provide an estimate
of p. Consequently, we can be at least 95%
confident that our sample proportion will not
differ from the true proportion by more than
0.02 if we choose a sample of size
2

1.96
n
2
40.02
 2401
25
Example: Estimation of Binomial parameter p
In a random sample of n = 500 families owning
television sets in the city of Hamilton, Canada, it
was found that fS = 340 owned color sets. Estimate
the population proportion of families with color TV
sets and determine a 95% confidence interval for
the actual proportion of families in this city with
color sets.
26
Example: solution
The point estimate of p is ^
p = 340/500 = 0.68.
Then, an approximate 95% confidence interval
for p is  pL , pU  where
pL  ^p  p
and
p  Z / 2
and
^^
pq
n
pU  ^p  p

0.680.32
 1.96
500
 0.04089
so that
pL  0.68  0.04089
 0.63911
27
Example: solution
and pU  0.68  0.04089
 0.72089
an approximate 95% confidence interval for p is
(0.63911, 0.72089).
Therefore, our “best” (point estimate) of p is 0.68
and we are about 95% confident that p is between
0.64 and 0.72.
28
Estimation - Binomial Population
• Exact (1 - )·100% Confidence Interval for p:
PL , PU 
PL 
f s F1
n  f s  1  f s F1
PU 
(f s  1)F2
n  f s   (f s  1)F2 , where F2 = F / 2;2(fs 1), 2( n fs ) ,
and
, where F1 = F1( α / 2); 2fs , 2( n fs 1)
and F ,df1 ,df2 is the value of x for which
P(X> F ,df1 ,df2 )= 
NOTE: Use the ‘FINV’ function in Excel to get the
values of F1 and F2
29
Example
A random sample of 25 vehicle records are
selected for audit from a large number of county
records. It is found that 5 have errors. Estimate
the population proportion of vehicle records
having errors in terms of a point estimate and
95% confidence interval.
30
Example - solution
fS
5
px

 0.20
n 25
^
An approximate 95% confidence interval for p is
pL' , pU'


where
 = 0.05
and Z   Z 0.025  1.96
2
31
Example - solution
Then
^
p  p  z
'
L
2
^^
pq
n
 0.20  1.96
0.20.8
25
 0.20  1.960.08
 0.20  0.157
 0.043
32
Example - solution
^
p  p  z
'
U
2
^^
pq
n
 0.20  0.157
 0.357
33
Example - solution
An exact 95% confidence interval for p is  pL , pU 
where
fs F
pL 

α
1 ,2 f s ,2  n  f s 1
2
n  f s  1  f s F1 α ,2 f ,2 n f 1
5F0.975,10,42
2
s
s
21  5F0.975,10,42
50.308

21  50.308
 0.068
34
Example - solution
and
pU 
 f s  1F , 2 f 1, 2n f 
2
s
s
n  f s    f s  1F ,2 f 1,2n f 
2

s
s
6 F0.025,12, 40
20  6 F0.025,12, 40
62.39

20  62.39
 0.418
35
Estimation - Binomial Populations
Estimation of the difference between two
proportions
• Let X11, X12, …, X1n1 , and X21, X22, …, X 2 n2 ,
be random samples from B(n1, p1) and
B(n2, p2) respectively
• Point estimation of p1 – p2 = p
^
p^
p1  ^
p2
 X1  X 2
f
f
 1 2
n1 n2
36
Estimation - Binomial Populations
• Approximate (1 - )·100% confidence interval
for p  p1  p2
pL , pU 
where
^
pL   p  Z 
2
^
p^
q
1
1
n1

^
p ^
q
2
2
n2
and
^
pU   p  Z 
2
^
p^
q
1
n1
1

^
p ^
q
2
2
n2
37
Example: Estimation of P1 - P2
A certain change in a manufacturing process for
component parts is being considered. Samples
are taken using both the existing and the new
procedure in order to determine if the new
procedure results in an improvement. If 75 of
1500 items from the existing procedure were
found to be defective and 80 of 2000 items from
the new procedure were found to be defective,
find a confidence interval for the true difference
in the fraction of defectives between the existing
and the new process.
38
Example: solution
Let p1 and p2 be the true proportions of defectives
for the existing and new procedures, respectively.
Hence
75
^
p1 
 0.05
1500
and
^
p 2  80  0.04
2000
and the point estimate of p = p1 - p2 is
^p  ^
p1  ^
p2
= 0.05 - 0.04
= 0.01
39
Example: solution
An approximate 90% confidence interval for
p = p1 - p2 is ' , '
where

pL
pU

40
Example: solution
^
 pL   p  
'
^
 pU   p  
'
  Z
2
^
p^
q
1
1
n1
p^2 q^2

n2

0.050.95 0.040.96
 1.645

1500
2000
 0.011732
41
Example: solution
Then
' pL  0.01  0.011732
 0.001732
' pU  0.01  0.011732
 0.021732
Therefore an approximate 90% confidence interval
for  = p1 - p2 is (-0.0017, 0.0217).
LN
LN
0.0217
rN |E 


 0.927
L LE  LN 0.0234
or about 93% of the length of the confidence
Interval favors the new procedure
42