Central Limit Theorem - University of Massachusetts Amherst

Download Report

Transcript Central Limit Theorem - University of Massachusetts Amherst

Central Limit Theorem
•
Example:
(NOTE THAT THE ANSWER IS CORRECTED COMPARED
TO NOTES5.PPT)
–
–
–
1.
2.
3.
5 chemists independently synthesize a compound 1 time each.
Each reaction should produce 10ml of a substance.
Historically, the amount produced by each reaction has been normally
distributed with std dev 0.5ml.
What’s the probability that less than 49.8mls of the substance are made in
total?
What’s the probability that the average amount produced is more than
10.1ml?
Suppose the average amount produced is more than 11.0ml. Is that a rare
event? Why or why not? If more than 11.0ml are made, what might that
suggest?
Answer:
• Central limit theorem:
If E(Xi)=m and Var(Xi)=s2 for all i (and
independent) then:
X1+…+Xn ~ N(nm,ns2)
(X1+…+Xn)/n ~ N(m,s2/n)
Lab:
1. Let Y = total amount made.
Y~N(5*10,5*0.52) (by CLT)
Pr(Y<49.8) = Pr[(Y-50)/1.12 < (49.8-50)/1.12]
=Pr(Z < -0.18) = 0.43
2. Let W = average amount made.
W~N(10,0.52/5) (by CLT)
Pr(W > 10.1) = Pr[Z > (10.1 – 10)/0.22]
=Pr(Z > 0.45) = 0.33
Lab (continued)
3.
One definition of rare:
It’s a rare event if Pr(W > 11.0) is small
(i.e. if “Seeing probability of 11.0 or something more
extreme is small”)
Pr(W>11) = Pr[Z > (11-10)/0.22]
= Pr(Z>4.55) = approximately zero.
This suggests that perhaps either the true mean is not 10
or true std dev is not 0.5 (or not normally distributed…)
QuickT i me™ and a
T IFF (Uncompressed) decom pressor
are needed to see this picture.
Sample size: 1006
(source: gallup.com)
• Let Xi = 1 if person i thinks the President
is hiding something and 0 otherwise.
• Suppose E(Xi) = p and Var(Xi) = p(1-p) and
each person’s opinion is independent.
• Let Y
= total number of “yesses”
Note that this definition
= X1+…+ X1006
turns three outcomes into
two outcomes
• Y ~ Bin(1006,p)
• Suppose p = 0.36 (this is the estimate…)
• What is Pr(Y < 352)?
Normal Approximation to the binomial
CDF
Pr(Y<352) = Pr(Y=0)+…+Pr(Y=351),
where Pr(Y=k) = (1006 choose k)0.36k0.641006-k
– Even with computers, as n gets large, computing
things like this can become difficult. (1006 is OK, but
how about 1,000,000?)
– Idea: Use the central limit theorem approximate this
probability
– Y is approximately
N[1006*0.36, (0.36)*(0.64)*1006]
= N(362.16,231.8) (by central limit theorem)
Pr[ (Y-362.16)/15.2 < (352-362.16)/15.2]
= Pr(Z < -0.67) = 0.25
Normal Approximation to the binomial
CDF
Blue line is plot of
Normal(362.16,231.8) pdf
Black “step function” is plots of
bin(1006,0.36) pdf versus Y (integers)
Normal Approximation to the
binomial CDF
Area under blue curve to
left of 352
is
approximately
equal to the
sum of areas of
rectangles (black
Stepfunction) to the
left of 352
Comments about normal approximation of the binomial :
Rule of thumb is that it’s OK if np>5 and n(1-p)>5.
“Continuity correction”
Y is binomial.
If we use the normal approximation to the probability
that Y<k, we should calculate Pr(Y<k+.5)
If we use the normal approximation to the probability
that Y>k, we should calculate Pr(Y<k-.5)
(see picture on board)
Probability meaning of 6 sigma
• Even if you shift the process mean for the center
of the specifications to 1.5 standard deviations
toward one of the specifications, then you will
expect no more than 3.4 out of a million defects
outside of the specification toward which you
shifted.
• (I know it’s convoluted, but that’s the
definition…)
What does 6 sigma mean?
(example)
• Suppose a product has a quantitative
specification:
ex: “Make the gap between the car door and
the car body between 3.4 and 4.6mm.”
• When cars are actually made, the “std dev of
car door gap is 0.1mm”. i.e. X1,…,Xn are gap
widths. The sqrt(sample variance of
X1,…,Xn)= 0.1mm
Statistically, six sigma means that
Upper Spec – Lower Spec > 12 sigma
(i.e. Specs are fixed. Lower the
manufactuing process variability.)
Distribution of gap widths
Lower
specification
3.4mm
Shifted mean
= 3.85mm gap
Center of spec
= 4mm gap
4.6 – 3.4 = 1.2 = 12*0.1 = 12*sigma
Upper
specification
4.6mm
Probability of being
out here is Pr( gap is less than 3.4 ) = Pr( (gap – 3.85)/0.1 < (3.4-3.85)/.1)
Arbitrary “magic” number for 6s
=Pr( Z < -4.5) = 3.4/1,000,000
Probability meaning of 6 sigma
In general:
Assume process mean is 1.5 standard
deviations toward the lower spec:
i.e. E(X)=4-1.5s and assume X has a normal
distribution.
When the process is in control enough so that
the distance between the center of the specs
and the lower spec is least 6s, then
Pr(X below lower spec) =Pr( X<4- 6s)
=Pr[(X- (4-1.5s))/s < (4-6s-(4-1.5s))/s ]
=Pr(Z<-4.5) = 3.4/1,000,000
Control Charts
• Let X = an average of n measurements.
• Each measurement has mean m and
variance s2.
• Fact:
– By the central limit theorem, almost all observations of
X fall in the interval
m +/- 3s/sqrt(n)
(i.e. mean +/- 3 standard deviations)
– s/sqrt(n) is also called sx or standard error
Use the “fact” to detect changes in
production quality
• Idea: let xi = average door gap from the n
cars made by shift i at the car plant
m+3 s/sqrt(n)
(Upper Control Limit)
m
m-3 s/sqrt(n)
(Lower Control Limit)
x1
x6
x3
x2
x7
x8
x5
x4
shift
Points outside the +/- 3 std error bounds, are called “out of control”. They are evidence
that m and or s are not the true mean and std dev any more, and the process needs to
be readjusted. Calculate the “false alarm rate”… (= 26/10,000)
Assume 100 new people are
polled.
Assume true pr( a new
person says yes) = 0.36.
QuickT i me™ and a
T IFF (Uncompressed) decom pressor
are needed to see this picture.
Let P = “P hat”
= number say yes/100
What’s an approximation to
the distribution of P-hat?
Use the approximation to
determine a number so that
the Pr(p-hat> that number)
= 0.95.
EXAMPLE OF SAMPLING DISTRIBUTION OF
P-HAT
Xk = 1 if person k says yes and 0 if not.
Note that E(Xk)=0.36=p and
Var(Xk)=0.36*0.64=p(1-p)
Note that Xk is binomial(1,0.36).
P-hat = (X1+…+X100)/100. By CLT, P-hat is
approximately N(0.36,0.36*0.64/100).
(Rule of thumb is that this approximation is
good if np>5 and n(1-p)>5.)
• Suppose true p is 0.36.
• If survey is conducted again on 100 people, then
P-hat
~ N(.36,(.36)(.64)/100)
= N(.36, 0.002304)
Want p0 so that Pr(P-hat<p0) = 0.95
Pr(P-hat<p0) = 0.95 means
Pr(Z < (p0-.36)/0.048) = 0.95.
Since Pr(Z<1.645) = 0.95,
(p0-.36)/0.048 = 1.645
(p0-.36) = 0.07896
p0 = 0.43896
• Suppose true p is 0.40.
• If survey is conducted again on 49 people, what’s the probability of
seeing 38% to 44% favorable responses?
Pr( 0.38 < P ”hat” < 0.44)
= Pr[(0.38-0.40)/sqrt(0.40*0.60/49) < Z <
(0.44-0.40)/sqrt(0.40*0.60/49) ]
= Pr(-0.29 < Z < 0.57)
= Pr(Z<0.57) – Pr(Z<-0.29)
= 0.7157-0.3859=0.3298