MATH408: PROBABILITY & STATISTICS

Download Report

Transcript MATH408: PROBABILITY & STATISTICS

MATH408: Probability & Statistics
Summer 1999
WEEK 6
Dr. Srinivas R. Chakravarthy
Professor of Mathematics and Statistics
Kettering University
(GMI Engineering & Management Institute)
Flint, MI 48504-4898
Phone: 810.762.7906
Email: [email protected]
Homepage: www.kettering.edu/~schakrav
Sample Size Determination for a given 
Examples
Confidence Interval
• Recall point estimate for the parameter
under study.
• For example, suppose that µ= mean tensile
strength of a piece of wire.
• If a random sample of size 36 yielded a
mean of 242.4psi.
• Can we attach any confidence to this value?
• Answer: No! What do we do?
Confidence Interval (cont’d)
• Given a parameter, say,  , let ˆ denote its
UMV estimator.
• Given , 100(1-  )% CI for
 is
constructed using the sampling (probability)
distribution of ˆ as follows.
• Find L and U such that P(L < ˆ< U) = 1.
ˆ
• Note that L and U are functions of .
Interpretation of CI
• With 100(1-  )% confidence, we can say
that the true value of  will lie between L
and U; or equivalently, if 100 samples of
size n were taken, then we would expect at
least 100(1-  ) of the 100 values ofˆ
will be between L and U.
• We will illustrate this in the laboratory.
Confidence Interval for the population mean
Horsepower Example (Revisited)
Confidence Intervals
The assumed sigma = 10.0
Variable N
hp@4500 16
hp@5500 16
Mean
StDev
253.25 13.51
241.06 23.16
SEMean 95.0 % CI
2.50 ( 248.35, 258.15)
2.50 ( 236.16, 245.96)
Choice of Sample Size
Examples
Student’s t-distribution
• Referring to HP example, we assumed that
the population standard deviation  was
known (to be 10).
• However, in practice, it is usually unknown.
Hence, we need to estimate it first. If the
sample size is reasonably large (n  30), we
can still use the normal distribution for
inferential part (as justified by the CLT).
Student’s t-distribution
• What happens if the sample is small (n < 30)?
• In this case we cannot use normal since the sample
size is small and by using the sample standard
deviation to estimate s, we bring in more
variability into the picture and the appropriate
distribution to use is the student's t-distribution.
• In 1908, William S.Gosset, a chemist working for
a brewery company, under the pseudonym
Student, first deduced this distribution.
Student’s t-distribution
• Suppose that X1, X2, …, Xn are n random samples
from a normal distribution with mean  and
standard deviation . Then the PDF of
X 
T
s/ n
•
is given by
[(k  1) / 2]
1
f (t ) 
,    t  ,
2
( k 1) / 2
k (k / 2) [(t / k )  1]

(k )   x k 1e  x dx , for any positive num berk.
0
Student’s t-distribution (cont’d)
• Student’s t-distribution, like normal,
– is bell-shaped. It depends on the sample size.
– It is more spread than normal and approaches normal as
n approaches infinity.
• So in the case when n is small,  is unknown and
with the assumption that the population is
approximately normal, 100(1-a)% C.I for  is
given by
X  ta / 2 s / n , X  ta / 2 s / n
HP Example (cont’d):
Confidence Intervals
Variable N Mean StDev SE Mean 95.0 % CI
hp@4500 16 253.25 13.51 3.38 (246.05, 260.45)
hp@5500 16 241.06 23.16 5.79 (228.72, 253.40)
VERIFYING THE NORMALITY ASSUMPTION
• Note that in constructing the above confidence
interval, we assumed that the populations (for
4500 RPM and 5500 RPM) are normal.
• How do we verify that the assumptions are not
grossly violated?
• Recall normal probability plot?
• If there is a reason to believe that this assumption
is not valid in any given problem, then one has to
transform the data or to rely on nonparametric
methods.
Examples
One-sided Confidence Intervals
Testing on the mean using T-distribution
H0:  = 0 vs H1:   0
The test statistics is given by:
Examples
HOMEWORK PROBLEMS
Sections 4.1 through 4.5
1-4, 11-13, 15-18, 21, 22, 24-27, 30,31, 33-38, 40
INFERENCE ON THE VARIANCE
• Recall that sample variance is an UMV
estimator for the population variance.
• Here, we will see how to construct CI and test
hypotheses on 2.
• Assuming that the population is normal, the
statistic:
 = (n-1) s /
2
2
2
follows a chi-square distribution with n-1 degrees of freedom
Chi-square Distribution
1
( k / 2 ) 1  x / 2
f ( x)  k / 2
x
e , for x  0.
2 (k / 2)
  k ,   k.
2