Bootstrap Method

Download Report

Transcript Bootstrap Method

Bootstrap Method - Introduction
• The bootstrap, developed by Efron in the late 1970s, allows us
to calculate estimates in situations where there is no adequate
statistical theory.
• Example: calculating a CI for the mean when the population is
not normal and the sample size is small.
•
Example: Calculating CI for other parameters such as the
population median or other percentiles.
week 7
1
Bootstrap Sample
• Suppose we obtained a sample of n observations from some
unknown distribution, F. We call the original sample X.
• Our goal is to know about a parameter, θ, of the original
distribution (for example: mean, median, standard deviation,
upper quartile etc.)
• A bootstrap sample is a sample with replacement of size n
from the original sample. It is denoted by X*.
week 7
2
The Bootstrap Method
• Use Minitab to sample B bootstrap samples of X, X*1, X*2,…, X*B.
• Use Minitab to calculate an estimate of the parameter of interest θ
from each of the B bootstrap samples. Call these estimates ˆ*1 ,ˆ*2 , ...,ˆ*B .
• The collection of estimates ˆ*1 ,ˆ*2 , ...,ˆ*B form the bootstrap estimate
of the distribution of ˆ .
• Use Minitab to calculate summary statistics and histograms for the
bootstrap estimate of the distribution of ˆ .
• Determine the bootstrap percentile interval for θ by finding the lower
αB/2 percentile and the upper αB/2 percentile of the bootstrap
distribution.
week 7
3
Example
week 7
4
Hypotheses Testing
• A hypothesis test is a formal procedure for comparing observed
data with a hypothesis whose truth we want to assess. The
hypothesis is a statement about the parameters in a population
or model.
• There are two types of hypotheses:
 The null hypothesis, H0, is the current belief.
 The alternative hypothesis, Ha, is your belief, it is the
negation of the null hypothesis.
• The test of significance is designed to assess the strength of the
evidence against the null hypothesis.
week 7
5
Example
Each of the following situations requires a significance test about a
population mean . State the appropriate null hypothesis H0 and alternative
hypothesis Ha in each case.
(a) The mean area of the several thousand apartments in a new development is
advertised to be 1250 square feet. A tenant group thinks that the apartments
are smaller than advertised. They hire an engineer to measure a sample of
apartments to test their suspicion.
Answer: H0:  = 1250 ft2 ; Ha:  < 1250 ft2
(b) Larry's car consume on average 32 miles per gallon on the highway. He
now switches to a new motor oil that is advertised as increasing gas
mileage. After driving 3000 highway miles with the new oil, he wants to
determine if his gas mileage actually has increased.
Answer: H0:  = 32 mpg; Ha:  > 32 mpg
(c) The diameter of a spindle in a small motor is supposed to be 5 millimeters.
If the spindle is either too small or too large, the motor will not perform
properly. The manufacturer measures the diameter in a sample of motors to
determine whether the mean diameter has moved away from the target.
Answer: H0:  = 5 mm; Ha:   5 mm.
week 7
6
Hypothesis Test Procedure
• A hypothesis test is a proof by contradiction: we assume the
null hypothesis is true, but we want to find evidence that does
not support this assumption.
• A hypothesis test is conducted by constructing a test statistic
that incorporate both H0 and the sample data and has a known
distribution (usually Z, t, F, χ2).
• The hypothesis test conclusion is based on how likely the
observed test statistic (or more extreme) is under the
assumption that H0 is true.
week 7
7
Rejection Region approach for Test Conclusion
• There are two ways to determine if the test statistic is likely or
not.
• The first method is using a significant level, α, and a rejection
region. If the test statistic is in the rejection region we have
evidence against H0. If the test statistic is not in the rejection
region we find have no evidence against H0.
week 7
8
Example
week 7
9