Document

Transcript Document

Chapter 3
Basic Concepts in Statistics
and Probability
3.7 Choice of
Statistical Distributions
• The distributions are simply models of reality (not
reality themselves)
• Always check the assumptions
3.8 Statistical Inference
• Methods of Statistical Inference used to estimate
the parameters of the statistical distributions:
– Central Limit Theorem
– Point Estimation
• Maximum Likelihood Estimation
– Confidence Intervals
– Tolerance Intervals
– Hypothesis Tests
• Probability Plots
• Likelihood Ratio Tests
– Bonferroni Intervals
3.8.1 Central Limit Theorem
4
Central Limit Theorem
5
Rule of Thumb
For most populations, if the sample size is greater than 30,
the Central Limit Theorem approximation is good.
Normal approximation to the Binomial:
If X ~ Bin(n,p) and if np > 5, and n(1– p) > 5, then
X ~ N(np, np(1-p)) approximately.
Normal Approximation to the Poisson:
If X ~ Poisson(λ), where λ > 10, then X ~ N(λ, λ2).
6
3.8.2 Point Estimation
• The statistical distributions have 1 or more parameters
(usually represented by Greek letters)
• The value of these parameters are generally unknown
and must be estimated by sample statistics.
Parameters

Sample Statistics
2
p
S2
• Point estimator: the form of the estimation
• Point estimates: the numerical value of an estimator
7
3.8.2.1 Maximum Likelihood
Estimation
8
3.8.3 Confidence Intervals
• An interval estimator will contain the unknown
parameter with (approximately) a specified probability.
• The desired degree of confidence determines the width
of the interval.
• For a fixed sample size, an increase in the width will
increase the degree of confidence.
• Increasing the sample size will decrease the width of a
confidence interval.
• Narrower confidence intervals are more meaningful
(lower uncertainty)
• However, a low level of uncertainty is contingent upon
the requisite assumptions being met.
9
Confidence Intervals
10
Constructing a CI
To see how to construct a confidence interval, let 
represent the unknown population mean and let 2
be the unknown population variance. Let
X1,…,X100 be the 100 amperages of the sample
batteries. The observed value of the sample mean
is 185.5. Since X is the mean of a large sample,
and the Central Limit Theorem specifies that it
comes from a normal distribution with mean  and
whose standard deviation is  X   / 100 .
11
Computing a 95% Confidence
Interval
The 95% confidence interval (CI) is X  1.96 X .
So, a 95% CI for the mean is 185.5  1.96 (5/√100) or
185.5  0.98 or (184.52, 186.48). We can use the sample
standard deviation as an estimate for the population
standard deviation, since the sample size is large.
We can say that we are 95% confident, or confident at
the 95% level, that the population mean amperage lies
between 184.52 and 186.48.
Warning: The methods described here require that the data
be a random sample from a population. When used for
other samples, the results may not be meaningful.
12
Illustration of Capturing True
Mean
Here is a normal curve, which represents the distribution of X .
The middle 95% of the curve, extending a distance of 1.96  X
on either side of the population mean , is indicated. The
following illustrates what happens if X lies within the middle
95% of the distribution:
13
Illustration of Not Capturing True
Mean
If the sample mean lies outside the middle 95% of the curve:
Only 5% of all the samples that could have been drawn fall
into this category. For those more unusual samples the 95%
confidence interval X  1.96 X fails to cover the true
population mean .
14
Question?
 Does this 95% confidence interval actually cover
the population mean ?
• It depends on whether this particular sample happened
to be one whose mean came from the middle 95% of the
distribution or whether it was a sample whose mean was
unusually large or small, in the outer 5% of the
population.
• There is no way to know for sure into which category this
particular sample falls.
• In the long run, if we repeated these confidence intervals
over and over, then 95% of the samples will have means
in the middle 95% of the population. Then 95% of the
confidence intervals will cover the population mean.
15
Pieces of CI
• Recall that the CI was 185.5  0.98.
• 185.5 was the sample mean which is a point estimate for
the population mean.
• We call the plus-or-minus number 0.98 the margin of error
• The margin of error is the product of 1.96 and  X = 0.5.
• We refer to  X which is the standard deviation of X , as the
standard error.
• In general, the standard error is the standard deviation of
the point estimator.
• The number 1.96 is called the critical value for the
confidence interval. The reason that 1.96 is the critical
value for a 95% CI is that 95% of the area under the normal
curve is within – 1.96 and 1.96 standard errors of the
population mean.
16
Other CI Levels
• Suppose we are interested in 68% confidence
intervals, then we know that the middle 68% of the
normal distribution is in an interval that extends
1.0  X on either side of the population mean.
• It follows that an interval of the same length around X
specifically, will cover the population mean for 68%
of the samples that could possibly be drawn.
• For our example, a 68% CI for the diameter of
pistons is 185.5  1.0(0.5), or (185.0, 186.0).
17
100(1 - )% CI
Let X1,…,Xn be a large (n > 30) random sample
from a population with mean  and standard
deviation , so that X is approximately normal.
Then a level 100(1 - )% confidence interval for 
is
X  z / 2 X
where  X   / n. When the value of  is
unknown, it can be replaced with the sample
standard deviation s.
18
100(1 - )% CI
19
Specific Intervals for 
• X
s
is a 68% interval for .
n
• X  1.645 s
n
• X  1.96
s
• X  2.58
s
• X 3
s
n
n
n
is a 90% interval for .
is a 95% interval for .
is a 99% interval for .
is a 99.7% interval for .
20
More About CI’s
• The confidence level of an interval measures the
reliability of the method used to compute the
interval.
• A level 100(1 - )% confidence interval is one
computed by a method that in the long run will
succeed in covering the population mean a
proportion 1 -  of all the times that it is used.
• In practice, there is a decision about what level of
confidence to use.
• This decision involves a trade-off, because intervals
with greater confidence are less precise.
21
Probability vs. Confidence
• In computing CI, such as the one of amperage of
batteries: (184.52, 186.48), it is tempting to say that the
probability that  lies in this interval is 95%.
• The term probability refers to random events, which can
come out differently when experiments are repeated.
• 184.52 and 186.48 are fixed, not random. The
population mean is also fixed. The mean diameter is
either in the interval or not.
• There is no randomness involved.
• So, we say that we have 95% confidence that the
population mean is in this interval.
22
Probability vs. Confidence
a) 68% CI
b) 95% CI
c) 99.7% CI
23
Determining Sample Size
• In Example 5.4, a 95% CI was given by 12.68  1.89.
• This interval specifies the mean to within  1.89. Now
assume that the interval is too wide to be useful.
• Assume that it is desirable to produce a 95% confidence
interval that specifies the mean to within  0.5.
• To do this, the sample size must be increased.
• The width of a CI is specified by  z / 2 / n.
• If the desired with is  w then w  z / 2 / n.
• Solving this equation for n yields n  z 2  2 / w2 .
 /2
24
One-Sided Confidence Intervals
• We are not always interested in CI’s with both an upper
and lower bound.
• For example, we may want a confidence interval on
battery life. We are only interested in a lower bound on
the battery life.
• With the same conditions as with the two-sided CI, the
level 100(1-)% lower confidence bound for  is
X  z  X .
and the level 100(1-)% upper confidence bound for 
is X  z  X .
25
Small Sample CIs for a
Population Mean
• The methods that we have discussed for a
population mean previously require that the
sample size be large.
• When the sample size is small, there are no
general methods for finding CI’s.
• If the population is approximately normal, a
probability distribution called the Student’s t
distribution can be used to compute confidence
intervals for a population mean.
26
More on CI’s
• What can we do if X is the mean of a small sample?
• If the sample size is small, s may not be close to ,
and X may not be approximately normal. If we know
nothing about the population from which the small
sample was drawn, there are no easy methods for
computing CI’s.
• However, if the population is approximately normal,
it will be approximately normal even when the
sample size is small. It turns out that we can use the
quantity ( X   ) /(s / n ) , but since s may not be
close to , this quantity has a Student’s t distribution.
27
Student’s t Distribution
• Let X1,…,Xn be a small (n < 30) random sample from
a normal population with mean . Then the quantity
( X  )
.
s/ n
has a Student’s t distribution with n -1 degrees of
freedom (denoted by tn-1).
• When n is large, the distribution of the above quantity
is very close to normal, so the normal curve can be
used, rather than the Student’s t.
28
More on Student’s t
• The probability density of the Student’s t
distribution is different for different degrees of
freedom.
• The t curves are more spread out than the
normal.
• Table C, called a t Distribution, provides
probabilities associated with the Student’s t
distribution.
29
Student’s t CI
Let X1,…,Xn be a small random sample from a normal
population with mean . Then a level 100(1 - )% CI
for  is
s
X  t n1, / 2
.
n
To be able to use the Student’s t distribution for
calculation and confidence intervals, you must have a
sample that comes from a population that is
approximately normal. Samples such as these rarely
contain outliers. So if a sample contains outliers, this
CI should not be used.
30
Other CI’s
Let X1,…,Xn be a small random sample from a normal
population with mean .
• Then a level 100(1 - )% upper confidence bound for  is
s
X  t n1,
.
n
• Then a level 100(1 - )% lower
s confidence bound for  is
X  t n1,
.
n
• Occasionally a small sample may be taken from a normal
population whose standard deviation  is known. In these
cases, we do not use the Student’s t curve, because we are
not approximating  with s. The CI to use here is the one
using the z table, which we discussed in the first section.
31
3.8.4 Tolerance Intervals
• A confidence interval for a parameter is an
interval that is likely to contain the true value of
the parameter.
• Prediction and tolerance intervals are concerned
with the population itself and with values that may
be sampled from it in the future.
• These intervals are only useful when the shape of
the population is known, here we assume the
population is known to be normal.
32
Prediction Interval
• A prediction interval is an interval that is likely to
contain the value of an item that will be sampled
from the population at a future time.
• We “predict” that a value that is yet to be sampled
from the population will fall within the predication
interval.
33
100(1 – α)% Prediction Interval
34
Comparing CI and PI
• The formula for the PI is similar to the formula for the CI of
a mean of normal population.
• The prediction interval has a small adjustment to the
standard error with the additional + 1 under the square root.
• This reflects the random variation in the value of the
sampled item that is to be predicted.
• Prediction intervals are sensitive to the assumption that the
population is normal.
• If the shape of the population differs much from the normal
curve, the prediction interval may be misleading.
• Large samples do not help, if the population is not normal
then the prediction interval is invalid.
35
Tolerance Intervals
• A tolerance interval is an interval that is likely to
contain a specified proportion of the population.
• First assume that we have a normal population
whose mean μ and standard deviation σ are known.
• To find an interval that contains 90% of the
population, we have μ ± 1.645σ.
• In general, the interval μ ± zγ/2σ will contain
100(1 – γ)% of the population.
• In practice, we do not know μ or σ. Instead we use
the sample mean and sample standard deviation.
36
Consequences
• Since we are estimating the mean and standard
deviation from the sample,
– We must make the interval wider than it would be if μ
and σ were known.
– We cannot be 100% confident that the interval actually
contains the required proportion of the population.
37
Construction of
Tolerance Interval
• We must specify the proportion 100(1 – γ)% of the
population that we wish the interval to contain.
• We must also specify the confidence 100(1 – α)% that the
interval actually contains the specified proportion.
• It is then possible to find a number kn,α,γ such that the
interval
X  kn, , s
will contain at least 100(1 – γ)% of the population with
confidence 100(1 – α)%. Values of kn,α,γ are presented in
various statistical books (for example Table A.4 in
Principles of Statistics for Engineers and Scientists)
38
Tolerance Interval Summary
Let X1,…,Xn be a random sample from a normal
population. A tolerance interval for containing at least
100(1 – γ)% of the population with confidence
100(1 – α)% is
X  kn, , s
Of all the tolerance intervals that are computed by this
method, 100(1 – α)% will actually contain at least
100(1 – γ)% of the population.
39
3.8.4 Hypothesis Tests
1. Formulate a hypothesis that is to be tested
2. Use the data to test the hypothesis
3. Determine whether or not the hypothesis should be
rejected
•
•
The hypothesis that is being tested is called the null
hypothesis (H0), which is tested against an alternative
hypothesis (H1).
Hypothesis tests are used implicitly when control
charts are employed.
40
Hypothesis Tests
41
Hypothesis Tests
• In general, hypotheses that are tested are hardly ever
true.
• A hypothesis that is being tested is more likely to be
rejected for a very large sample size than for a very
small sample size.
42
P-Value
• The P-value measures the plausibility of H0.
• The smaller the P-value, the stronger the
evidence is against H0.
• If the P-value is sufficiently small, we may
be willing to abandon our assumption that
H0 is true and believe H1 instead.
• This is referred to as rejecting the null
hypothesis.
43
Steps in Performing a
Hypothesis Test
1. Define H0 and H1.
2. Assume H0 to be true.
3. Compute a test statistic. A test statistic is a statistic
that is used to assess the strength of the evidence
against H0. A test that uses the z-score as a test
statistic is called a z-test.
4. Compute the P-value of the test statistic. The P-value
is the probability, assuming H0 to be true, that the test
statistic would have a value whose disagreement with
H0 is as great as or greater than what was actually
observed. The P-value is also called the observed
significance level.
44
One and Two-Tailed Tests
• When H0 specifies a single value for , both
tails contribute to the P-value, and the test
is said to be a two-sided or two-tailed
test.
• When H0 specifies only that  is greater
than or equal to, or less than or equal to a
value, only one tail contributes to the Pvalue, and the test is called a one-sided or
one-tailed test.
45
Drawing Conclusions from the
Results of Hypothesis Tests
• There are two conclusions that we draw when
we are finished with a hypothesis test,
– We reject H0. In other words, we concluded that H0 is
false.
– We do not reject H0. In other words, H0 is plausible.
• One can never conclude that H0 is true. We can
just conclude that H0 might be true.
• We need to know what level of disagreement,
measured with the P-value, is great enough to
render the null hypothesis implausible.
46
More on the P-value
• The smaller the P-value, the more certain we can
be that H0 is false.
• The larger the P-value, the more plausible H0
becomes but we can never be certain that H0 is
true.
• A rule of thumb suggests to reject H0 whenever P
 0.05. While this rule is convenient, it has no
scientific basis.
47
Comments
• Some people report only that a test significant at a certain level,
without giving the P-value. Such as, the result is “statistically
significant at the 5% level.”
• This is poor practice.
• First, it provides no way to tell whether the P-value was just barely
less than 0.05, or whether it was a lot less.
• Second, reporting that a result was statistically significant at the 5%
level implies that there is a big difference between a P-value just
under 0.05 and one just above 0.05, when in fact there is little
difference.
• Third, a report like this does not allow readers to decide for
themselves whether the P-value is small enough to reject the null
hypothesis.
• Reporting the P-value gives more information about the strength of
the evidence against the null hypothesis and allows each reader to
decide for himself or herself whether to reject the null hypothesis.
48
Comments on P
Let  be any value between 0 and 1. Then, if P  ,
 The result of the test is said to be significantly significant
at the 100% level.
 The null hypothesis is rejected at the 100% level.
 When reporting the result of the hypothesis test, report
the P-value, rather than just comparing it to 5% or 1%.
49
Significance
• When a result has a small P-value, we say that it
is “statistically significant.”
• In common usage, the word significant means
“important.”
• It is therefore tempting to think that statistically
significant results must always be important.
• Sometimes statistically significant results do not
have any scientific or practical importance.
50
Hypothesis Tests and CI’s
• Both confidence intervals and hypothesis tests are
concerned with determining plausible values for a
quantity such as a population mean .
• In a hypothesis test for a population mean , we specify
a particular value of  (the null hypothesis) and
determine if that value is plausible.
• A confidence interval for a population mean  can be
thought of as a collection of all values for  that meet a
certain criterion of plausibility, specified by the
confidence level 100(1-)%.
• The values contained within a two-sided level
100(1-)% confidence intervals are precisely those
values for which the P-value of a two-tailed hypothesis
test will be greater than .
51
Small Sample Test for a
Population Mean
• When we had a large sample we used the sample
standard deviation s to approximate the population
deviation .
• When the sample size is small, s may not be close
to , which invalidates this large-sample method.
• However, when the population is approximately
normal, the Student’s t distribution can be used.
• The only time that we don’t use the Student’s t
distribution for this situation is when the population
standard deviation  is known. Then we are no
longer approximating  and we should use the ztest.
52
Hypothesis Test
• Let X1,…, Xn be a sample from a normal population
with mean  and standard deviation , where  is
unknown.
• To test a null hypothesis of the form H0:   0, H0:
 ≥ 0, or H0:  = 0.
• Compute the test statistic
t
X  0
.
s/ n
53
P-value
Compute the P-value. The P-value is an area under
the Student’s t curve with n – 1 degrees of freedom,
which depends on the alternate hypothesis as follows.
• If the alternative hypothesis is H1:  > 0, then the Pvalue is the area to the right of t.
• If the alternative hypothesis is H1:  < 0, then the Pvalue is the area to the left of t.
• If the alternative hypothesis is H1:   0, then the Pvalue is the sum of the areas in the tails cut off by t
and -t.
54
Fixed-Level Testing
• A hypothesis test measures the plausibility of the null
hypothesis by producing a P-value.
• The smaller the P-value, the less plausible the null.
• We have pointed out that there is no scientifically valid
dividing line between plausibility and implausibility, so it is
impossible to specify a “correct” P-value below which we
should reject H0.
• If a decision is going to be made on the basis of a
hypothesis test, there is no choice but to pick a cut-off
point for the P-value.
• When this is done, the test is referred to as a fixed-level
test.
55
Conducting the Test
To conduct a fixed-level test:
• Choose a number , where 0 <  < 1. This is
called the significance level, or the level, of the test.
• Compute the P-value in the usual way.
• If P  , reject H0. If P > , do not reject H0.
56
Comments
• In a fixed-level test, a critical point is a value of the test
statistic that produces a P-value exactly equal to .
• A critical point is a dividing line for the test statistic just
as the significance level is a dividing line for the Pvalue.
• If the test statistic is on one side of the critical point, the
P-value will be less than , and H0 will be rejected.
• If the test statistic is on the other side of the critical
point, the P-value will be more than , and H0 will not be
rejected.
• The region on the side of the critical point that leads to
rejection is called the rejection region.
• The critical point itself is also in the rejection region.
57
Errors
When conducting a fixed-level test at
significance level , there are two types of
errors that can be made. These are
Type I error: Reject H0 when it is true.
Type II error: Fail to reject H0 when it is false.
The probability of Type I error is never
greater than .
58
Power of Tests
• A hypothesis test results in Type II error if H0 is not
rejected when it is false.
• The power of the test is the probability of rejecting
H0 when it is false. Therefore,
Power = 1 – P(Type II error).
• To be useful, a test must have reasonable small
probabilities of both type I and type II errors.
59
More on Power
• The type I error is kept small by choosing a small
value of  as the significance level.
• If the power is large, then the probability of type II
error is small as well, and the test is a useful one.
• The purpose of a power calculation is to
determine whether or not a hypothesis test, when
performed, is likely to reject H0 in the event that
H0 is false.
60
Computing the Power
This involves two steps:
1. Compute the rejection region.
2. Compute the probability that the test statistic
falls in the rejection region if the alternate
hypothesis is true. This is power.
When power is not large enough, it can be
increased by increasing the sample size.
61
Example
Find the power of the 5% level test of H0:   80 versus
H1:  > 80 for the mean yield of the new process under
the alternative  = 82, assuming n = 50 and  = 5.
62
3.8.5.1 Probability Plots
• Scientists and engineers often work with data that
can be thought of as a random sample from
some population. In many cases, it is important
to determine the probability distribution that
approximately describes the population.
• More often than not, the only way to determine an
appropriate distribution is to examine the sample
to find a sample distribution that fits.
63
Finding a Distribution
Probability plots are a good way to determine an
appropriate distribution.
Here is the idea: Suppose we have a random sample
X1,…,Xn. We first arrange the data in ascending order.
Then assign evenly spaced values between 0 and 1 to
each Xi. There are several acceptable ways to this; the
simplest is to assign the value (i – 0.5)/n to Xi.
The distribution that we are comparing the X’s to should
have a mean and variance that match the sample mean
and variance. We want to plot (Xi, F(Xi)), if this plot
resembles the cdf of the distribution that we are
interested in, then we conclude that that is the
distribution the data came from.
64
Probability Plot: Example
i
1
2
3
4
5
Xi
3.01
3.35
4.79
5.96
7.89
(i-.5)/n
0.1
0.3
0.5
0.7
0.9
Qi
2.4369
3.9512
5.0000
6.0488
7.5631
Qi
8.0000
7.0000
6.0000
5.0000
4.0000
3.0000
2.0000
1.0000
0.0000
0
2
4
6
8
10
65
Probability Plot: Example
66
Software
Many software packages take the (i – 0.5)/n
assigned to each Xi, and calculate the
quantile (Qi) corresponding to that number
from the distribution of interest. Then it
plots each (Xi, Qi). If this plot is a
reasonably straight line then you may
conclude that the sample came from the
distribution that we used to find quantiles.
67
Normal Probability Plots
The sample plotted on the left comes from a population
that is not close to normal.
Normal Probability Plots
The sample plotted on the left comes from a population that
is not close to normal. The sample plotted on the right
comes from a population that is close to normal.
3.8.5.2 Likelihood Ratio Tests
• The general idea is to form the ratio of the
likelihood function using the hypothesized value
and the likelihood function using an alternative
value.
70
3.8.6 Bonferroni Intervals
71
Bonferroni Method in
Hypothesis Tests
• Sometimes a situation occurs in which it is necessary to
perform many hypothesis tests.
• The basic rule governing this situation is that as more
tests are performed, the confidence that we can place
in our results decreases.
• The Bonferroni method provides a way to adjust Pvalues upward when several hypothesis tests are
performed.
• If a P-value remains small after the adjustment, the null
hypothesis may be rejected.
• To make the Bonferroni adjustment, simply multiply the
P-value by the number of test performed.
72
Example
Four different coating formulations are tested to see if they
reduce the wear on cam gears to a value below 100 m. The
null hypothesis H0:   100 m is tested for each formulation
and the results are
Formulation A: P = 0.37
Formulation B: P = 0.41
Formulation C: P = 0.005
Formulation D: P = 0.21
The operator suspects that formulation C may be effective, but
he knows that the P-value of 0.005 is unreliable, because
several tests have been performed. Use the Bonferroni
adjustment to produce a reliable P-value.
73
3.9 Enumerative Studies v.s.
Analytic Studies
• An enumerative study is conducted for the purpose of
determining the “current state of affairs” relative to a
fixed frame (population).
• Example of enumerative study: random sampling of
typing errors made by clerical workers.
• Analytic study focuses on determining the cause(s) of
the errors that were made, with an eye toward
reducing the number.
• Making inferential and descriptive statements
regarding a fixed frame <> Determining how to
improve future performance.

Document

Transcript Document

Directory