Hypothesis Testing

Transcript Hypothesis Testing

Hypothesis Testing
-3 -2 -
 + +2 +3
Lecture 7
0909.400.01 / 0909.400.02
Dr. P.’s
Clinic Consultant Module in
Probability & Statistics
Unless indicated otherwise, all cartoons from
in Engineering
The Cartoon Guide to Statistics by L. Gonick and W. Smith
1993, Harper Resource
Today in P&S
-3 -2 -
 + +2 +3
 Review:
 Confidence intervals for proportion of success, large sample, small sample mean
 Hypothesis testing
 Null hypothesis vs. alternative hypothesis
 A statistician’s cherished values: The -value, the β value, the p-value and all that
jazz…
 We find the defendant guilty of committing a type II error…, your honor!
• Type I and type II error in hypothesis testing
 After the exam – next week: Tests of Hypotheses
 Large sample significance tests for proportions
 Large sample tests for population mean
 Small sample tests for population mean
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Confidence Intervals
-3 -2 -
 + +2 +3
 One of the most important areas where we use statistics is in making decisions from
incomplete data.
 Often times we wish to make an inference about a population, such as average shelf-life of a product,
average BP of a patient after a specific treatment, percent defect rate of a product, average test scores,
expected election results, etc. by only looking the data available from a small sample.
 The problem is, there is uncertainty in our sample, since it is a random selection from a population
whose statistical properties are unknown to us.
ENGINEERS
ROWAN ENGINEER’S
NIGHT OUT
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Confidence Intervals
-3 -2 -
 + +2 +3
 We have seen that often times we simply use the sample mean or the probability of success
for most of our inferences. But how good is our estimate? Confidence intervals allow us to
precisely quantify the uncertainty in the data.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Confidence Intervals
-3 -2 -
 + +2 +3
 For proportion of successes, the 100(1-)% confidence interval
in which the true probability of success p will lie is
p1  p  


n

s  

p   pˆ  z / 2 pˆ    pˆ  z / 2
  pˆ  z / 2
n  

 Recall that for - say 95% confidence level - =0.05, and the
above CI is computed by finding the critical z- values ±z/2 ,
corresponding to /2, such that the true (unknown) value has
a 95% probability of lying between the two limits.
1-
/2
-z/2
0
/2
+z/2
 For sample means, computed from large sample sizes, the 100(1- )% confidence interval in
which the true mean μ lies is
s 

   x  z / 2


n
where s is the std.dev of sample mean.
 Critical z-values can easily be computed from tables or from Matlab. Most commonly used
values are
Confidence Level (%)
99.73
99
98
96
95.45
95
90
80
68.27
50
Critical Value z/2
3.00
2.58
2.33
2.05
2.00
1.96
1.645
1.28
1.00
0.675
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Example
-3 -2 -
 + +2 +3
 Here is one sample of size 100 from a group of students’ weights.
Unbeknownst to us, the population is normal with mean weight of 160 lbs
and a standard deviation of 20. These parameters we wish to estimate.
From sample data we
compute:
n=100
Sample mean X =157.46
Sample std.dev. s=18.89
Sample Size
We want to find out the
90%, 95% and 99%
confidence intervals
(=0.1, 0.05 and 0.01,
respectively) for the
students’ weight.
s 

   x  z / 2

n

136
136
162
176
153
157
169
180
150
191
115
138
173
164
143
158
141
128
167
174
179
189
136
140
169
169
160
158
174
199
149
161
150
186
189
148
147
146
202
170
166
135
154
165
149
157
170
139
132
197
164
159
135
189
143
151
171
135
139
153
160
137
133
182
155
158
155
165
180
137
139
133
178
146
173
190
118
151
152
155
141
154
160
134
142
147
162
161
132
183
152
179
147
158
135
133
191
152
166
137
z0.1 2  z0.05  1.645    157.46  1.645 * 18.89 10   [154.4 160.6]
z0.05 2  z0.025  1.96    157.46  1.96 * 18.89 10   [153.8 161.2]
z0.01 2  z0.005  2.58    157.46  2.58 * 18.89 10  [152.6 162.3]
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Small Sample Size
-3 -2 -
 + +2 +3
 So far we have secretly and inconspicuously introduced the phrase “for
sufficiently large sample sizes” into our calculations
 Exactly what is sufficiently large? Depends on the problem, but usually n>40
 What happens if n is not sufficiently large?
 Recall that in calculating the confidence interval we needed to compute,
X 
which included the term σ, which was unknown to us. So we replaced
 n
it with the standard error, s, the variance of the sample mean.
 While
X 
 n
is indeed normal,
X 
s n
is only approximately normal, and only
for large n.
 In fact,
X 
s n
is said to have a student’s t-distribution
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
T-Distribution
-3 -2 -
 + +2 +3
 When X is the mean of a random sample of size n from a normal distribution with
mean μ the random variable
X 
T
S n
has a probability distribution called (student’s) t-distribution with n – 1 degrees of
freedom (df).
 For large n the r.v. S will have a value s close to the true σ, however, for small n this
is not the case. Therefore, the t-distribution resembles the normal distribution for
large n but deviates from it for smaller n
std. normal
t-dist., large n  large ν
t-dist., small n (ν)
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Properties of T-Distribution
-3 -2 -
 + +2 +3
 Let tv denote the density function curve for v degrees of freedom.
1. Each tν curve is bell-shaped and centered at 0.
2. Each tν curve is spread out more than the standard normal-z curve.
3. As ν increases, the spread of the corresponding tν curve decreases.
4. As ν→∞ , the sequence of tν curves approaches the standard normal curve (the z
curve is called a t curve with df =∞)
5. Let t,ν= the number on the measurement axis for which the area under the t
curve with ν df to the right of t,ν is . Then, t,ν is called a t-critical value
(which is the counterpart of the z critical value in normal distribution). For
brevity, when the meaning is obvious, we will drop ν and simply use t just like z
tν curve
/2
-t/2,ν
1-
0
/2
t/2,ν
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Confidence Intervals Using
T-Distribution
-3 -2 -
 + +2 +3
 Then, for smaller sample sizes (where the original distribution is normal),
we can write the confidence interval expression as follows:
 Let x and s be the sample mean and standard deviation computed from the results
of a random sample from a normal population with mean μ. The 100(1-)%
confidence interval is:
s
s 

   x  t 2,n 1 
, x  t 2,n 1 

n
n

s
 x  t 2,n 1 
n
tcdf(x,v): returns the area under the tv curve, left of x
tinv(p,v): return the t-critical value to the left of which the area under the curve is p
Strictly speaking, the t-distribution applies if and only if the population parameter being
estimated is normally distributed. However, in practice, t-distribution works well, if the
population distribution is only approximately mound-shaped.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Hypothesis Testing
-3 -2 -
 + +2 +3
 Estimating the value of a parameter, even along with its confidence interval, has
little meaning, unless we use that information to make a decision.
 The probability that a randomly selected processor from a specific manufacturer will be
flawed is 0.24%±0.01% with a confidence level of 95%...So what…?
 Shall we decide that this is a reliable processor?
 Confidence intervals are most useful in
making decisions based on statistical tests
 Given an observation based on a finite
random sample, can this observation be
entirely due to chance?
 In HT, we compare two hypotheses
against each other and determine whether
we have enough statistical evidence to
reject the hypothesis that the observation
is entirely due to chance.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Hypothesis Testing:
Setting the stage
-3 -2 -
 + +2 +3
 We will start with an example to familiarize our self with the terminology. Note that
any given application can easily be substituted into a number of engineering or nonengineering scenarios:
 As the CEO of Owl Superior Chip Co., you hear the announcement of your competitor
Lentil’s new chips: Pantsium XIX, and its low cost version Crapleron.
 Lentil declares that their chips, even the low cost versions, are 99.99% defect free (that is,
only 0.01% of their chips are flawed).
 Since you are in this business for quite some time (2 ½ months), you think this is pretty
impressive, if not too good to be true…You are suspicious.
 You know that Pantsium is pretty reliable, but 99.99% on Crapleron…?
 You suspect that Lentil is cheating in its figures…that the 99.99% is primarily for the
Pantsium chips, not for the Craplerons…How to prove?
 You later learn that in estimating the 99.99% figure, they have taken a sample of 80 chips, of
which only 4 were Crapleron…You consider going to court, stating that this is false
advertising!...to which they reply with “…well, we randomly picked 80 chips from a
production run that manufactures equal number of chips of each kind. The fact that there
were only 4 of Craplons in the sample is purely coincidental. There is no foul play!”
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
By chance…?
-3 -2 -
 + +2 +3
 Other versions of the same scenario:
 A company whose workforce of 80 employees consists of 76 males and 4 females. The
company claims that they do not favor males, and the fact that there are only 4 females is
purely by chance. On the days they were hiring, only men happened to apply – although men
and women are equally likely to apply and be successful in such a position
 A southern state in 1960s: Out of a panel of 80 potential jurors, only 4 were African –
American, in a district where 50% of all eligible citizens were AAs.
50% of all eligible employees/jurors/chips are
women / African American / Crapleron
On a random sample of 80 employees/jurors/chips,
only 4 are women / African American / Crapleron !
Could this be the result of pure chance?
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
What are the odds?
-3 -2 -
 + +2 +3
 If the selection is really random, and that each group is 50%
of the total population, then the number of women / AAs /
Craplerons in the sample would be the binomial random
variable X with p=0.5, and n=80.
 Thus the chances of getting only 4 women/AAs/Craplerons
is P(X≤4), which is
 80  4
 0.5 1  0.576 0.0000000000000000014 !
4
 You think you have enough statistical evidence to reject
Lentil’s claim that having only 4 Craplons in their sample
was random or pure chance. You go to court!
The probability of 4
Craplerons in a sample of 80 is
0.0000000000000000014!
your honor…
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
What are the odds?
-3 -2 -
 + +2 +3
 To drive the point home, you argue that this probability is less than the
chances of getting three consecutive royal flushes in poker, or almost the
same as hitting the big jackpot twice in a row!
 Remember? Picking 6 numbers out of 52 in order: 0.000000000068
 Getting 4 Craplerons in sample of 80: 0.0000000000000000014 !
 So the judge rejects Lentil’s claim (hypothesis) of random selection!
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Formal Definitions
-3 -2 -
 + +2 +3
 A statistical hypothesis is a claim about the value or values of one or more
parameters.
 Proportion of defective chips is p<0.01%
 Average SAT math score in NJ is s>500
 Average wattage of a 60W bulb is w=60W
 In any hypothesis testing problem, there are two competing hypothesis
 H0 – Null hypothesis: the protected hypothesis that is initially assumed to be true,
such as the observations are the result of pure chance
 Ha – Alternative hypothesis: the claim that the null hypothesis is false, such as the
observations are not by chance, but are the result of a real effect, plus variation.
 The test is to analyze observed data to determine whether there is enough
evidence to reject the null hypothesis in favor of the alternative hypothesis.
 The burden of proof is with the alternative hypothesis. If the data does not
strongly support the Ha claim, then the test fails to reject H0.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
H0 vs. Ha
-3 -2 -
 + +2 +3
 Often we wish to find out whether a new value / a new theory / a new treatment plan
is better then the previous / existing one.
 H0: The claim that the current value/theory/plan is better.
 Ha: The alternative claim that the new value/theory/plan is better.
 We only replace the current with the new if there is enough, convincing and
compelling evidence to do so.
 Ex: If in the defective chips example, if we develop a new procedure to fabricate the chips,
we would use it if and only if it produces fewer defects. If the current procedure has
proportion of defective chips as p=0.01
• Ha , on which the burden of proof is placed, is the assertion that the new procedure has
p<0.01. H0 is then the initial and prior claim that p=0.01
 The null hypothesis is always in the form of Ho: θ=θ0 (the null value)
 The alternative hypothesis can be in any of the following three forms:
• Ha: θ > θ0 (which implicitly assumes that Ho: θ≤θ0)
• Ha: θ < θ0 (which implicitly assumes that Ho: θ≥θ0)
• Ha: θ ≠ θ0 (which implicitly assumes that Ho: θ=θ0)
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Choosing an
Appropriate test
-3 -2 -


 + +2 +3
Suppose that a 9V battery – when fresh – is required to provide 9.1 V. As the quality control engineer,
you draw a random sample of size n to determine whether you are in compliance.
You design an experiment where H0 μ = 9.1 V and
a. Ha>9.1V
b. Ha<9.1V
c. Ha≠9.1V
You would choose (a) because, in this formulation H0 indicates non-compliance. As a quality
control engineering, you put the burden of proof on asserting that the specs are satisfied.
If we were to choose the other options, then H0 would indicate compliance, and Ha would then
put the burden of proof on asserting that the batteries are in non-compliance. If you were
challenged in a legal proceeding, however, the alleger would have to choose test (b).

Suppose 5pCi/L is the borderline for radioactivity in water. Which test would you choose?
Choose H0: μ=5pCi/L vs. Ha: μ<5pCi/L  Then the water is believed unsafe unless proven
otherwise, that is the burden of proof is on showing that the water is indeed safe, that is μ<5pCi/L.
Choosing Ha: μ>5pCi/L would mean that the water is safe, unless proven otherwise.

Suppose you manufacture 20 A fuses for home use. If the fuse burns out at < 20 A, then users would
complain fuse burning out prematurely. If fuse burns out at >20 A, then fire may occur due to
malfunctioning fuse. What test should you choose?
Choose H0: μ=20 A vs. Ha: μ≠20 A. Because this time the burden of proof is on showing that fuse
blows out at exactly at 20A. Departure in either way from 20A is equally costly.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Testing Procedure
-3 -2 -
 + +2 +3
 Step 1: Formulate the hypotheses and determine the null value
The null hypothesis asserting that current / status quo situation is preferred




H0:Lentil’s sample was purely random – there was 50% chance to pick either chip
H0: The new drug will lower the cholesterol by (no more then) 20%
H0: The new engine technology will allow gas mileage of (no more then) 30mph
H0: The defective component ratio of our product is the same as the competitor’s
The alternative hypothesis claiming that the null hypothesis should be
rejected in preference of the new procedure
 Ha: Lentil’s sample was not purely random, but rather it was biased: there was
>50% chance to pick Pantsium in the sample.
 Ha: The new drug will lower the cholesterol by > 20%
 Ha: The new engine technology will allow gas mileage > 30 mph
 Ha: The defective component ratio of our product is < that of the competitor’s.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Testing Procedure
-3 -2 -
 + +2 +3
 Step 2: Choose a test statistic and the formula for computing it
 A test statistic is a function of the sample data on which the decision – reject H0 or do not
reject H0 – will be based
 This is the statistical value that will asses your evidence against the null hypothesis
• For the random sampling of chips example, the test statistic would be the binomial random
variable with probability of success p=0.5, and the number of trials n=80.
– For applications of the form ‘proportion of successes’, the test statistic will
generally be the mean of the observed binomial random variable probability of
success, compared with the presumed probability of success (p0, the null value)
Note that for a large enough sample size this random variable is approximately
normal
pˆ  p0
pˆ  p0
z

/ n
p0 1  p0  n
• For the gas mileage problem, the test statistic would be the sample mean of the gas mileage
obtained from a normally distributed gas mileages of the cars with the new technology: H0:
μnew_tech=30mpg vs. Ha: μnew_tech> 30mpg.
– For all applications of the form “average value”, the test statistic will generally be
the mean of the random sample (sample mean) compared to presumed average (μ0,
the null value). Note that from CLT, for a sufficiently large sample size, this
statistic will also be approximately normal.
x  0
z
/ n
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Testing Procedure
-3 -2 -
 + +2 +3
 Step 3: State the rejection region for a selected significance level 
 The rejection region is the set of all test statistic values for which H0 will be
rejected.
• For the Lentil’s random sampling, we may want to reject their hypothesis if the
probability of selecting 4 Craplons at random is less then a specific value. In the
previous example, the de-facto rejection (for the judge) was the probability of three
royal flushes in a row or hitting the jackpot twice in a row.
• For the gas mileage example, we may choose the rejection as average gas mileage being
less then 35 mph.
– Note that since the H0 is the default hypothesis, we need convincing and
compelling argument to reject it. Therefore, the rejection region usually
picked in such a manner to give H0 plenty of “benefit of the doubt”
• The  value is the confidence we wish to have in our rejection region. For example a
95% confidence for the car example, would mean that after observing a large number
of cars with the new technology, on average, 95% will have a gas mileage 40 (or higher).
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Testing Procedure
-3 -2 -
 + +2 +3
 Step 4:Compute the sample quantities and decide whether H0
should be rejected
 For the random sampling example, we compute the probability
P(X≤4|p0=0.5, n=80)
 For the car example, we compute P( x ≥35 |μ0=30, σ=…, n=…)
• We then compare these values to rejection region at the specified confidence level.
 A commonly used figure of merit is the p-value,
which answers the following question:
 If the null hypothesis were true, then what is the
probability of observing a test statistic as extreme as the
one we observed ?
 The smaller the p-value, the stronger the evidence
against the null hypothesis.
 Much more about the p-value later…
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Testing Procedure
-3 -2 -
 + +2 +3
 If the p-value is less then a threshold, corresponding to the rejection region,
then we agree that there is statistically compelling evidence against H0.
 For the random sampling example, p=1.4x10-18, we have enough evidence to rule
out Lentil’s claim that having only 4 Craplerons in their sample was purely
coincidental!
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Errors in
Hypothesis Testing
-3 -2 -
 + +2 +3
 Can we make errors despite being over cautious and giving H0 plenty of
benefit of the doubt…?
 Of course, in fact, there are two types of errors we can make. To make the point,
think of the fire detector in your house, and how often it goes off if you make the
toast little too dark!
 Well, this is called Type I error: An alarm without a fire
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Errors in
Hypothesis Testing
-3 -2 -
 + +2 +3
 Every cook knows how to avoid a type I error: Just remove the batteries!
 But then this can cause a fire going undetected – and this is called Type II error :
A fire without an alarm!
 Similarly, we can reduce the chance of Type II error by increasing the
sensitivity of the sensor, but then again, that increases the probability of
Type I error.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Errors in
Hypothesis Testing
-3 -2 -
 + +2 +3
 We can put these observations in a table, called the decision table.
 Now consider the null hypothesis that there is no fire, and Ha: FIRE!. The
alarm, then corresponds to rejection of the null hypothesis.
 Statistically speaking:
 A type I error is committed if we reject the null hypothesis when in fact it was
true
 A type II error is committed if we fail to reject the null hypothesis, when in fact it
was false.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
:Type I Error
-3 -2 -
 + +2 +3
 Examples:
 For the car example, let’s suppose we observed 50 cars and checked their gas
mileage. It is possible that the average gas mileage of those 50 cars was say 35.7
mph, when in fact the true average is below 35. Then by rejecting H0, a type I error
is made.
 On the other hand, it is also possible that the average gas mileage of those 50 cars
were, say 34.6 mph, when in fact, the true average was above 35. Then by not
rejecting H0, a type II error is made.
 Note that the significance level we mention earlier, emphasized the
probability of committing a type I error: the probability of making the
observed observation, if indeed H0 was true:
 P(rejecting H0 | H0 is true) = P(type I error | H0 )=
 Then, with 100(1- )% confidence, we claim that the observed observation under
H0 is statistically very unlikely, and hence reject H0. The lower the , the higher the
confidence we have in rejecting H0 hence the lower the probability of committing
a type I error.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Type II error
-3 -2 -
 + +2 +3
 But, sometimes we are interested in type II error, is our alarm too sensitive?
 In the past, factories discharging chemicals into waterways were required to show that the
discharge had no effect on the downstream wildlife. This is H0. The factory could continue,
as long as H0 was not rejected at the 0.05 significance level.
 So a polluter, suspecting that he is in violation of EPA standards could devise an ineffective
pollution monitoring program:
 Type I error: Reject H0, when it is true
(shut down the factory, when in fact
its discharge really has no effect on
wildlife)
 Type II error: Accept H0, when it is false
(factory continues, when in fact it is
decimating the wildlife).
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Type II error
-3 -2 -
 + +2 +3
 Such a test, say “interviewing the ducks” is equivalent to removing the batteries from the fire
detector. Both are designed to reduce (remove) type I error.
 Of course, such a test greatly increases the probability of
committing a type II error, that is, accepting the H0 that
the factory discharge is harmless, when in fact it is.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
β:Type II error
-3 -2 -
 + +2 +3
 Just like we limit our probability of committing a type I error using a confidence
level of , we can also limit our probability of making a type II error.
 We define: β = P(accepting H0 | Ha is true) = P(type II error |Ha)
 Thus β defines the probability of making a type II error. The lower the β, the more
confident we are of not committing a type II error.
 Again, just like our confidence in not making a type I error is 1-, our confidence in
not making a type II error is then 1-β, which is called the power of a hypothesis test.
 Note that the two types of error, type I and II are always in competition. Reducing
one increases the other.
 Of course, we’re happy to report, the environmental regulations
have changed since then, requiring pollution monitoring programs
to show that they have a high probability of detecting serious
pollution events – that is having a very small β, revealing any
hidden flaws in the monitoring program.
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
A Complete Example
-3 -2 -
 + +2 +3
 A new design for braking systems is proposed. For the current system, the true
average braking distance at 40 mph is 120 ft. The new system is to be implemented,
if there is substantial evidence that it will reduce the braking distance significantly.
 Parameter of interest, appropriate hypotheses to test the new system
 Suppose the new system’s braking distance has a σ=10 ft. Let X be the sample average
breaking distance of the new system for 36 observations. Which rejection region is most
appropriate? R1: x >124.80,
R2: x <115.20,
R3: { x >125.13 or x <114.87}
 What is the significance level for the appropriate region in selected above? How would we
change the region to obtain 99% confidence level?
 What is the probability that the new design is NOT implmented when its true average
braking distance is actually 115 ft and the appropriate region from above is used?
 Let Z  X  120  n  . What is the significance level for the rejection region of z<-2.33?
How about z<-2.88?
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
Solution
-3 -2 -
a.
b.
c.
 + +2 +3
Let μ = true average braking distance for the new design at 40 mph. We want to make sure that
the burden of proof is on the new braking distance be lower, then, Ho: μ = 120 vs. Ha: μ < 120
We want to give the null hypothesis the benefit of doubt. Therefore, we need significant
evidence that the new average distance is substantially less then that of the existing one.
Therefore, we should choose R2. Reject Ho if x < 115.2 (<120)
Recall, significance level is probability of type I error, that is
rejecting H0, when in fact we shouldn’t. We will reject H0, if
observed average is <115.2. The area under the normal curve
with mean 120 (the assumed average for existing system 
hence H0) is the green shaded region whose area is then :
115.2 120
1-

  Px  115.2 |   120  P z 

x 
115.2  120 
  P z 
  Pz  2.88  0.02  98% confidence

10
/
6
 n


Now, if we want =0.001 (that is increased to 99.9% confidence) , then we should expect a
smaller rejection region: We find the z- value that would give a green shaded
area of 0.001 as -3.08 from the Gaussian tables. Then the
new rejection region threshold c is:
0.001
c  120
114.87  120 

 3.08  c  114.87  P z 
  0.01
10 6
1.667


1-
114.87
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical
and 115.2
Computer Engineering
Solution – Cont.
-3 -2 -
d.
 + +2 +3
What is the probability that the new design is NOT implemented when its true average
braking distance is actually 115 ft and the appropriate region from above is used?
▪ Now, if we are not implementing the new design, then we must have failed to reject H0
(presumably because we think we do not have enough evidence). But in fact the true
average distance for the new design is 115, which is less then 115.20. Clearly, we are
committing a type II error (failed to reject H0 when it should have been)
▪ According to our hypothesis, we will not reject H0 if average braking distance is >115.2.
Therefore, we are looking at the probability of the observed average braking distance
being greater then 115.2, when in fact the observed sample is drawn from a population that
has a mean of 115:
 (115)  Px  115.20 | true   115
115.2  115 

 P z 
  P z  0.12   0.4522
1.6667 

e.
For Z  X  120 
n

z is normal, therefore
  Pz  2.33  0.01
115 115.2
  Pz  2.88  0.02
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering
No new Homework
-3 -2 -
 + +2 +3
MIDTERM: Thursday, Oct 23
 The midterm exam will be during regular class meeting time, Thursday, October 23, 10:50
AM to 12:05PM.
 The exam will consist of several fill-in-the-blank / short answer questions that test your
comprehension of statistical concepts, and a few problems that test your ability to make use
of these concepts in real world problems.
 The conceptual comprehension part will include questions from all material we have
discussed so far, including today’s hypothesis testing, however, the numerical problem
section will only include material from Lectures 1 through 6.
 The problems will be similar in nature to those you have solved for homework.
 The questions will assume that you only have 60 minutes to solve them.
 You may bring one page (two sides) of equation sheet with you. You may put any equation
that you think you may need, however, you may NOT put any definition / description /
explanation etc. on this sheet. Equation sheets will be collected at the end of the exam.
 You may not use your books or laptops. Standard calculators are allowed. I will also provide
a table of standard Gaussian distribution values.
 Complete solution is necessary for full credit, drawing cartoons, however, are not!
 Dr. Linda Head will proctor the exam
© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering

Hypothesis Testing

Transcript Hypothesis Testing

Directory