ConfidInterval

Download Report

Transcript ConfidInterval

Chapter 9:
Basics of Hypothesis Testing
March 16
In Chapter 9:
9.1 Null and Alternative Hypotheses
9.2 Test Statistic
9.3 P-Value
9.4 Significance Level
9.5 One-Sample z Test
9.6 Power and Sample Size
Review: Basics of Inference
• Population  all possible values
• Sample  a portion of the population
• Statistical inference  generalizing from a
sample to a population with calculated degree
of certainty
• Two forms of statistical inference
– Hypothesis testing
– Estimation
• Parameter  a numerical characteristic of a population,
e.g., population mean µ, population proportion p
• Statistic  a calculated value from data in the sample,
e.g., sample mean ( x ), sample proportion ( p̂)
Distinctions Between Parameters
and Statistics (Chapter 8 review)
Parameters
Statistics
Source
Population
Sample
Notation
Greek (e.g., m) Roman (e.g., xbar)
Vary
No
Yes
Calculated
No
Yes
Ch 8 Review:
How A Sample Mean Varies
Sampling distributions of means tend to be
Normal with an expected value equal to
population mean µ and standard deviation (SE) =
σ/√n
x ~ N m , SE x 
where SE x 

n
Introduction to Hypothesis Testing
•
Hypothesis testing is also called
significance testing
•
The objective of hypothesis testing is to
test claims about parameters
•
The first step is to state the problem!
•
Then, follow these four steps --
Hypothesis Testing Steps
A. Null and alternative hypotheses
B. Test statistic
C. P-value and interpretation
D. Significance level (optional)
Understand all four steps of testing,
not just the calculations.
§9.1 Null and Alternative
Hypotheses
• Convert the research question to null and
alternative hypotheses
• The null hypothesis (H0) is a claim of “no
difference in the population”
• The alternative hypothesis (Ha) is a
claim of “difference”
• Seek evidence against H0 as a way of
bolstering Ha
Illustrative Example: “Body Weight”
• Statement of the problem: In the 1970s, 20–29
year old men in the U.S. had a mean body
weight μ of 170 pounds. Standard deviation σ =
40 pounds. We want to test whether mean body
weight in the population now differs.
• H0: μ = 170 (“no difference”)
• The alternative hypothesis can be stated in one
of two ways:
Ha: μ > 170 (one-sided alternative)
Ha: μ ≠ 170 (two-sided alternative)
§9.2 Test Statistic
This chapter introduces the z statistic for onesample problems about means.
z stat
x  m0

SE x
where m 0  the population mean assuming the null hypothesisis true
and SE x 

n
Illustrative Example: z statistic
• For the illustrative example, μ0.= 170
• We take an SRS of n = 64 and know We know
that σ = 40 so the standard error of the mean is

40
SEx 

5
n
64
• If we found a sample mean of 173, then
x  m 0 173  170
zstat 

 0.60
SEx
5
• If we found a sample mean of 185, then
x  m 0 185  170
zstat 

 3.00
SEx
5
Reasoning Behind the zstat
This is the sampling distribution
of the mean when μ = 170, σ =
40, and n = 64.
x ~ N 170 ,5
§9.3 P-value
• The P-value answer the question: What is the
probability of the observed a test statistic equal
to or one more extreme than the current statistic
assuming H0 is true?
• This corresponds to the AUC in the tail of the Z
sampling distribution beyond the zstat. Use Table
B or a software utility to find this AUC. (See next
slide).
• Smaller and smaller P-values provider stronger
and stronger evidence against H0
The one-sided P-value
for zstat = 0.6 is 0.2743
The one-sided P-value
for zstat = 3.0 is 0.0010
Two-Sided P-Value
• For one-sided Ha  use the area in the tail
beyond the z statistic
• For two-sided Ha  consider deviations “up” and
“down” from expected  double the one-sided
P-value
• For example, if the one-sided P-value = 0.0010,
then the two-sided P-value = 2 × 0.0010 =
0.0020.
• If the one-sided P = 0.2743, then the two-sided
P = 2 × 0.2743 = 0.5486.
Two-tailed P-value
§9.4 Significance Level
• Smaller and smaller P-values provide stronger
and stronger evidence against H0
• Although it unwise to draw firm cutoffs, here are
conventions that used as a starting point:
P > 0.10  non-significant evidence against H0
0.05 < P  0.10  marginally significant against H0
0.01 < P  0.05  significant evidence against H0
P  0.01  highly significant evidence against H0
• Examples
P =.27 is non-significant evidence against H0
P =.01 is highly significant evidence against H0
Decision Based on α Level
• Let α represent the probability of erroneously
rejecting H0
• Set α threshold of acceptable error (e.g., let α =
.10, let α = .05, or whatever level is acceptable)
• Reject H0 when P ≤ α
• Example: Set α = .10. Find P = 0.27. Since P >
α, retain H0
• Example: Set α = .05. Find P = .01. Since P < α
reject H0
§9.5 One-Sample z Test
(Summary)
Test procedure
A. H0: µ = µ0 vs.
Ha: µ ≠ µ0 (two-sided) or
Ha: µ < µ0 (left-sided) or
Ha: µ > µ0 (right-sided)
B. Test statistic
z stat 
x  m0

where SE x 
SEx
n
C. P-value: convert zstat to P
value [Table B or software]
D. Significance level (optional)
Test conditions:
• Quantitative response
• Good data
• SRS (or facsimile)
• σ known (not calculated)
• Population
approximately Normal
or sample large
(central limit theorem)
Example: The “Lake Wobegon
Problem”
•
•
•
•
•
Typically, Weschler Adult Intelligence
Scores are Normal with µ = 100 and  = 15
Take an SRS of n = 9
Measure scores  {116, 128, 125, 119, 89,
99, 105, 116, 118}
Calculate x-bar = 112.8
Does this sample mean provide statistically
reliable evidence that population mean μ is
greater than 100?
Illustrative Example: Lake Wobegon
A. Hypotheses:
H0: µ = 100
Ha: µ > 100 (one-sided)
B. Test statistic:
SEx 
zstat


15
5
n
9
x  m 0 112 .8  100


 2.56
SEx
5
C. P-value: Pr(Z ≥ 2.56) = 0.0052 (Table B)
P =.0052  highly significant evidence against H0
D. Significance level (optional): This level of
evidence is significant at α 0.01 (reject H0).
Two-Sided Alternative
(Lake Wobegon Illustration)
Two-sided alternative
Ha: µ ≠100 doubles
P = 2 × 0.0052 = 0.0104
P = .0104 provides
significant against H0
§9.6 Power and Sample Size
Two types of decision errors:
Type I error = erroneous rejection of a true H0
Type II error = erroneous retention of a false H0
Truth
Decision
H0 true
H0 false
Retain H0
Correct retention
Type II error
Reject H0
Type I error
Correct rejection
α ≡ probability of a Type I error
β ≡ Probability of a Type II error
Power
• The traditional hypothesis testing
paradigm considers only Type I errors
• However, we should also consider Type II
errors
• β ≡ probability of a Type II error
• 1 – b  “Power” ≡ probability of avoiding a
Type II error
Power of a z test

| m0  ma | n 

1  b    z1  

2



• where Φ(z) represent the cumulative
probability of Standard Normal z (e.g.,
Φ(0) = 0.5)
• μ0 represent the population mean under
the null hypothesis
• μa represents the population mean under
an alternative hypothesis
Calculating Power: Example
A study of n = 16 retains H0: μ = 170 at α = 0.05
(two-sided). What was the power of the test
conditions to identify a significant difference if the
population mean was actually 190?


|
m

m
|
n
a

1  b    z1   0


2





|
170

190
|
16

   1.96 


40


 0.04   0.5160 [From table B]
Reasoning Behind Power
• Consider two competing sampling distribution
models
– One model assumes H0 is true (top curve, next page)
– An other model assumes Ha is true μa = 190 (bottom
curve, next page)
• When α = 0.05 (two-sided), we will reject H0
when the sample mean exceeds 189.6 (right tail,
top curve)
• The probability of getting a value greater than
189.6 on the bottom curve is 0.5160,
corresponding to the power of the test
Sample Size Requirements
The required sample size for a two-sided z test to
achieve a given power is
n

 z1 b  z1 
2
2

2

2
where
1 – β ≡ desired power of the test
α ≡ desired significance level
σ ≡ population standard deviation
Δ = μ0 – μa ≡ the difference worth detecting
Illustrative Example: Sample Size
Requirement
• How large a sample is needed for a one-sample
z test with 90% power and α = 0.05 (two-tailed)
when σ = 40. The null hypothesis assumes μ =
170 and the alternative assumes μ = 190. We
look for difference Δ = μ0 − μa = 170 – 190 = −20
n

 2 z1 b  z1 
2
2

2

40 2 (1.28  1.96 ) 2
 20 2
 41 .99
• Round this up to 42 to ensure adequate power.
Example showing the
conditions for 90%
power.