Transcript P-value

Chapter 14
Introduction to
Inference
Chapter 14
Introduction to Inference
1
What is Statistical Inference?
For everyone who does
habitually attempt the
difficult task of making
sense of figures is, in fact,
assaying a logical process
of the kind we call
induction, in that he is
attempting to draw
inferences from the
particular to the general;
or, as we more usually say
in statistics, from the
sample to the population.
R.A. Fisher (1890 – 1962)
Father of modern statistics
Statistical Inference
Two forms of statistical inference:
• Estimation (Confidence Intervals)
• Hypothesis Tests (Significance)
Chapter 14
Introduction to Inference
3
Statistical Inference
• Objective  to infer parameters
• Parameter ≡ a numerical characteristic of a
population or probability function
• Examples of parameters:
μ (population mean; expected value)
σ (population standard deviation; standard dev
parameter)
p (probability of “success,” population proportion)
• Chs 14 & 15 introduces concepts about inference
• Chs 15–20 introduces inferential techniques
Chapter 14
Introduction to Inference
4
“Simple Conditions” for Chapter 14
•
•
•
Data acquired by simple
random sample (SRS),
i.e., all potential
observations have same
probability of entering
the sample
No major deviations
departures from
Normality in population
Value of σ is known or
assumed before
collecting data
Chapter 14
Objective: to infer μ!
Introduction to Inference
5
Example “Female BMI”
• Statement: What is the mean BMI
µ in females between ages 20 and
29?
• Body Mass Index ≡ BMI =
weight / height2
• Assume “simple conditions”
1. SRS
2. Population approx. normal
3. σ = 7.5 (assumed before data
collected)
• Plan: Estimate µ with 95%
confidence
Chapter 14
Introduction to Inference
6
Reasoning behind estimation
• If I took a multiple SRSs, the sample means (x-bars)
would be different in each one.
• We do not expect x-bar to be exactly equal to µ  any
given x-bar is just an estimate of µ.
• The variability of the x-bars in predictable in the form of
a sampling distribution of means
• Fact: Under the “simple conditions” in this chapter, the
sampling distributions of means will be Normal
distribution with mean µ and standard deviation:

Chapter 14
x


n
← Standard Deviation of the Mean
(also referred to as the standard error
of the mean)
Introduction to Inference
7
Example (Female BMI)
In our example, n = 654 and σ = 7.5. Therefore:

7.5
x 

 0.3 (rounded)
n
654
• σx-bar tells us how close x-bar is likely to be to µ
• The 68-95-99.7 rule tells us that x-bar will be within two
σx-bar units (that’s 0.6) of µ in 95% of samples
•  If we say that µ lies in the interval (x-bar − 0.6) to (xbar + 0.6), we’ll be right 95% of the time
• Therefore, we can be 95% confident that an interval “xbar ± 0.6” will capture µ
Chapter 14
Introduction to Inference
8
Basis of Confidence Intervals
(CIs)
Chapter 14
Introduction to Inference
9
Confidence Interval (CI)
• The CI has two parts
point estimate ± margin of error
• Suppose in our particular sample, the mean is
26.8. This is the point estimate for µ.
• Recall from the previous slide that the margin of
error for our data is 0.6 (with 95% confidence)
• Therefore, the 95% confidence interval (for this
particular sample) = 26.8 ± 0.6 = (26.2, 27.4).
Chapter 14
Introduction to Inference
10
Confidence Level C
• CIs can be calculated at different levels of
•
•
•
•
confidence.
Let C represent the probability the interval will
capture the parameter
In our example, C = 95%
Other common levels of confidence are 90%
and 99%.
In this chapter we adjust the C level by changing
the z* critical value.
Chapter 14
Introduction to Inference
11
Confidence Levels & z critical values
In this Chapter we adjust the confidence level by
altering critical value z*
Common levels of confidence & z critical values
Confidence level C
90%
95%
99%
Critical value z* (table C) 1.645
1.960
2.576
Chapter 14
Introduction to Inference
12
C level CI for μ, σ known
“z procedure”
To estimate µ with confidence level C, use
xz


n
Use Table C to determine value of z*
Chapter 14
Introduction to Inference
13
Example (95% CI): Solve & Conclude
Data: n  654, x  26.8
xz


n
7.5
 26.8  (1.960)
654
 26.8  0.6
 (26.2, 27.4)
Conclude: We are 95% confident population mean
BMI µ is between 26.0 and 27.6
Chapter 14
Introduction to Inference
14
Now have students calculate a
99% CI with the data
Data: n  654, x  26.8
xz


n
Hint: The only thing that changes is the z* critical value.
Conclude: We are 99% confident population mean BMI µ
is between “lower confidence limit (LCL) here” and
“upper confidence limit (UCL) here.”
Chapter 14
Introduction to Inference
15
Interpreting a CI
• Confidence level C
is the success rate
of the method that
produced the interval.
• We know with C level
of confidence that the
CI will capture µ.
• We don’t know with
certainty whether any
given CI will capture µ
or missed it.
Chapter 14
Introduction to Inference
16
Four-Step Procedure for CIs
Chapter 14
Introduction to Inference
17
Stopping Point for Exam 2
Slides after this point forward
could be edited after exam 2
Hypothesis
(“Significance”)
Tests
• Objective  test a claim about a
parameter
• Uses an elaborate vocabulary
Chapter 14
Introduction to Inference
19
4-step Process
Hypothesis (Significance) Testing
Chapter 14
Introduction to Inference
20
State and Plan
Example “Population Weight
Gain?”
State: Is there good evidence
that the
population is gaining weight?
Plan
• Parameter is population mean
weight gain µ
• Null hypothesis H0  statement
of “no difference”  population not
gaining weight  H0: μ = 0
• Alternative hypothesis Ha 
population gaining weight  Ha: μ > 0
• Type of test: z test if “simple
conditions” (slide 5) met
Chapter 14
Introduction to Inference
21
Notes on Statistical Hypotheses
• H0 is key to understanding
• Ha contradicts H0
• Ha can be stated in one-sided or two-sided ways
– One-sided Ha specifies the direction of the
difference  weight GAIN in population  Ha: μ >
0
– Two-sided Ha does not specific the direction of
the difference  weight CHANGE in the population
 Ha: μ ≠ 0
Chapter 14
Introduction to Inference
22
Example “Weight Gain”
“Solve” Sub-steps
(a) Check conditions
SRS
No major departures from Normality
σknown before collecting data
(b) Calculate statistics
See “z Statistic” Slide
(c) Find P-value
Chapter 14
Introduction to Inference
23
Reasoning of Significance Testing
• If H0 and the conditions are
x ~ N (0, 0.316)
true, then the sampling
distribution of x-bar would be
Normal with µ = 0 and

1
x 

 0.316
n
10
• If a study produced an x-bar
of 0.3, this would be poor
evidence against H0
• If a different study produced
an x-bar of 1.02, this would
be good evidence against H0
Chapter 14
Introduction to Inference
24
Test Statistic
Standardize the sample mean
zstat 
x  0

n
Suppose: x-bar = 1.02, n = 10, and σ = 1
x  μ0
zstat 
σ
n
1.02  0

1
10
 3.23
 X-bar is 3 standard deviations greater than expected if H0 true
Chapter 14
Introduction to Inference
25
P-Value from Z Table
For Ha: μ > μ0
P-value = Pr(Z > zstat)
= right-tail beyond zstat
• For Ha: μ < μ0
P-value = Pr(Z < zstat)
= left tail beyond zstat
• For Ha: μ μ0
P-value = 2 × onetailed P-value
Chapter 14
Introduction to Inference
26
P-value from Z Table
• Draw (right)
• One-sided P-value
= Pr(Z > 3.23)
= 1 − .9994
= .0006
• Two-sided P-value
= 2 × one-sided P
= 2 × .0006
= .0012
Chapter 14
Introduction to Inference
27
P-value: Interpretation
• P-value ≡ the probability the data would take a value as
extreme or more extreme than observed if H0 were true
• Smaller-and-smaller P-values → stronger-andstronger evidence against H0
• Conventions
.10 < P < 1.0  insignificant evidence against H0
.05 < P ≤ .10  marginally significant evidence vs. H0
.01 < P ≤ .05  significant evidence against H0
0 < P ≤ .01  highly significant evidence against H0
Chapter 14
Introduction to Inference
29
“Significance Level”
• α (alpha) ≡ threshold for “significance”
• If we choose α = 0.05, we require evidence so
strong that it would occur no more than 5% of
the time when H0 is true
• Decision rule
P-value ≤ α  evidence is significant
P-value > α  evidence not significant
• For example, let α = 0.01. The two-sided Pvalue = 0.0012 is less than .01, so data are
significant at the α = .01 level.
Chapter 14
Introduction to Inference
30
Example “Weight Gain”
Conclusion
• The P-value of .0012 provides highly significant
evidence against H0: µ = 0
•  We rule in favor of Ha: µ ≠ 0
• Conclude: the population’s mean weight in
changing
• Our sample mean weight gain of 1.02 pounds
per person is statistically significant at the α=
.002 level but not at the α= .001 level
Chapter 14
Introduction to Inference
31
Chapter 14
Basics
Introduction
of Significance
to Inference
Testing
32