Lectures for January 18 and 20 -- review
Download
Report
Transcript Lectures for January 18 and 20 -- review
Experimental Statistics
- week 2
Review Continued
• Sampling Distributions
– Chi-square
–F
• Statistical Inference
– Confidence Intervals
– Hypothesis Tests
1
Chi-Square Distribution
(distribution of the sample variance)
IF:
• Data are Normally Distributed
• Observations are Independent
Then:
( n 1) S 2
2
n
( X i X )2
i 1
2
has a Chi-Square distribution
with n - 1 degrees of freedom
2
Chi-square Distribution, Figure 7.10, page 357
3
4
5
F-Distribution
IF:
• S12 and S22 are sample variances from 2 samples
• samples independent
• populations are both normal
Then:
S12 / 12
S22
/ 22
has an F-distribution with n1 and n2 df
6
F-distribution, Figure 7.10, page 357
7
8
9
(1-a)x100% Confidence Intervals
for m
Setting:
• Data are Normally Distributed
• Observations are Independent
Case 1: known
X za / 2
n
m X za / 2
Case 2: unknown
X ta / 2
n
m X ta / 2
n
n
(df n 1 )
10
CI Example
An insurance company is concerned about the number and magnitude of
hail damage claims it received this year. A random sample 20 of the
thousands of claims it received this year resulted an average claim
amount of $6,500 and a standard deviation of $1,500.
What is a 95% confidence interval on the mean claim damage amount?
Suppose that company actuaries believe the company does not need
to increase insurance rates for hail damage if the mean claim damage
amount is no greater than $7,000. Use the above information to make
a recommendation regarding whether rates should be raised.
11
Interpretation of 95%
Confidence Interval
100 different 95% CI plotted
in the case for which true
mean is 80
i.e. about 95% of these
confidence intervals should
“cover” the true mean
12
Concern has been mounting
that SAT scores are falling.
• 3 years ago -- National AVG = 955
• Random Sample of 200 graduating high school
students this year (sample average = 935)
(each the standard deviation is about 100)
Question: Have SAT scores dropped ?
Procedure: Determine how “extreme” or “rare” our
sample AVG of 935 is if population AVG really is 955.
We must decide:
• The sample came from population with population
AVG = 955 and just by chance the sample AVG is
“small.”
OR
• We are not willing to believe that the pop. AVG
this year is really 955. (Conclude SAT scores
have fallen.)
Hypothesis Testing Terminology
Statistical Hypothesis
- statement about the parameters of
one or more populations
Null Hypothesis ( H 0 )
- hypothesis to be “tested”
(standard, traditional, claimed, etc.)
- hypothesis of no change, effect, or
difference
(usually what the investigator wants to disprove)
Alternative Hypothesis ( H a )
- null is not correct
(usually what the hypothesis the
investigator suspects or wants to show)
15
Basic Hypothesis Testing Question:
Do the Data provide sufficient evidence to
refute the Null Hypothesis?
16
Hypothesis Testing (cont.)
Critical Region (Rejection Region)
- region of test statistic that leads to
rejection of null (i.e. t > c, etc.)
Critical Value
- endpoint of critical region
Significance Level
- probability that the test statistic will
be in the critical region if null is true
- probability of rejecting when it is true
17
Types of Hypotheses
One-Sided Tests
H 0 : m m0
H 0 : m m0
H a : m m0
H a : m m0
Two-sided Tests
H 0 : m m0
H a : m m0
18
Rejection Regions for One- and
Two-Sided Alternatives
H 0 : m m0 vs. H a : m m0
Reject H 0 if t ta
H 0 : m m0 vs. H a : m m0
a
-ta
Critical Value
Reject H 0 if t ta
H 0 : m m0 vs. H a : m m0
Reject H 0 if |t | ta / 2
19
A Standard
Hypothesis Test Write-up
1. State the null and alternative
2. Give significance level, test statistic,and the
rejection region
3. Show calculations
4. State the conclusion
- statistical decision
- give conclusion in language of the problem
20
Hypothesis Testing Example 1
A solar cell requires a special crystal. If properly
manufactured, the mean weight of these crystals is .4g.
Suppose that 25 crystals are selected at random from from a
batch of crystals and it is calculated that for these crystals, the
average is .41g with a standard deviation of .02g. At the
a = .01 level of significance, can we conclude that the batch is
bad?
21
Hypothesis Testing Example 2
A box of detergent is designed to weigh on the average
3.25 lbs per box. A random sample of 18 boxes taken from
the production line on a single day has a sample average
of 3.238 lbs and a standard deviation of 0.037 lbs.
Test whether the boxes seem to be underfilled.
22
Errors in Hypothesis Testing
Actual Situation
Null is True
Do Not
Reject Ho
Conclusion
Reject Ho
Null is False
Correct
Decision
(1-a)
Type II
Error
Type I
Error
Correct
Decision
(a)
(Power)
(b)
(1-b)
23
H 0 : m m0 vs. H a : m m0
Reject H 0 if t ta
Note: “Large negative values” of t make us believe
alternative is true
p-Value
the probability of an observation as
extreme or more extreme than the
one observed when the null is true
Suppose t = - 2.39 is observed from data for test above
p-value
-2.39
(observed value of t)
24
Note:
-- if p-value is less than or equal to a, then
we reject null at the a significance level
-- the p-value is the smallest level of
significance at which the null hypothesis
would be rejected
25
Find the p-values for Examples 1 and 2
26
Two Independent Samples
• Assumptions: Measurements from Each
Population are
– Mutually Independent
Independent within Each Sample
Independent Between Samples
– Normally Distributed (or the Central Limit
Theorem can be Invoked)
• Analysis Differs Based on Whether the Two
Populations Have the Same Standard Deviation
27
Two Types of Independent
Samples
• Population Standard Deviations Equal
– Can Obtain a Better Estimate of the Common
Standard Deviation by Combining or “Pooling”
Individual Estimates
• Population Standard Deviations Different
– Must Estimate Each Standard Deviation
– Very Good Approximate Tests are Available
If Unsure, Do Not Assume
Equal Standard Deviations
28
Equal Population Standard
Deviations
Test Statistic
(y1 y2 ) (μ1 μ2 )
t=
1
1
sp
n1 n2
where
2
2
(
n
1
)
s
+
(
n
1
)
s
1
2
2
s 2p= 1
n1+n2 2
s p= s 2p
df = n1 + n2 - 2
29
Behrens-Fisher Problem
If 1 2
y1 y2 ( m1 m 2 )
s12
n1
s22
~t
n2
30
Satterthwaite’s Approximate t
Statistic
If 1 2
y1 y2 ( m1 m 2 )
s12
n1
s22
t
(i.e. approximate t)
n2
( a b) 2
s12
s22
df =
, a , b
2
2
a
b
n1
n2
n1 1 n2 1
(Approximate t df)
31
Often-Recommended Strategy
for Tests on Means
Test Whether 1 = 2 (F-test )
– If the test is not rejected, use the 2-sample t statistics,
assuming equal standard deviations
– If the test is rejected, use Satterthwaite’s approximate t
statistic
NOTE: This is Not a Wise Strategy
– the F-test is highly susceptible to non-normality
Recommended Strategy:
– If uncertain about whether the standard deviations are
equal, use Satterthwaite’s approximate t statistic
32
Example 3:
Comparing the Mean Breaking
Strengths of 2 Plastics
Question: Is there a difference between the 2
plastics in terms of mean breaking strength?
Plastic A:
nA=35 ,
y A=28.3 ,
s A=3.3
Plastic B:
nA=40
,
y A=26.7
, s A=4.9
Assumptions:
Mutually independent measurements
Normal distributions for measurements from
each type of plastic
Equal population standard deviations
33
New diet -- Is it effective?
Design:
50 people: randomly assign 25 to go on diet
and 25 to eat normally for next month.
Assess results by comparing weights at end
of 1 month.
Diet:
No Diet:
XD
X ND
SD
S ND
Run 2-sample t-test using guidelines we have
discussed.
Is this a good design?
34
Better Design:
Randomly select subjects and measure
them before and after 1-month on the diet.
Subject
Before
After
Difference
1
2
:
150
210
:
147
195
:
3
15
:
n
187
190
-3
Procedure: Calculate differences, and analyze
differences using a 1-sample test
“Paired t-Test”
35
Example 4:
International Gymnastics
Judging
Question: Do judges from a contestant’s
country rate their own contestant higher than
do foreign judges?
Data:
Contestant
Native Judge
Foreign Judges
1
2
3
4
5
6
7
8
9 10 11 12
6.8 4.5 8.0 7.2 8.7 4.5 6.6 5.8 6.0 8.8 8.7 4.4
6.7 4.3 8.1 7.2 8.3 4.6 5.4 5.9 6.1 9.1 8.7 4.3
i.e. test H 0 : m N m F
H a : mN mF
36