Lecture 6 - notes - for Dr. Jason P. Turner

Download Report

Transcript Lecture 6 - notes - for Dr. Jason P. Turner

Hypothesis Testing II
MARE 250
Dr. Jason Turner
To ASSUME is to make an…
Four assumptions for t-test hypothesis testing:
When do I do the what now?
If all 4 assumptions are met:
If the samples are not independent:
If the variances (std. dev.) are not equal:
If the data is not normal or has small sample size:
When to pool, when to not-pool
Both tests are run by Minitab as “2-sample t-test”
For pooled test check box – “Assume Equal Variances”
For non-pooled, do not check box
Assessing Equal Variances…
Often not recommended:
Although pooled t-test is moderately robust to
unequal variances, F test is extremely non-robust
to such inequalities
Pooled t-test will allow you to run an accurate test
with some degree of unequal variance
F-test is much more specific than pooled-t
Assessing Equal Variances…
In both tests, the null hypothesis (Ho) is that the
population variances under consideration (or
equivalently, the population standard deviations)
are equal, and the alternative hypothesis (Ha) is that
the two variances are not equal.
What the F…?
Use Levene's test when the data come from continuous,
but not necessarily normal, distributions
is less sensitive than the F-test, so use the F-test when your
data are normal or nearly normal
When the F…?
Ho: σ1 = σ2
Ha: σ1 ≠ σ2
High p-values (above α-level) Fail to Reject Null
- indicate no statistically significant difference between
the variances (equality or homogeneity of variances)
Low p-values (below α-level) Reject Null
- indicate a difference between the variances (inequality
of variances)
How the F…?
STAT – Basic Statistics – 2-Variances
Enter columns of data as before
Under “Options” can modify α-level of test (but why would you do that)
Test for Equal Variances for Kapoho, Ka Lae
F-Test
Test Statistic
P-Value
Kapoho
2.57
0.008
Lev ene's Test
Test Statistic
P-Value
Ka Lae
15
20
45
40
35
30
25
95% Bonferroni Confidence Intervals for StDevs
50
1.94
0.168
Note that by
default,
MINITAB gives
you the results of
both the F-test
and Levene’s
Kapoho
Ka Lae
0
50
100
Data
150
200
Must decide a
priori which test
you plan to utilize
Significance Level
The probability of making a TYPE I Error
(rejection of a true null hypothesis) is called the
significance level (α) of a hypothesis test
TYPE II Error Probability (β) – nonrejection of a
false null hypothesis
For a fixed sample size, the smaller we specify the
significance level (α) , the larger will be the
probability (β) , of not rejecting a false hypothesis
I have the POWER!!!
The power of a hypothesis test is the probability of
not making a TYPE II error (rejecting a false null
hypothesis) t evidence to support the alternative
hypothesis
POWER = 1 - β
Produce a power curve
We need more POWER!!!
For a fixed significance level, increasing the
sample size increases the power
Therefore, you can run a test to determine if your sample
size HAS THE POWER!!!
By using a sufficiently
large sample size, we
can obtain a
hypothesis test with as
much power as we
want
Power –
Sample size –
Difference (effect) -
Increasing the power of the test
There are four factors that can increase the power
of a two-sample t-test:
1. Larger effect size (difference) - The greater the real
difference between m for the two populations, the more likely
it is that the sample means will also be different.
2. Higher α-level (the level of significance) - If you choose a
higher value for α, you increase the probability of rejecting
the null hypothesis, and thus the power of the test. (However,
you also increase your chance of type I error.)
3. Less variability - When the standard deviation is smaller,
smaller differences can be detected.
4. Larger sample sizes - The more observations there are in
your samples, the more confident you can be that the sample
means represent m for the two populations. Thus, the test will
be more sensitive to smaller differences.
Increasing the power of the test
Sample size
Increasing the size of your samples increases the power of
your test
You want enough observations in your samples to achieve
adequate power, but not so many that you waste time
and money on unnecessary sampling
When to pair, when to not-pair
Test is run by Minitab directly as “paired t-test”
Used when there is a natural pairing of the
members of two populations
Each pair consists of a member from one
population and that members corresponding
member in the other population
Use difference between the two sample means
When to pair, when to not-pair
Paired t-test assumptions:
1. Random Sample
2. Paired difference normally distributed; large n
3. Outliers can confound results
Tests whether the difference in the pairs is
significantly different from zero
Paired Test - Example
For Example…
If you are testing the effects of some experimental
treatment upon a population
e.g. – effect of new diet upon a single sample of fish
However…
Paired test must have equal sample sizes
When to parametric…
Nonparametric procedures
Statistical procedures that require very few
assumptions about the underlying population.
They are often used when the data are not from a
normal population.
Non-Parametric
Non-parametric t-test (Mann-Whitney):
Tests whether the difference in the pairs is
significantly different from zero
Non-parametric test are used heavily in some
disciplines – although not typically in the natural
sciences – often the “last resort” when data is not
collected correctly, low “power”
Drawbacks of Nonparametric Tests
Nonparametric tests:
Less powerful than parametric tests. Thus, you are less likely
to reject the null hypothesis when it is false.
Often require you to modify the hypotheses. For example, most
nonparametric tests concerning the population center are tests
about the median rather than the mean. The test does not
answer the same question as the corresponding parametric
procedure.
When a choice exists and you are reasonably certain that the
assumptions for the parametric procedure are satisfied, then
use the parametric procedure.