Violation of the normality assumption: How serious is it?

Download Report

Transcript Violation of the normality assumption: How serious is it?

Breaking Statistical Rules:
How bad is it really?
Presented by Sio F. Kong
Joint work with: Janet Locke,
Samson Amede
Advisor: Dr. C. K. Chauhan
Background



Make inference about populations based on
information from random samples.
The process is called Hypothesis Testing.
Being Used in many areas such as Biology,
Psychology, Business, etc.
Examples

Mean heart rates:
–

Mean daily intake of saturated fat:
–

white newborns vs. African American newborns.
Among a vegetarian population vs. 15 grams.
Mean SAT score:
–
In a particular county vs. the national average.
Notations

Population means:
–




(unknown most of the time)
Sample means:
x1 , x 2
Population standard deviations: σ1 , σ2
–

μ1 , μ2
(unknown most of the time)
Sample standard deviation:
Pool standard deviation:
Sample size:
S1 , S2
Sp
n1 , n2
2-Samples Hypotheses Testing

Example:
Null Hypothesis:
H0
μ1-µ2 = 0
H1
μ1-µ2 ≠ 0
(two means are equal)
Research Hypothesis:
(two means are not equal)

x 1 - x 2 is significantly away from 0 --- reject Null

Hypothesis
That is, two means are NOT equal.

The corresponding function has a t-distribution.
Important

This test statistics has a t-distribution under
certain conditions:
(x  y)  0
Sp
–
–
–
1
n1

1
n2
Samples are drawn randomly.
If samples are small, populations need to be normally
distributed.
The two populations have equal variances, σ1 = σ2.
Objective


To investigate the effect of the violation of
equal variances on the testing procedure.
Our textbook suggests that the effect of the
violation is minimum when sample sizes are
equal.
Measurement for a GOOD test

Two types of errors:
–
–
–
Type 1 error – rejecting the true null hypothesis
Type 2 error – failing to reject a false hypothesis
 = the probabilities of type 1 error

–
Power = 1- Pr( type 2 errors )


is selected in advance, usually 5%.
can be calculated under various alternatives.
A test is good if the power is high under various
alternatives while  stays the same level as
selected.
In this research…


1000 tests are generated by simulations in
each situation
Simulation studies are done to calculate:
–
–
α:
Probability of rejecting the true hypothesis
Power: Probability of rejecting the false hypothesis

Based on various alternatives when equal variances
assumption is violated.
Effect when σ1 ≠ σ2:
Pop1
Pop2
Pop1
Pop2
Mean
µ1 = 10
µ2 = 10
µ1 = 10
µ2 = 14
Sample Size
n1 =10
n2 =10
n1 =10
n2 =10
α
power
s1 =2, s2 =3
4.4%
89.8%
s1 =2, s2 =4
5.4%
75.0%
s1 =2, s2 =5
6.0%
60.1%
s1 =2, s2 =10
8.0%
24.2%
S pool 
n1  1  S 12  n 2  1  S 22
n1  n 2  2
t
(x  y)  0
Sp
1
n1

1
n2
Reject if
t  t / 2 , d . f
Result
Condition not violated:
σ1 = σ2
In this example: σ1 = σ2 = 2
Condition violated:
σ1 ≠ σ2
In this example: σ1 = 2 and σ2 = 5
n1 = n2 =10
n1 ≠ n 2
n1 = n2 = 10
n1 ≠ n 2
α
power n1, n2 α
power α
power n1, n2 α
power
5.2% 98.5% 12, 8 5.2% 98.9% 6.5% 60.1% 12, 8 9.6% 65.2%
13, 7 5.0% 98.6%
13, 7 12.2% 66.6%
14, 6 5.0% 97.7%
14, 6 15.4% 64.2%
Conclusion:
Conclusion:
When σ1 = σ2, it does not
When σ1 ≠ σ2, if n1 ≠ n2, effect on
matter if n1 = n2 since it is not a
alpha is even more significant.
requirement.
Conclusion


If the difference between σ1 and σ2 get
larger, α goes up and power goes up.
Other interesting observations:
–
–
If smaller sample has larger standard deviation,
α goes up.
If larger sample has larger standard deviation, α
goes down.
Note

This conclusion is only based on what this
simulation study has shown. By selecting
different parameters and choosing different
alternatives, the result may be different.
Thank You!