Lectures as PPT format

Download Report

Transcript Lectures as PPT format

PHL - 541
Test of Significance: General Purpose
The idea of significance testing. If we have a basic knowledge of the underlying distribution
of a variable, then we can make predictions about how, in repeated samples of equal size, this
particular statistic will "behave," that is, how it is distributed.
Normal distribution (Gaussian distribution) of variables.
 About 68 % of values drawn from a normal distribution
are within one SD, σ > 0 away from the mean μ .
 About 95 % of the values are within two SD .
 About 99.7 % lie within three SD .
 This is known as the 68-95-99.7 Sigma rule.
Are most variables normally distributed?
 Income in the population.
 Incidence rates of rare diseases.
 Heights in different cities.
 Number of car accidents,,,,,,,,,,,,,
Example: If we draw 100 random samples of 100 adults
each from the general population, and compute the mean
height in each sample, then the distribution of the
standardized means across samples will likely approximate
the normal distribution (Student's t distribution with 99 df).
Now imagine that we take an additional sample in a
particular city ("X") where we believe that people are taller
than the average population. If the mean height in that
sample falls outside the upper 95% tail area of the t
distribution then we conclude that, indeed, the people of X
city are taller than the average population.
Test of significance: General Purpose
Sample size: Large enough or Very small. If very small, then those tests based on the assumption
can be used only if we are sure that the variable is normally distributed, and there is no way to
test this assumption if the sample is small (<5, unpaired).
Measurement Scales:
a) Interval variables allow us not only to rank order the items that are measured, but also to
quantify and compare the sizes of differences between them. For example, temperature, as
measured in degrees Fahrenheit or Celsius, represent an interval scale. We can say that a
temperature of 40 degrees is higher than a temperature of 30 degrees, and that an increase from
20 to 40 degrees is twice as much as an increase from 30 to 40 degrees.
b) Ratio variables are very similar to interval variables. Most statistical data analysis procedures do
not distinguish between the interval and ratio properties of the measurement scales.
c) Nominal variables (qualitative). For example, we can say the two individuals are different in
terms of variable A (e.g., they are of different gender, colour, city, race etc.), but we cannot say
which one "has more" of the quality represented by the variable.
b) Ordinal variables allow us to rank order the items we measure in terms of which has less and
which has more of the quality represented by the variable, but still they do not allow us to say
"how much more". Examples; good-better-best, upper-middle-lower.
Differences between parametric and nonparametric methods
Parametric Test Procedures
Nonparametric Test Procedures
Involve population parameters like mean and SD Do not involve population parameters and don’t
of population distribution.
assume data is normal, or t-distributed.
The underlying measurements are at least of
interval, meaning that equally spaced
intervals on the scale can be compared in a
consequential manner. For example,
temperature, as measured in degrees F or C,
constitutes an interval scale. We can say that
a temperature of 40>30 degrees, and that an
increase from 20 to 40 degrees is twice as
much as an increase from 30 to 40 degrees.
Dependent variable may be measured on any scale
method
 Interval or Ratio where distribution of the
random variable of interest is unspecified
 Ordinal-scaled data or ranking: goodbetter-best, upper-middle-lower.
 Nominal-scaled data: gender, race, color,
city.
Often requires large sample sizes to call to Sample sizes can be small
normality
Based on normality assumptions i.e. have Have few assumptions about the population
stringent assumptions: normal distribution
distribution (only assume random sample).
Examples: t-Test, Z Test, F test, ANOVA
Example: Mann-Whitney U test, Chi-square test
Alternatives of Parametric and Nonparametric Methods
Basically, there is at least one nonparametric equivalent for each parametric general type of test. In
general, these tests fall into the following categories:
Tests of differences between groups (independent samples);
1.
t tset-for independent samples  alternatives  the Mann-Whitney U test.
2.
ANOVA  alternatives  Kruskal-Wallis analysis of ranks.
Tests of differences between variables (dependent samples);
1.
t-test for dependent samples  alternatives  Sign test and Wilcoxon's matched pairs test
or McNemar's Chi-square (If the variables of interest are dichotomous in nature (i.e., "pass"
vs. "no pass").
2.
Repeated measures ANOVA  alternatives  Friedman's two-way analysis of variance
and Cochran Q test (if the variable was measured in terms of categories, e.g., "passed" vs.
"failed").
Tests of relationships between variables.
Correlation coefficient  equivalents to  Spearman R, Kendall Tau, Coefficient Gamma,
Chi-square test, the Phi coefficient, and the Fisher exact test. If the two variables of interest
are categorical in nature (e.g., "passed" vs. "failed" by "male" vs. "female")
The Most Frequently Used Nonparametric Ttests
Example
Parametric test
Non-parametric
Purpose of test
To compare girls’ heights with
boys’ heights
Two-sample
(unpaired) t test
Mann-Whitney
U test
Compares two independent samples drawn
from the same population
To compare weight of infants
before and after a feed
One sample
(paired) t test
Wilcoxon
matched pairs
test
Compares two sets of observations on a
single sample
To determine whether plasma
glucose is higher one, two, or
three hours after a meal
One way analysis
of variance (F
test) using total
sum of squares
Kruskal-Wallis
analysis of
variance by ranks
Effectively, a generalization of the paired t
or Wilcoxon matched pairs test where three
or more sets of observations are made on a
single sample
In the above example, to
determine whether the results
differ in male and female
subjects
Two way analysis
of variance
Two way
analysis of
variance by ranks
As above, but tests the influence (and
interaction) of two different covariates
To determine whether acceptance
into medical school is more
likely if the applicant was born
in the same country
χ2 test
Fisher’s exact
test
Tests the null hypotheses that the
distribution of a discontinuous variable is
the same in two (or more) independent
samples
To assess whether and to what
extent plasma HbA1
concentration is related to plasma
triglyceride concentration in
diabetic patients
Product moment
correlation
coefficient
(Pearson’s r)
Spearman’s rank
coefficient (r2)
Assesses the strength of the straight line
association between two continuous
variables
When to Use Which Method?
 It is not easy to give simple advice concerning the use of
nonparametric procedures.
 Each nonparametric procedure has its peculiar sensitivities and blind
spots.
 In general, if the result of a study is important (e.g., does a very
expensive and painful drug therapy help people get better?), then it is
always useful to run different nonparametric tests; should discrepancies
in the results occur dependent on which test is used, one should try to
understand why some tests give different results.
 On the other hand, nonparametric statistics are less statistically
powerful (less sensitive) than their parametric counterparts, and if it is
important to detect even small effects one should be very careful in the
choice of a test statistic.
Nonparametric Methods: Mann-Whitney U or
Wilcoxon rank sum test (MWW)
Frank Wilcoxon
(1892-1965)
 Mann-Whitney U si dna tset ecnacifingis cirtemarap-non nwonk-tseb eht f o eno si tset
llams rof dnah yb detaluclac ylisae osla si tI .segakcap lacitsitats nredom tsom ni dedulcni
.selpmas
 It was proposed initially by Frank Wilcoxon in 1945 for equal sample sizes, and extended to
arbitrary (random) sample sizes and in other ways by Mann and Whitney (1947).
 MWW test is practically identical to performing an ordinary parametric two-sample t test on the
data after ranking over the combined samples. They are excellent alternatives to the t test if
your data are significantly skewed.
 MWW test tests for differences in medians and for chances of obtaining greater observations in
one population versus the other .
 The null hypothesis in the MWW test is that both populations have the same probability of
exceeding each other. i.e. no difference in the two population distributions.
 The alternative hypothesis is that the variable in one population is stochastically greater .
 The test involves the calculation of a statistic, usually called U esohw ,)sknar f o mus eht(
si noitubirtsid eht ,selpmas llams f o esac eht nI .nwonk si sisehtopyh llun eht rednu noitubirtsid
~ evoba sezis elpmas rof tub ,detalubat20 there is a good approximation using the normal
Nonparametric Methods: Mann-Whitney U test
Assumptions for Mann-Whitney U :tseT
 The two samples under investigation in the test are independent of each other and the
observations within each sample are independent.
 The observations are ordinal or continuous measurements (i.e., for any two observations,
one can at least say, whether they are equal or, if not, which one is greater).
Data types that can be analysed with Mann-Whitney U-test:
 Data points should be independent from each other.
 Data do not have to be normal and variances (SD) do not have to be equal.
 All individuals must be selected at random from the population.
 All individuals must have equal chance of being selected.
 Sample sizes should be as equal as possible but some differences are allowed.
Nonparametric Methods: Mann-Whitney U test
Calculations: There are two procedures of doing this
Procedure # 1
 Stage 1: Call one sample A and the other B.
 Stage 2: Place all the values together in rank order (i.e. from lowest to highest). If
there are two samples of the same value, the 'A' sample is placed first in the rank.
 Stage 3: Inspect each 'B' sample in turn and count the number of 'A's which
precede (come before) it. Add up the total to get a U value.
 Stage 4: Repeat stage 3, but this time inspects each A in turn and count the number
of B's which precede it. Add up the total to get a second U value.
 Stage 5: Take the smaller of the two U values and look up the probability value in
the next table. This gives the percentage probability that the difference between the
two sets of data could have occurred by chance.
Example: The results of the cytogenetic analysis of abnormal cells after exposure to
the drug (Y) are shown below together with the concurrent control (X) data. Test to
see if there is a significant difference between the treated and the control groups.
Group (X) = 7; 3; 6; 2; 4; 3; 5; 5
Group (Y) = 3; 5; 6; 4; 6; 5; 7; 5
Nonparametric Methods: Mann-Whitney U test
Solution
 Stage 1:
 Stage 2:
 Stage 3:
 Stage 4:
 Stage 5:
Sample A = 7; 3; 6; 2; 4; 3; 5; 5
Sample B = 3; 5; 6; 4; 6; 5; 7; 5
3. Group
A
A
A
B
A
B
A
A
B
B
B
A
B
B
A
B
2. Labels
2
3
3
3
4
4
5
5
5
5
5
6
6
6
7
7
1. Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Ub = 3 + 4 + 6 + 6 + 6 + 7 + 7 + 8 = 47
Ua = 0 + 0 + 0 + 1 + 2 + 2 + 5 + 7 = 17
U = 17
The critical value from the table = 6.5.
The probability that the quality of the measures in group Y is better than group X just
by chance is 6.5 per cent (p = 0.065) (see the next Table).
If you find that there is a significant probability that the differences could have occurred
by chance, this can mean:
1. Either the difference is not significant and there is little point in looking further for
explanations of it, OR
2. Your sample is too small. If you had taken a larger sample, you might well find that
the result of the test of significance changes: the difference between the two groups
becomes more certain.
n¹
1
2
3
4
5
6
7
8
0
11.1
2.2
0.6
0.2
0.1
0.0
0.0
0.0
1
22.2
4.4
1.2
0.4
0.2
0.1
0.0
0.0
2
33.3
8.9
2.4
0.8
0.3
0.1
0.1
0.0
3
44.4
13.3
4.2
1.4
0.5
0.2
0.1
0.1
4
55.6
20.0
6.7
2.4
0.9
0.4
0.2
0.1
5
26.7
9.7
3.6
1.5
0.6
0.3
0.1
6
35.6
13.9
5.5
2.3
1.0
0.5
0.2
7
44.4
18.8
7.7
3.3
1.5
0.7
0.3
8
55.6
24.8
10.7
4.7
2.1
1.0
0.5
9
31.5
14.1
6.4
3.0
1.4
0.7
10
38.7
18.4
8.5
4.1
2.0
1.0
11
46.1
23.0
11.1
5.4
2.7
1.4
12
53.9
28.5
14.2
7.1
3.6
1.9
13
34.1
17.7
9.1
4.7
2.5
14
40.4
21.7
11.4
6.0
3.2
15
46.7
26.2
14.1
7.6
4.1
16
53.3
31.1
17.2
9.5
5.2
17
36.2
20.7
11.6
6.5
18
41.6
24.5
14.0
8.0
19
47.2
28.6
16.8
9.7
u
Nonparametric Methods: Mann-Whitney U test
Example: The results of the cytogenetic analysis of abnormal cells Males (♂) and Females
(♀) are shown below. Test to see if there is a significant difference between these two gender
groups.
Group (♀) = 9; 4; 6; 8; 6 (% of cells)
Group (♂) = 19; 16; 9; 19; 8 (% of cells)
UB = 24
UA = 1
P = 0.2 (0.02)
Nonparametric Methods: Mann-Whitney U test
Procedure # 2
 Choose the sample for which the ranks seem to be smaller (the only reason
to do this is to make computation easier). Call this "sample 1," and call the
other sample "sample 2."
 Taking each observation in sample 1, count the number of observations in
sample 2 that are smaller than it (count a half for any that are equal to it).
 Calculate sum of ranks R1 and R2 then use the following formula
U1 = m x n + m (m + 1)/2 – R1
U2 = m x n + n (n + 1)/2 – R2
U1 + U2 should be equal to m x n
NB: If you have ties (equal), a correction should, strictly, be made for ties.
• Rank them anyway, pretending they were slightly different.
• Find the average of the ranks for the identical values, and give them all that rank.
• Carry on as if all the whole-number ranks have been used up.
Nonparametric Methods: Mann-Whitney U test
Example
Data
14
2
5
4
2
14
18
14
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
Ties
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
Ties
Rank them anyway,
pretending they were
slightly different
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
Rank A
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
1
2
3
4
5
6
7
8
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
Rank A
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
1
2
3
4
5
6
7
8
Find the average of
the ranks for the
identical values, and
give them all that rank
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
Rank A
1
Average = 1.5
2
3
4
5
6 Average = 6
7
8
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
Rank A
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
1
2
3
4
5
6
7
8
Rank
1.5
1.5
3
4
6
6
6
8
Nonparametric Methods: Mann-Whitney U test
Example
Data
Sorted Data
Rank A
14
2
5
4
2
14
18
14
2
2
4
5
14
14
14
18
1
2
3
4
5
6
7
8
Rank
1.5
1.5
3
4
6
6
6
8
These can now be used for the Mann-Whitney U test
Nonparametric Methods: Mann-Whitney U test
Solution of our example
Group (X) = 7; 3; 6; 2; 4; 3; 5; 5
Group (Y) = 3; 5; 6; 4; 6; 5; 7; 5
Sample 2
3
4
5
5
5
6
6
7
m=8
Ranks (orders)
3
5.5
9
9
9
13
13
15.5 (highest order)
R1 (sum of the ranks)
Sample 1
2
3
3
4
5
5
6
7
n=8
Ranks (orders)
1 (lowest order)
3
3
5.5
9
9
13
15.5
R2 (sum of the ranks)
R1 = should be 77,
R2 = should be 59,
U1 = 23, U2 = 41
Look at the next Table at n 8 and m 8 you will find the tabulated value at p 0.05 (= 16) is less than
the calculated one (23) so the difference is not significant and we failed to reject Ho.
Nonparametric Methods: Mann-Whitney U test
Nonparametric Methods: Mann-Whitney U test