Using and Understanding the - Association for Biology

Download Report

Transcript Using and Understanding the - Association for Biology

Using and Understanding
the Chi-squared Test
hypothesis
testable prediction
(what you expect to observe)
hypothesis
testable prediction
(what you expect to observe)
make observations
hypothesis
testable prediction
(what you expect to observe)
make observations
Do your observations match what you expected to observe?
hypothesis
testable prediction
(what you expect to observe)
make observations
Do your observations match what you expected to observe?
No
Yes
reject
hypothesis
do not reject
hypothesis
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
3 purple : 1 yellow
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
3 purple : 1 yellow
¾ purple
¼ yellow
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
3 purple : 1 yellow
¾ purple
¼ yellow
of 868 kernels, expect
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
3 purple : 1 yellow
¾ purple
¼ yellow
of 868 kernels, expect
868 x ¾ = 651 purple
this week:
hypothesis
about genotypes of parent corn plants
testable prediction
about expected phenotypic ratios in offspring (corn kernels)
3 purple : 1 yellow
¾ purple
¼ yellow
of 868 kernels, expect
868 x ¾ = 651 purple
868 x ¼ = 217 yellow
expected:
3 purple : 1 yellow
¾ purple
¼ yellow
of 868 kernels, expect
868 x ¾ = 651 purple
868 x ¼ = 217 yellow
expected:
3 purple : 1 yellow
actually observed:
of 868 kernels counted,
656 purple, and
212 yellow
¾ purple
¼ yellow
of 868 kernels, expect
868 x ¾ = 651 purple
868 x ¼ = 217 yellow
expected:
3 purple : 1 yellow
of 868 kernels, expect
¾ purple
¼ yellow
868 x ¾ = 651 purple
868 x ¼ = 217 yellow
actually observed:
phenotype
observed
number
expected
number
of 868 kernels counted,
656 purple, and
212 yellow
purple
656
651
yellow
212
217
total
868
Observed and expected don’t match. What to do?
phenotype
observed
number
expected
number
purple
656
651
yellow
212
217
total
868
Observed and expected don’t match. What to do?
phenotype
observed
number
expected
number
purple
656
651
yellow
212
217
total
868
The observed doesn’t
match the expected
closely enough! Reject
the hypothesis!
The observed and the
expected are close
enough! Don’t reject the
hypothesis!
Observed and expected don’t match. What to do?
phenotype
observed
number
expected
number
purple
656
651
yellow
212
217
total
868
The observed doesn’t
match the expected
closely enough! Reject
the hypothesis!
The observed and the
expected are close
enough! Don’t reject the
hypothesis!
How close is “close enough”?
The chi-squared test to the rescue!
The chi-squared test to the rescue!
So, your observed and expected numbers are different.
Maybe that difference is because your hypothesis should be rejected…
…but maybe that difference is just due to chance,
and there’s no need to reject your hypothesis.
The chi-squared test to the rescue!
So, your observed and expected numbers are different.
Maybe that difference is because your hypothesis should be rejected…
…but maybe that difference is just due to chance,
and there’s no need to reject your hypothesis.
What is the probability that the difference
between observed and expected is due to chance?
The chi-squared test to the rescue!
So, your observed and expected numbers are different.
Maybe that difference is because your hypothesis should be rejected…
…but maybe that difference is just due to chance,
and there’s no need to reject your hypothesis.
What is the probability that the difference
between observed and expected is due to chance?
high probability = close enough! The difference is not significant, so don’t reject your hypothesis.
low probability = not close enough! The difference is significant, so reject your hypothesis.
How to perform a chi-squared test
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Observed
number
(o)
Expected
number (e)
(o - e)
(o - e)2
(o - e)2
e
Purple
Yellow
TOTAL
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Observed
number (o)
Purple
656
Yellow
212
TOTAL
868
Expected
number (e)
(o - e)
(o - e)2
(o - e)2
e
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Observed
number (o)
Expected
number (e)
Purple
656
868 x ¾ =
(o - e)
(o - e)2
(o - e)2
e
651
Yellow
212
868 x ¼ =
217
TOTAL
868
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Observed
number (o)
Expected
number (e)
(o - e)
Purple
656
868 x ¾ =
656-651 =
651
5
868 x ¼ =
212–217 =
217
-5
Yellow
TOTAL
212
868
(o - e)2
(o - e)2
e
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Purple
Yellow
TOTAL
Observed
number (o)
Expected
number (e)
(o - e)
(o - e)2
656
868 x ¾ =
656-651 =
52 =
651
5
25
868 x ¼ =
212–217 =
(-5)2 =
217
-5
25
212
868
(o - e)2
e
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Purple
Yellow
TOTAL
Observed
number (o)
Expected
number (e)
(o - e)
(o - e)2
(o - e)2
e
656
868 x ¾ =
656-651 =
52 =
25/651 =
651
5
25
0.038
868 x ¼ =
212–217 =
(-5)2 =
25/217 =
217
-5
25
0.115
212
868
χ2 =
How to perform a chi-squared test
(Recall that your hypothesis generates a predicted phenotypic ratio of 3 purple : 1 yellow)
Phenotype
(class)
Purple
Yellow
TOTAL
Observed
number (o)
Expected
number (e)
(o - e)
(o - e)2
(o - e)2
e
656
868 x ¾ =
656-651 =
52 =
25/651 =
651
5
25
0.038
868 x ¼ =
212–217 =
(-5)2 =
25/217 =
217
-5
25
0.115
212
868
χ2 = 0.038
+0.115 =
0.153
so, χ2 = 0.153
…but what does this tell us about our hypothesis?
so, χ2 = 0.153
…but what does this tell us about our hypothesis?
Remember, what we want to find out is:
What is the probability that the difference
between observed and expected is due to chance?
so, χ2 = 0.153
…but what does this tell us about our hypothesis?
Remember, what we want to find out is:
What is the probability that the difference
between observed and expected is due to chance?
So, we need to use our chi-squared value to look up a p (probability) value.
...how do we look it up?
…in a chi-squared table!
Degrees of
Freedom
Probability (P)
0.95
0.8
0.5
0.2
0.05
0.01
0.005
1
0.004
0.064
0.455
1.642
3.841
6.635
7.879
2
0.103
0.446
1.386
3.219
5.991
9.21
10.597
3
0.352
1.005
2.366
4.642
7.815
11.345
12.838
4
0.711
1.649
3.357
5.989
9.48
13.277
14.86
5
1.145
2.343
4.351
7.289
11.07
15.086
16.75
6
1.635
3.07
5.348
8.558
12.592
16.812
18.548
7
2.167
3.822
6.346
9.803
14.067
18.475
20.278
8
2.733
4.594
7.344
11.03
15.507
20.09
21.955
9
3.325
5.38
8.343
12.242
16.919
21.666
23.589
10
3.94
6.179
9.342
13.442
18.307
23.209
25.188
15
7.261
10.307
14.339
19.311
24.996
30.578
32.801
20
10.851
14.578
19.337
25.038
31.41
37.566
39.997
25
14.611
18.94
24.337
30.675
37.652
44.314
46.928
30
18.493
23.364
29.336
36.25
43.773
50.892
53.672
Non significant
Significant
…in a chi-squared table!
Degrees of
Freedom
Probability (P)
0.95
0.8
0.5
0.2
0.05
0.01
0.005
1
0.004
0.064
0.455
1.642
3.841
6.635
7.879
2
0.103
0.446
1.386
3.219
5.991
9.21
10.597
3
0.352
1.005
2.366
4.642
7.815
11.345
12.838
4
0.711
1.649
3.357
5.989
9.48
13.277
14.86
5
1.145
2.343
4.351
7.289
11.07
15.086
16.75
6
1.635
3.07
5.348
8.558
12.592
16.812
18.548
7
2.167
3.822
6.346
9.803
14.067
18.475
20.278
8
2.733
4.594
7.344
11.03
15.507
20.09
21.955
9
3.325
5.38
8.343
12.242
16.919
21.666
23.589
10
3.94
6.179
9.342
13.442
18.307
23.209
25.188
15
7.261
10.307
14.339
19.311
24.996
30.578
32.801
20
10.851
14.578
19.337
25.038
31.41
37.566
39.997
25
14.611
18.94
24.337
30.675
37.652
44.314
46.928
30
18.493
23.364
29.336
36.25
43.773
50.892
53.672
Non significant
Significant
number of degrees of freedom = number of different phenotypes minus 1
…in a chi-squared table!
Degrees of
Freedom
Probability (P)
0.95
0.8
0.5
0.2
0.05
0.01
0.005
1
0.004
0.064
0.455
1.642
3.841
6.635
7.879
2
0.103
0.446
1.386
3.219
5.991
9.21
10.597
3
0.352
1.005
2.366
4.642
7.815
11.345
12.838
4
0.711
1.649
3.357
5.989
9.48
13.277
14.86
5
1.145
2.343
4.351
7.289
11.07
15.086
16.75
6
1.635
3.07
5.348
8.558
12.592
16.812
18.548
7
2.167
3.822
6.346
9.803
14.067
18.475
20.278
8
2.733
4.594
7.344
11.03
15.507
20.09
21.955
9
3.325
5.38
8.343
12.242
16.919
21.666
23.589
10
3.94
6.179
9.342
13.442
18.307
23.209
25.188
15
7.261
10.307
14.339
19.311
24.996
30.578
32.801
20
10.851
14.578
19.337
25.038
31.41
37.566
39.997
25
14.611
18.94
24.337
30.675
37.652
44.314
46.928
30
18.493
23.364
29.336
36.25
43.773
50.892
53.672
Non significant
Significant
number of degrees of freedom = number of different phenotypes minus 1
2 -1 = 1 degree of freedom
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
non-significant
0.2
1.642
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
non-significant
0.2
1.642
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
non-significant
0.2
1.642
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
0.2
1.642
non-significant
If X2 = 0.153, then
0.5 < p < 0.8
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
non-significant
0.2
1.642
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.5
0.064
0.455
non-significant
If X2 = 0.153, then 0.5 < p < 0.8
high probability that difference
between observed and expected
is due to chance:
do not reject hypothesis.
0.2
1.642
0.05
3.841
0.01
6.635
0.005
7.879
significant
Degrees of
Freedom
Probability (P)
0.95
1
0.004
0.8
0.064
0.5
0.455
non-significant
0.2
1.642
0.05
3.841
0.01
0.005
6.635
7.879
significant
If X2 = 7.5, then 0.01 < p < 0.005
low probability that difference
between observed and expected
is due to chance:
reject hypothesis.