Module 21: 2 For Contingency Tables

Download Report

Transcript Module 21: 2 For Contingency Tables

Module 21: 2 For Contingency
Tables
This module presents the 2 test for contingency
tables, which can be used for tests of association
and for differences between proportions
Reviewed 06 June 05 /MODULE 21
21- 1
Contingency Tables
Contingency tables are very common types of tables
used to summarize. The cells of these tables typically
include counts of something and/or percentages. For
our purposes, we can use only those tables that have
integers that are counts. If you are interested in
analyzing data in a contingency table that includes
only percentages in the cells, then you must convert
these percentages into counts in order to proceed.
21- 2
Source: AJPH, November 1977;67:1033-1036
21- 3
21- 4
21- 5
Association Hypothesis
H0:
There is no association between the classification
of the numbers of patients who kept their
appointments and the classification of the sex of
the patients,
vs.
H1:
There is association between the classification of
the numbers of patients who kept their
appointments and the classification of the sex of
the patients.
21- 6
Proportions Hypothesis
The association hypothesis can also be tested using a
test of proportions hypothesis:
H0: PM = PF
vs.
H 1 : PM  P F
where
PM = Proportion of males in population who
“kept their appointments,”
PF = Proportion of females in population who
“kept their appointments.”
21- 7
Contingency Tables
Contingency tables have:
r = number of rows, not counting any totals
c = number of columns, not counting any totals
r  c = number of cells
Approach is to compare observed number in each cell
with the number expected under the assumption that
the null hypothesis of no association is true.
21- 8
Calculating Expected Numbers
Calculating the number expected in each contingency
table cell under the assumption that the null hypothesis is
true is, in practice, quite simple. This number is equal to
the total for the row the cell is in times the total for the
column the cell is in divided by the overall total for the
table. You may be interested in working out why this is
true; however, it really isn’t worth the effort.
The algebraic expression is:
(Row Total)(Column Total)
E = Expected =
Overall Total
21- 9
Calculating χ2
The value of the test statistic χ2 calculated by comparing
the observed number (O) to the expected number (E) in
each cell according to the following formula:
2
(
O

E
)
2  
E
all cells
The result of this calculation is then compared to the
appropriate value in the tables for the χ2 distribution.
These are in the Table Module 3: The χ2 Distribution.
Note that you need to look up χ20.95(df) for a test with the
-level at  = 0.05.
21- 10
So, for a 22 table,
df = (2-1)(2-1) = 11=1
Reject H0: if 2  20.95(1) = 3.84
21- 11
2 Test for Appointment Keeping Example 1
H0:
There is no association between the classification of
the numbers of patients who kept their
appointments and the classification of the sex of the
patients,
vs.
H1:
There is association between the classification of the
numbers of patients who kept their appointments
and the classification of the sex of the patients.
1. The hypothesis:
H0: PM = PF vs. H1: PM  PF
2. The assumption:
Contingency table
3. The  – level:
 = 0.05
21- 12
21- 13
Sex
Male
Female
Total
Cell
1
2
3
4
Total
Kept appt
n
%
947 85.39
1,736 84.15
Didn’t keep
n
%
162
14.61
327
15.88
1,109
2,063
2,683
489
3,172
O
947
162
1,736
327
3,172
E
938.03
170.97
1,744.97
318.03
3,172.00
(O - E)
8.97
-8.97
-8.97
8.97
0
(O - E)2
80.46
80.46
80.46
80.46
Total
(O - E)2/E
0.09
0.47
0.05
0.25
2
 = 0.86
21- 14
4. The test statistic:
2
(
O

E
)
2
  
E
allcells
5. The critical region:
Reject H0: if 2  20.95(1) = 3.84
6. The result:
2 = 0.86
7. The conclusion:
Accept H0; Since 2 = 0.86 < 3.84
21- 15
The published table as shown on slide 13 indicates a
result of
  8.54; p > 0.05
2
Our result is
  0.86; p > 0.05
2
If  2  8.54 with df = 1; then p cannot be > 0.05
21- 16
21- 17
2 Test for Appointment Keeping Example 2
H0:
There is no association between the classification of
the numbers of patients who kept their appointments
and the classification of the age of the patients,
vs.
H1:
There is association between the classification of the
numbers of patients who kept their appointments
and the classification of the age of the patients.
1. The hypothesis:
H0: PY = PO vs. H1: PY  PO
2. The assumption:
Contingency table
3. The  – level:
 = 0.05
21- 18
Age
<45
45+
Kept Appt
n
%
2,227 83%
442 90.37
Total
2,669
Cell
1
2
3
4
Total
O
2,227
456
442
47
3,172
Didn’t keep
n
%
456 17.00
47 9.63
503
E
2,257.54
425.46
411.46
77.54
3,172
(O-E)
-30.54
30.54
30.54
-30.54
0
Total
2,683
489
3,172
(O-E)2
932.69
932.69
932.69
932.69
(O-E)2/E
0.41
2.19
2.27
12.03
2 = 16.90
21- 19
4. The test statistic:
2
(
O

E
)
2  
E
allcells
5. The critical region: Reject H0: if 2  20.95(1) = 3.84
6. The result:
2 = 16.90
7. The conclusion:
Reject H0; Since 2 = 16.90 > 3.84
21- 20
21- 21
2 Test for Appointment Keeping Example 3
H0:
There is no association between the classification of
the numbers of patients who kept their appointments
and the classification of the ethnicity of the patients,
vs.
H1:
There is association between the classification of the
numbers of patients who kept their appointments and
the classification of the ethnicity of the patients.
1. The hypothesis:
H0: PP = PB= Pw vs. H1: PP  PB ≠ Pw
2. The assumption: Contingency table
3. The  – level:
 = 0.05
21- 22
Ethnicity
Puerto Rican
White
Black
Total
Cell
1
2
3
4
5
6
Total
O
1,797
409
766
71
120
9
3,172
Kept appt
n
%
1,797 81.46
766 91.52
120 93.02
Didn’t keep
n
%
409 18.54
71
8.48
9
6.98
2,683
489
E
1,865.92
340.08
707.97
129.03
109.11
19.89
3,172
(O - E)
-68.92
68.92
58.03
-58.03
10.89
-10.89
0
Total
2,206
837
129
3,172
(O - E)2
4,749.97
4,749.97
3,367.48
3,367.48
118.59
118.59
(O - E)2/E
2.55
13.97
4.76
26.10
1.09
5.96
2
 = 54.43
21- 23
4. The test statistic:
2
(
O

E
)
2  
E
allcells
5. The critical region: Reject H0: if 2  20.95(2) = 5.99
6. The result:
2 = 54.43
7. The conclusion:
Reject H0; Since 2 = 54.43 > 5.99
21- 24
Source: AJPH, July 1996; 86: 948-955
21- 25
21- 26
21- 27
2 Test for Heavy Drinkers Intervention Example 1
1. The Hypothesis
H0: There is no association between the classification of
the number of male patients according to their change in
alcohol consumption and the treatment they received
vs.
H1: There is an association between the classification of
the numbers of male patients according to their change in
alcohol consumption and the treatment they received.
2. The assumption:
Contingency table
3. The  – level:
 = 0.05
21- 28
% of patients
Simple
Brief
Control Advice Counseling
Decreased
29.0
40.8
40.3
No Change
54.5
53.1
50.3
Increased
16.5
6.1
9.3
Total
100
100
99.9
Cell
1
2
3
4
5
6
7
8
9
Total
O
118
160
186
222
208
232
67
24
43
E
149.88
144.36
169.77
213.84
205.96
242.21
43.28
41.69
49.03
O-E
-31.88
15.64
16.23
8.16
2.04
-10.21
23.72
-17.69
-6.03
1260
1260
0.00
Number of Patients
Simple
Brief
Control Advice Counseling Total
118
160
186
464
222
208
232
662
67
24
43
134
407
392
461 1260
2
2
(O-E) (O-E) /E
1016.29
6.78
244.75
1.70
263.57
1.55
66.64
0.31
4.18
0.02
104.20
0.43
562.44
12.99
312.90
7.51
36.32
0.74
 
32.03
21- 29
4. The test statistic:
2
(
O

E
)
2  
E
allcells
5. The critical region: for a 3 x 3 table; df = (3-1)(3-1) = 4
Reject H0: if 2  20.95(4) = 9.49
6. The result:
2 = 32.03
7. The conclusion:
Reject H0; Since 2 = 32.03 > 9.49
21- 30
21- 31
2 Test for Heavy Drinkers Intervention Example 2
1. The Hypothesis
H0: There is no association between the classification of
the number of male patients according to their posttest
hazardous alcohol consumption and the treatment they
received.
vs.
H1: There is an association between the classification of
the numbers of male patients to according to their posttest
hazardous alcohol consumption and the treatment they
received.
2. The assumption:
Contingency table
3. The  – level:
 = 0.05
21- 32
%
Simple
Brief
Posttest Control Advice Counseling Control
Yes
58
49
47
234
No
42
51
53
169
Total
100
100
100
403
Cell
1
2
3
4
5
6
Total
O
E
234 206.13
190 197.95
221 240.92
169 196.87
197 189.05
250 230.08
1261 1261.00
O-E
27.87
-7.95
-19.92
-27.87
7.95
19.92
0.00
2
Number
Simple
Brief
Advice
Counseling Total
190
221 645
197
250 616
387
471 1261
2
(O-E) (O-E) /E
776.51
3.77
63.20
0.32
396.64
1.65
776.51
3.94
63.20
0.33
396.64
1.72
11.74
21- 33
4. The test statistic:
2
(
O

E
)
2  
E
allcells
5. The critical region: for a 2 x 3 table; df = (2-1)(3-1) = 2
Reject H0: if 2  20.95(2) = 5.99
6. The result:
2 = 11.74
7. The conclusion:
Reject H0 , since 2 = 11.74 > 5.99
21- 34
Source: AJPH, August 2001; 91: 1258-1263
21- 35
21- 36
21- 37
21- 38
21- 39
2 Test for the MATISS Project
1. The Hypothesis
H0: There is no association between the classification of
the number of male patients according to their heart rate
category and this classification of their diabetes status.
vs.
H1: There is an association between the classification of
the numbers of male patients to according to their heart
rate category and this classification of their diabetes
status.
2. The assumption:
Contingency table
3. The  – level:
 = 0.05
21- 40
Table 1: Percent with and without diabetes within heart groups
Diabetes
Yes
No
Total
< 60
4.3
95.7
100
Heart Rate (Beats/Min)
60 - 69 70 - 79 80 - 89
3.4
7.1
7.8
96.6
92.9
92.2
100
100
100
≥ 90
9.3
90.7
100
Table 2: Number with and without diabetes within heart groups
Diabetes
Yes
No
Total
< 60
28
614
642
Heart Rate (Beats/Min)
60 - 69 70 - 79 80 - 89
29
34
14
807
443
167
836
477
181
≥ 90
9
88
97
Total
114
2,119
2,233
21- 41
Cell
1
2
3
4
5
6
7
8
9
10
Total
O
28
29
34
14
9
614
807
443
167
88
2,233
E
(O - E) (O - E)2
32.78
-4.78 22.848
42.68 -13.68 187.14
24.35
9.65 93.123
9.24
4.76 22.658
4.95
4.05 16.403
609.22
4.78 22.848
793.32 13.68 187.14
452.65
-9.65 93.122
171.76
-4.76 22.658
92.05
-4.05 16.403
2,233
0
(O - E)2/E
0.70
4.38
3.82
2.45
3.31
0.04
0.24
0.20
0.13
0.18
15.45
21- 42
4. The test statistic:
2
(
O

E
)
2  
E
allcells
5. The critical region: Reject H0: if 2  20.95(4) = 9.49
6. The result:
2 = 15.45
7. The conclusion:
Reject H0; Since 2 = 15.45 > 9.49
21- 43