6 contingency tables

Download Report

Transcript 6 contingency tables

Contingency Table Analysis
• contingency tables show frequencies
produced by cross-classifying observations
• e.g., pottery described simultaneously
according to vessel form & surface
decoration
bowl
jar
olla
polished
47
30
6
burnished
28
42
45
matte
3
8
25
• most statistical tests for tables are designed
for analyzing 2-dimensions
– only examine the interaction of two variables at
one time…
• most efficient when used with nominal data
– using ratio data means recoding data to a lower
scale of measurement (ordinal)
– means ignoring some of the information
originally available…
• still, you might do this, particularly if you
are interested in association between metric
and non-metric variables
• e.g.: variation in pot size vs. surface
decoration…
• may decide to divide pot size into ordinal
classes…
small
small
medium
large
large
rim diameter:
small
large
slip: specular
non-specular
4
15
13
18
• other options may let you retain more of the
original information content
• could use a “t-test”
to test the equality
of the means
• makes full use of
ratio data…
non-specular
slip
specular
slip
polished burnished
matte
bowl
47
28
3
jar
30
42
8
olla
6
45
25
• why do we work with contingency
tables??
polished burnished
matte
bowl
47
28
3
jar
30
42
8
olla
6
45
25
• because we think there may be some kind
of interaction between the variables…
• basic question: can the state of one
variable be predicted from the state of
another variable?
• if not, they are independent
expected counts
• a baseline to which observed counts can be
compared
• counts that would occur by random chance
if the variables are independent, over the
long run
• for any cell
E = (col total * row total)/table total
M
F
PP
4
1
5
45%
Pot
1
5
6
55%
Total
5
6
11
45% 55%

2.3
2.7
5
2.7
3.3
6
5
6
11
significance
• = probability of getting, by chance, a table
as or more deviant than the observed table,
if the variables are independent
– ‘deviant’ defined in terms of expected table
• no causality is necessarily implied by the
outcome
– but, causality may well be the reason for
observed association…
– e.g.: grave goods and sex
Fisher’s Exact Test
• just for 2 x 2 tables
• useful where chi-square test is inappropriate
• gives the exact probability of all tables with
• the same marginal totals
• as or more deviant
than the observed table…
a
c
b
d
4
1
1
5
P = (a+b)!(a+c)!(b+d)!(c+d)! / (N!a!b!c!d!)
P = 5!5!6!6! / 11!4!1!1!5! = 5*6!6! / 11!
P = 5*6!6! / 11! = 5*6! / 11*10*9*8*7
P = 5*6! / 11*10*9*8*7 = 3600 / 55440
P = .065
a
c
b
d
4
1
1
5
P = .065
use R (or Excel) if the counts aren’t too large…
> fisher.test(x)
0
5
5
5
1
6
5
6
11
3
2
5
2
4
6
5
6
11
1
4
5
4
2
6
5
6
11
4
1
5
1
5
6
5
6
11
2
3
5
3
3
6
5
6
11
5
0
5
0
6
6
5
6
11
(expected)
0
5
5
1
0.013
3
2
2
4
0.325
1
4
4
2
0.162
4
1
1
5
0.065
2
3
3
3
0.433
5
0
0
6
0.002
2.3 2.7
2.7 3.3
(observed)
• P = 0.065+0.002 = 0.067 or
• P = 0.067+0.013 = 0.080
• 2-tailed test = 0.067+0.013 = 0.080
• 1-tailed test = 0.065+0.002 = 0.067
in R:
M
F
PP
4
1
5
> fisher.test(x, alt = "two.sided")
Pot
1
5
6
5
6
11
> fisher.test(x, alt = “greater”)
[i.e.: H1: odds ratio > 1]
Chi-square Statistic
k
Oi  Ei 2
i 1
Ei
2  
• an aggregate measure (i.e., based on the
entire table)
• the greater the deviation from expected
values, the larger (exponentially!) the chisquare statistic…
• one could devise others that would place
less emphasis on large deviations
 |o-e|/e
• X2 is distributed approximately in accord
with the X2 probability distribution
• X2 probabilities are traditionally found in a
table showing threshold values from a CPD
– need degrees of freedom
– df = (r-1)*(c-1)
• just use R…
Status:
low
Ritual arch.: altar
no altar
low
altar
no altar
low
altar
no altar
(7-11.8)2
intermed. high
7
20
18
22
25
42
16
8
24
intermed. high
11.8
19.8
11.3
13.2
22.2
12.7
25
42
24
intermed. high
2.0
0.0
1.8
0.0
3.7
0.0
1.9
1.7
3.6
43
48
91
43
48
91
3.9
3.5
7.3
(43*24)
91

= 2
11.8
  .025
X2 assumptions & problems
• must be based on counts:
– not percentages, ratios or weighted data
• fails to give reliable results if expected
counts are too low:
obs.
exp.
2
3
3
3
5
6
5
6
2.27
2.72
2.72
3.27
X2=0.74
P(Fishers)=1.0
rules of thumb
1. no expected counts less than 5
–
almost certainly too stringent
2. no exp. counts less than 2, and 80% of
counts > 5
–
more relaxed (but more realistic)
collapsing tables
• can often combine columns/rows to increase
expected counts that are too low
– may increase or reduce interpretability
– may create or destroy structure in the table
• no clear guidelines
– avoid simply trying to identify the combination
of cells that produces a “significant” result
obs. counts
8
6
6
3
23
3
1
4
12
20
6
6
5
8
25
2
5
4
3
14
19
18
19
26
82
4.6
4.4
4.6
6.3
20
5.8
5.5
5.8
7.9
25
3.2
3.1
3.2
4.4
14
19
18
19
26
82
8
11
9
11
39
19
18
19
26
82
9.0
8.6
9.0
12.4
39
19
18
19
26
82
exp. counts
5.3
5.0
5.3
7.3
23
obs. counts
11
7
10
15
43
exp. counts
10.0
9.4
10.0
13.6
43
• chi-square is basically a measure of
significance
• it is not a good measure of strength of
association
• can help you decide if a relationship exists,
but not how strong it is
17
13
13
17
60
34
26
26
34
120
X2=1.07
alpha=.30
X2=2.13
alpha=.14
• also, chi-square is a ‘global statistic’
• says nothing (directly) about which parts of
a table may be ‘driving’ a large chi-square
statistic
• ‘chi-square contributions’ from individual
cells can help:
low
altar
no altar
intermed. high
2.0
0.0
1.8
0.0
3.7
0.0
1.9
1.7
3.6
3.9
3.5
7.3
Monte Carlo test of X2 significance
• based on simulated generation of cell-counts
under imposed conditions of independence
• randomly assign counts to cells:
23
15
38
14
6
20
8
13
21
45
34
79
• significance is simply the proportion of
outcomes that produced a X2 statistic >=
observed
• not based on any assumptions about the
distribution of the X2 statistic
• overcomes the problems associated with
small expected frequencies
G Test

 Oi
G  2 Oi * log e 
i 1 
 Ei
k
2



• a measure of significance for any r x c table
• look up critical values of G2 in an ordinary
chi-square table; figure out degrees of
freedom the same way
• conforms to chi-square distribution better
than the chi-square statistic
an R function for G2
gsq.test  function(obs) {
df  (nrow(obs)-1) * (ncol(obs)-1)
exp  chisq.test(obs)$expected
G  2*sum(obs*log(obs/exp))
2*dchisq(G, df)
}
Measures of Association
Phi-Square (2)
• an attempt to remove the effects of sample
size that makes chi-square inappropriate for
measuring association
• divide chi-square by n
• 2=X2/n
• limits:
0:
1:

variables are independent
perfect association in a 2x2 table;
no upper limit in larger tables
17
13
13
17
2=0.18
60
34
26
26
34
2=0.18
120
Cramer’s V
• also a measure of strength of association
• an attempt to standardize phi-square
(i.e., control the lack of an upper boundary in tables
larger than 2x2 cells)
• V= 2/m
where m=min(r-1,c-1) ; i.e., the smaller of rows-1 or
columns-1)
• limits: 0-1 for any size table; 1=highest possible
association
Yule’s Q
• for 2x2 tables only
• Q = (ad-bc)/(ad+bc)
a
c
b
d
Yule’s Q
• often used to assess the strength of presence /
absence association
Male burial
+
-
Bone needles
+
12
14
16
3
Q = -.72
• range is –1 (perfect negative association) to 1
(perfect positive association); values near 0
indicate a lack of association
Yule’s Q
• not sensitive to marginal changes (unlike Phi2)
• multiply a row or column by a constant;
cancels out…
Source A
Source B
jars
19
6
ollas
10
15
Source A
Source B
jars
19
6
ollas
20
30
(Q=.65 for both tables)
Yule’s Q
• can’t distinguish between different degrees of ‘complete’
association
• can’t distinguish between ‘complete’ and ‘absolute’
association
RHS
LHS
RHS
LHS
M
60
0
M
60
0
F
10
30
100
F
20
20
100
RHS
LHS
M
60
0
F
0
40
100
“odds” ratio
• easiest with 2 x 2 tables
• what are the ‘odds’ of a man being buried
on his right side, compared to those of a
woman??
• if there is a strong level of association
between sex and burial position, the odds
should be quite different…
a
b
c
d
a
c
odds ratio =
b
d
M
RHS
LHS
F
29
11
40
14
33
47
43
44
87
29/11=2.64
14/33=0.42
2.64/0.42=6.21
if there is no association, the odds ratio=1
departures from 1 range between 0 and infinity
>1 =‘positive association’
<1 =‘negative association’
Goodman and Kruskal’s Tau ()
• “proportional reduction of error”
• how are the probabilities of correctly
assigning cases to one set of categories
improved by the knowledge of another set
of categories??
Goodman and Kruskal’s Tau ()
•
•
•
•
limits are 0-1; 1=perfect association
same results as Phi2 w/ 2x2 table
sensitive to margin differences
asymmetric
– get different results predicting row assignments
based on columns than from column
assignments based on rows
• =[P(error|rule 1)-P(error|rule 2)] / P(error|rule 1)
• rule 1: random assignments to one variable are
made with no knowledge of 2nd variable
• rule 2: random assignments to one variable are
made with knowledge of 2nd variable
B1 B2
A1
A2
6
14 20
B1
A1 6
A2 0
6
B2
0
14
14 20
Table Standardization
• even very large and highly significant X2 (or G2)
statistics don’t necessarily mean that all parts of the
table are equally “deviant” (and therefore interesting)
• usually need to do other things to highlight loci of
association or ‘interaction’
• which cells diverge the most from expected values?
• very difficult to decide when both row and column
totals are unequal…
Percent standardization
• highly intuitive approach, easy to interpret
• often used to control the effects of samplesize variation
• have to decide if it makes better sense to
standardize based on rows, or on columns
• usually, you want to standardize whatever it
is you want to compare
– i.e., if you want to compare columns, base
percents on column totals
• you may decide to make two tables, one
standardized on rows, the other on
columns…
Fauna
bear
moose
coyote
rabbit
dog
deer
Fauna
bear
moose
coyote
rabbit
dog
deer
Site
A
2
15
2
16
2
16
53
Site
A
3.8
28.3
3.8
30.2
3.8
30.2
100
B
C
1
5
0
8
3
8
25
B
4.0
20.0
0.0
32.0
12.0
32.0
100
0
10
0
12
0
7
29
C
0.0
34.5
0.0
41.4
0.0
24.1
100
3
30
2
36
5
31
107
MNIs
Fauna
bear
moose
coyote
rabbit
dog
deer
Site
A
66.7
50.0
100.0
44.4
40.0
51.6
B
33.3
16.7
0.0
22.2
60.0
25.8
C
0.0
33.3
0.0
33.3
0.0
22.6
100
100
100
100
100
100
Binomial Probabilities
• P(n,k,p):
“probability of k successes in n trials, with p
probability of success in any one trial”
5
1
3
4
3.7
2.3
13
n = 13
k=5
p = 3.7/13
4.3
2.7
13
Binomial Probabilities
• in R:
> pbinom(k, n, p)
• easy to build into a function…
K-S test for cumulative percents
100
90
90
80
80
70
70
percent
60
50
40
cumulative percent
100
60
50
40
30
30
20
20
10
10
100
90
80
cumulative percent
70
60
50
40
30
20
10
Cumulative Percent Graph
• good for
comparing data
sets
• some useful
statistical
measures
100
90
cumulative percent
80
70
60
50
40
• can be
misleading
when used with
nominal data
30
20
10
(ordinal or ratio scale)
Percentages
Sites
A
Types
1
2
3
4
5
6
7
8
9
B
5
45
5
5
5
5
20
5
5
100
Cumulative Percents
Sites
A
B
Types
1
5
2
50
3
55
4
60
5
65
6
70
7
90
8
95
9
100
C
5
0
48
5
5
5
5
22
5
100
5
30
5
5
5
5
35
5
5
100
120
A
C
5
5
53
58
63
68
73
95
100
B
100
C
5
35
40
45
50
55
90
95
100
80
60
40
20
0
1
2
3
4
5
6
7
8
9
120
A
B
100
C
80
60
120
40
A
B
100
C
20
80
0
1
2
3
4
5
6
7
60
8
9
40
20
0
1
5
3
4
2
6
7
8
9
K-S test
• find Dmax:
– maximum difference between 2 cumulative
proportion distributions
– compare to critical value for chosen sig. level
• C*((n1+n2)/(n1n2))^.5
– alpha =.05, C=1.36
– alpha =.01, C=1.63
– alpha =.001, C=1.95
example 2
• mortuary data (Shennan, p. 56+)
• burials characterized according to 2 wealth
(poor vs. wealthy) and 6 age categories
(infant to old age)
Rich
Poor
Infans I
6
23
Infans II
8
21
Juvenilis
11
25
Adultus
29
36
Maturus
19
27
Senilis
3
4
Total
76
136
• burials for younger age-classes appear to be
more numerous among the poor
• can this be explained away as an example of
random chance?
or
• do poor burials constitute a different
population, with respect to age-classes, than
rich burials?
• we can get a visual sense of the problem
using a cumulative frequency plot:
1
0.9
0.8
rich
0.7
poor
0.6
0.5
0.4
0.3
0.2
0.1
Senilis
Maturus
Adultus
Juvenilis
Infans II
Infans I
0
• K-S test (Kolmogorov-Smirnov test) assesses the
significance of the maximum divergence between two
cumulative frequency curves
H0:dist1=dist2
• an equation based on the theoretical distribution of
differences between cumulative frequency curves
provides a critical value for a specific alpha level
• observed differences beyond this value can be regarded
as significant at that alpha level
• if alpha = .05, the critical value =
1.36*(n1+n2)/n1n2
1.36*(76+136)/76*136 = 0.195
• the observed value = 0.178
• 0.178 < 0.195; don’t reject H0
1
0.9
0.8
rich
0.7
poor
0.6
0.5
Dmax=.178
0.4
0.3
0.2
0.1
Senilis
Maturus
Adultus
Juvenilis
Infans II
Infans I
0
example 2
statement/question: “Oil exploration should be allowed in
coastal California…”
age <= 30
age > 30
strongly disagree
8
7
mildly disagree
5
9
disagree
6
6
no opinion
0
1
agree
2
2
mildly agree
1
3
strongly agree
2
3
example 3
• survey data  100 sites
• broken down by location and time:
early
late
Total
piedmont
31
19
50
plain
19
31
50
Total
50
50
100
• we can do a chi-square test of independence
of the two variables time and location
• H0:time & location are independent
• alpha = .05
time
location
time
location
H0
H1
• 2 values reflect accumulated differences between
observed and expected cell-counts
• expected cell counts are based on the assumptions
inherent in the null hypothesis
• if the H0 is correct, cell values should reflect an
“even” distribution of marginal totals
piedmont
plain
Total
early
25
late
50
50
Total
50
50
100
• chi-square = ((o-e)2/e)
• observed chi-square = 4.84
• we need to compare it to the “critical value”
in a chi-square table:
• chi-square = ((o-e)2/e)
• observed chi-square = 4.84
• chi-square table:
 critical value (alpha = .05, 1 df) is 3.84
 observed chi-square (4.84) > 3.84
• we can reject H0
• H1: time & location are not independent
• what does this mean?
early
late
Total
piedmont
31
19
50
plain
19
31
50
Total
50
50
100