X2 Tests - cloudfront.net

Download Report

Transcript X2 Tests - cloudfront.net

How to know when to use them.
Which one should I use?
Chi Square Tests
 How do I know to use a chi-square test?
 Only use a Chi-Square test if all of the data in question is
categorical (remember your assumptions)





Randomness
Independence
10% Rule
Counted Data – The values in each cell must be counts for the
categories of a categorical (qualitative) variable.
Expected Cell Frequency – Every cell should contain a count of at
least 5.
Chi Square Goodness of Fit Test
 This test allows us to compare a collection of categorical
data with some theoretical expected distribution.
 Sometimes called a One-Sample Test
 Degrees of Freedom = categories – one.
 Example: Birth month of Baseball players
Chi Square Goodness of Fit Test
After getting trounced by your little brother in a children’s
game, you suspect the die he gave you to roll may be unfair.
To check, you roll it 60 times, recording the number of times
each face appears. Do these results cast doubt on the die’s
fairness?
• If the die is fair, how many times would
you expect each face to show?
• To see if these results are unusual, what
type of test will you perform?
• State your hypotheses
Face
Count
1
11
2
7
3
9
4
15
5
12
6
6
Chi Square Goodness of Fit Test
After getting trounced by your little brother in a children’s
game, you suspect the die he gave you to roll may be unfair.
To check, you roll it 60 times, recording the number of times
each face appears. Do these results cast doubt on the die’s
fairness?
Face
Count
• How many degrees of freedom are there?
1
11
2
7
• Find x2 and the P-value
3
9
4
15
5
12
6
6
• Check the conditions
• State your conclusion
Chi Square Test of Homogeneity
 A test comparing the distribution of counts for two or more
groups on the same categorical variable.
 Finds the expected counts based on the overall frequencies,
adjusted for the totals in each group under the (null
hypothesis) assumption that the distributions are the same
for each group.
 Degrees of Freedom = (rows – 1)(columns – 1)
 Where rows gives the number of categories and columns gives
the number of independent groups.
 Example: Future plans of Graduates based on college of
study
Chi Square Test of Homogeneity
Does your doctor know? A survey of articles from the New England
Journal of Medicine (NEJM) classified them according to the principal
statistics methods used. The articles recorded were all non-editorial
articles appearing during the indicated years. Has there been a
change in the use of Statistics?
• What kind of test would be
Publication Year
appropriate?
• State the hypotheses.
• How many degrees of freedom are
there?
• The smallest expected count will be in
1989/No cell. What is it?
197879
1989
200405
Total
No
Stats
90
14
40
144
Stats
242
101
271
614
Total
332
115
311
758
Chi Square Test of Homogeneity
Does your doctor know? A survey of articles from the New England
Journal of Medicine (NEJM) classified them according to the principal
statistics methods used. The articles recorded were all non-editorial
articles appearing during the indicated years. Has there been a
change in the use of Statistics?
• Check the assumptions and
Publication Year
conditions for inference.
• Calculate the component of chi-
square for the 1989/No cell.
• For this test, x2 = 25.28. What’s the P-
value?
• State your conclusion.
197879
1989
200405
Total
No
Stats
90
14
40
144
Stats
242
101
271
614
Total
332
115
311
758
Chi Square Test of Homogeneity
Does your doctor know? A survey of articles from the New England
Journal of Medicine (NEJM) classified them according to the principal
statistics methods used. The articles recorded were all non-editorial
articles appearing during the indicated years. Has there been a
change in the use of Statistics?
• Show how the residual for the
Publication Year
1989/No cell was calculated.
• What can you conclude from the
patterns in the standardized
residuals?
197879
1989
200405
No
Stats
3.39
-1.68
-2.48
Stats
-1.64
0.81
1.20
Chi Square Test of Independence
 A test of whether two categorical variables are independent.
Usually displayed in a contingency table.
 Contingency table – A two-way table that classifies
individuals according to two categorical variables.
 Examines the distribution of counts for one group of
individuals classified according to both variables.
 Degrees of Freedom = (rows – 1)(columns – 1)
 Where rows give the number of categories in one variable and
columns gives the number of categories in the other.
Chi Square Test of Independence
There is some concern that if a woman has an epidural to reduce pain
during childbirth, the drug can get into the baby’s bloodstream,
making the baby sleepier and less willing to breastfeed. Researchers
followed up on 1178 births, noting whether the mother had an
epidural and whether the baby was still nursing after 6 months.
• What kind of test would be
• State the null and alternative
hypotheses.
• How many degrees of freedom are
there?
Breastfeeding at 6
months?
appropriate?
Epidural?
Yes
No
Total
Yes
206
498
704
No
190
284
474
Total
396
782
1178
Chi Square Test of Independence
There is some concern that if a woman has an epidural to reduce pain
during childbirth, the drug can get into the baby’s bloodstream,
making the baby sleepier and less willing to breastfeed. Researchers
followed up on 1178 births, noting whether the mother had an
epidural and whether the baby was still nursing after 6 months.
• The smallest expected count will be in
• Check the assumptions and
conditions for inference.
• Calculate the component of chi-
square for the epidural/no
breastfeeding cell.
Breastfeeding at 6
months?
the epidural/no breastfeeding cell.
What is it?
Epidural?
Yes
No
Total
Yes
206
498
704
No
190
284
474
Total
396
782
1178
Chi Square Test of Independence
There is some concern that if a woman has an epidural to reduce pain
during childbirth, the drug can get into the baby’s bloodstream,
making the baby sleepier and less willing to breastfeed. Researchers
followed up on 1178 births, noting whether the mother had an
epidural and whether the baby was still nursing after 6 months.
• For this test, x2 = 14.87. What’s the P• State your conclusion.
• Show how the residual for the
epidural/no breastfeeding cell was
calculated.
• What can you conclude from the
standardized residuals?
Breastfeeding
at 6 months?
value?
Epidural?
Yes
No
Yes
-1.99
14.2
No
2.43
-1.73
Chi Square Test of Independence
There is some concern that if a woman has an epidural to reduce pain
during childbirth, the drug can get into the baby’s bloodstream,
making the baby sleepier and less willing to breastfeed. Researchers
followed up on 1178 births, noting whether the mother had an
epidural and whether the baby was still nursing after 6 months.
Suppose a broader study included several additional issues, including
whether a mother drank alcohol, whether this was a first child, and
whether the parents occasionally supplemented breastfeeding with
bottled formula. Why would it not be appropriate to use chi-square
methods on the 2 × 8 table with yes/no columns for each potential
factor?