PPT Lecture Notes
Download
Report
Transcript PPT Lecture Notes
Chi-Square
Analyses
Please refrain from typing, surfing or printing during our conversation!
Outline of Today’s Discussion
1.
The Chi-Square Test of Independence – Introduction
2.
The Chi-Square Test of Independence – Excel
3.
The Chi-Square Test of Independence – SPSS
4.
The Chi-Square Test for Goodness of Fit - Introduction
5.
The Chi-Square Test for Goodness of Fit – Excel
Part 1
Chi-Square Test of Independence
(Introduction)
Chi-Square: Independence
1. The chi-square is a non-parametric test - It’s NOT
based on a mean, and it does not require that the
data are bell-shaped (I.e., Gaussian distributed).
2. We can use the Chi-square test for analyzing data
from certain between-subjects designs.
Will someone remind us about between-subject
versus within-subject designs?
3. Chi-square tests are appropriate for the analysis of
categorical data (i.e., on a nominal scale).
Chi-Square: Independence
1. Sometimes a behavior can be described only in an
all-or-none manner.
2. Example: Maybe a particular behavior was either
observed or not observed.
3. Example: Maybe a participant either completed an
assigned task, or did not.
4. Example: Maybe a participant either solved a
designated problem, or did not.
Chi-Square: Independence
1. We can use the Chi-square to test data sets
that simply reflect how frequently a particular
category of behavior is observed.
2. The Chi-square test of independence is also
called the two-way chi-square.
3. The Chi-square test of independence requires
that two variables are assessed for each
participant.
Chi-Square: Independence
1. The Chi-square test is based on a comparison
between values that are observed (O), and values that
would be expected (E) if the null hypothesis were
true.
2. The null hypothesis would state that there is no
relationship between the two variables, i.e., that the
two variables are independent of each other.
3. The chi-square test allows us to determine if we
should reject or retain the null hypothesis…
Chi-Square: Independence
1. To calculate the chi-square statistic, we need to
develop a so-called “contingency table”.
2. In the contingency table, the levels of one variable
are displayed across rows, and the levels of the
other variable are displayed across columns.
3. Let’s see a simple 2 x 2 design…
Chi-Square: Independence
Contingency Table: 2 rows by 2 columns
City
Political Party
Democrat
Republican
Minneapolis
Atlanta
The “marginal frequencies” are the
row totals and column totals for each
level of a particular variable.
Chi-Square: Independence
1. A “cell” in the table is defined as a
unique combination of variables (e.g.,
city, political party).
2. For each cell in the contingency table, we
need to calculate the expected frequency.
3. To get the expected frequency for a cell,
we use the following formula…
Chi-Square: Independence
The expected (E) frequency of a cell.
Example
Chi-Square: Independence
Political Party
Democrat
Republican
City
Minneapolis
Atlanta
Does everyone now
understand where this 28
came from?
Chi-Square: Independence
Here’s the Chi-square statistic.
Let’s define the components…
Chi-Square: Independence
Components of the Chi-Square Statistic
Chi-Square: Independence
We’ll need one of these
for each cell in our contingency table.
Then, we’ll sum those up!
Chi-Square: Independence
Check: Be sure to have one of these
for each cell in your contingency table.
We’ll reduce them, then sum them…
Chi-Square: Independence
Finally, for each cell,
reduce the parenthetical expression
to a single number, and sum those up.
Chi-Square: Independence
1. After calculating the Chi-square statistic, we
need to compare it to a “critical value” to
determine whether to reject or accept the null
hypothesis.
2. The critical value depends on the alpha level.
What does the alpha level indicate, again?
3. The critical value also depends on the “degrees
of freedom”, which is directly related to the
number of levels in being tested…
Chi-Square: Independence
Formula for the “degrees of freedom”
In our example,
we have 2 rows and 2 columns, so
df = (2-1) (2-1)
df = 1
Chi-Square: Independence
1.
We will soon attempt to develop some intuitions about the
“degrees of freedom” (df), and why they are important.
2.
For now, we will simply compute the df so that we can
determine the critical value.
3.
For df = 1, and an alpha level of 0.05, what is the critical
value? (see the hand-out showing the critical values table).
4.
How does the critical value compare to the value of chi-square
that we obtained (i.e., 6.43)?
5.
So, what do we decide about the null hypothesis?
Chi-Square: Independence
1. Congratulations! You’ve completed your first try at
hypothesis testing!
2. In a way, the computations are somewhat similar to
the various “r” statistics you’ve previously
calculated.
3. However, we had not previously compared our “r”
statistics to a critical value. So, we had not previously
drawn any conclusions about statistical significance.
4. Questions so far?
Chi-Square: Independence
1. Before we move on, I’d like you to develop some
intuitions about the computations…
2. Let’s look at a portion of the computation that you
just completed, and really understand it…
Chi-Square: Independence
Under what circumstances would
the expression that’s circled
produce a zero?
Chi-Square: Independence
In general, when the observed and expected
values are very similar to each other,
the chi-square statistic will be small
(and we’ll likely retain the null hypothesis).
Chi-Square: Independence
By contrast, when the observed and expected
values are very different from each other,
the chi-square statistic will be large
(and we’ll likely reject the null hypothesis).
Chi-Square: Independence
1. The decision to reject or retain the null hypothesis
depends, of course, not only on the chi-square value
that we obtain, but also on the critical value.
2. Look at the critical values on the Chi-square table
that was handed out. What patterns do you see, and
why do those patterns occur?
3. Questions or comments?
Part 2
Chi-Square
Test of Independence
In Excel
Part 3
Chi-Square Test of Independence
In SPSS
Chi-Square in SPSS
1.
Here’s the sequence of steps for Chi-Square in SPSS.
2.
Analyze --> Descriptive Statistics --> CrossTabs (yeah, it’s weird).
3.
Select the two variables of interest by moving one into the
ROWS box, and the other into COLUMNS box.
4.
Statistics --> check off the chi-square
5.
Cell display --> check off observed, expected, row & column
6.
In the output, look for a large value of Pearson Chi square,
we need “asymp sig (2 sided)” to be < 0.05, our alpha level.
Chi-Square in SPSS
When the “asymp sig (2 sided)” value is < 0.05,
reject the null hypothesis.
In practice, there are 2 alpha levels:
There’s the criterion alpha level (usually 0.05),
and the observed alpha level (shown in SPSS output)
Chi-Square in SPSS
Probability by Chance
1. For a given degree-of-freedom level, there is an
inverse relationship between the observed chisquare statistic and observed alpha level.
2.
The higher the observed chi-square value, the smaller
the observed alpha level, i.e., “sig” value.
Probability by Chance
Chi-Square in SPSS
There is a low probability of large c2 values.
Large chi-square values are unlikely to occur just by chance,
So….large chi-square values correspond to low alpha levels.
(Note: Alpha levels are called “sig” values in SPSS)
Part 4
Chi-Square Test:
Goodness-of-Fit
Chi-Square: Goodness-of-Fit
1.
Good news! The test for the goodness-of-fit is much simpler
than that for independence! :-)
2.
In the test for goodness-of-fit, each participant is
categorized on ONLY ONE VARIABLE.
3.
In the test for independence, participants were categorized
on two different variables (i.e, city and political party)).
Chi-Square: Goodness-of-Fit
1.
The null hypothesis states that the expected frequencies
will provide a “good fit” to the observed frequencies.
2.
The expected frequencies depend on what the null
hypothesis specifies about the population…
3.
For example, the null hypothesis might state that all levels
of the variable under investigation are equally likely in the
population.
4.
Example: Let’s consider the factors that go in to choosing a
course for next semester…
Chi-Square: Goodness-of-Fit
1.
Perhaps we’re identified 4 factors affecting course
selection: Time, Instructor, Interest, Ease.
2.
The null hypothesis might indicate that, in the population
of Denison students, these four factors are equally likely to
affect course selection.
3.
If we sample 80 Denison students, the expected value for
each category would be (80 / 4 categories = 20).
4.
Questions so far?
Chi-Square: Goodness-of-Fit
1.
Let’s further assume that, after asking students to decide
which of the 4 factors most affects their course selection,
we obtain the following observed frequencies.
2.
Time = 30; Instructor=10; Interest=22; Ease=18.
3.
We now have the observed and expected values for all
levels being examined…
Chi-Square: Goodness-of-Fit
The chi-square computation is simpler
than before, since we only have one variable
(i.e., only one row).
Chi-Square: Goodness-of-Fit
df = C - 1
Calculating the degrees of freedom
is also simpler than before.
(C = # of columns = one for each level of the variable)
Chi-Square: Goodness-of-Fit
1.
Let’s now evaluate the null hypothesis using the chi-square
test for goodness-of-fit. Again the observed frequencies
are…
2.
Time = 30; Instructor=10; Interest=22; Ease=18.
3.
The expected frequencies are 20 for each category (because
the null hypothesis specifies that the four factors are
equally likely to effect course selection in the population).
Chi-Square: Goodness-of-Fit
1.
Note: There are two assumptions that underlie both chisquare tests.
2.
First, each participant can contribute ONLY ONE
response to the observed frequencies.
3.
Second, each expected frequency must be at least 10 in the
2x2 case, or in the single variable case;
each expected frequency must be at least 5 for designs that are 3x2
or higher.
Chi-Square: Goodness-of-Fit
1.
Lastly, there are standards by which the chi-square statistics
are to be reported, formally.
p = 0.033
2.
Statistics like this are to be reported in the Method section of
an APA style report.
3.
APA = American Psychological Association
Chi-Square: Goodness-of-Fit
Memorize This!
All APA-style manuscripts
consist of the following sections,
in this order:
•
•
•
•
•
•
Abstract
Introduction
Method (singular, not not Methods)
Results
Discussion
References
Part 5
Chi-Square Test for
Goodness-of-Fit
In Excel
Goodness of Fit in Excel
1.
We’ve already seen how we can use a Chi-square table to find
the critical value (“the number to beat”).
2.
We can also use the following Excel command:
=chiinv( probability, degrees of freedom)
where probability = criterion alpha level
i.e., 0.05 in most cases.
3.
The output of “=Chiinv()” is the critical value
“the number to beat”
Goodness of Fit in Excel
1.
We can also use Excel to find the observed alpha level, given
an observed c2 value.
2.
Here’ the Excel command: =chidist(c2 , degrees of freedom)
3.
The output of “=Chidist()” is the observed alpha level (“sig
value in SPSS”).
4.
The observed alpha level must be less than 0.05 (or the
criterion alpha level) to reject the null hypothesis.
Goodness of Fit in Excel
1.
We can also use Excel to find the observed alpha level, given
an observed c2 value.
2.
Here’ the Excel command: =chidist(c2 , degrees of freedom)
3.
The output of “=Chidist()” is the observed alpha level (“sig
value in SPSS”).
4.
The observed alpha level must be less than 0.05 (or the
criterion alpha level) to reject the null hypothesis.