Analyzing Statistical Inferences: A Review

Download Report

Transcript Analyzing Statistical Inferences: A Review

Analyzing Statistical
Inferences
How to Not Know Null
Agenda
• Inferential stats
– Descriptive vs. Inferential
– Ramifications of hypothesis testing
– Tests of significance
• Action Research
– Dissect Sanna paper
– Outline paper
– Discuss presentations
Teacher Salary Example
• Descriptive Stats
– Range of salary distributions
– Mean
– Percentages of teachers with different levels of
experience and degrees
Teacher Salary Example
• What kinds of questions would we ask if we
wanted to compare these 2 groups of
teachers?
• Inferential Statistics
Understand the need for using
inferential statistics to estimate likely
conclusions
• What Are Inferential Statistics?
– Inferential statistics refer to certain
procedures that allow researchers to make
inferences about a population based on
data obtained from a sample.
– The term “probability,” as used in research,
refers to the predicted relative frequency
with which a given event will occur (e.g., pvalue).
Descriptive vs. Inferential Stats
Descriptive
• Describe the data.
• Means, variances,
frequencies.
• Important precursor
to inferential stats
Inferential
• Infer from a sample
what is true of a
population.
• Rely on descriptive
stats.
• Ultimate goal - to
draw accurate
conclusions about
the population
The Notorious Null
• What is it??
– A statement about a relationship
– No differences between groups
– No relationships between variables
• What assumption should you always
make about the null?
Assume the null is accurate
The Notorious Null
• Null hypothesis differs in most instances
from the research hypothesis
– which states that one method is expected
to be more effective than another
• Rejecting the null hypothesis provides
evidence (but not proof) that the
intervention had an effect
Hypothesis Testing
• Tests of significance ask this question:
Could these observations really
have occurred by chance?
• Example of Jury Selection
p < .0000000000000000014
Probability
• Level of significance or p = probability of
being wrong to reject the null (to state
there is a true difference, but in reality
the difference is from chance)
• In general, research should be p < .05
to be considered significant.
• The decision a researcher must make
is:
– whether to accept the null hypothesis or to
reject it
• There are four possibilities…
The researcher’s decision
about the null hypothesis…
Decisions concerning rejecting
the null hypothesis…
The true status of the null hypothesis…
True
False
True
Correct
Type II
Error (β)
False
Type I
Error (α)
Correct
Consider an example from a
kitchen…
• You probably have a smoke alarm
where you live.
• You have probably made microwave
popcorn or toast that set off your alarm
though you had no fire.
TYPE I error
• If you ever took the batteries out of your
smoke alarm because you got so
annoyed,…
• You ran the risk of having a fire but no
alarm.
TYPE II error
• If you had a fire, but your alarm worked,
you’re ok.
• If you had no fire, and no alarm, you’re
also ok.
• Hence…
The status of the smoke alarm…
Put in cooking terms…
The true status of the kitchen…
No Fire
Fire
No
Alarm
Correct
Type II
Error (β)
Alarm
Type I
Error (α)
Correct
The researcher’s decision
about the null hypothesis…
Back to the null hypothesis…
The true status of the null hypothesis…
True
False
True
Correct
Type II
Error (β)
False
Type I
Error (α)
Correct
Your Turn for Statistical Fun
• Create a null hypothesis regarding the
effectiveness of 2 methods of instruction on
student achievement.
• Using the chart on the previous slide, state
what is occurring with this particular
hypothesis.
• What are ramifications of incorrect decisions?
Probability
• Level of significance or p = probability of
being wrong to reject the null (to state
there is a true difference, but in reality
the difference is from chance)
• In general, research should be p < .05
to be considered significant.
Things that effect p
• Difference between 2 groups
• Sampling and/or measurement error
• Size of the sample
Steps in using inferential statistics
1. Select the test of significance
2. Determine whether significance test will
be two-tailed or one tailed
3. Select α (alpha), the probability level
(usually <.05)
4. Compute the test of significance
5. Consult table to determine the
significance of the results
How to determine p
Tests of Significance
Tests of Significance
• Statistical formulas that enable the
researcher to determine if there was a
real difference between the sample
means
• Examples
– t test
– ANOVA
– Chi-square
t test
• Used to determine whether two means are
significantly different at a selected probability
level
• Adjusts for the fact that the distribution of
scores for small samples becomes
increasingly different from the normal
distribution as sample sizes become
increasingly smaller
• Sample t-table
t test
• If the t value is equal to or greater than
the table value, then the null hypothesis
is rejected because the difference is
greater than would be expected due to
chance
Reminder…
• Don’t forget the purpose. You are going
through this statistical rigmarole
because you want to know
Could these observations really
have occurred by chance?
ANOVA
• A comparison of the means for two or more
groups
• Example - Do the mean scores differ for the
groups using co-operative group, lecture, or
web-based instruction?
• The assumption is that randomly formed
groups of participants are chosen and are
essentially the same at the beginning of a
study on a measure of the dependent
variable
ANOVA
• F value of ANOVA is similar to t-value in
t-test.
• If F value is significant, you know there
is a difference somewhere, but have to
do post hoc tests to figure out where.
Chi-Square
• Tests differences in frequencies across
different categories
– Do mothers and fathers differ in their
support of a year-round school calendar?
– Do the percentages of undergraduate,
graduate, and doctoral students differ in
terms of their support for the new class
attendance policy?
Some words about significance
• “Statistical significance” is a term that
refers to some statistical criterion,
usually the numerical value of some
formula or calculation.
• “Practical significance” means its utility,
and that is in the eyes of the beholder.
What may be impractical to you or me
may be very practical to someone else.
Practical Significance
• An Example – A new reading program shows
improved comprehension scores that are
statistically significant. However, it takes
many hours and dollars to train teachers to
use the program. Does it warrant buying the
new program?
• Big Question: Is it practical to use the results?
“There is no magical or purely technical
way to decide whether or not a
statistically significant difference means
you should do something different in
your school. There are only tools that
assist your judgment. There is no
escape from using judgment.”
--Gerald W. Bracey
Reading Educational Research: How
To Avoid Getting Statistically Snookered
Practical Significance
Large sample sizes can produce a
statistically significant result even
though there is limited or no practical
importance associated with the finding.
Effect Size
• Take into account variance, not just the
means.
• Refers to the magnitude of a difference.
• Levels you should know
d ≥ .75 = large effect
d ~ .5 = moderate effect
d ~ .3 = small effect
• Good website on Effect Size
http://www.cemcentre.org/renderpage.asp?linkID=30325016
Evaluation Criteria
• Basic descriptive statistics are needed
to evaluate the inferential results
• Inferential analyses report statistical
significance, not practical significance
• Inferential analyses do not indicate
internal or external validity
• The results depend on sample sizes
Evaluation Criteria
• The appropriate statistical procedures
are used
• The level of significance is interpreted
correctly