Inferential Statistics (K-19)
Download
Report
Transcript Inferential Statistics (K-19)
research
remembering the great
Lee Cronbach
April 22 1916-October 1 2001
• in 1992, Psychological Bulletin
published a list of its 10 most cited
articles. Lee Cronbach was first or
only author on 4 of the 10
• the trick to doing research is to
begin with the question and then
to figure out the best way to
answer that question.
• the mistake is to begin with the
method and fit the question to
the method.
an introduction to
statistics (cont.)
• research using
– measurement description
– statistical analysis
critical for answering certain kinds of
important questions
strengths of measurement description
• precise descriptions
• often efficient—one can make confident
predictions based on relatively small
samples—if samples good
• increasingly sophisticated ways of
analyzing measurement data
• powerful stat packages now available for
desktop computers, e.g, Systat, SPSS,
SAS, Resampling Statistics
cautions
• measure only what can be measured
– “to replace the unmeasureable with the
unmeaningful is not progress” (Achen,
1977)
• value precision but realize that a precise
description may not be an accurate one
• remember that scientific method (drawing
inferences from observations) comprises
many specific methods and that its
strength does not come from any one
method
my personal recommendations
• whatever your Ph.D. Research
Specialization take at least one stat
course
• whatever your methodological
expertise, find people with similar
interests but different methodological
expertise and work with them—the
best research often uses multiple
methods
type I & type II error (revisited)
• type I: accepting what is really
false (alpha error)
• type II: rejecting as false what
should be accepted (beta error)
• decreasing the probability of one
increases the probability of the
other
when theory testing
• be concerned about Type I error
when theory building
• be concerned about Type II error
• pointless to talk about Type I and Type
II error absent discussion of what is at
stake if I am wrong
cost of type I error in theory testing
• dominant theory not challenged
• knowledge production stopped
cost of type II error in theory
building
• possibly important explanations etc
ignored
• knowledge production stopped
this Type I/Type II error discussion
one of the many contributions that
Lee Cronbach made to doing
research, and one of the many
challenges he made to the accepted
wisdom of the day.
KRATHWOHL
inferential statistics (ch 19)
• inferential statistics allow one to infer
the characteristics of a population from
a representative sample
– from sample one can estimate
characteristics of population within a
determined range with a given
probability
– determine whether an effect beyond
sampling and chance error exists in a
study with a given probability
• parameters: refer to population
• statistics: refer to sample
• sampling distribution: descriptive
statistic calculated from repeated
sampling
• confidence intervals: range that
includes the population value with a
given probability (based on standard
error of measurement)
confidence level:
• the probability that the interval will
contain the population value
(conventionally 68, 95, and 99%, or 2
to 1, 19 to 1, 99 to 1 respectively, or
+1 SE, + 1.96 SE, +2.58 SE
respectively )
• the wider the interval the more
certain it contains the population
value (and the less valuable the
information becomes)
hypothesis testing (traditionally takes
form of rejecting the null hypothesis,
i.e., that there is no effect beyond
sampling and chance error)
alpha level: the risk the result is due to
chance; set by the researcher in
advance, traditionally .10, .05, .01, .001
p-level: the actual probability level
found, which is then compared to the
alpha level
two-tailed test:
• non-directional, puts the alpha level
at both ends. Used when one does not
expect results in one direction
one-tailed test:
• directional, puts alpha level at one
end (determined by researcher).
Increases probability of finding
statistically significant result
common statistical tests
t test of difference between means
• common and simple test for
differences between means of two
groups
chi-square
• common test for categorical data and
frequencies—are cell values different
from what would be expected
ANOVA (analysis of variance)
• commonly used in experimental designs
where two or more groups or multiple
conditions are being compared (thus
common in psychology and ed psych, and in
educational research in general)
• powerful: more accurate measure of
error variance, tests significance of each
variable as well as combined effect,
avoids inflation of probabilities problem
errors of inference
• type I error: a concern when theory
testing (K, “when validating a finding”)
• type II error: a concern when theory
building (K: “when exploring”)
statistical power: 1-beta
• increasing statistical power:
– increase size of effect (stronger
treatment)
– increase sample size
– reduce variability
statistical & practical significance
• statistical: confidence at a given
probability that the result is not due
to chance
• practical: is the result important
enough, big enough, feasible,
affordable—all value judgments
– if one apple a day keeps the doctor
away, but it takes three grapefruit,
then…?
• no statistic or statistical test can
make practical decision
• whether one risks being wrong
cautiously (type I) or wrong
incautiously (type II) cannot be
decided absent cost and risk, what’s
at stake, needs etc
• no statistical analysis better than
the numbers (descriptions) fed into
it: garbage in, garbage out
statistical significance refers only to
samples from population
• it does not refer to size of effect—
ceteris paribus larger effects are
more likely to be statistically
significant, but with large samples
very small effects will be
• if you have the population, then any
effects are real no matter what the
size
no proof in science:
• a statistically significant result (assuming
appropriate analysis etc) does not prove
that the hypothesis is true, only that it
has escaped disconfirmation
• the more often an hypothesis passes the
test and the more demanding the tests it
passes, then the more certain we can be
that we know something—the more we
have reduced uncertainty
other terms
• parametric: assumes random sampling,
from distribution with known parameters,
often normal distribution
• nonparametric: when data do not come
from known distribution—often with
nominal or ordinal data
• robust test: accurate even when
assumptions violated
• effect size: too often ignored—journals
now requiring estimates of effect size.
terms you should look up in Vogt
• effect size
• emic & etic
• endogenous and exogenous variables
• face validity
• file drawer problem
• gambler’s fallacy
• halo effect or bias
• hold constant
• independence
• interaction effect
ethics
Sieber ch 6: Strategies for Assuring
Confidentiality
6.1 Confidentiality refers to agreements
with people about what can be done with
data
• states steps will be taken to insure
privacy
• states legal limitations to assurances of
confidentiality
6.2 why an issue (be able to discuss the
cases)
6.3 confidentiality or anonymity
6.4 procedural approaches to assuring
confidentiality
6.4.1 cross-sectional research
– anonymity
– temporarily identified responses
– separately identified responses
6.4.2 longitudinal data (requires links)
– aliases
6.4.3 interfile linkage
6.5 statistical strategies for assuring
confidentiality (coin flip example)
6.6 certificates of confidentiality
– researchers do NOT have testimonial
privilege unless they have certificate
of confidentiality from Dept of
Health and Human Services
6.7 confidentiality and consent:
– consent statement must specify
promises of confidentiality researcher
cannot make—be aware of state
reporting laws, e.g., on child abuse
6.8 data sharing
– when data shared publicly, all
identifiers must be removed and
researcher must ensure no way to
deduce identity
– techniques
simple statistical way to find out what
people may not willing to admit
• ask people to flip coin
– if head, answer “don’t know”
– if tail and have done X, answer “don’t
know”
– if tail and have not done X, answer
“no”
• thus, no’s an estimate of half who have
not done x
• thus, N minus twice the number of “no’s”
gives estimate of those who have done X
case 3
• what did you learn from reading this
case?
• how would your write this case
differently?
• do you think that this case is
realistic?
• what should our hero do?
writing
general style rules
and tips
use active voice
• I interviewed the kids. (good)
• The kids were interviewed. (bad)
use first person to talk about yourself
• I interviewed the kids. (good)
• The researcher interviewed the kids.
(bad)
do not begin sentences with “there is” or
“it is” etc.
• There were three kids who answered…
(bad)
• Three kids answered the questions.
(good)
use who for people, that for things
• I interviewed the kids, who all
agreed….(good)
• I interviewed the teacher that was in….
(bad)
pronouns must refer to nouns
• I entered the room and found the kids
running across the table tops and
throwing erasers at each other. That
made me nervous. (bad—not clear what
made you nervous)
use comma to separate clauses in
compound sentence joined by a
conjunction (e.g., and, but) I
interviewed the kids, but they did not
appear comfortable.
introductory adjectival phrases must
modify the subject
• Rushing into the room, the class had
already begun. (bad)
• Rushing into the room, I discovered that
the class had already begun. (good)
use “Harvard comma”
• apples, pears, and bananas (good)
• apples, pears and bananas (bad)
find the right word
• Mark Twain observed that the
difference between the right word
and the almost right word is the
difference between lightning and a
lightning bug.
grad life
more bests
best place to prepare for Hallowe’en
• Dallas & Company, 1st & University, C
best used book stores
• Babbitt’s, 608 E Green C
• Jane Addams, 208 N. Neil C
• Old Main Book Shop, 116 N Walnut C
• Priceless Books, 108 W Main U