Inferential statistics
Download
Report
Transcript Inferential statistics
Inferential statistics
Hypothesis testing
Questions statistics can help us
answer
• Is the mean score (or variance) for a given
population different from the one predicted?
• Are two populations different on some
characteristic?
• Are two (or more) variables related in the
population under study?
• For all three above: Is the finding due to pure
chance?
– Requires probability sample or randomization
Statistical significance
• Could my findings be due to chance? (luck)
• Researcher sets acceptable level prior to data
collection
– Less than 5 times in 100 [p < .05]
– Less than 1 time in 100 [p < .01]
• If the researcher chooses p < .01, then the results are
considered ‘statistically significant’ if fewer than 1
sample in 100 would generate a result “like the one
found in the study” (depending on what you’re
looking for)
• Statistical significance is not the same as
theoretical or social significance
– Very sensitive to sample size
Sample difference from ideal
number
• A sample statistic may be compared to an ideal
number (often 50% or the population percentage
established by a census) to determine whether the
group it represents has some characteristic.
• For example, is the population of
Telecommunications students more than
half male?
• Take a sample of Tel students and compare the %
who are male to 50%
– Depending on the percentage of males in the sample
and the size of the sample, you may be able to say that
the finding that Tel students are more likely to be male
is “statistically significant”
Differences between two samples
• You may want to see if two samples are different
on some characteristic
• Compare means of two samples on the outcome
measure
• If the difference between the two groups is large
in comparison to the variance within groups, then
there will be a statistically significant difference
between groups
For example:
• Are Telecommunications students more
likely to be male than are Geography
students?
Comparisons among groups
• Commonly used in experimental studies
• t-test
– preferred where two groups can be compared
according to a hypothesis
• ANOVA
–
–
–
–
comparison among multiple groups
allows for factorial designs
good at dealing with interactions
F ratio
Main effect of commercial
35
30
25
20
Male
Female
15
10
5
0
Commercial A
Commercial B
Main effect of gender
35
30
25
20
Male
Female
15
10
5
0
Commercial A
Commercial B
Interaction
35
30
25
20
Male
Female
15
10
5
0
Commercial A
Commercial B
Relationships among variables
• Eyeballing a scatterplot
Statistics of association
• If we want a more precise test as to what extent
two variables are related, we would look at a
statistic of association
• Different statistics are used depending upon the
kind of scale we are concerned with and the
assumed population distribution of the variables
– Parametric
– Nonparametric
• Two nominal—Chi-square (non-parametric)
• Two ordinal—Spearman’s rho
• Two ratio—Pearson’s r
Covariance among variables
• Correlation
• How much of the variance in one measure can be
accounted for by variance in another?
• Essentially, when we know where someone stands on
one variable, how much does it help us predict where
they stand on another?
• Stats: Pearson’s r; r2 coefficient of determination
• Regression
Linear regression
• Minimizes total squared distances between
individual data points and regression line
– y=ax+b
• allows for prediction of behavior of
dependent variable
• preferred to simple correlation
Multiple variables
• We can statistically analyze the relations
among multiple variables at the same time
• Attempt to determine unique contribution of
a single independent variable
– Identify, then remove the effect of variables not
currently being tested
– Evaluate relationship between predictor
variable of interest and “corrected” scores on
dependent variable
Common statistical methods
• Multiple correlation
• Multiple regression
– Variables may be entered according to
theoretical model or their predictive strength
So:
• Inferential statistics do two jobs:
– They provide an estimate of how likely it is that
your findings are due to chance
– They provide an estimate of the ‘size’ of the
relationship between two or more variables
• The appropriate statistics vary by the
method you used to collect data, the
assumed population distribution of your
variables and the scale types you employed
to measure them