Chapter 2: Every thing you ever wanted to know about statistics

Download Report

Transcript Chapter 2: Every thing you ever wanted to know about statistics

Imagine an engineer who needs to build a bridge. He/she would collect
data, construct a model to test in a lab then then build the actual bridge.
In scientific research we use statistics to develop models to test the
outcome or effect of some treatment. We strive to optimize the fit of our
model with the population of interest.
The data set below can be fit with either a linear or non-linear
model. We need some method to evaluate how good each
model fits the data. Variance and the standard deviation (SD)
can be used to quantify the error in the model.
The dotted red lines show the deviance. The deviance is the
vertical distance from an actual data point up to or down to the
mean (blue line).
The deviance can be thought of as the error in the model.
Degrees of Freedom
• Degrees of Freedom: the degrees of freedom
are the number observations that are free to
vary.
• Example. Choose any 3 numbers, where the
sum is 10. The first two numbers can be any
number, but once they are choose the third
number is no longer free, its value is
determined by your choice of the first two
numbers: 5 + 4 + ___ = 10
• In the example above the degrees of
freedom are (n-1).
Measures of Variation
• Sum of squared errors (SS)
is a good measure of
accuracy.
• The variance (s2) is the
average error between
the mean and the data
points. The variance is in
units squared.
• The standard deviation (s)
is the square root of the
variance. The standard is
in the actual units.
Expressing the Mean as a Model
• Everything in statistics essentially boils down to
one equation:
• Outcomei = (model) + errori
• The data we observe can be predicted from the
model we choose to fit the data plus some
amount of error.
• Likewise, the variance and standard deviation
quantify the goodness of the fit.
If you take several
samples from a
population these
samples will differ
slightly.
It is important to know
how well a sample
represents a
population.
The standard error is
the standard deviation
of sample means.
Standard Error
• The standard error (SE) is the standard deviation of sample
means.
• The SE is simply the standard deviation divided by the
square root of n.
• The SE is a measure of how representative a sample is
likely to be of the population.
• A large SE indicates that there is a large variation between
means of different samples and so the sample might not
be representative of the population.
• A small SE indicates that most sample means are likely to
be a accurate reflection of the population.
Difference between SD & SE
• SD is the sum of the squared deviations
from the mean.
• SE is the amount of ERROR in the estimate
of the population based on the sample.
More than 95% of scores fall between ± 2 SD
Z of 1.96 is the 95% value
Computing Confidence Intervals
Confidence Intervals
• The basic idea of a confidence interval is to
construct a range of values within which we think
the population value falls.
• 95% of z-scores fall between -1.96 and +1.96
• We are 95% sure that the true mean falls in this
range.
In the two samples below the intervals overlap. The population
mean is probably between 2 and 16 million. These samples
were probably drawn from the same population.
Error bars
represent
95% CI.
In the two samples below the intervals DO NOT overlap. These
samples were probably NOT drawn from the same population.
Inferential Statistics
• In inferential statistics, we take a sample and try to make
inference about the population.
• Fisher describes an experiment in which a woman said
she could determine by tasting a cup of tea, whether the
milk or the tea was added to the cup first.
• In the simplest case if we only use 2 cups of tea, the
woman has a 50% chance of getting it right. How much
confidence would you have in her ability?
• If we used 6 cups of tea, there are 20 combinations, if the
woman just guesses she would have a 1 in 20 (5% of the
time) chance of guessing correctly. If the woman now
guesses all 6 correctly you would feel much more
confident about her ability.
Test Statistics
• We can fit statistical models to data that represent the hypotheses we want
to test.
• We can use probability to see whether scores are likely to have happened
by chance.
• If we combine these ideas we can test whether statistical models
significantly fit our data.
• Systematic variation is the variation explained by the model.
• Unsystematic variation is the variation not explained by the model.
• A test statistics (t, F) is a ratio systematic to unsystematic variance.
Alpha (α) is the probability of making a Type I error.
A Type I is saying that there is a difference when there is none.
Beta (β) is the probability of making a Type II error.
A Type II error is saying that there is no difference when there is a
difference.
Power (1-β) is the probability of finding differences when they truly exist.
Effect Size
• An effect size is a standardized (objective) measure of the
magnitude of observed effect.
• The general form of the effect size equation is:
Ex. 20 subjects were pre-tested for jumping ability. They then
reported to the lab 3 times/week for 4 weeks. In each lab session
they sat on the floor and imagined performing 3 sets of 10 maximum
vertical jumps. After 4 weeks of imagined training all of the subjects
jumped higher.
How strong of a training effect would you expect in this experiment?
Will they subjects improve their jumping ability by 20 cm in 4 weeks?
Effect Size
• The effect size quantifies the importance or
meaningfulness of the results.
• A significant finding is not necessarily meaningful or
important.
• It is now common practice to report both confidence
intervals and effect sizes for experiments.
Original
Standard
Large
Medium
Small
Effect
Size
2.0
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Percentile Percent of
Standing Nonoverlap
97.7
81.1%
97.1
79.4%
96.4
77.4%
95.5
75.4%
94.5
73.1%
93.3
70.7%
91.9
68.1%
90
65.3%
88
62.2%
86
58.9%
84
55.4%
82
51.6%
79
47.4%
76
43.0%
73
38.2%
69
33.0%
66
27.4%
62
21.3%
58
14.7%
54
7.7%
50
0%
Cohen's d for Effect Sizes
adapted from Cohen, J. (1988). Statistical
power analysis for the behavioral sciences
(2nd ed.). Hillsdale, NJ: Lawrence Earlbaum
Associates
Effect Size using r
• r = .10 (small effect) explains 1% of the
total variance
• r = .30 (medium effect) explains 9% of the
total variance
• r = .50 (large effect) explains 25% of the
total variance
Statistical Power
• The power of a test is the probability of finding
differences between the means when they truly
exist.
• To increase power:
– Increase n
– Increase alpha (use 0.1 rather than 0.05)
– Decrease the variance
• We will use G*Power to do power analyses. It is
free, just google gpower3:
•
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/