Statistical significance

Download Report

Transcript Statistical significance

Statistics
Statistics may be defined as a body of
methods for making wise decisions in the face
of uncertainty.
W. Allen Wallis
Economist & Statistician
Statistics Terms
• Statistics: Procedures used to summarize and
analyze quantitative data.
• Descriptive statistics: Procedures used to
summarize a set of numbers in terms of central
tendency, variation, or relationships.
• Inferential statistics: Procedures used to
determine the error when estimating a value for
a population based upon the measurement of
the same value for a sample of that population.
Types of Descriptive Statistics
• Central Tendency: The typical score (best bet).
• Variability: How different the scores are.
• Correlation Coefficient: A measure of the
relationship between two variables.
• z-Score: The relationship of one score to the
norm group in terms of standard units.
• Effect Size: A measure of the magnitude and
difference of the means of two groups.
Descriptive Statistics
• Measures of Central Tendency
- Mean: The arithmetic average, sensitive to outliers
- Median: The middle score, reduces effect of outliers
- Mode: The most frequent score
• Measures of Variability
- Range: The difference between the largest and
smallest.
- Standard Deviation: The average distance of all scores
from the mean.
• Correlation Coefficient
- How related two variables are, predictability.
- Sensitive to outliers (moving R closer to zero).
z-Score
The quantity z represents the distance between the raw
score (of an individual’s score, for instance) and the group
mean in units of the standard deviation. z is negative when
the raw score is below the mean and positive when above.
Effect Size
The quantity ES represents the difference between the
mean of the experimental group and the mean of the
control group in units of the standard deviation.
Inferential Statistics
• The purpose of inferential statistics is to make
conclusions about some value of a population on the
basis of that same value measured for a sample.
• Inferential statistics allow us to estimate the
magnitude of our error—the difference between the
sample value and the population value—even
though we don’t know what the population value is.
• One estimate of error is the “confidence interval”—a
range within which the true value is likely (%) to be.
The wider the range, the higher the confidence level.
Sampling Error
• It’s always easier and quicker to measure a sample
drawn from a population than it is to measure every
person in the population (a census).
• Unfortunately, the value for the sample is never
exactly equal to the true population value. This is
called sampling error (error due to sampling).
• The larger the percentage of the population that is
sampled, the smaller the sampling error. (Think
about the increase in accuracy by moving from a
sample of 50% of the population to 99%.)
Sampling Fluctuation
• Sampling fluctuation occurs when we measure a
value for samples repeatedly drawn from the same
population. The value for each sample is different
from the others (and different from the true value of
the population).
—
b = 71”
Population
—
x = 70”
—
a = 66”
—
c = 73”
Sampling Fluctuation Example
• Five people each grab a fistful of coins from a bucket.
• You would expect each to grab a different amount.
• When one person’s amount is much different from
another’s, you could say there is a statistically significant
difference between their “grab” and the other’s “grab.”
The difference between the two is larger than you
would expect than from sampling fluctuation alone.
$5.42
$8.23
$5.25
$5.58
$5.12
Statistical Significance
• Statistical significance is a mathematical test
that gives a yes/no answer to the question: “Are
the differences we see larger than we would
expect than from sampling fluctuation alone?”
- It doesn’t tell us which value is larger.
- It doesn’t tell us how big the difference is.
- It doesn’t tell us how important the difference is.
- And because statistical significance is based on the size
of the sample, one experiment may have statistically
significant results while another may not simply
because the sample sizes were different.
Practical Significance
• Practical significance answers the allimportant question of “So what?”
• Statistical significance tells us whether the
differences are larger than we would expect to
see than from sampling fluctuation alone.
• Effect size tells us the magnitude and the
direction of the differences.
• Practical significance tells us how important the
differences are in terms of what people value.
Restriction in the Range
Do SAT Scores Predict College GPA?
• Based on your experience at Vanderbilt:
– Does it seem like the students with the highest
SAT scores have the highest GPA?
• Think about the kids you knew in high school:
– Did you know smart kids with low SAT scores?
– Did you know kids that weren’t that bright who
were able to achieve high SAT scores?
• Do you think that high school SAT scores predict
college GPA?
Vanderbilt Class of 2015
What it Takes to Get In
What it Takes to Stay In
A Correlation between SAT & GPA?
A Correlation between SAT & GPA?
r = 0.50
Relationship between SAT & GPA
Minimum SAT of 1,000 to Enter
Minimum SAT of 1,000 to Enter
Minimum GPA of 2.0 to Remain
SAT & GPA of Vanderbilt Students
What is the Correlation Coefficient?
Restriction of the Range Conclusion