SL3_HypothesisTesting - yale-lccn

Download Report

Transcript SL3_HypothesisTesting - yale-lccn

Stats Lunch: Day 3
The Basis of Hypothesis
Testing w/ Parametric
Statistics
Back to the Normal Curve…
The Central Limit Theorem in Nature
Samples vs. Populations
-The Central Limit Theorem says that if we have lots of people
(scores), their distribution will approximate a normal curve
-However, it is often impractical or too expensive to run studies
with lots of subjects…
-Population: Entire group of people (animals, etc.) that you are
interested in (Ex: How ALL Americans will vote in election)
-Sample: The group of people that you actually study (Ex: the
100 people that you call on the phone to ask them how they’ll
vote)
Samples vs. Populations
-We assume that our sample is REPRESENTATIVE of the
population we’re interested in…
Random Selection: start with a list of entire population
and randomly select people to study. This assures that
sample represents population
Haphazard Selection: Taking whoever you can get...
So, our samples are often not truly RANDOM and therefore
can have a (very) different mean and standard deviation
than the true population:
-Our sample is less likely to resemble a true normal curve
-It is more difficult to estimate what percentage of scores
fall above or below a given score...
The Central Limit Theorem in Practice
Population Parameters vs. Statistics
-Although we want to, we almost never know the TRUE mean
and standard deviation of a population
-Thus, we don’t know the “Population Parameters”: Mean,
Variance and SD of a population
 = population mean
2 = population variance,  = population SD
-Samples ESTIMATE a population
-We call the mean (M), Variance (SD2), and Standard Deviation
(SD) of a sample “Statistics”
Introduction to Hypothesis Testing
“All I know is I don’t know nothing”
If you don’t know “nothing”, then you must know
something…
Logic of Hypothesis Testing (w/ N=1)
We manipulate an independent variable (Ex: Let baby
listen to Mozart)
We obtain a sample score (Ex: IQ = 140 ; Z=2.67)
Using what we know about the normal curve for that
variable’s distribution:
-We determine how likely it is that we got that sample score
just by chance (in other words, how likely is it we got a score
this high if the music had no effect)
Introduction to Hypothesis Testing
Z= 2.67
More than 99% of scores fall below Z =2.67
Introduction to Hypothesis Testing
-It’s VERY unlikely (but possible) that we got a score (Z=2.67)
this BIG just by chance
-With a score this high, there is less than a 2% chance that
this score would occur if the drug had no effect
-We can reject the idea that the drug has no effect. Therefore,
we accept the idea that it does have an effect.
“All I know is I don’t know nothing”…
We can only show how unlikely it is to get a score if our
manipulation really had no effect (by seeing where our score
falls on the normal distribution for that variable).
-It’s VERY unlikely (but possible) that we got a score (Z=2.67)
this BIG just by chance
-With a score this high, there is less than a 2% chance that
this score would occur if the drug had no effect
-We can reject the idea that the drug has no effect. Therefore,
we accept the idea that it does have an effect.
“All I know is I don’t know nothing”…
We can only show how unlikely it is to get a score if
our manipulation really had no effect (by seeing
where our score falls on the comparison distribution
for that variable)
Steps of Hypothesis Testing
1. Restate your question as a research hypothesis and
a null hypothesis about the populations
2. Determine the characteristics of the comparison
distribution
3. Determine the cutoff sample score on the
comparison distribution at which the null hypothesis
could be rejected
4. Determine your sample’s score on the comparison
distribution
5. Decide whether to reject the null hypothesis
Experimental Group: Baby Mozart
Rest of Population: All the other babies
Step 1. Create a Null Hypothesis and Research (Alternate)
Hypothesis (about the population)
-Null Hypothesis (H0): Your manipulation has no effect
(µexperimental group = µrest of population)
-Research Hypothesis (H1): Your manipulation has some
effect (µexperimental group > µrest of population)
The null is the exact opposite of your research hypothesis.
Statistically, we can only test how likely it is we got a score so
big (or small) if the null is true.
Thus, we try to disprove, or reject, the null hypothesis.
Ex:
Null: Listening to Mozart DOES NOT increase babies’ IQ
Research: Listening to Mozart does increase babies’ IQ
Step 2. Determine the characteristics of your Comparison
Distribution (Which, in this case is the normal curve)
Comparison Distribution: What the distribution would look
like if the null hypothesis were true. We compare our sample
scores to this distribution.
-What score would we expect to get if the null were true?
-If null is true, then our experimental group and the rest of the
population would look the same. The distributions would be
equal.
Ex: in this case, if the null is true both groups would have a
mean IQ of 100
Step 3. Determine when to reject null hypothesis
Cutoff Sample Score (Critical Value): How big will your sample
score have to be before you reject the null?
In other words, the probability (2.5%, 5%, etc.) that you’re
willing to take that the null hypothesis is really TRUE, even
though you say it isn’t
This is most commonly called “alpha level” or “p”.
95%
Ex: Cutoff  = .05
-the point (Z
score) on the
normal curve
where 95% of
scores fall below it
Step 5. Deciding to Reject Null or Not
If your sample score is larger than your cutoff sample score
(critical value), you can reject the null
You have reached statistical significance
However, you can never ACCEPT the null.
If your sample score is SMALLER than your cutoff sample score
(critical value), all you can say is that you “failed to reject the
null”
-So, you never really “prove” anything…
-All you can say is how unlikely it is that what you found would
occur randomly (your manipulation had no effect)
Directional (One-Tailed) vs. Nondirectional (Two-Tailed) Hypotheses
Directional Tests: Expect that your sample score will be bigger
(or smaller) than the mean of your comparison distribution
Reject Null if your sample score falls in the top (or
bottom…depending on your hypothesis) 5% of scores in the
comparison distribution.
But: can only reject null if your data are in the direction you
predicted…
Sample Falls below cutoff:
FAIL TO REJECT NULL
Sample Falls
above cutoff:
REJECT NULL
Nondirectional Tests: Expect that your sample score
will be different than the mean of your comparison
distribution (either bigger or smaller)
Reject Null if your sample score falls in the top 2.5%
OR bottom 2.5% of comparison distribution.
Harder to reject null.
Reject
Fail to reject
null
Reject
Because, in the real world, we don’t do hypothesis testing w/ a
N=1...
We aren’t interested in comparing one score to a distribution of
individual scores
Instead, we are interested in means of samples, not individual
scores (our N > 1)
Need a new comparison distribution (can’t compare the mean
of a sample to a distribution of individual scores)
Distribution of Means: Comparison distribution composed
of the MEANS of a large number of samples
-get a bunch of samples from the same population, and
plot their means
This is also called a “Sampling Distribution of Means”
(Sampling) Distribution of Means
-We can’t compare the mean of a sample (the group of subjects
we’re studying) with a comparison distribution composed of
individual scores.
-Instead we want to compare it to a distribution made up of a
bunch of means from samples...
120, 110, 85, 100, 80, 100, 95, 105, 101, 99, 95, 107
120, 80, 100
110, 80, 100
107, 100, 101
M = 100
M = 96.67
M = 102.67
...
Characteristics of a Distribution of Means:
1. Mean will be close to the population mean (M = )
2. Shape of the Distribution will be approx. normal (N > 30)
3. The spread of the distribution (i.e., the variance) is SMALLER
than the population distribution
Calculating the Variance of a Distribution of Means:
-It is unlikely that more than one extreme score will be selected in
each sample…
-Thus, it is unlikely that a sample mean will be extreme
-So, the variance for the Distribution of Means will be smaller
than the population’s
-This is particularly true with a large N (sample size)
-The larger the N, the smaller the variance of a dist of means...
Mathematically it looks like this:
Population Variance / N (or… σ2/N)
Called the “Variance of Distributions of Means”
σ2M = σ2/N
IQ Example (Pop Variance=225, N=50)
σ2M = 225/50 = 4.5
Standard Deviation of Dist of Means (Standard Error)
σM = √ σ2M = √4.5 = 2.12
Estimating the Population Mean...
-We can never know the TRUE population mean
-However, we can estimate it from what we know about the
sample mean and the Standard Error
-Point Estimate: when we estimate a particular population
parameter
-The best point estimate of  is M
-But how accurate is it going to be?
-Better off to estimate using a range of possible
scores (some above and some below M)
-Interval Estimate
Confidence Intervals
We are 95% (or 99%…or 50%, etc.) confident that the
population mean falls between Score A and Score B
Figuring Confidence Intervals:
1. Find Standard Error (2.12)
2. Figure the Raw Scores above and Below M
-For 95%, figure raw scores +/- 1.96 Standard Error from M
-For 99%, figure raw score +/- 2.57 Standard Error from M
Interval = M +/- (Standard Error * Z cutoff)
95% Interval Example
Standard Error for Baby Mozart = 2.12, M = 130
SE * Z cutoff = (2.12 * 1.96) = 4.16
Interval = 130 +/- 4.16
130 - 4.16 = 125.84
130 + 4.16 = 134.16
So, we are 95% confident that the population mean falls
between 125.84 and 134.16
99% Interval Example
Standard Error for Baby Mozart = 2.12, M = 130
SE * Z cutoff = (2.12 * 2.57) = 5.45
Interval = 130 +/- 5.45
130 - 5.45 = 124.55
130 + 5.45 = 135.45
So, we are 99% confident that the population mean falls
between 124.55 and 135.45