Transcript June 1

Probability
Population:
The set of all individuals of interest
(e.g. all women, all college students)
Sample:
A subset of individuals selected from
probability the population from whom data is
collected
What we learned from Probability
1) The mean of a sample can be treated as a random variable.
2) By the central limit theorem, sample means will have a normal

distribution (for n > 30) with  X   and  X 
n
3) Because of this, we can find the probability that a given
population might randomly produce a particular range of
sample means.
P( X  something )  P(Z  something )  Use table E.10
Inferential statistics
Population:
The set of all individuals of interest
(e.g. all women, all college students)
Sample:
Inferential statistics
A subset of individuals selected from
the population from whom data is
collected
Once we’ve got our sample
The key question in statistical inference:
Could random chance alone have produced
a sample like ours?
Once we’ve got our sample
Distinguishing between 2 interpretations of
patterns in the data:
Random Causes:
Inferential
statistics
separates
Fluctuations of chance
Systematic Causes Plus Random Causes:
True differences in the population
Bias in the design of the study
Reasoning of hypothesis testing
1.
Make a statement (the null hypothesis) about some
unknown population parameter.
2.
Collect some data.
3.
Assuming the null hypothesis is true, what is the
probability of obtaining data such as ours? (this is the
“p-value”).
4.
If this probability is small, then reject the null
hypothesis.
Step 1: Stating hypotheses
Null hypothesis
– H0
– Straw man: “Nothing interesting is happening”
Alternative hypothesis
– Ha
– What a researcher thinks is happening
– May be one- or two-sided
Step 1: Stating hypotheses
Hypotheses are in terms of population parameters
One-sided
Two-sided
H0: µ=110
H0: µ = 110
H1: µ < 110
H1: µ ≠ 110
Step 2: Set decision criterion
• Decide what p-value would be “too unlikely”
• This threshold is called the alpha level.
• When a sample statistic surpasses this level, the result is said
to be significant.
• Typical alpha levels are .05 and .01.
More on setting a criterion
• The retention region.
The range of sample mean values that are “likely” if H0 is true.
If your sample mean is in this region, retain the null hypothesis.
• The rejection region.
The range of sample mean values that are “unlikely” if H0 is true.
If your sample mean is in this region, reject the null hypothesis
Setting a criterion
Null distribution
Accept H0
Reject H0
Zcrit
Reject H0
H
0
Zcrit
Step 3: Compute sample statistics
A test statistic (e.g. Ztest, Ttest, or Ftest) is information we get from
the sample that we use to make the decision to reject or keep
the null hypothesis.
A test statistic converts the original measurement (e.g. a sample
mean) into units of the null distribution (e.g. a z-score), so that
we can look up probabilities in a table.
Test Statistics
Null distribution
Accept H0
Reject H0
Zcrit
Reject H0
 hyp
Zcrit
Ztest?
Accept H0
Reject H0
Zcrit
Reject H0
Zcrit
• If we want to know where our sample mean lies in the null
distribution, we convert X-bar to our test statistic Ztest
• If an observed sample mean were lower than z=-1.65 then it would be
in a critical region where it was more extreme than than 95% of all
sample means that might be drawn from that population
Step 4: Make a decision
If our sample mean turns out to be extremely unlikely under
the null distribution, maybe we should revise our notion of µH0
We never really “accept” the null. We either reject it, or fail to
reject it.
Steps of hypothesis testing
1. State hypothesis (H0, HA)
2. Select a criterion (alpha, Zcrit)
3. Compute test statistic (Ztest) and get a pvalue
4. Make a decision
Z as test statistic
• Z test-statistic converts a sample mean into a z-score from the
null distribution.
•Zcrit is the criterion value of Z that defines the rejection region
•Ztest is the value of Z that represents the sample mean you
calculated from your data
X  H
Z test 
X
Standard error!!!!
0
• p-value is the probability of getting a Ztest as extreme as yours
under the null distribution
Z as test statistic
• All test statistics are fundamentally a comparison between what
you got and what you’d expect to get from chance alone
Z calc 
X   hyp
X
Deviation you got
Deviation from chance alone
• If the numerator is considerably bigger than the denominator,
you have evidence for a systematic factor on top of random
chance
Example I
Tim believes that his “true weight” is 187 lbs with a
standard deviation of 3 lbs.
Tim weighs himself once a week for four weeks. The
average of these four measurements is 190.5.
Are the data consistent with Tim’s belief?
Example I
H0:  = 187
1.
2.
HA:  > 187
Criterion? Let’s say alpha=.05. That would be Zcrit = 1.65
3. An X-bar of 190.5 is what Ztest? What is the probability of
getting a Ztest as high as ours?
Z test 
4.
X  H 0
X

190.5  187
 2.33
3 4
P( Z  2.33)  .0099
If H0 were true, there would be only about a 1% chance of
randomly obtaining the data we have. Reject H0.
Example I illustrated
z = 190.5-187 = 2.33
3
4
Reject H0
0.01
x = 187
x = 1.5
0
190.5
1.65
Zcrit
2.33
Ztest
Exercise
We have a sample of 500 students whose average score on some
standardized test is 461. We think they are a particularly gifted bunch.
Assume the general student population has a distribution of scores
that is approximately normal with µ = 450 and  = 100.
Does our sample come from a population with a mean of 450? Or are
they a better test-taking species?
H0: µ = 450
H1: µ > 450
Exercise
How to proceed?
Let’s:
- Select a criterion
- Calculate a z-score
- Compare our sample z with our criterion
- Make a decision
Exercise
We have a sample of 500 students whose average score on some
standardized test is 461. We think they are a particularly gifted bunch.
Assume the general student population has a distribution of scores
that is approximately normal with µ = 450 and  = 100.
Does our sample come from a population with a mean of 450? Or are
they a better test-taking species?
H0: µ = 450
H1: µ > 450
Exercise illustrated
We reject the null hypothesis because sample means
of 461 or larger have a very small probability. (We
expect such large means less than 1% of the time.)
When we reject a null hypothesis,
it is because
(a) if we believe the null hypothesis, there is only a
small probability of getting data like ours by
chance alone.
(b) if we believe our data, and don’t think it came
from an unlikely chance event, the null
distribution is probably not true.
One-tailed tests
• If HA states  is < some value, critical region occupies left
tail
• If HA states  is > some value, critical region occupies
right tail
Right-tailed tests
H0: µ = 100
H1: µ > 100
Points Right
Fail to reject H0
Reject H0
alpha
100
Zcrit
Values that
differ “significantly”
from 100
Left-tailed tests
H0: µ = 100
H1: µ < 100
Points Left
Reject H0
Fail to reject H0
alpha
Values that
differ “significantly”
from 100
Zcrit
100
One- vs. two-tailed tests
• In theory, should use one-tailed when
1. Change in opposite direction would be meaningless
2. Change in opposite direction would be uninteresting
3. No rival theory predicts change in opposite direction
• By convention/default in the social sciences, two-tailed is
standard
• Why? Because it is a more stringent criterion (as we will
see). A more conservative test.
Two-tailed hypothesis testing
• HA is that µ is either greater or less than µH0
HA: µ ≠ µH0
•  is divided equally between the two tails of the
critical region
Two-tailed hypothesis testing
H0: µ = 100
H1: µ  100
Means less than or greater than
Reject H0
Fail to reject H0
Reject H0
alpha
Zcrit
100
Zcrit
Values that differ significantly from 100
One tail
Reject H0
Fail to reject H0
.05
Values that
differ “significantly”
from 100
Two tail
Zcrit
Reject H0
100
Fail to reject H0
.025
Zcrit
Reject H0
.025
100
Zcrit
Values that differ significantly from 100
Example
We have a sample of 36 children of geniuses. They have an
average IQ of 110. We want to know whether they are
significantly different from the general population of children,
who have µ=100 and σ=25.
1. Test the hypothesis that the mean of this group is higher than
that of the population.
2. What is Ztest?
3. What is Zcrit for alpha = .05? For alpha = .01? Do we reject
the null for either case?
4. What is the exact p-value for this test?
Example
• Ztest= 10/4.16 = 2.4
• alpha .05, Zcrit=1.64;
• alpha .01, Zcrit=2.33
• P(Z>2.4)=.008
Reject Ho
Ztest
Example
We have a sample of 36 children of geniuses. They have an
average IQ of 110. We want to know whether they are
significantly different from the general population of children,
who have µ=100 and σ=25.
1. Test the hypothesis that the mean of this group is not equal to
that of the population
2. What is Ztest?
3. What is Zcrit for alpha = .05? For alpha = .01? Do we reject
the null for either case?
4. What is the exact p-value for this test?
Example
• Ztest= 10/4.16 = 2.4
• alpha .05, Zcrit=1.96;
• alpha .01, Zcrit=2.58
• P(/Z/>2.4)=.016
Reject Ho
Ztest