Hypothesis Testing

Download Report

Transcript Hypothesis Testing

Hypothesis Testing
Central Limit Theorem
Hypotheses and statistics are
dependent upon this theorem.
Central Limit Theorem
To understand the Central Limit Theorem
we must understand the difference between
three types of distributions…..
A distribution is a type of graph showing the frequency
of outcomes:
A distribution is a type of graph showing the frequency
of outcomes:
Of particular interest is the “normal distribution”
Different populations will create differing frequency
distributions, even for the same variable…
There are three fundamental types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
3. Sampling distributions
A sampling distribution is a
distribution of samples, i.e., a
distribution of statistics taken from
samples.
The three distributions are related:
1. Population distributions
The frequency distributions of a population.
The three distributions are related:
2. Sample distributions
The frequency distributions of samples.
The sample distribution should look like
the population distribution…..
Why?
The three distributions are related:
2. Sample distributions
The frequency distributions of samples.
These three distributions are related:
3. Sampling distributions
The frequency distributions of statistics.
The three distributions are related:
2. Sample distributions
The frequency distributions of samples.
The sampling distribution should NOT look like
the population distribution…..
Why?
Some questions about this sampling distribution:
1. If the population mean was 40, how many
of the sample means would be larger than 40,
and how many would be less than 40?
Regardless of the shape of the distribution
above, the sampling distribution would be
symmetrical around the population mean of 40.
2. What will be the variance of the
sampling distribution?
The means of all the samples will be closer
together (have less variance) if the variance of
the population is smaller.
The means of all the samples will be closer
together (have less variance) if the size of
each sample (n) gets larger.
So the sampling distribution will have a mean
very close to the population mean, and a variance
inversely proportional to the size of the sample (n),
and proportional to the variance of the population.
http://www.youtube.com/watch?v=zr-97MVZYb0
Central Limit Theorem
Central Limit Theorem
If samples are large, then
the sampling distribution created by those
samples will have a mean equal to the
population mean and a standard deviation
equal to the standard error.
Sampling Error = Standard Error
The sampling distribution will be a normal curve with:
x  o
and
Sx 
o
n
This makes inferential statistics possible
because all the characteristics of a normal curve
are known.
Hong Kong Temperatures
http://www.statisticalengineering.com/central_limit_theorem.htm
A great example of the theorem in action….
Hypothesis Testing:
A statistic tests a hypothesis: H
0
H 0 :   100
Hypothesis Testing:
A statistic tests a hypothesis: H
0
The alternative or default hypothesis is: HA
H A :   100
Hypothesis Testing:
A statistic tests a hypothesis: H
0
The alternative or default hypothesis is: HA
A probability is established to test the
“null” hypothesis.
Hypothesis Testing:
95% confidence: would mean that there
would need to be 5% or less probability of
getting the null hypothesis; the null
hypothesis would then be dropped in
favor of the “alternative” hypothesis.
Hypothesis Testing:
95% confidence: would mean that there
would need to be 5% or less probability of
getting the null hypothesis; the null
hypothesis would then be dropped in
favor of the “alternative” hypothesis.
1 - confidence level (.95) = alpha
Errors:
Errors:
Errors:
Type I Error: saying something is
happening when nothing is:
p = alpha
Errors:
Type I Error: saying something is
happening when nothing is:
p = alpha
Type II Error: saying nothing is
happening when something is:
p = beta
http://www.youtube.com/watch?v=taEmnrTxuzo
An example from court cases:
http://www.intuitor.com/statistics/T1T2Errors.html
Care must be taken when using hypothesis testing…
PROBLEMS
I hypothesize that a barking dog is hungry.
The dog barks, is the dog therefore hungry?
To answer that questions, I would have to have some
prior information.
For example, how often does the dog bark when it is not hungry.
Suppose we flipped a coin
a hundred times….
It came up heads 60 times.
Is it a fair coin?
No….
Because a Z-test finds that the probability of doing
that is equal to 0.0228.
We would reject the Null Hypothesis!
Suppose we flipped the same
coin a hundred times again…
It came up tails 60 times.
Is it a fair coin?
But we have now thrown the
coin two hundred times, and…
It came up tails 100 times.
Is it a fair coin?
Perfectly fair
The probability of rejecting the null hypothesis is ZERO!!
Suppose we project a
Poggendorf figure to one side
of the brain or to the other….
and measure error.
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(11) = 2.17, p = 0.053
What do you conclude?
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(11) = 2.17, p = 0.053
Now suppose you did this again
with another sample of 12 people.
t(11) = 2.10, p = 0.057
But the probability of independent events is:
p(A) X p(B) so that:
The Null hypothesis probability for both studies was:
0.053 X 0.057 = 0.003
What do you conclude now?
But if the brain
hemispheres are truly
independent….
Then...
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(22) = 0.53, p = 0.60
What do you conclude now?