Hypothesis Testing

Download Report

Transcript Hypothesis Testing

Hypothesis Testing
Central Limit Theorem
Hypotheses and statistics are
dependent upon this theorem
Central Limit Theorem
To understand the Central Limit Theorem
we must understand the difference between
three types of distributions…..
A distribution is a type of graph showing the frequency
of outcomes:
Of particular interest is the “normal distribution”
Different populations will create differing frequency
distributions, even for the same variable…
There are three fundamental types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
There are three types of distributions:
1. Population distributions
2. Sample distributions
3. Sampling distributions
1. Population distributions
The frequency distributions of a population.
2. Sample distributions
The frequency distributions of samples.
The sample distribution should look like
the population distribution…..
Why?
2. Sample distributions
The frequency distributions of samples.
3. Sampling distributions
The frequency distributions of statistics.
2. Sample distributions
The frequency distributions of samples.
The sampling distribution should NOT look like
the population distribution…..
Why?
Suppose we had population distributions that
looked like these:
Say the mean was equal to 40, if we took
a random sample from this population of a certain
size n… over and over again and calculated the
mean each time……
We could make a distribution of nothing but
those means. This would be a sampling
distribution of means.
Some questions about this sampling distribution:
1. What would be the mean of all those means?
2. If the population mean was 40, how many
of the sample means would be larger than 40,
and how many would be less than 40?
Regardless of the shape of the distribution
below, the sampling distribution would be
symmetrical around the population mean of 40.
3. What will be the variance of the sampling distribution?
The means of all the samples will be closer
together (have less variance) if the variance of
the population is smaller.
The means of all the samples will be closer
together (have less variance) if the size of
each sample (n) gets larger.
n = number of samples
Sample
So the sampling distribution will have a mean
equal to the population mean, and a variance
inversely proportional to the size of the sample (n),
and proportional to the variance of the population.
http://www.khanacademy.org/math/statistics/v/central-limit-theorem
http://www.khanacademy.org/math/statistics/v/sampling-distributionof-the-sample-mean
Central Limit Theorem
Central Limit Theorem
If samples are large, then
the sampling distribution created by those
samples will have a mean equal to the
population mean and a standard deviation
equal to the standard error.
Sampling Error = Standard Error
The sampling distribution will be a normal curve with:
x  o
and
Sx 
o
n
This makes inferential statistics possible
because all the characteristics of a normal curve
are known.
http://www.statisticalengineering.com/central_limit_theorem.htm
A great example of the theorem in action….
https://www.khanacademy.org/math/probability/statistics-inferential/sampling_distribution/v/sampling-distribution-example-problem
Another great example of the theorem in
action….
Hypothesis Testing:
A statistic tests a hypothesis: H
0
H 0 :   100
Hypothesis Testing:
A statistic tests a hypothesis: H
0
The alternative or default hypothesis is: HA
H A :   100
Hypothesis Testing:
A statistic tests a hypothesis: H
0
The alternative or default hypothesis is: HA
A probability is established to test the
“null” hypothesis.
Hypothesis Testing:
95% confidence: would mean that there
would need to be 5% or less probability of
getting the null hypothesis; the null
hypothesis would then be dropped in
favor of the “alternative” hypothesis.
Hypothesis Testing:
95% confidence: would mean that there
would need to be 5% or less probability of
getting the null hypothesis; the null
hypothesis would then be dropped in
favor of the “alternative” hypothesis.
1 - confidence level (.95) = alpha
Alpha
Errors:
Errors:
Type I Error: saying nothing is
happening when something is:
p = alpha
Errors:
Type I Error: saying something is
happening when nothing is:
p = alpha
Type II Error: saying nothing is
happening when something is:
p = beta
http://www.youtube.com/watch?v=taEmnrTxuzo
An example from court cases:
http://www.intuitor.com/statistics/T1T2Errors.html
Care must be taken when using hypothesis testing…
PROBLEMS
I hypothesize that a barking dog is hungry.
The dog barks, is the dog therefore hungry?
To answer that questions, I would have to have some
prior information.
For example, how often does the dog bark when it is not hungry.
Suppose we flipped a coin
a hundred times….
It came up heads 60 times.
Is it a fair coin?
No….
Because of the Z-test finds that the probability of doing
that is equal to 0.0228.
We would reject the Null Hypothesis!
Suppose we flipped the same
coin a hundred times again…
It came up tails 60 times.
Is it a fair coin?
But we have now thrown the
coin two hundred times, and…
It came up tails 100 times.
Is it a fair coin?
Perfectly fair
The probability of rejecting the null hypothesis is now ZERO!!
Suppose we project a
Poggendorf figure to one side
of the brain or to the other….
and measure error.
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(11) = 2.17, p = 0.053
What do you conclude?
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(11) = 2.17, p = 0.053
Now suppose you did this again
with another sample of 12 people.
t(11) = 2.10, p = 0.057
But the probability of independent events is:
p(A) X p(B) so that:
The Null hypothesis probability for both studies was:
0.053 X 0.057 = 0.003
What do you conclude now?
But if the brain
hemispheres are truly
independent….
Then...
Paired Samples Statistics
Pair 1
Right
Left
Mean N
5.4167 12
4.9167 12
Std. Error Mean
.70128
.62107
t(22) = 0.53, p = 0.60
What do you conclude now?
Read the following article….
http://commonsenseatheism.com/wp-content/uploads/2011/01/Siegfried-Odds-Are-Its-Wrong.pdf