Slides - Courses

Download Report

Transcript Slides - Courses

http://xkcd.com/539/
Hypothesis Testing and Statistical Significance

Estimators and Correlation

Hypothesis Testing and
Statistical Significance

labels and other questions
2
General definition, continuous and discrete variables:

E[X]   x  f (x)dx (where f (x) is probability density function for X)

For discrete variables:
E[X]   x i  f (x i )
i
  x i  P(X  x i )
i
1
  x i  X (when P(x1)  P(x 2 ) 
n i
 P(x n )  1 n)
3

What is an estimator?

Often a trade-off
between bias
and variance
 
ˆ 
 E 

ˆ )  E 
ˆ 
bias(
4
Variance defined:
Population variance:
(have all obs 1…N)
Two estimators
of population variance:
(Typeset equations courtesy http://en.wikipedia.org/wiki/Variance)
5
vs.
(Typeset equations courtesy http://en.wikipedia.org/wiki/Variance)
6
1 N
 (x i  x )(y i  y )
E (X  x )(Y  y ) N 
i1
xy 

 x y
 x y
1 n
ˆ x )(y i  
ˆy )
  (x i  
 1 n (x ) (y )
n 
1
(x

x)
(y

y)
n
i1
rxy 
   i
 i
   z i zi
ˆ x
ˆy

n i1 sx
sy  n i1



n 
1 
(x i  x)
(y i  y)

 

1 n (x ) (y )
n i1 1 n
1 n
2
2 
  zi zi
  (x i  x)
  (y i  y) 

n 1 i1
 n 1 i1
 n 1 i1



n 
1
(x i  x)
(y i  y)


 

n 1 i1 1 n
1 n
2
2 
  (x i  x)
  (y i  y) 

n 1 i1
 n 1 i1

7

Standard Deviation
 Spread of a list
 Single variables have
SD
Graphics: Wikipedia

Standard Error
Spread of a chance
process
 Sampling Distributions
have SE

8
9

Remember that a z-score tells us where a
score is located within a distribution–
specifically, how many standard deviation
units the score is above or below the mean.
z
Y 

10

For example, if we find a particular difference
that is x standard errors wide, how confident
are we that the difference is not just due to
chance?

So… we can use z-scores on a normal curve to
interpret how likely a given outcome is (how
likely is it due to chance?)
11

Example, you have a variable x with mean of
500 and S.D. of 15. How common is a score of
525?
 Z = 525-500/15 = 1.67
 If we look up the z-statistic of 1.67 in a z-score table,
we find that the proportion of scores less than our
value is .9525.
z
Y 

 Or, a score of 525 exceeds .9525 of the population.
(p < .05)
12

z is a test statistic

More generally:
z
Y 

z = observed – expected
SE

Z tells us how many
standard errors an observed
value is from its expected
value.
13

A confidence interval is a range of scores
above and below the mean.
 The interval is in standard errors
 It is the interval where we expect our value to be

A confidence coefficient is the likelihood that
a given interval has the true value of the
parameter

Sample value = true population value + error
14

One-tailed
 Directional Hypothesis
 Probability at one end of the
curve

Two-tailed
 Non-directional Hypothesis
 Probability is both ends of the
curve
15

Null Hypothesis:
 H0: μ1 = μc
▪ μ1 is the intervention
population mean
▪ μc is the control population
mean
 Alternative
Hypotheses:


H1: μ1 < μc
H1: μ1 > μc
16

Null Hypothesis:
 H0: μ1 = μc
▪ μ1 is the intervention
population mean
▪ μc is the control population
mean
 Alternative
Hypothesis:

H1: μ1 ≠ μc
17

Do Berkeley students read more or less than 8
hours a week?


H0: μ = 8
H1: μ ≠ 8
The mean for Berkeley students is equal to 8
The mean for Berkeley students is not equal to 8
18

Do Berkeley students read more than 8 hours a
week (the average for students across the
country)?


H0: μ = 8
H1: μ > 8
There is no difference between Berkeley students and other students
The mean for Berkeley students is higher than the mean for all students
19

A p-value is the observed significance level (more on
this in a moment)

A test statistic depends on the data, as does p.

This chance assumes that the null hypothesis is
correct. Thus, the smaller the chance (p-value), the
morel likely that the null can be rejected.

The choice of a test statistic (e.g., z, t, F, Χ2) depends
on the model and they hypothesis being considered

The basic process is exactly the same, however.
20

When p value > .10 → the observed difference is
“not significant”

When p value ≤ .10 → the observed difference is
“marginally significant” or “borderline significant”

When p value ≤ .05 → the observed difference is
“significant”

When p value ≤ .01 → the observed difference is
“highly significant”
21

We cannot hypothesize the null
 As odd as it may seem at first, we reject or do not
reject the null; a traditional hypothesis test tests
against the null.

We never use the word proof with hypothesis
testing and statistics, we reject or accept.
 Prove has a specific meaning in mathematics and
philosophy, but the term is misleading in
statistics.
22

Type I Error: falsely rejecting a null
hypothesis (false positive)

Type II Error: Failing to reject the null
hypothesis when it is false (false negative)
23
(The auto data)
24