SEEDSM12_8ff

Download Report

Transcript SEEDSM12_8ff

Scientific Methods 1
‘Scientific evaluation, experimental design
& statistical methods’
COMP80131
Lecture 8: Statistical Methods-Significance tests
& confidence limits
Barry & Goran
www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131
10 Dec 2012
COMP80131-SEEDSM8
1
Introduction
• Statistical significance testing has so far been applied on the
assumption of a
1) discrete population with binomial distribution
2) continuous population with known normal pdf & stdev.
• Before proceeding further, take a quick look at a few more
prob distributions & pdfs.
• Significance testing can be adapted to any of these.
10 Dec 2012
COMP80131-SEEDSM8
2
Exponential pdf
• Lifetimes, e.g. of light bulbs, follow an exponential distribution:
:x0
 0
pdf ( x)  
x / 
(
1
/

)
e
:x0

0.5
mean = 2;
x = 0:0.1:10;
y = exppdf(x,mean);
plot(x,y);
0.45
0.4
0.35
pdf
0.3
0.25
0.2
Mean = 
0.15
0.1
Stdev =  also
0.05
0
0
1
10 Dec 2012
2
3
4
5
x
6
7
8
9
10
COMP80131-SEEDSM8
3
Poisson Distribution
• For applications that involve counting number of times a
random event occurs in a given amount of time,
e.g. number of people walking into a store in an hour.
prob( x) 
•
•
•
•
x e  x
where x is an integer
x!
λ, is both mean & variance of the distribution.
Poisson & exponential distributions are related.
If number of counts follows a Poisson distribution, then
interval between individual counts follows exponential
distribution.
As λ gets larger, Poisson pdf  normal with µ = λ, σ2 = λ.
10 Dec 2012
COMP80131-SEEDSM8
4
Poisson distributions in MATLAB
x=0:60
y = poisspdf(x,20);
stem(x,y);
x=0:16
y = poisspdf(x,5);
stem(x,y);
0.09
0.18
0.08
0.16
0.07
0.14
0.06
prob(x)
prob(x)
0.12
0.1
0.08
0.04
0.03
0.06
0.02
0.04
0.01
0.02
0
0.05
0
0
2
10 Dec 2012
4
6
8
x
10
12
14
16
0
COMP80131-SEEDSM8
10
20
30
x
40
50
60
5
Chi-squared distribution
•Given a population of normally distrib random variables
with mean = 0 & stdev =1.
•Randomly choose a sample of V observations of them.
•Let x be the sum of their squares.
•Then pdf of x has the 2 distribution:
:x0
 0
1
V / 2 1  x / 2
V2 ( x)  
x
e
:x0
V /2
 2 (V / 2)
(‘Gamma function’ (x) is generalisation of x! to non-integers).
If s = stdev of the V observations, pdf(s2)  (1/V)V2(s2)
If pop mean =  & stdev = , pdf (s2 )  (1/V)V2(s2/2+ 2)
10 Dec 2012
COMP80131-SEEDSM8
6
Plot chi2 pdf with V = 4
0.2
0.18
0.16
x = 0:0.2:15;
y = chi2pdf(x,4);
plot(x,y)
0.14
pdf
0.12
0.1
0.08
0.06
0.04
0.02
0
0
5
10
15
x
10 Dec 2012
COMP80131-SEEDSM8
7
Student’s t-distribution pdf
Depends on a single parameter V (degrees of freedom).
As V, t-pdf approaches standard normal distribution
 (V 1) / 2
( (V  1) / 2 )
2
1  t / V )
pdf (t ) 
:   t  
V (V / 2)
If x is random sample of size n from a normal distribution
with mean μ, then the t-statistic
x
s/ n
(with x  sample mean & s  sample stdev)
has Student's t-pdf with V = n – 1 degrees of freedom.
10 Dec 2012
COMP80131-SEEDSM8
8
Compare t-pdf(V=5) with normal
T-pdf(blue) Norm-pdf(red)
0.4
0.35
0.3
x = -5:0.1:5;
y = tpdf(x,5);
z = normpdf(x,0,1);
plot(x,y,'b',x,z,'r');
0.25
0.2
0.15
0.1
0.05
0
-5
-4
10 Dec 2012
-3
-2
-1
0
1
2
3
4
COMP80131-SEEDSM8
5
9
MATLAB functions for t-dist
• pdf for t-distribution with V degrees of freedom:
y = tpdf ( t,V);
(With samples with n values, V = n-1)
• Cumulative df with V degrees of freedom
p = tcdf ( t , V)
Prob of rand var being  t
• Complementary df (area under ‘tail’ from t to )
p = 1 – tcdf ( t , V) Prob of rand var being > t
10 Dec 2012
COMP80131-SEEDSM8
10
Inverse-cdf in MATLAB
t-pdf
• Inverse of cumulative distrib function
p
x
t
• If p = tcdf(t,V) then t = tinv(p,V)
Value of t such that prob of rand var being  t is p
• If p = normcdf(z,m,) then z = norminv(p,m, )
Value of z such that prob of rand var being  z is p
t-pdf
• Complementary version:
t = tinv(1-p,V)
Value of t such that prob of rand var being > t is p.
• Similarly for complementary version of norminv
10 Dec 2012
COMP80131-SEEDSM8
p
x
t
11
Significance testing: z-test
•
•
•
•
Assume Normal population with known stdev = .
Null-hypothesis: pop-mean = 0
Alternative hyp: pop-mean < 0
Take one sample of n values & calculate the z-statistic:
x  0
z
(with x  sample mean &   pop stdev)
/ n
If pop-mean = 0, dist of z will be standard Normal (mean=0, std=1)
Std Normal pdf
0.4
0.3
If mean of z is 0, how likely is a
value  z as just calculated?
0.2
p-value = prob (x  z)
0.1
= 1-normcdf(z,0,1)
0
10 Dec 2012
-2
-1
0 1
2
z
4
If p-value < significance level
alpha () reject null-hyp.
COMP80131-SEEDSM8
12
Alternative formulation
z
x  0
/ n
(with x  sample mean &   pop stdev)
Assuming we need 95% confidence,  = 0.05
Let z() = norminv(1-, 0, 1) = 1.65
Prob of getting rand var  1.65 is less than 0.05
If z  1.65, it is outside our 95% ‘confidence limit’ that the
null-hyp may be true.
So reject null-hyp.
Confidence limit is for z is - to 1.65
Neglect possibility that z may be negative.(1-tailed test)
Confidence limit for sample-mean is - to 1.65/n + 0
10 Dec 2012
COMP80131-SEEDSM8
13
2-tailed test
x  0
z
/ n
(with x  sample mean &   pop stdev)
Assuming we need 95% confidence,  = 0.05.
Allowing possibility that z < 0, extreme portions of tails are
for z > z(/2)) and for z < -z(/2)).
prob(z  z(/2)) + prob(z -z((/2) ) = 2 prob(z  z(/2))
Now, z(/2) = norminv(1-/2,0,1) = 1.96
Prob of getting rand var  1.96 or  -1.96 is 0.05
If z > 1.96 or z < - 1.96, it is outside our 95% ‘confidence
limit’ that the null hyp may be true. So reject null-hyp.
Confidence limits for z are -1.96 to 1.96
Confidence limits for sample-mean are:
0 - 1.96/n to 0 + 1.96/n
10 Dec 2012
COMP80131-SEEDSM8
14
Significance testing: t-test
•
•
•
•
Assume Normal population with unknown stdev.
Null-hypothesis: pop-mean =0
Alternative hyp: pop-mean < 0
Take one sample of n values & calculate the t-statistic:
x  0
t
s/ n
(with x  sample mean & s  sample stdev)
T-pdf(blue) Norm-pdf(red)
If pop-mean = 0, dist of t will be standard t-pdf (blue) with V=n-1.
0.4
How likely is calculated value of t?
0.3
‘1-tailed’ p-value = prob (x  t)
0.2
= 1 - tcdf(t , n-1)
t
If p-value < significance level alpha ()
reject null hyp.
0.1
0-5 -4 -3 -2 -1 0
10 Dec 2012
1
2
3
4
5
COMP80131-SEEDSM8
15
Alternative formulation (2-tailed)
• Null-Hyp is that pop-mean is 0
x  0
t
(with x  sample mean & s  sample stdev)
s/ n
• Assuming we need 95% confidence,  = 0.05
• Confidence limits for 0 is:
x  tinv(1   / 2, n  1)  s / n
to
x  tinv(1   / 2, n  1)  s / n
If value of 0 is outside these limits, reject the null-hyp that
population mean is 0
If 0 is within these confidence limits, cannot reject null-hyp.
10 Dec 2012
COMP80131-SEEDSM8
16
Difference betw z-test & t-test(2-tailed)
• With z-test pop-std () is known; with t-test  is unknown.
x  0
z
/ n
t
x  0
s/ n
(with x  sample mean &   pop stdev)
(with x  sample mean & s  sample stdev)
For z-test, p-value = prob ( x   z) = 1- normcdf(z,0,1)
For t-test, p-value = prob( x   t) = 1 – tcdf(t,n-1)
Same Null-hyp: pop-mean = 0 : reject if 0 outside conf limits
Confidence limits for z-test:
x  norminv (1   / 2 ,0,1)   / n
to
x  norminv (1   / 2, 0,1)   / n
Confidence limits for t-test:
x  tinv(1   / 2, n  1)  s / n to x  tinv(1   / 2, n  1)  s / n
10 Dec 2012
COMP80131-SEEDSM8
17
Non-Gaussian populations
• If samples of size n are ‘randomly’ chosen from a pop with
mean  & std , the pdf of their sample-means approaches
a Normal (Gaussian) pdf with mean  & stdev /n as n ∞.
• Regardless of whether the population is Gaussian or not!
• This is Central Limit Theorem
• Tests can be made to work for non-Gaussian populations
provided n is ‘large enough’.
10 Dec 2012
COMP80131-SEEDSM8
18
Meaning of confidence limits
If =0.5, there is 95% probability that the
confidence limits for a given sample will
contain the true population statistic  say.
10 Dec 2012
COMP80131-SEEDSM8
19
A really subtle point
• Does this mean that there a 95% probability that  lies
within the 95% confidence limits for the given sample?
10 Dec 2012
COMP80131-SEEDSM8
20
A really subtle point
• Does this mean that there a 95% probability that  lies
within the 95% confidence limits for the given sample?
• No! A common mistake!
• We have just one sample – we have no idea whether it is
one whose confidence limits contain  or not.
• Only 95% of possible samples will have conf limits which
contain .
10 Dec 2012
COMP80131-SEEDSM8
21
P-values & confidence limits in MATLAB
• Come for free with most measurements. For example:
x= [1;2;3;4;5;6]; y =[1.1; 3;2;4;6;4];
[R, p_value, Rlo, Rup] = corrcoef(x,y)
• Returns Pearson corr coeff R= 0.79,
• p_value = 0.061,
• Also 95% confidence limits: Rlo=-0.06, Rup = 0.98
• 95% prob that the true corr lies between -0.06 & 0.98
• “ Returns p-values for testing the hypothesis of no correlation.
Each p-value is probability of getting a correlation as large as the
observed value by random chance, when the true correlation is zero.
If p_value is small, say < 0.05, then the correlation is significant”.
10 Dec 2012
COMP80131-SEEDSM8
22
Credibility limits
• Baysian equivalent of ‘confidence limits’
• If limits are C1 to C2, &  = 0.05
• Now there is 95% probability of the statistic,  say, lying
between C1 & C2.
• ‘Confidence limits are ‘frequentist’
• Jonas explained why many people distrust the
frequentist approach and consider the Bayesian
approach to be much more reliable.
10 Dec 2012
COMP80131-SEEDSM8
23
Reminder: Binomial distribution
True probability of getting
that no of heads
• If p=prob(Heads), prob of getting Heads exactly r times in n
independent coin-tosses is:
r
(n-r)
nCr p (1-p)
• For a fair coin. p=0.5,  this becomes nCr /2n
0.2
0.16
0.12
0.1
0.04
0.02
00
10 Dec 2012
2
4
6
8
10
12
14
COMP80131-SEEDSM8
16
18
No of heads obtainable
20 with n coin-tosses
24
True probability of getting that no of heads
Binomist dist with n=6
0.4
0.35
0.3
0.25
0.2
0.15
0.0156
0.1
0.05
0
0
10 Dec 2012
1
2
3
4
5
6
No of heads obtainable with n coin-tosses
COMP80131-SEEDSM8
25
MATLAB Script
p = 0.5; % for coin tossing
n=6;
for r=0:n
nCr = prod(n:-1:(n-r+1))/prod(1:r);
Prob(1+r) = nCr * (p^r) * (1-p)^(n-r);
end;
Prob
figure(2); stem(0:n,Prob);
10 Dec 2012
COMP80131-SEEDSM8
26
Geometric distribution
(p = prob of success).
• p(x) = (1-p)px-1
• Number of trials (coin tosses) up to & including that in
which first failure occurs
0.5
0.45
p = 0.5
x=1:10;
prob = (1-p)*p.^(x-1);
stem(x,prob);
prob of first failure at x
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
1
2
10 Dec 2012
3
4
5
6
x: number of trials
7
8
9
10
COMP80131-SEEDSM8
27
Geometric distribution (again)
prob of first failure at x
0.5
0.4
0.3
prob(6)
= 0.0156
0.2
prob(5)
= 0.0313
0.1
0.05
0
1
10 Dec 2012
2
3
4
5
6
7
x: number of trials
8
COMP80131-SEEDSM8
9
10
28
Barry’s Assignment
•
•
•
•
•
•
•
Deadline 20 Dec 2012
Email to [email protected] with ‘SEEDSM12’ in title
or
Hand in paper copy to SSO
Exam statistics are in examdata.dat and examdata.xls in
www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131
(or navigate from www.cs.man.ac.uk/~barry)
10 Dec 2012
COMP80131-SEEDSM8
29