power point - Turning Wheel
Download
Report
Transcript power point - Turning Wheel
Revision of basic statistics
Hypothesis testing
Principles
Testing a proportion
Testing a mean
Testing the difference between two means
Estimation
Principles of hypothesis testing
Null vs alternative hypothesis
The null as assumed true until proven otherwise
If the evidence is inconsistent with the null, reject it in favour of the
alternative. E.g.
H0: a coin is fair vs H1: a coin is biased towards heads
Evidence (data): 20 heads in 25 tosses
Evidence seems unlikely if H0 were true, hence reject H0
Probability of such extreme evidence is actually 0.2%. We usually
reject if the probability is < 5% (the significance level of the test.)
Testing a proportion
H0: 10% of people are left handed
H0: p = 0.1
H1: the proportion is not 10%
H1: p 0.1
The sample proportion p is a random variable and should be
somewhere near to the true value.
Its probability distribution is p ~ N(p, p(1-p)/n) under H0
Hence the test statistic is
This is, in general,
z
p -p
p (1 - p )
n
sample statistic - hypothesised value
standard error of the sample statistic
Using data to calculate the test statistic
If 7 out of a group of 50 are left handed, the test statistic is
0.14 - 0.10
z
0.94
0.1(1 - 0.1)
50
This is less than z* = 1.96, the critical value which cuts off 5% in
the two tails of the Normal distribution.
Hence we cannot reject H0.
-3 -2.5 -2 -1.5 -1 -0.5 -0 0.5
1
1.5
2
2.5
3
Testing a mean
A firm selling franchises claims that the average weekly income of a
franchise is at least £2000. A sample of 40 such franchises finds an
average weekly income of £1770 with s.d. £450. Is the claim
justified?
H0: m = 2000 vs H1: m < 2000
Significance level for test: 1% (we want to avoid a false accusation)
Critical value: z*= 2.33
x ~ N m , 2 n so x ~ N 2000, 4502 40
(
)
(
1770 - 2000)
z
-3.23
450 2 40
Since z < -z* we reject H0.
(
)
The Prob-value approach
Instead of comparing the test statistic to the critical value, we could
compare the prob-value to the significance level (1% in this case)
The prob-value is the area in the tail of the distribution beyond the
value of the test statistic.
In this case (z = -3.23) the prob-value is 0.0013 (0.13%, found from
the standard Normal table)
Since 0.13% < 1% we reject H0
Left hand tail of the Normal distribution
1% in tail of distribution
0.13% in tail
-2.33
-3.23
-3.9 -3.8 -3.6 -3.5 -3.3 -3.2
-3 -2.9 -2.7 -2.6 -2.4 -2.3 -2.1 -2
How to reject the null hypothesis
Method 1
Test statistic > critical value (in absolute value)
3.23 > 2.33
Method 2 (prob-value)
Prob value < significance level
0.13% < 1%
Note the different direction of the inequality!!! Both reject the null
If in doubt, draw the diagram!
Watch out for:
Choice of significance level (5% or 1%)
One vs two tail test. If we had a two tail test, the prob-value
would be 0.26% (and compare this to 1%).
Testing the difference of two means
A sample of 40 students five years ago found an average
expenditure on text books per annum of £87 (at today's prices) with
s.d. £21. A current survey of 50 students found average
expenditure of £77 with s.d. £30. Has expenditure declined?
H0: m1 - m2 = 0 vs H1: m1 -m2 > 0
12 22
x1 - x2 ~ N m1 - m 2 ,
n1 n2
Random variable:
Significance level: 5%. Critical value z = 1.64.
Test statistic:
Decision: z > z* hence reject H0.
Or, prob-value associated with 1.86 is 3.14% < 5% hence reject.
z
(87 - 77 ) - 0
21 40 30 50
2
2
1.86
The t distribution
When testing a mean with small samples, we use the t distribution
instead of the Normal.
(But note that regression coefficients follow the t distribution
whatever the sample size.)
A sample of 12 National Lottery outlets finds an average sale of 800
tickets per week, with s.d. 140. Does this suggest the original target
of 700 has been exceeded?
H0: m =700; H1: m > 700
Significance level: 5%. Critical value t* = 1.796 (d.f. = 11)
Test statistic: t
(800 - 700) - 0 2.47
140 2 12
2.47 > 1.796 hence reject H0.
Alternatively, prob-value associated with 2.47 is 1.6%.
Estimation
An alternative approach than hypothesis testing
The sample mean or proportion is a point estimate
Around this we build a confidence interval
For the Normal distribution, the 95% CI is given by
Point estimate 1.96 standard errors
For the franchising example above, we have
2
2
s
450
x 1.96
1770 1.96
1630.5,1909.5
n
40
The interval has a width of about 170, expressing our uncertainty.
For the t distribution, the interval is given by
Point estimate t* standard errors
where t* is obtained from tables, using the appropriate degrees
of freedom (d.f. = n – 1 for the mean).