One-sample t-test - Florida International University

Download Report

Transcript One-sample t-test - Florida International University

Module 16: One-sample t-tests
and Confidence Intervals
This module presents a useful statistical tool, the
one-sample t-test and the confidence interval for the
population mean.
Reviewed 05 May 05 / Module 16
16 - 1
The t-test and t Distribution
What happens when we don't know the true value
for the population standard deviation ? Suppose
we have only the information from a random
sample, n = 5, from the population of body
weights, with
x = 153.0 lbs, and
s = 12.9 lbs.
When we had a value for the population parameter
, we used the following formula:
C[ x  z0.975 (n  1)(
n )    x  z0.975 (n  1)(
n )]  0.95
16 - 2
t-test and t distribution (contd.)
We do have an estimate of the population standard deviation,
, namely the sample standard deviation s = 12.9 lbs. Hence, it
seems reasonable to think that we should be able to use this
estimate in some way. It is also reasonable to think that, if we
substitute s for , we are substituting a guess at the truth for the
truth itself and we will probably have to pay a price for doing
so. So, what is the price?
The essence of the situation is that, when we substitute a guess
for the truth, we add noise to the system. The question then
becomes one of characterizing this noise and taking it into
account. Noise in this situation is equivalent to variability, so
we are adding variability to the system. How much and
exactly where?
16 - 3
t-test and t distribution (contd.)
If we use this estimate, then we must make appropriate
adjustments to the formula to account for the
variability of this estimate.
To properly account for this situation, we need to use a
distribution different from the normal distribution. The
appropriate distribution is the t distribution, which is
very similar to the normal distribution for large sample
sizes, but differs importantly for smaller samples,
especially those with n < 30.
16 - 4
Confidence Interval for µ using s
The appropriate formula, when  = 0.05, is
C[ x  t0.975 (n  1)(s
n )    x  t0.975 (n  1)(s
n )]  0.95
where t0.975(n-1) references the t distribution with n-1
degrees of freedom (df), specifically the point on that
distribution below which lies 0.975 of the total area.
For this situation, the correct number of degrees of
freedom is one less that the sample size, i.e. df = n -1.
16 - 5
Tables for the t Distribution
To obtain the values for the t distribution, see Table
Module 2: The t distribution.
16 - 6
Example
For the above situation, with n = 5, x = 153.0 lbs, and
s = 12.9 lbs, we have t0.975 (n-1) = t0.975 (4) = 2.776 so
that the interval becomes:
C[ x  t0.975 (4)( s
n )    x  t0.975 (4)( s
C[153.0 - 2.776 (12.9
n )]  0.95
5 )    153.0 + 2.776 (12.9
5 )] = 0.95
C[137.0    169.0] = 0.95
Given this confidence interval, would you believe that
the population mean for the population from which
this sample was selected had the value  = 170.0 lbs?
16 - 7
Ten Samples from N(150,10), n = 5
Confidence Intervals
t Distribution (t0.975,4 = 2.776)
Normal Distribution
Sample n
5
1
2
3
4
5
6
7
8
9
10
5
5
5
5
5
5
5
5
5
Mean
s2
s
LL
UL
Length
LL
UL
Length
147. 43
153. 98
146. 50
155. 53
147. 87
143. 60
146. 87
149. 19
150. 05
146. 92
88. 14
117. 91
103. 66
91. 99
149. 65
66. 76
64. 23
280. 88
200. 28
173. 36
9. 39
10. 86
10. 18
9. 59
12. 23
8. 17
8. 01
16. 76
14. 15
13. 17
138. 66
145. 21
137. 74
146. 76
139. 10
134. 84
138. 11
140. 43
141. 28
138. 16
156. 19
162. 75
155. 27
164. 29
156. 63
152. 37
155. 64
157. 96
158. 82
155. 69
17. 53
17. 53
17. 53
17. 53
17. 53
17. 53
17. 53
17. 53
17. 53
17. 53
135. 77
140. 50
133. 86
143. 62
132. 68
133. 46
136. 92
128. 39
132. 48
130. 58
159. 08
167. 46
159. 14
167. 43
163. 05
153. 75
156. 82
170. 00
167. 62
163. 27
23. 31
26. 96
25. 28
23. 81
30. 37
20. 29
19. 90
41. 61
35. 14
32. 69
16 - 8
95% Confidence Intervals for samples n = 5
Sample
10
9
8
7
6
5
4
3
2
1
0
120
125
130
135
C[ x  t0.975 (4)(s
140
145
150
155
160
n )    x  t0.975 (4)(s
165
170
175
180
n )]  0.95, n  5
16 - 9
Ten Samples from N(150,10), n = 20
Confidence Intervals
Normal Distribution
t Distribution (t0.975,19 = 2.09)
Sample n
20
1
2
3
4
5
6
7
8
9
10
20
20
20
20
20
20
20
20
20
Mean
s2
s
LL
UL
Length
LL
UL
Length
150. 86
146. 88
147. 65
149. 37
153. 30
152. 83
148. 62
152. 16
154. 40
151. 43
100. 96
122. 70
119. 51
51. 07
109. 54
111. 96
91. 94
140. 83
179. 56
115. 85
10. 05
11. 08
10. 93
7. 15
10. 47
10. 58
9. 59
11. 87
13. 40
10. 76
146. 48
142. 50
143. 27
144. 99
148. 92
148. 45
144. 24
147. 77
150. 02
147. 04
155. 24
151. 27
152. 03
153. 75
157. 69
157. 21
153. 01
156. 54
158. 79
155. 81
8. 77
8. 77
8. 77
8. 77
8. 77
8. 77
8. 77
8. 77
8. 77
8. 77
146. 16
141. 71
142. 54
146. 03
148. 41
147. 89
144. 14
146. 61
148. 14
146. 40
155. 56
152. 06
152. 76
152. 71
158. 19
157. 77
153. 10
157. 70
160. 67
156. 46
9. 39
10. 35
10. 22
6. 68
9. 78
9. 89
8. 96
11. 09
12. 52
10. 06
16 - 10
95% Confidence Intervals for samples n = 20
Sample
10
9
8
7
6
5
4
3
2
1
0
120
125
130
135
C[ x  t0.975 (19)(s
140
145
150
155
160
n )    x  t0.975 (19)(s
165
170
175
180
n )]  0.95, n  20
16 - 11
Ten Samples from N(150,10), n = 50
Confidence Intervals
Normal Distribution
t Distribution (t0.975,49 = 2.01)
Sample n Mean
1 50 148. 79
2
3
4
5
6
7
8
9
10
50
50
50
50
50
50
50
50
50
150. 43
150. 86
152. 92
149. 68
150. 62
150. 57
149. 75
150. 68
150. 99
s2
s
LL
UL
Length
LL
UL
Length
120. 98
83. 83
108. 58
144. 88
104. 21
100. 25
74. 04
97. 23
61. 77
110. 58
11. 00
9. 16
10. 42
12. 04
10. 21
10. 01
8. 60
9. 86
7. 86
10. 52
146. 02
147. 66
148. 09
150. 15
146. 91
147. 85
147. 79
146. 98
147. 91
148. 22
151. 57
153. 20
153. 63
155. 69
152. 45
153. 39
153. 34
152. 53
153. 45
153. 77
5. 54
5. 54
5. 54
5. 54
5. 54
5. 54
5. 54
5. 54
5. 54
5. 54
145. 67
147. 83
147. 90
149. 50
146. 78
147. 78
148. 12
146. 95
148. 44
148. 00
151. 92
153. 03
153. 82
156. 34
152. 58
153. 47
153. 01
152. 56
152. 91
153. 98
6. 25
5. 21
5. 92
6. 84
5. 80
5. 69
4. 89
5. 61
4. 47
5. 98
16 - 12
95% Confidence Intervals for samples n = 50
Sample
10
9
8
7
6
5
4
3
2
1
0
120
125
130
135
C[ x  t0.975 (49)(s
140
145
150
155
160
n )    x  t0.975 (49)(s
165
170
175
180
n )]  0.95, n  50
16 - 13
Hypothesis Testing: x = 153.0 lbs, s = 12.9 lbs
A random sample of n = 5 measurements of weights from
a population provides a sample mean of x = 153.0 lbs and
a sample standard deviation of s = 12.9 lbs. Is it likely
that the population mean has the value  = 170 lbs.?
1. The hypothesis:
H0:  = 170 versus H1:  ≠ 170
2. The assumptions:
Random sample from a normal
distribution
3. The α - level:
α = 0.05
16 - 14
4. The test statistic:
x 
t
s n
5. The critical region:
Reject H0: µ = 170 if the value
calculated for t is not between
± t0.975(4) = 2.776
6. The result:
153.0  170.0 17.0
t

 2.95
5.77
12.9 5
7. The conclusion:
Reject H0: µ = 170 since the
value calculated for t is not
between ± 2.776.
16 - 15
This test was performed under the assumption that
µ=170. Our conclusion is that our sample mean x =
153.0 is so far away from µ=170 that we find it hard
to believe that µ =170. That is, our observed value for
the sample mean of x = 153.0 is too rare for us to
believe that  = 170.
Question:
How rare is
µ = 170?
x = 153.0 under the assumption that
16 - 16