The t distribution
Download
Report
Transcript The t distribution
Daniel S. Yates
The Practice of Statistics
Third Edition
Chapter 11:
Inference for Distributions
Copyright © 2008 by W. H. Freeman & Company
• Same as chapter 10 for inference for a mean of a
population m, except more realistic.
• Both population parameters m and s are unknown.
• s is estimated by the sample standard deviation s.
• The standard deviation of the sampling mean is
estimated by s/√n.
• s/√n is called the standard error of the sample mean
x-bar
The t distribution
• When s/√n is used the statistic that results is called
the t statistic.
t distribution
• Similar in shape to standard normal
curve.
• Symmetric about zero.
• Spread of t dist. Greater than that of
normal dist.
• As degrees of freedom increase the t
(k), k is number of degrees of freedom,
approaches N(0,1)
• s estimates s more accurately as
sample size increases.
t distribution
Upper-tail probability
t statistic
Problem. 11.10
Level of phosphate in blood. Tend to vary normally over time. Following is
data for a patient on six visits; {5.6, 5.1, 4.6,4.8, 5.7, 6.4}. Construct 90% CI.
x-bar = 5.37, s = .67, n = 6, df = 5, t = 2.015
5.37± 2.015(.67/√6)
(4.82,5.91)
We are 90% confident that the mean level of phosphate of blood in the patients
blood is between 4.82 and 5.91 mg/dl
Problem 11.12
The yield in pounds of two varieties of tomatoes are compared. Each variety of tomatoes is
grown on one half of 10 plots of land. The 10 differences (variety A – variety B) give x-bar =
0.34 and s = 0.83. Is there convincing evidence that variety A has the higher yield?
m = ma – mb
Ho: m = 0 , no difference in yield
Ha: m > 0 , variety A has larger yield
df = 9
t = 0.34 -0/ (0.83/√10) = 1.295
p( t > 1.29) = 0.114
There is insufficient evidence at a = 0.05 level to reject the null hypothesis that the yields of the
two varieties of tomatoes are the same. 11.4% of the all the samples of size 10 that could have
been taken would give a result that is as extreme as this if the true mean difference is zero.
Comparing two Means
Two – Sample Problems
• Compare the responses to two treatments or to compare the
characteristics of two populations.
• Separate sample from each treatment or population. No matching
of units in the two samples. The two samples can be of different
sizes.
Assumptions for comparing two means
• We have two SRS’s from two distinct populations.
• the samples are independent
• Both populations are normally distributed. The means and STD. of
the populations are unknown.
• There are four unknown parameters
•
Parameter
Statistic
m1
X-bar1
m2
X-bar2
s1
s1
s2
s2
We may want to compare the two
population means.
1) Confidence interval: m1 – m2
2) Hypothesis test:
Ho: m1 = m2
Two sample t procedures
Confidence interval
• Draw SRS of size n1 from a normal population with unknown mean m1
• Draw SRS of size n2 from a normal population with unknown mean m2
(x-bar1 – x-bar2) ± t*√s12/n1 + s22/n2
use df = smaller of (n1-1) or (n2-1); TI-83,84 will calculate more
precise degrees of freedom.
Hypothesis test
test: Ho: m1 = m2
t = (x-bar1 - x-bar2) – (m1-m2)
√ s12/n1 + s22/n2
becomes t =
(x-bar1 - x-bar2)
√ s12/n1 + s22/n2
General considerations when answering inference questions
1)
Four important questions.
Is the question a confidence interval or hypothesis test?
Is the question regarding one sample or two? Matched pairs?
Does the question involve means, x-bar, or proportions p-hat?
Should you use z statistic or t statistic?
Z
A
P
T
A
X
T
L
R
T
L
M
A
W O
A
W E
B
A
P
B
A
A
L
Y
O
L
Y
N
E
S
R
E
S
T
I
O
N
2)
Follow four step procedure for Hypothesis test or
Confidence interval..
•
HAM C
State the hypothesis – null and alternate.
Identify type of test and assumptions.
Must do
these three
for a
confidence
interval
I
Independence of samples
R
Random selection; SRS
O
Check outliers
N
Is data normally distributed
S
Sample size; np>10, n(1-p)>10
10n<pop. Size, large sample
Do the math; show formula and calculation.
State the conclusion in context.