next lectures (posted 3/22/04)
Download
Report
Transcript next lectures (posted 3/22/04)
n >30 or so for means, np and n(1-p) both > 5 for proportions
Review:
Large Sample Confidence Intervals
• 1-a confidence interval for a mean:
x +/- za/2 s/sqrt(n)
• 1-a confidence interval for a proportion:
p +/- za/2 p(1-p)/sqrt(n)
• 1-a confidence interval for the difference
between two means:
x1 – x2 +/- za/2 sqrt(s21/n1+s22/n2)
In General:
Estimate (that is normally
distributed via the Central +/Limit Theorem)
(
)
standard deviation
Za/2
of estimate
This gives an interval: (Lower Bound , Upper Bound)
Interpretation: This is a plausible range for the true value of
the number that we’re estimating. a is a tuning parameter
for level of plausibility: smaller a = more conservative estimate.
np and n(1-p) > 5 for all p’s…
Large Sample Confidence Intervals
• 1-a confidence interval for difference
between two proportions:
p1-p2 +/- za/2 sqrt[(p1(1-p1)/n1)+(p2(1-p2)/n2)]
Designing an Experiment and
Choosing a Sample Size
• Example: Compare the shrinkage in a tumor due to a
“new” cancer treatment relative to standard treatment
• 100 patients randomly assigned to “new” treatment or
standard treatment
xinew = reduction in tumor size for person i under new
treatment
xjstd = reduction in tumor size for person j under std
treatment
xnew and s2new
xstd and s2std
Mean and sample variance of the changes
in size for the new and standard treatments
Suppose the data are:
xnew = 25.3
snew = 2.0
xstd = 24.8
sstd = 2.3
95% Confidence Interval for difference:
x1 – x2 +/- za/2 sqrt(s21/n1+s22/n2)
= 0.5 +/- 0.84
What can we conclude?
• There’s no difference?
• Can’t see a difference?
• There’s a difference, but it’s too
small to care about?
There is a difference between:
• Can’t see a difference
Situation for
Cancer example
• There’s no difference
(In cancer experiment, we can assume we care about small differences.)
• Can’t see a difference (that is big enough to care
about) = wasted experiment
• AVOID / PREVENT THE WASTE AND
ASSOCIATED TEARS
USE SAMPLE SIZE PLANNING
Sample Size Planning
• Length of a 1-a level confidence interval
is:
“2 za/2 std deviation of estimate”
2za/2s/sqrt(n)
2za/2p(1-p)/sqrt(n)
2za/2sqrt((s21/n1)+(s22/n2))
2za/2sqrt[(p1(1-p1)/n1)+(p2(1-p2)/n2)]
1. Suppose we want a 95% confidence
interval no wider than W units.
2. a is fixed. Assume a value for the
standard deviation (or variance) of the
estimator.
3. Solve for an n (or n1 and n2) so that the
width is less than W units.
4. When there are two sample sizes (n1 and
n2), we often assume that n1 = n2.
Cancer example
• Let W = 0.1. Want 95% CI for difference
between means with width less than W.
• Suppose s2new = s2std = 6 (conservative
guess)
W > 2za/2sqrt((s2new/n1)+(s2std/n2))
0.1 > 2(1.96)sqrt(6/n + 6/n)
Book’s B = our W/2
0.1 > 3.92sqrt(12/n)
0.01 > (3.922)12/n
n > 18439.68 (each group…)
Hypothesis testing and p-values
(Chapter 9)
We used confidence intervals in two ways:
1. To determine an interval of plausible values for
the quantity that we estimate.
Level of plausibility is determined by 1-a. 90%
(a=0.1) is less conservative than 95% (a=0.05) is
less conservative than 99% (a=0.01)...
2. To see if a certain value is plausible in light of
the data:
If that value was not in the interval, it is not plausible
(at certain level of confidence). Zero is a common
certain value to test, but not the only one.
Hypothesis tests address the second use directly
Example: Dietary Folate
100
• Data from the Framingham Heart Study
80
n = 333 Elderly Men
60
Mean = x = 336.4
Count
Std Dev = s = 193.4
0
20
40
Can we conclude that
the mean is greater than 300 at
5% significance?
(same as 95% confidence)
0
200
400
600
800
1000
1200
Dietary Folate (micrograms / day, calorie adjusted to 2000 calorie diet)
Five Components of the Hypothesis test:
1. Null Hypothesis = “What we want to disprove”
= “H0” = “H not”
= Mean dietary folate in the population
represented by these data is <= 300.
= m <= 300
2. Alternative Hypothesis
= “What we want to prove”
= “HA”
= Mean dietary folate in the population
represented by these data is > 300.
= m > 300
3. Test Statistic
To test about a mean with a large sample test, the
statistic is z = (x – m)/(s/sqrt(n))
(i.e. How many standard deviations (of X) away from
the hypothesized mean is the observed x?)
4. Significance Level of Test, Rejection Region, and P-value
Next page
5. Conclusion
Reject H0 and conclude HA if test stat is in rejection region.
Otherwise, “fail to reject” (not same as concluding H0 – can only cite a
“lack of evidence”
(think “innocent until proven guilty”)
(Equivalently, reject H0 if p-value is less than a.)
• Significance Level: a=1% or 5% or 10%... (smaller is more
conservative) (Significance = 1-Confidence)
• Rejection Region:
– Reject if test statistic in rejection region.
– Rejection region is set by:
• Assume H0 is true “at the boundary”.
• Rejection region is set so that the probability of seeing the observed test
statistic or something further from the null hypothesis is less than or equal to
a
• P-value
– Assume H0 is true “at the boundary”.
– P-value is the probability of seeing the observed test statistic or
something further from the null hypothesis.
– = “observed level of significance”
Note that you reject if the p-value is less than a.
(Small p-values mean “more observed significance”)
Example:
• H0: m<=300, HA: m>300
• z (x-m)/(s/sqrt(n))
= (336.4 – 300)/(193.4/sqrt(333))
= 3.43
• Significance level = 0.05
• When H0 is true, Z~N(0,1). As a result, the cutoff
is z0.05=1.645. (Pr(Z>1.645) = 0.05.)
• P-value = Pr(Z>3.43 when true mean is 300) =
0.0003
• Reject. Mean is greater than 300.
• Would you reject at significance level 0.0001?
Picture
Distribution of
Z = (X – 300)/(193.4/sqrt(333))
when true mean is 300.
0.2
Rejection region
0.1
Observed
Test Statistic
0.0
Density
0.3
0.4
Test statistic
-4
-2
0
2
4
3.43
1.645
Area to right of 3.43
Area to right of 1.645
=0.0003 = p-value
=0.05 = sig level
Test Statisistic
One Sided versus Two Sided Tests
• Previous test was “one sided” since we’d
only reject if the test statistic is far enough
to “one side” (ie. If z > z0.05)
• Two sided tests are more common (my
opinion):
H0: m=0, HA: m does not equal 0
Two Sided Tests (cntd)
Test Statistic (large sample test of mean)
z = (x – m)/(s/sqrt(n))
Rejection Region:
reject H0 at signficance level a if |z|>za/2
i.e. if z>za/2 or z<-za/2
Note that this “doubles” p-values. See next example.
Example:
• H0: m=300, HA: m doesn’t equal 300
• z=(x-m)/(s/sqrt(n))
= (336.4 – 300)/(193.4/sqrt(333))
= 3.43
• Significance level = 0.05
• When H0 is true, Z~N(0,1). As a result, the cutoff
is z0.025=1.96. (Pr(|Z|>1.96)=2*Pr(Z>1.96)=0.05
• P-value = Pr(|Z|>3.43 when true mean is 300) =
Pr(Z>3.43) + Pr(Z<-3.43) = 2(0.0003)=0.0006
• Reject. Mean is not equal to 300.
• Would you reject at significance level 0.0005?
Picture
Distribution of
Z = (X – 300)/(193.4/sqrt(333))
when true mean is 300.
0.4
Test statistic
Rejection region
0.2
Rejection region
0.0
0.1
Density
0.3
Sig level = area to right of 1.96
+ area to the left of -1.96 = 0.05=a
-4
-2
-3.43
Area to left of -3.43
=0.0003
1.96
0
Test Statisistic
2
1.96
Pvalue=0.0006=Pr(|Z|>3.43)
4
3.43
Area to right of 3.43
=0.0003
Power and Type 1
and Type 2 Errors
Action
H0 True
Fail to Reject H0
Reject H0
correct
Type 1
error
Significance level = a
=Pr( Making type 1 error )
Truth
HA True
Type 2
error
correct
Power =
1–Pr( Making type 2 error )
• Assuming H0 is true, what’s the probability
of making a type I error?
• H0 is true means true mean is m0.
• This means that the test statistic has a
N(0,1) distribution.
• Type I error means reject which means
|test statistic| is greater than za/2.
• This has probability a.