Transcript Lecture 6

Comparison of Two Conditions
for Continuous Measurements
Lecture 6
Description of Problem
Suppose that you are interested in the
effect of a binary variable on a
continuous variable
•
Diesel exhaust vs. clean air
→ White Blood Cell count
• Special instruction vs. a control condition
→ test scores that measure students’
understanding of abstract ideas about risk
Two Variables of Interest
The Predictor is the variable that is being
used to “predict” the value of another
variable.
•
•
Also commonly referred to as the independent
variable or the explanatory variable.
For now, we assume this variable is binary or
dichotomous, ie., a variable with two levels.
The Response is the outcome variable in
whose value we are interested.
•
•
Also commonly referred to as the dependent
variable.
For now, we assume this variable is continuous.
WARNING! The use of this terminology does
not imply actual causation!
Two Study Designs
Independent Groups
•
Subjects at each level of the predictor
are sampled independently of one
another
•
The levels of the predictor may be automatic
(such as gender) or subjects may be randomly
assigned to a “group” (Type of instruction)
Paired Samples
Paired Samples
Individual subjects are measured at both levels of
the predictor.
OR
Subjects are paired such that one member of the
pair is at one level of the predictor and the other
member of the pair is at the other.
• The levels of the predictor may be automatic (such as
before/after an exam or mother/daughter) or may be
randomly assigned to each unit of the pair (special
instruction/control)
• Pairs are often put together based on demographic
information, such as age, gender, ethnicity, etc. or are
natural pairings such as in the case of family linkages.
Hypotheses in Words
H0: Predictor has no effect
or
H0:
versus
HA: Predictor has an effect
(may specify which level is associated with
a larger mean)
or
HA:
Hypotheses in Statistical Notation
H0: Predictor has no effect/Group means are equal
translates to
H0: μa = μb
or equivalently
H0: μa - μb = 0
VERSUS
HA: Predictor has an effect/Group means are unequal
translates to
HA: μa ≠ μb
OR μa > μb OR μa < μb
or equivalently
 HA: μa - μb ≠ 0
OR
μa - μb > 0 OR μa - μb < 0
Example – Two Independent
Groups
“Risperidone improves disruptive behavior in
children with low IQ. … Dr. Michael Aman
from Ohio State University, Columbus, and
colleagues randomly assigned 118 children
… with the atypical antipsychotic at mean
dose of 0.98 mg/kg daily for 6 weeks or
placebo. Intention-to-treat analysis found
that children receiving risperidone had a
significant 47.3% reduction on the Conduct
Problem subscale of the NCBRF compared
with children receiving placebo (20.9%).”
Indep.Groups –
Risperidone Example
Let’s start the analysis by comparing the information
presented at baseline to demonstrate that the two
groups, into which children were randomized, did not
have significantly different conduct problem scores at
baseline. (See Table 1, Am J Psychiatry 2002;159:1337)
To study the differences between groups, we are
interested in the quantity μa - μb, where μa is the
mean of conduct problem scores for the placebo
group and μb is the mean for the risperidone group.
The sample means are both approximately normally
distributed and are independent of one another.


Normality of the sampling distribution of the sample means
follows by the CLT (na=63, nb=55)
Independence of the sample means follows from the fact that
the groups contain subjects that are independent of one
another.
Example – Paired Sample
Description
• Each member of 11 heterosexual double
income couples (in which both members
were between the ages of 30 and 40
years) were questioned about the average
amount of time that they spent reading the
newspaper or watching the news on TV.
Do women and men spend different
amounts of time catching up on the news
via the newspaper and TV?
Paired News Example
To study the differences between groups
(genders), we are interested in the quantity
μa - μb, where μa is the mean time for females and
μb is the mean for males.
Are the sample means both approximately
normally distributed?
•
Only if the original population distributions are assumed
normal. That is the times of all females in double income
couples must approximately follow a normal distribution.
The same is true for the time of all males.
The sample means are approximately normally
distributed, but are NOT independent of one
another.
•
The times spent catching up on the news by a man and a
woman in the same couple would be expected be associated
with each other.
Components needed for
inference
When the sampling distribution of the
estimate is approximately normal,
Confidence Interval is of the form
Estimate ± (Critical value x Std.Error)
Test Statistic is of the form
(Estimate - Null value) / Std.Error
Hence, for each study design (independent
groups or paired), we need to
1)
2)
3)
.
.
.
.
Paired Sample
#1: Estimate of μa - μb
The Estimate of μa - μb is
(Sample mean a – Sample mean b)
For paired samples, this is equivalent to
the sample mean of the pair differences, i.e.,
D = Ya – Yb
Ya = measurement from member a of each pair
Yb = measurement from member b of each pair
If there are n pairs in the study, there are total of
n observed D’s.
Paired Sample
#2 cont.: Standard Error
Since we can calculate the estimate of the
difference of population means as the average
of the differences (the D’s),
we can calculate the standard error of the
estimate as the standard error of an average.
Hence, the standard error is given by
sD /sqrt(npairs)
where
sD 
2
(
D

D
)

n pairs  1
Paired Differences Example –
Statistics
Wife
0.4
0.5
1.0
0.2
0.9
1.0
1.2
0.1
0.6
0.4
0.2
Husband
0.5
0.4
0.7
0.0
0.6
1.2
0.7
0.1
0.5
0.1
0.1
Difference -0.1 0.1
0.3
0.2
0.3
-0.2
0.5
0.0
0.1
0.3
0.1
• Group averages are 0.591 and 0.446
• The difference of these averages is 0.146
• The average of the differences is also 0.146
• Standard Deviation of the 11 differences is 0.202
• The standard error of the average of the differences is
0.202/sqrt(11) = 0.061
• (The standard deviations for each of the groups are
0.378 and 0.359.)
Paired Differences Example – #3
What’s the Sampling Distribution?
 Estimate and SE are calculated the same as in the
one-sample case. Therefore, to determine the
sampling distribution, we need to consider the
sample size or the population distribution.
Histograms
 Here is a histogram
of the differences.
3
Count
2
1
0
-0.10
0.00
0.10
diff
0.20
0.30
0.40
#3 Sampling Distribution for
Paired Samples
 If the sample size is large, then the sampling
distribution of the averaged differences is
approximately normal distribution (CLT).
 If the sample size is small and the distribution
of differences approximately normal, then the
sampling distribution is the t-distribution with
npairs-1 degrees of freedom.
 Confidence Intervals and p-values for
Hypothesis tests for paired samples (when
the outcome is continuous) are calculated the
same way we have been doing these with
one-group samples.
Independent Groups Sample
#1: Estimate of μa - μb
Just take the difference of the two
averages!
Average of Conduct problem scores at
Baseline:
Placebo Group: 34.5
Risperidone Group: 32.9
Difference: 34.5 - 32.9 = 1.6
#2: Standard Error for
Two Independent Groups
FACT: The variance of the difference
(or sum) of two independent normal
random variables is the sum of the two
variances.
#2, cont.: Std.Error, 2 Groups
Recall: The Standard Error is the estimate of
the standard deviation of the sampling
distribution of the sample mean.
The true standard error of the difference of
the two sample means is
a b

na
nb
2
Std.Error. of Difference 
• How this is estimated in practice
depends on whether σa and σb are
assumed to be equal.
2
#2 cont.: Special Cases of
Std.Error, 2 Groups
Easiest case: σa ≠ σb
The standard deviation is estimated by replacing σa
and σb by their respective standard deviations in the
equation give on the previous slide.
Harder case: Assume σa= σb = σ
The common value of the standard deviation, σ, is
estimated by
(na  1) sa  (nb  1) sb
s pooled 
na  nb  2
2
2
and replaces both σa and σb on the previous slide.
Two Group Example –
Standard Deviations
Are standard deviations for the two
groups equal?
•
•
•
The paper gives these values to be 6.9 and 7.6 for
the placebo and Risperidone children respectively.
Just eyeing them I can’t tell if the differences are
due to random chance or not.
We will try both tests (one for equal variances and
one for unequal variances)
Cont. (unequal variances)
If we assume unequal variances, then
SE = sqrt(variancea/na + varianceb/nb)
= sqrt(6.92/63 + 7.62/55)
= 1.344
Cont. (equal variances)
If we assume equal variances, then first
calculate the pooled estimate of the std. dev.
62sa2  54sb2
s pooled 
(63  55  2)
62(47.61)  54(57.76)

 7.234
(63  55  2)
And, SE=sqrt(SDpooled2/na + SDpooled2/nb)
=sqrt(7.2342/63 + 7.2342/55)
=1.335
#3: Sampling Distribution of the
Estimate – 2 Independent Groups
The estimate of the difference of the means
divided by its standard error
•
Follows a normal distribution if the sample size is
large enough
• Follows a t-distribution with na+nb-2 degrees of
freedom if with equal variances, the samples come
from normal distributions
• Follows a mixture of t-distributions if the variances
are unequal and the samples come from normal
distributions
Background: Sampling
Distributions of the Estimates
FACT: “Linear combinations” of normally distributed
variables are also normally distributed.
CONSEQUENCE: If the original measurements follow a
normal distribution, then any estimate composed of
these measurements that is a sum, a difference, an
average, a difference of averages or an average of
differences will also follow a normal distribution.
Examples:
•
•
•
Average for Single Group (if sample size is small and original
measurements come from a normal distribution or, by CLT, if
sample size is large)
Independent Groups: Difference of averages
Paired Samples: Average of differences
Choosing a Critical Values
for CI’s and HT’s
Paired Sample
•
•
Large Sample: Gaussian Table
Small Sample: t-distribution with (npairs-1) d.f.
Two Independent Groups
• Large Sample: Gaussian Table
• Small Sample:
• Population variances unequal – related to tdistribution
• Pop’n variances equal – t-dist’n with (na+nb-2) d.f.
Paired Differences Example
95% Confidence Interval
Assumptions for CI
Independent, random sample of pairs,
Differences have approximately normal pop’n
distribution
Confidence Interval (using t-distribution 10 d.f.)
Avg Diff +/- 2.228 SE
0.146 +/- 2.228 (0.061)
(0.010, 0.282)
Interpretation
We are 95% confident that the difference in the time
spent reading or watching the news between women and
men is between 0.010 and 0.282 hours (with women
spending more time). (between 0.6 and 16.9 minutes)
Paired Differences Example
Hypothesis Test
Study Hypothesis: Men and women spend
different amounts of time reading/watching the news.
1.
Assumptions:
Independent, random sample of pairs,
Differences have approximately normal pop’n
distribution
2.
Hypotheses:
H0: μa – μb = 0 vs. HA: μa – μb ≠ 0
3.
Test Statistic:
t = (Avg.Diff-0)/SE = 0.146/0.061 = 2.393
Interpretation:
Paired Differences HT, cont.
4. Critical Value: t-table, 2-sided 10 d.f.
 2.228
P-value: From computer  0.038
5. Conclusion: With a p-value of 0.038,
there is some evidence to suggest
that, on average, men and women do
spend different amounts of time either
watching or reading the daily news.
Computer Output - Paired
Paired t-test
Paired Samples Statistics
Pair
1
WIFE
HUSBAND
Mean
.5909
.4455
N
11
11
Std. Deviation
.37803
.35879
Std. Error
Mean
.11398
.10818
Paired Samples Correlations
N
Pair 1
WIFE & HUSBAND
11
Correlation
.851
Sig.
.001
Data for wives and
husbands are in two
separate columns, with
matched observations in
the same row.
Analyze
Compare Means
Paired Samples T-test
Paired Samples Test
Paired Differences
Pair 1
WIFE - HUSBAND
Mean
.1455
Std. Deviation
.20181
Std. Error
Mean
.06085
95% Confidence
Interval of the
Difference
Lower
Upper
.0099
.2810
t
2.390
df
10
Sig. (2-tailed)
.038
Computer Output - Independent
Data for wives &
husbands are in the same
column; a second column
indicates whether each
observation is for the wife
or husband.
Analyze
Compare Means
Independent Samples
T-test
Independent
Samples t-test
Group Statistics
TIME
SPOUSE
wife
hus b
N
11
11
Mean
.5909
.4455
Std. Deviation
.37803
.35879
Std. Error
Mean
.11398
.10818
Independent Samples Test
Levene's Test for
Equality of Variances
F
TIME
Equal variances
ass umed
Equal variances
not as sumed
.228
Sig.
.638
t-tes t for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
.926
20
.366
.1455
.15714
-.18234
.47325
.926
19.946
.366
.1455
.15714
-.18240
.47331
Paired Differences Example
Hypothesis Test – Take 2
1.
Study Hypothesis: Women spend more
time than men.
Assumptions:
Independent, random sample of pairs,
Differences have approximately normal pop’n
distribution
2.
Hypotheses:
H0: μa – μb = 0 vs. HA: μa – μb > 0
3.
Test Statistic:
t = (Avg.Diff-0)/SE = 0.146/0.061 = 2.393
Interpretation:
Paired Differences HT- Take 2,
cont.
4. P-value: ½ of computer’s two-sided p-
value  0.038/2 = 0.019
5. Conclusion: There is a moderate
amount of evidence to suggest that
women do spend more time on
average than men reading or watching
the daily news.
Two Group Example, CI
Assumptions for Confidence Interval
Large Sample
Each group is an independent, random
sample and the groups are chosen independently
of one another
Confidence Interval
Equal SD: 1.6 +/- 1.96 (1.335) = (-1.017, 4.217)
Unequal SD: 1.6 +/- 1.96 ( 1.344) = (-1.034, 4.234)
Interpretation (Equal)
We are 95% confident that at baseline the two groups of
children (to be treated with placebo or risperidone) have
Conduct Problem scores that are on average between
–1.017 and 4.217 points different.
At the 0.05 level the two populations are not significantly
different.
Two Group Example, HT
Study Hypothesis: The two groups are
different.
1. Assumptions:
Large Sample
Each group is an independent, random
sample and the groups are chosen
independently of one another
2. Hypotheses
H0: μa – μb = 0 vs. HA: μa – μb ≠ 0
3.
Test Statistic
Equal SD:
Unequal SD:
z = 1.6/1.335 = 1.199
z = 1.6/1.344 = 1.190
Two Group Example, HT, cont.
4.
P-value
Equal SD: t, with 63+55-2=116 d.f.
approximately z  p-value=0.2301
Unequal SD: Due to large sample size, the test
statistic will follow an approximately normal (due
to CLT)  p-value slightly larger than 0.2301
5.
Conclusion
At any reasonable significance level (say 0.05), we
fail to reject the null hypothesis. Thus we would
conclude that there is not a significant difference
between the two groups with respect to their
conduct problem score. There does not seem to
be any bias with respect to their conduct
problem score in the way that children were
randomized to the two treatments.
Choosing a Study Design
When I’m planning a study design, should I plan
for paired (matched) samples or a study
with two independent groups?
Considerations:
1. Standard error for difference of means is
generally smaller for paired samples
resulting in
a. Shorter confidence intervals
b. More powerful tests
2. Cost
3. Feasibility
Why is Std error smaller for paired samples?
Non-parametric statistics
for small, non-normal samples
Paired Data
The same as for univariate data, except perform the
test using the differences rather than the raw data.
Two Independent Groups
Mann-Whitney Rank Sum Test (Ch. 24)
Procedure is similar to the Rank sum test, except
that instead of dividing observations according to
whether they are positive or negative, we divide
observations according to group membership.
Assumptions include (1) independent, random
samples, (2) independently selected groups, and
(3) the shape and spread of the two distributions
are the same
Comparison of Two Proportions
with Large Samples
Lecture 6b
Confidence Interval
 ASSUMPTIONS
Large sample size
Independent, randomly selected sample
Two groups are independent
 Estimate π1-π2
 95% CI: Estimate +/- 1.96 SE
o Estimate = p1-p2
o SE2=p1(1-p1)/n1 + p2(1-p2)/n2
Cardiovascular Disease Risk
Factors Example
 The Journal of Women’s Health (Vol.7,
pp1125-1133) reports on the prevalence of
risk factors for cardiovascular disease among
women. Two independent samples of women
were examined in 1992 and 1995. In 1992, it
was found that 27.7 percent of the 36,836
women sampled had high cholesterol. In
1995, 29.2% of the 44,745 women sampled
had high cholesterol. Is there a significant
change in the percent of women with high
cholesterol?
CVD Risk Factors: CI
 Estimate=0.292-0.277=0.015
 SE2=.292*.708/44,745+.277*.723/36,836





=.000010
SE = .003171
0.015 +/- 1.96*.003171 = (0.0089, 0.0212)
I am 95% confident that the risk of high
cholesterol is between 0.89 and 2.12
percentage points higher in 1995 than it was
in 1992. (This is not a relative change!)
It looks like this change is statistically
significant!
Is it practically significant?
Homework
 Textbook Reading
 Chapters
7, 23, 24, 25
 Homework Problems
 Available
at http://yhkuo.tripod.com/PHCO0504/index.htm