Tue, Oct 7 - Wharton Statistics Department

Download Report

Transcript Tue, Oct 7 - Wharton Statistics Department

Lecture 10 Outline: Tue, Oct 7
• Resistance of two sample t-tools (Chapter 3.3)
• Practical strategies for two-sample problem
(Chapter 3.4)
• Review
• Office hours:
– Today after class
– Tomorrow morning (9 a.m. – 11:30 a.m.)
– Note: I will be out of town Thursday. I will try to check
e-mail Wednesday night around 10 but will not be able
to check after that.
Matched Pairs Studies
• Studies in which units in the two groups can be blocked
into pairs that are more “similar” to each other than to the
other units are called matched pairs studies (e.g.,
schizophrenia twins study)
• In a matched pairs study, the samples are not independent –
knowledge of the outcome of one in a pair helps to predict
the outcome of the other.
• The proper tool for analyzing matched pairs study is the
paired (one-sample) t-test.
• Motivation for doing matched pairs studies: Blocking.
By controlling the influence of outside variables, we
reduce variability in the responses ( ), decreasing the
margin of error of a CI. More on this later in the course.
Recognizing Matched Pairs
Studies
• Does there exist some natural relationship between
the first pair of observations that makes it more
appropriate to compare the first pair than the first
observation in group 1 and the second observation
in group 2?
• Before and after designs
• Example: A researcher for OSHA wants to see
whether cutbacks in enforcement of safety
regulations coincided with an increase in work
related accidents. For 20 industrial plants, she has
number of accidents in 1980 and 1995.
Conceptual Question #16
• A researcher has taken tissue cultures from 25 subjects.
Each culture is divided in half, and a treatment is applied
to one of the halves chosen at random. The other half is
used as a control. After determining the percent change in
the sizes of all culture sections, the researcher calculates
the standard error for the treatment-minus-control
differences using both the paired t-analysis and the two
independent sample t-analysis. Finding that the paired tanalysis gives a slightly larger standard error (and gives
only half the degrees of freedom), the researcher decides to
use the results from the unpaired analysis. Is this
legitimate?
Outliers and resistance
• Outliers are observations relatively far from their
estimated means.
• Outliers may arise either
– (a) if the population distribution is long-tailed.
– (b) they don’t belong to the population of interest
(come from contaminating population)
• A statistical procedure is resistant if one or a few
outliers cannot have an undue influence on result.
Resistance
• Illustration for understanding resistance: the
sample mean is not resistant; the sample
median is.
– Sample: 9, 3, 5, 8, 100
– Mean with outlier: 25, without: 6.2
– Median with outlier: 8, without: 6.5
• t-tools are not resistant to outliers because
they are based on sample means.
Practical two-sample strategy
• Think about independence – use tools from later in
course (or matched pairs) if there’s a potential
problem.
• Use graphical displays to assess: normality
(particularly skewness, multimodality and heavy
tails), equal spread, outliers
• If there are outliers, investigate them and see
whether they (i) change conclusions; (ii) warrant
removal. Follow the outlier examination strategy
in Display 3.6.
Excluding Observations from Analysis in
JMP for Investigating Outliers
• Click on row you want to exclude.
• Click on rows menu and then click
exclude/unexclude. A red circle with a line
through it will appear next to the excluded
observation.
• Multiple observations can be excluded.
• To include an observation that was excluded back
into the analysis, click on excluded row, click on
rows menu and then click exclude/unexclude. The
red circle next to observation should disappear.
Notes on Outliers
• In the examination strategy of Display 3.6, in
order to warrant the removal of an outlier, an
explanation for why it is different must be
established.
• It is not surprising that the outliers in the Agent
Orange example have little effect, since the
sample sizes are so large.
• The apparent differences in the box plots may be
due to differences in sample sizes. If the
population distributions are identical, more
observations will appear in the extreme tails from
a sample of size 646 than a sample of size 97.
Conceptual Question #6
• (a) What course of action would you propose for
the statistical analysis if it was learned that
Vietnam veteran #646 (the largest observation in
Display 3.6) worked for several years, after
Vietnam, handling herbicides with dioxin?
• (b) What would you propose if this was learned
instead for Vietnam veteran #645 (second largest
observation)?
Review
• Material: 1.1-3.4, 4.5.1-4.5.3. Class notes.
• Review class notes, homework, textbook.
• Themes:
– Study design (randomized experiments vs.
observational studies, random sampling vs.
non-random sampling) and what inferences
they permit
– Hypothesis tests and confidence intervals for
two group problems.
Inference
• A statistical inference is an inference justified by a
probability model linking the data to a broader
context. Statistical inferences involve measures of
uncertainty about the conclusions (e.g., p-values
and confidence intervals).
• Population inference: an inference about
population characteristics, like the difference
between two population means
• Causal inference: an inference that a subject
would have received a different numerical
outcome had the subject belonged to a different
group
Statistical inferences permitted
by study designs
• Display 1.5
Confounding Variables
• A confounding variable is a variable that is related to both
group membership and the outcome. Its presence makes it
hard to establish the outcome as being a direct
consequence of group membership. Example: experience
in sex discrimination study.
• Observational studies: Always have to worry about
confounding variables even in very large studies.
• Randomized experiments: Because group membership is
randomly assigned, there are no confounding variables.
Differences between groups are due to play of chance and
can be made almost surely small in large studies.
Measuring Uncertainty
• Probability model for two treatment
randomized experiment: Randomly shuffle
and deal red and black cards to assign group
membership.
• Additive treatment effect model:  = the
effect of being assigned to group II rather
than group I.
Randomization Test
• Two-sided test H 0 :   0 vs. H1 :   0
• Test statistic: T | Y2  Y1 |
• Distribution of test statistic under H 0 :
Distribution of T under all possible regroupings.
• p-value: Probability that T will be at least as large
as the observed T, To, under H 0 .
• p-value: Measure of evidence against H 0 . See
Display 2.12 for interpreting.
Graphical Methods
•
•
•
•
Box plots
Histograms
Stem-and-leaf diagrams
Note on box plots: To produce two plots
with same scale using Analyze,
Distribution, stack them and click uniform
scaling under both groups.
Sampling
• Simple random sample (of size n): each subset of
population of size n has same probability of being
chosen.
• Need a frame: a numbered list of all subjects
• Sampling units: In conducting a random sample, it
is important that we are randomly sampling the
units of interest. Otherwise, we may create a
selection bias, e.g., sampling families instead of
individuals, homework 2 Problem #3.
Inferences Under Random
Sampling Model
• Two types of samples
– One sample/matched pairs – random sample
from one population (paired t tools)
– Two independent samples (two sample t tools)
• Tools can also be used to analyze
randomized experiments if group sizes are
reasonably large.
Testing a hypothesis about 
(one sample/matched pairs)
• H 0 :   0, H1 :   0
• Could the difference of Y from  * (the
hypothesized value for  , =0 here ) be due
to chance (in random sampling)?
| Y  * |
|
t
|

• Test statistic:
SE (Y )
• If H0 is true, then t equals the t-ratio and has
the Student’s t-distribution with n-1 degrees
of freedom
P-value
• The (2-sided) p-value is the proportion of
random samples with absolute value of t
ratios >= observed test statistic (|t|)
• Schizophrenia example: t = 3.23
8
7
Estim Mean 0.1986666667
Hypoth Mean 0
T Ratio 3.2289280811
P Value 0.0060615436
6
Y
5
4
3
2
1
0
-0.4
-0.3
Sample Size = 15
-0.2
-0.1
.0
X
.1
.2
.3
.4
Confidence Intervals
• A confidence interval is a range of “plausible
values” for a statistical parameter (e.g., the
population mean) based on the data.
• If the population distribution of Y is normal, 95%
CI for mean of single population:
Y  tn1 (.975) * SE (Y ) 
Y  tn1 (.975) *
s
n
• A 95% confidence interval will contain the true
parameter (e.g., the population mean) 95% of the
time if repeated random samples are taken.
Two sample t-test
• H0:2  1   *, H1: 2  1   *
• Test statistic: T= | t | | (Y2  Y1 )   * |
SE (Y2  Y1 )
• If population distributions are normal with equal 
, then if H0 is true, the test statistic t has a
Student’s t distribution with n1  n2  2 degrees of
freedom.
• p-value equals probability that T would be greater
than observed |t| under random sampling model if
H0 is true; calculated from Student’s t distribution.
Practical vs. Statistical Significance
• The p-value of a test depends on the sample size.
With a large sample, even a small difference can
be “statistically significant,” that is hard to explain
by the luck of the draw. This doesn’t necessarily
make it important. Conversely, an important
difference may not be statistically significant if the
sample is too small.
• Always accompany p-values for tests of
hypotheses with confidence intervals. Confidence
intervals provide information about the likely
magnitude of the difference and thus provide
information about its practical importance.
Designing a Study
• Types of confidence interval for key parameter in
a study – Display 23.1
• Role of research design is to avoid outcome D.
• Margin of error of 95% confidence interval:
Approximately 2 * SE (estimate) , one sample
problem: 2s / n
• For one sample study: choose sample size to be
greater than4s 2 /( PSD2 ) where PSD denotes least
practically significant difference.
Robustness of t-tools
• A statistical procedure is robust to departures from a
particular assumption if it is valid even when the
assumption is not met exactly
• Valid means that the uncertainty measures – the confidence
levels and p-values – are nearly equal to the stated rules
• If the sample sizes are large, the t-tests will be valid no
matter how nonnormal the populations are.
• If the two populations have same S.D. and approximately
the same shape and if n1  n2 , validity of t-tools is affected
moderately by long-tailedness and very little by skewness.