Lecture 3: Thurs, Sept 11 - University of Pennsylvania

Download Report

Transcript Lecture 3: Thurs, Sept 11 - University of Pennsylvania

Lecture 3 Outline: Thurs, Sept 11
• Chapters 1.3-1.4
• Probability model for 2-group randomized
experiment
• Randomization test p-value
• Probability model for random sampling
Vocabulary of Experiments
• A study is an experiment when we actually do something
to people, animals or objects to observe the response.
• Experimental units are the things to which treatments are
applied, e.g., people, rats, samples of materials or pieces of
land.
• When units are human beings, they are called subjects.
• A specific experimental condition applied to the units is
called a treatment.
• The “control” refers to a treatment that is considered a
baseline for comparing all other treatments.
• Creativity study: Experimental units? Treatments?
Probability Model for 2-treatment
Randomized Experiment
• Creativity Study
– Chance mechanism for randomizing units to treatment
groups ensures that every subset of 24 subjects gets the
same chance of becoming intrinsic group
– For example, 23 red and 24 black cards could be
shuffled and dealt, one to each subject and the subjects
with black cards would be the intrinsic group.
– Tables of random numbers can be used to assign units
to groups (assign the units with the 24 highest numbers
to group 1).
Additive Treatment Effect Model
• Potential Outcomes: For each subject, we can
imagine what the subject’s outcome would be if
placed in the extrinsic group (Y) and what
subject’s outcome would be if placed in the
intrinsic group (Y*). We only see the outcome for
the group to which they were assigned.
• Additive Treatment Effect Model: For every
subject, Y*=Y+ 
•  is a parameter – an unknown constant that
describes a key feature in model for answering
questions of interest.
Test for Treatment Effect
• Meaning of
–
–
–
:
 >0: Intrinsic questionnaire improves creativity.
 =0 : Intrinsic questionnaire (treatment) makes no
difference.
 <0: Intrinsic questionnaire makes creativity worse.
• Hypothesis testing: Questions of interests are
translated into questions about parameters in
probability models.
• Null hypothesis (H0):  =0 (group status has no
effect on outcome)
• Alternative Hypothesis:   0 (group status has
an effect on outcome)
Test Statistic
• A test statistic is a numerical data summary for
testing a hypothesis. We try to find a test statistic
that tends to be small when the null hypothesis is
true and tends to be large when the alternative
hypothesis is true.
• Test statistic for 2-group randomized exp.:
– Let Y be the sample mean of the outcome for units
1
assigned to group 1.
– Let Y2 be the sample mean of the outcome for units
assigned to group 2.
– Test statistic: T= | Y2  Y1 |
Testing the Null Hypothesis
• If there is no treatment effect (H0), subjects would
receive same outcome regardless of their assigned
group.
• If there is a treatment effect, subjects will receive
a higher (or lower) outcome if they are assigned to
one group.
• Therefore a large value of T | Y1  Y2 | argues
against the null hypothesis.
• But how large is large? Even if there is no
treatment effect, T will not necessarily equal 0
because the random assignment can result in an
uneven mix of abilities.
Randomization Test p-value
• The observed value of the test statistic can be large
because
– (a) there is an effect of the treatment
– (b) the random assignment resulted in an uneven mix
• A randomization test p-value is the probability
associated with explanation (b)
• The smaller the p-value, the less believable (b) is
as an explanation.
Exact Calculation of the p-value
• The p-value is the probability that T>=4.14 if, in
fact, there is no treatment effect (and based on the
random assignment of units to groups)
• Important starting point: If there is no treatment
effect, then the creativity score for an individual
would have been the same had they been assigned
to the other group.
• Exact Calculation of p-value
– Calculate T for every possible grouping of the 47
numbers into groups of size 23 and 24
– The p-value is the proportion of regroupings with
T>=4.14.
Exact calculation of p-value
• If there is no treatment effect, subjects would
receive same outcome regardless of their assigned
group.
• Distribution of test statistic T if there is no
treatment effect:
– For every possible random assignment of units into two
groups, calculate T using the observed outcomes.
– The T’s associated with each possible random
assignment have the same probability.
• The p-value is the probability that if the null
hypothesis were true (no treatment effect), T
would be greater than or equal to the observed T0
Example
• Suppose the creativity study had just six
students. Suppose the three students
assigned to the intrinsic group had scores of
12, 20 and 28. The three students assigned
to the extrinsic group had scores of 10, 18
and 26.
• Calculate the p-value for testing if there is a
treatment effect.
P-value for Creativity Study
• For the actual creativity study, using a computer
program, the p-value is .011.
• Conclusion: either
– (i) there is no treatment effect and we happened to get
an uneven randomization
– (ii) there is a treatment effect.
• The probability associated with (i) is .011. So
either there is a treatment effect or we obtained an
unusual (one-in-a-hundred) randomization.
• A p-value of around .01 is considered strong
evidence against the null hypothesis, see pg. 47
One-sided vs. Two-sided Tests
• For some problems, we might know that the
treatment effect is >=0 or <=0 and want to use a
one-sided alternative hypothesis
– (i) Ha:   0 or
– (ii) Ha:   0
• The appropriate test statistics for the one-sided
alternative hypotheses are T  Y2  Y1 for testing (i)
and T  Y1  Y2 for testing (ii), where it is assumed
that group 2 is the “treatment” group and group 1
is the “control” group.
Randomization Distribution and
p-value
• Defn.: The randomization distribution of a
test statistic describes its possible values
over all the ways the randomization could
have turned out.
• The p-value of the randomization test is the
proportion of the randomization distribution
that is at least as large as the observed test
statistic.
Approximating the p-value
• For the creativity study, there are 1.6*1013
different groupings.
• Approximating the randomization test p-value.
– (i) Monte Carlo simulation: Randomly choose many
groupings. Approximate the randomization distribution
by the histogram of the test statistic for the randomly
chosen groupings
– (ii) (Chapter 2). The randomization distribution of the
“t-statistic” is approximated by the “t”-distribution.
Probability Model for Random
Sampling
• Consider taking random samples from two
populations with respective means 1 and
2
.
Are the means different?
H0: 1  2  0
H1: 1  2  0
Test statistic: T | Y1  Y2 |
Probability Model for Random
Sampling
• The sampling distribution of T | Y1  Y2 | is
represented by a histogram of all values for the
statistic from all possible samples that can be
drawn from the two populations.
• The p-value for testing a hypothesis (and
confidence intervals) follows from an
understanding of the sampling distribution. We
will discuss sampling distribution in Ch. 2.