11.3: Uses and Abuses of tests

Download Report

Transcript 11.3: Uses and Abuses of tests

Inference Toolbox!
•To test a claim about an unknown population parameter:
•Step 1: State
•Identify the parameter (in context) and state your hypotheses
•Step 2: Plan
•Identify the appropriate inference procedure and verify the
conditions for using it (SRS, Normality, Independence)
•Step 3: Calculations
•Calculate the test statistic
•Find the p-value
•Step 4: Interpretation
•Interpret your results in CONTEXT
•Interpret P-value or make a decision about H0 using statistical
significance
Example =)
•Mel N. Colly is interested in whether or not his new treatment
for depressed patients is having any effect on his patients’
rating of depression. Suppose all of his depressed patients
have a mean depression score of 8 with a standard deviation
of 4. Mel chooses a random sample of 100 depressed
patients treated with his innovative approach and determines
that the mean depression score for these individuals is 7.5.
Does the cream have any effect?
•Mel N. Colly is interested in whether or not his new treatment
for depressed patients is having any effect on his patients’
rating of depression. Suppose all of his depressed patients
have a mean depression score of 8 with a standard deviation
of 4. Mel chooses a random sample of 30 depressed patients
treated with his innovative approach and determines that the
mean depression score for these individuals is 7.5. Does the
treatment have any effect?
•Mel N. Colly is interested in whether or not his new treatment for depressed patients is having
any effect on his patients’ rating of depression. Suppose all of his depressed patients have a
mean depression score of 8 with a standard deviation of 4. Mel chooses a random sample of 30
depressed patients treated with his innovative approach and determines that the mean depression
score for these individuals is 7.5. Does the treatment have any effect?
•Step 2: PLAN
•We will conduct a ______________________________.
•(1) SRS:
•The data was collected “at random.” The study does not state that
a simple random sample was used, but we will proceed assuming
proper sampling methods were used.
•(2) Normality:
•We do not know if the population distribution of depression
patients’ depression scores is Normal, but the sample size is large
enough (n=30) so that the sampling distribution will be
approximately normal (by the central limit theorem)
•(3) Independence:
•Mel N. Colly selected the patients without replacement, but we will
assume that there are more than 30(10) = 300 depressed patients
seen in his practice. Also assume that the depression score for the
each patient is independent of other patients in the sample.
•Mel N. Colly is interested in whether or not his new treatment for depressed
patients is having any effect on his patients’ rating of depression. Suppose
all of his depressed patients have a mean depression score of 8 with a
standard deviation of 4. Mel chooses a random sample of 30 depressed
patients treated with his innovative approach and determines that the mean
depression score for these individuals is 7.5. Does the treatment have any
effect?
•Step 3: Calculations
•(1) Test Statistic
z = x-bar - μ0
σ/√n
•(2) P-value: Draw a picture using the standardized value, then calculate
the P-value
•Mel N. Colly is interested in whether or not his new treatment for depressed patients
is having any effect on his patients’ rating of depression. Suppose all of his depressed
patients have a mean depression score of 8 with a standard deviation of 4. Mel
chooses a random sample of 30 depressed patients treated with his innovative
approach and determines that the mean depression score for these individuals is 7.5.
Does the treatment have any effect?
•Step 4: Interpretation
•P-value (the problem did not give us an alpha level)
•A sample mean depression score of 7.5 would happen
49.36% of the time by chance if the true population mean
depression score was 8. Because the probability of
obtaining these results is so high, we fail to reject our null
hypothesis. This is not good evidence that the true mean
depression score is not 8.
•Di Perrs is the quality control manager for Pampers. A recent ad
claimed that the new improved Pampers is more absorbent than the
old Pampers. The average absorbency of old pampers was 195
milliliters with a standard deviation of 80 milliliters. A total of 100 new
Pampers were selected at random and tested. The average amount
of fluid absorbed was x-bar = 210 milliliters. Di Perrs wants to use an
α = 0.05 significance level.
•Step 2: PLAN
We will perform a 1-sample z-test for means (sigma known)
•(1) SRS: The data was collected “at random.” The study does not state
that a simple random sample was used, but we will proceed assuming
proper sampling methods were used.
•(2) Normality: We do not know if the population distribution of
Pampers absorbency is Normal, but the sample size is large enough
(n=100) so that the sampling distribution will be approximately normal (by
the central limit theorem)
•(3) Independence: Di Perrs selected the diapers without
replacement, but we can assume that there are more than 10(100) = 1000
diapers produced at the factory. Also assume that the absorbency of each
diaper in the sample is independent of the other daipers.
•Di Perrs is the quality control manager for Pampers. A recent ad claimed that the new improved
Pampers is more absorbent than the old Pampers. The average absorbency of old pampers was
195 milliliters with a standard deviation of 80 milliliters. A total of 100 new Pampers were selected
at random and tested. The average amount of fluid absorbed was x-bar = 210 milliliters. Di Perrs
wants to use an α = 0.05 significance level.
•Step 3: Calculations
•(1) Test Statistic
•(2) P-value: Draw a picture using the standardized value, then
calculate the P-value
•Di Perrs is the quality control manager for Pampers. A recent ad claimed
that the new improved Pampers is more absorbent than the old Pampers.
The average absorbency of old pampers was 195 milliliters with a standard
deviation of 80 milliliters. A total of 100 new Pampers were selected at
random and tested. The average amount of fluid absorbed was x-bar = 210
milliliters. Di Perrs wants to use an α = 0.05 significance level.
•Step 4: Interpretation
•Using significance Level
•Since our P-value,
YOU TRY: Prom

Choosing a Level of Significance: Things
to think about
(1) How plausible is H0?


A study that finds that smoking increases the risk of
Alzheimer's.
You read a study that claims to have evidence that smoking is
really good for you.
(2) What are the consequences of rejecting H0?


You find evidence that cats sleep more than dogs.
You find evidence that a new drug may have harmful sideeffects…but your company has invested millions of dollars in
an ad campaign for the drug.
Statistical Significance vs. Practical Importance
 You decide to run a significance test to see if a particular SAT prep program
increases scores on the Math portion. You know from previous research that
the average score on the Math section is 510 with a standard deviation of 50.
You take a sample of 200 students and find that they have an average score
of 515. Use a 5% level of significance.
 H0: μ = 510
 Ha: μ > 510
 P-Value: 0.02167
 We can reject the null hypothesis that the prep program does not improve
scores…but is a 5 point increase worth anything?
Beware Outliers!!!
 Pesky little outliers can destroy the
significant of otherwise significant
data.
 They can also make data appear
significant when it actually is not.
 Always do a graphical analysis of
your data
 The effect you are searching for should
be evidence in your plots
 Confidence intervals can help you get a
better idea
Beware Outliers!!!
 Be aware of “dropouts” from
statistical analysis.
 Make sure that all the data is
represented in the analysis.
Lack of Significance
 Example 11.14
 In an experiment to compare methods for reducing transmission of
HIV, subjects were randomly assigned to a treatment group and a
control group. Result: the treatment group and the control group had
the same rate of HIV infection. Researchers described this as an
“incident rate ratio” of 1.00. (>1.00 means greater rate of infection
among treatment group, <1.00 means greater rate among control).
 The 95% confidence interval for the incident rate ratio was reported at
0.63 to 1.58.
 Can you really say that the treatment has no effect?
Lack of Significance
 Design a study so that it has a high probability of
finding a real effect.
 What could you do to increase the chances of finding
an effect?
Invalid Statistical Inference
 Hawthorne effect
 What is the term for a study where neither the subject
nor the administrator knows who is getting what
treatment?
Invalid Statistical Inference
 The importance of an SRS from the population of INTEREST.
Multiple Analyses
 A study using an alpha level of 0.05 is run for 20
different types of soda to see if there is an association
between drinking soda and scoring well on a math
test.
 It is found that one soda, Mountain Dew, did increase
scores.
 Why is this not good evidence of an effect?
HOMEWORK!!!
 11.43, 11.46, 11.48
 Friday: Chapter 11 TEST