PPT Lecture Notes

Download Report

Transcript PPT Lecture Notes

Confidence Intervals
&
Effect Size
Outline of Today’s Discussion
1. Confidence Intervals
2. Effect Size
3. Thoughts on Independent Group Designs
The Research Cycle
Real
World
Abstraction
Generalization
Research
Conclusions
Research
Representation
Methodology
***
Data Analysis
1. Observational
2. Survey
3. Experimental
Research
Results
Part 1
Confidence Intervals
A.K.A.
How Big is Your Error Bar?
Confidence Intervals
“A picture is worth a thousand…p-values!”
(say it with me)
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
Which graph makes a more convincing
case for Drug X, and why?
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
In some graphs, the error bars
reflect the range (min to max).
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
In some graphs, the error bars
reflect the inter-quartile range.
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
In some graphs, the error bars
reflect one standard deviation.
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
In some graphs, the error bars
reflect one standard error (of the mean).*
Confidence Intervals
The Effe ctive ne s s of Drug x
12
12
10
10
Mean Effectiveness
Mean Effectiveness
The Effe ctive ne s s of Drug x
8
6
4
8
6
4
2
2
0
0
Drug x
Placebo
Treatm e nt
Drug x
Placebo
Treatm e nt
In still other graphs, the error bars
reflect a confidence interval. *
Confidence Intervals
1.
Standard Error (of the Mean) – The standard deviation
of the “distribution of means” (D.O.M).
•
The standard deviation describes the average extent
to which a RAW SCORE (that’s one raw score)
deviates from the mean of the distribution of raw
scores.
•
The standard error describes the average extent to
which a SAMPLE MEAN (that’s the mean of one
sample) deviates from the mean of the distribution
of means (DOM).
Confidence Intervals
Three Kinds of Distributions
There are three kinds of distributions
A. The distribution of the population of individuals
B. The distribution of a sample
C. The distribution of means (of samples)
Critical Thinking Question: Why is the D.O.M. so ‘skinny’?
Confidence Intervals
Main Points on the D.O.M
• Q: Why would we want to use the standard
deviation of the D.O.M.?
• A: So we can put a mean in context!
• This is similar to the rationale for knowing the SD
of a distribution of raw scores…whether we have
a raw score or a mean we want some CONTEXT.
Confidence Intervals
Main Points on the D.O.M
• Example: Your new drug is given to a sample of depressed patients.
Subsequently, the sample’s mean mood score is 25, whereas the mean
for the population of all depressed people is 20.
• Did our drug have a significant effect?
• IT DEPENDS!!!!
• If the D.O.M has a standard deviation of 10 units, then our sample is
not so different from the D.O.M. mean. Our drug isn’t so special.
• If the D.O.M. has a standard deviation of 1 unit, then our sample mean
is very different from the D.O.M. mean. Our drug is hot stuff!!!
Confidence Intervals
Main Points on the D.O.M
• The standard error IS the standard deviation of the
distribution of means (DOM).
• We can estimate the standard deviation of the DOM from a
sample. To do so, we use the equation
S.E. = SDsample / sqrt( n ).
Please memorize this formula!
Confidence Intervals
1.
Confidence Interval – A range of values assumed, with a specified
degree of confidence (i.e., probability), to include a population
parameter (usually the mean) .
2.
Example 1: We might be, say, 95% confident that the mean height
in our room is in the range between 5’ 7’’ and 5’ 9’’.
3.
Example 2: We might be, say, 99% confident that the mean height
in our room is in the range between 5’ 6’’ and 5’ 10’’.
4.
Critical Thinking Question: Why is the 99% confidence interval
wider than the 95% confidence interval?
Confidence Intervals
1.
Each confidence interval has an upper bound, and a
lower bound.
2.
The upper & lower bounds depend on
- The mean
- The standard error [ s.d. / sqrt(n) ]
- The confidence level (95% versus 99%)
3.
The confidence level is determined by the critical value
of ‘t’ (the number to beat)…
Confidence Intervals
1.
If we want a 95% confidence interval, we’ll need to find
‘t’ critical value at a = 0.05.
2.
If we want a 99% confidence interval, we’ll need to find
‘t’ critical value at a = .01.
3.
Upper Bound = Mean + (tcrit * S.E.)
4.
Lower Bound = Mean - (tcrit * S.E.)
Confidence Intervals
1.
Practice Item 1: Assume that a sample in your
experiment has the following features:
Mean = 10
S.D. = 8
n = 16
D.F. = 15
tcrit(15) = 2.13 at 0.05 alpha level
2.
Compute the 95% confidence interval.
Confidence Intervals
1.
Practice Item 1: Assume that a sample in your
experiment has the following features:
Mean = 10
S.D. = 8
n = 16
D.F. = 15
tcrit(15) = 2.95 at 0.01 alpha level
2.
Compute the 99% confidence interval.
Confidence Intervals
1.
To summarize, researchers can make their error bars
equal to confidence intervals, instead of the standard
deviation.
2.
The researchers might then say: “We are 95% confident
that the population mean falls between (upper bound) and
(lower bound).”
3.
Larger confidence levels have larger confidence intervals.
Part 2
Effect Size
Effect Size & Meta-Analysis
There is Trouble in Paradise
(Say it with me)
Effect Size & Meta-Analysis
1.
One major problem with Null Hypothesis Testing (i.e.,
inferential statistics) is that the outcome depends on
sample size.
2.
For example, a particular set of scores might generate
a non-significant t-test with n=10. But if the exact same
numbers were duplicated (n=20) the t-test suddenly
becomes “significant”.
Effect Size & Meta-Analysis
1.
Effect Size – The magnitude of the influence that the IV
has on the DV.
2.
Effect size does NOT depend on sample size!
(“And there was much rejoicing!”)
Effect Size & Meta-Analysis
1.
A commonly used measure of effect size is Cohen’s d.
2.
Conventions for Cohen’s d:
d = 0.2 small effect
d = 0.5 medium effect
d = 0.8 large effect
Effect Size & Meta-Analysis
1.
A statistically significant effect is said to be a ‘reliable
effect’… it would be found repeatedly if the sample size
were sufficient.
2.
Statistically significant effects are NOT LIKELY due to
chance.
3.
An effect can be statistically significant, yet ‘puny’.
4.
There is an important distinction between statistical
significance, and practical significance…
Effect Size
Examples that distinguish effect size and statistical
significance….
1. Analogy to a Roulette Wheel – An effect can be
small, but reliable.
2. Anecdote about the discovery of the planet Pluto -An
effect can be small, but reliable.
3. Anecdote about buddy’s doctoral thesis, “Systematic
non-linearities in the production of time intervals”.
4. Denison versus “Other” in S.A.T. scores.
Effect Size & Meta-Analysis
1.
Potential Pop Quiz Question – Using two sentences,
generate your own novel example of a meta-analysis.
http://en.wikipedia.org/wiki/Meta-analysis
2.
Potential Pop Quiz Question – In your own words,
explain how Cohen’s d can be helpful in a metaanalysis.
Part 3
Thoughts
on
Independent Groups Designs
From Shaughnessy, Zechmeister, Zechmeister (2012)
Independent Groups Designs
1. Potential Pop Quiz Question – As we’ve seen,
inferential statistics can address the issue of
reliability. Statistically significant effects are
‘reliable effects’. What is the ultimate test of
an experiment’s reliability? (One word will
do.)
2. Potential Pop Quiz Question – In your own
words, explain what a conceptual replication
is. Use an example of your own, or from the
readings.
Independent Group Designs
1. Potential Pop Quiz Question – In your own
words, explain what a matched group design
is, and when it can be advantageously used.
2. Potential Pop Quiz Question – As we’ve noted
many times, the scientific method has 4 goals.
Which goal or goals can be met by a natural
groups design, and which cannot? Explain
your reasoning.