The New Statistics
Download
Report
Transcript The New Statistics
The New Statistics:
Why & How
Corey Mackenzie, Ph.D., C. Psych
http://www.latrobe.edu.au/scitecheng/about/staff/profile?uname=GDCumming
http://www.latrobe.edu.au/psy/research/cognitive-and-developmental-psychology/esci
Outline
• Need for changes to how we conduct research
– Three threats to research integrity
– Shift from Null Hypothesis Sig Testing (NHST)
• 3 “new” solutions
– Estimation
– Effect sizes
– Meta-analysis
1st change to how we do research: Enhance
research integrity by addressing three threats
Threat to Integrity #1
• We must have complete reporting of findings
– Small or large effects, important or not
• Challenging because journals have limited
space and are looking for novel, “significant”
findings
• Potential solutions
– Online data repositories
– New online journals
– Open-access journals
Threat to Integrity #2
• We need to avoid selection and bias in data
analysis (e.g., cherry picking)
• How?
– Prespecified research in which critical aspects of
studies are registered beforehand
– Distinguishing exploratory from prespecified
studies
Threat to Integrity #3
• We need published replications (ideally with
more precise estimates than original study)
– Key for meta-analysis
– Need greater opportunities to report them
2n change to how we do research: stop evaluating
research outcomes by testing the null hypothesis
Problems with p-values
In April 2009, people rushed to Boots pharmacies in
Britain to buy No. 7 Protect & Perfect Intense Beauty
Serum. They were prompted by media reports of an
article in the British Journal of Dermatology stating that
the anti-ageing cream “produced statistically significant
improvement in facial wrinkles as compared to baseline
assessment (p = .013), whereas [placebo-treated] skin
was not significantly improved (p = .11)”. The article
claimed a statistically significant effect of the cream
because p < .05, but no significant effect of the control
placebo cream because p > .05. In other words, the
cream had an effect, but the control material didn’t.
Problems with NHST
• Kline (2004) What’s Wrong with Stats Tests
– 8 Fallacies about null hypothesis testing
• Encourages dichotomous thinking, but effects
come in shades of grey
– P = .001, .04, .06, .92
• NHST is strongly affected by sample size
Solution #1
• Support for Bill 32 is 53% in a poll with an
error margin of 2%
– i.e., 53 (51-55 with 95% confidence)
vs
• Support is statistically significantly greater
than 50%, p < .01
Solution #2
• http://en.wikipedia.org/wiki/Effect_size
• http://lsr-wiki-01.mrccbu.cam.ac.uk/statswiki/FAQ/effectSize
• G*Power
Solution #3
• Meta-analysis
– P-values have no (or very little) role except their
negative influence on the file-drawer effect
– Overcomes wide confidence intervals often given
by individual studies
– Can makes sense of messy and disputed research
literatures
Why do we love P?
• Suggests importance
• We’re reluctant to change
• Confidence intervals are sometimes
embarrassingly wide
– 9 ±12
– But this accurately indicates unreliability of data
Why might we change?
• 30 years of damning critiques of NHST
• 6th edition of APA publication manual
– Used by more than 1000 journals across disciplines
– Researchers should “wherever possible, base
discussion and interpretation of results on point and
interval estimates”
• http://www.sagepub.com/journals/Journal20080
8/manuscriptSubmission
Epi Example