A Look at Why So Many People Find Statistics Frustrating

Download Report

Transcript A Look at Why So Many People Find Statistics Frustrating

Understanding Lies, Damn Lies,
and Statistics: A Look At Why So
Many People Find Statistics
Frustrating
John P. Holcomb, Jr.
Cleveland State University
Ohio MAA Section Meeting
April 1, 2005
Outline
• Why do statisticians find public reporting of
statistics frustrating?
• Why does the public find statistics frustrating?
• Why do students find statistics frustrating?
• What are some major differences between
statisticians and mathematicians?
• Emphasize our similarities
"There are Three Kinds of Lies:
Lies, Damn Lies and Statistics."
• Attributed to Benjamin
Disraeli (1804 - 1880)
• Prime Minister (1868,
1874 -1880)
• Said to be
popularized by Mark
Twain in the United
States
Statistics Affirming Quotations
• Frederick Mosteller (Harvard
University)
• “It is easy to lie with statistics, but it is
easier to lie without them.”
What Drives Statisticians Nuts?
Yahoo! News, (September 7, 2004)
Study Links TV to Teen Sexual
Activity
• “Teenagers who watch a lot of
television with sexual content are twice
as likely to engage in intercourse than
those who watch few such programs.”
(Reuters)
• Rebecca Collins, “This is the strongest
evidence yet that the sexual content of
television programs encourages
adolescents to initiate sexual
intercourse and other sexual activities.”
• The problem is this is an Observational
Study
• Did not sit 1,792 adolescents down and
force them to watch television
• Adolescents chose their own “treatment”
Confounding
• Occurs when some other variable(s)
affects both the independent variable (TV
watching) and the dependent variable
(Sexual Activity)
• Can be obvious and not-so-obvious
• This is hard for statistics students when it
is covered in class, but for the public …
Problem with All Observational
Studies
• Cannot assume there is no confounding
• So critics always have opportunity to
criticize observational studies
• This is the defense of the Tobacco
Industry for smoking
So why am I concerned?
• There is no mention of the role of
parental supervision
• What is the consequence?
• The public misguided on the meaning of
the result
Experiments
• Allow researchers to make “causal”
conclusions
• Randomly assign subjects to “treatments”
and “control” to ensure balance
– Control does not necessarily mean “sugar pill”
• Both groups alike to every known variable
as well as every unknown variable
EXCEPT the treatment variable
Example II
• July 9, 2002, The Journal of the American
Medical Association releases the results of
the “Women’s Health Initiative (WHI)”
• Headlines Across America warned women
about the risks from Hormone
Replacement Therapy (HRT)
• New York Times: Study Is Halted Over
Rise Seen In Cancer Risk
Belief: Estrogen and Progesterone would
help women live healthier lives
Findings:
• Increased risk for breast cancer (26%)
• Increased risk of heart disease (29%)
• Increased risk of Stroke (41%)
Previous Good News
• 1962 – Observational study suggests estrogen
therapy reduces risk of breast and genital
cancers
• 1980 – A study shows that estrogen and
progesterone together reduce risk for
endometrial cancer
• 1985 – The Nurses’ Health Study, with 121,964
subjects finds lower rate of heart disease in
those taking progesterone
• 1995 – Same study finds that estrogen and
progesterone reduce heart attack risk by 39%
Ethical Question
• For the WHI can we deprive the control
group this great treatment?
What Went Wrong?
• One major issue – Nurses’ Health Study is
observational
• WHI is a clinical Trial
• One theory is the confounder is health –
healthier nurses took the HRT and stayed
on the HRT
• Another theory is the nature of the study –
those who had some kind of heart ailment
stopped taking medicine
• Even though WHI was a clinical trial
(experiment), informed consent can add
bias
• Also, Women in WHI were older (most
were 60 or older instead of going through
menopause)
Caution
• Observational Studies are not useless
• Often point to issues needing further
investigation
– Experiments
– Animal Studies
What Did Not Make the Headlines
(or Even the Article)
• Recall the earlier increase:
– Breast cancer (26%)
– 8 more cases for every 10,000 women
– For 8 to equal 26% increase then:
X 8
 1.26  8  .26 X  X  30.77
X
P(Breast Cancer in Placebo Group) = 31/10,000 = .0031
P(Breast Cancer in the HRT Group) = 39/10,000 = .0038
THESE ARE STILL VERY
SMALL PROBABILITIES!
Frustrations:
1. Difference between observational studies
and experiments is subtle
2. For statisticians, there is no
contradiction, but for the public and even
scientists, there is a glaring contradiction
3. Confirms the culture of disbelief – and
who is blamed?
4. There is inherent uncertainty in the
process
Statistics is Perfect for the Law
• Since all conclusions are based on
probability – we can never say anything
definitively
• 0 and 1 are difficult to achieve ever in
practice
Implications for Teaching
• These are the topics we need to discuss
– Study Design
– Confounding and Causation
– Treatment vs. Placebo
– Absolute and Relative Risk
– Uncertainty
• “All models are wrong, but some are
useful”
– George Box (University of Wisconsin)
Further Implications
• In the courses:
– Introductory statistics
– Statistical literacy
– Mathematics for liberal arts
• Statistical thinking will one day be as
necessary a qualification for efficient
citizenship as the ability to read and write.
– H.G. Wells
Rational vs Emotional
• Statistics and Mathematics have the
perception of being rule enforcers
• People do not like being told what to do or
what not to do
• We are constantly saying do not play the
Lottery
– My life is a personal failure
Mega Millions
• July 2, 2004
• Mega-Millions jackpot reaches
$290,000,000
• Probability of winning is
.000000007399 = 7.399x10-9
Fox News Cleveland
Dr. Killjoy
• 57 times more likely to die from a motor
vehicle accident that day then win
MegaMillions
• 21 times more likely to die from lightening
strike in a year than win MegaMillions
Why Do Students Find Statistics
Frustrating?
1. Stilted Language
– Recall an earlier phrase
• “Cannot assume there is no confounding”
– We are the masters of the double negative
Confidence Intervals
• Students want to say
– The probability the mean is in the interval is
95%
• What we require them to say
– “We are 95% confident the interval (a,b)
captures the unknown population mean”
– When drawing random samples from a
population, calculating the intervals in this
manner captures the unknown mean 95% of
the time.
Hypothesis Testing
Want to say “Accept Null”
• Have to say “Fail to Reject Null”
– (AND we make them put in context)
• Again we statisticians can’t be certain (or
accepting) of anything
2. Look At What We Make Them Do
3. Statistics Taught By Folks Who
Are Not Trained Statisticians
• Statistics was added “on the side” to their
training
• Not sure of the “why”, so it is difficult to
motivate
• Teaching statistics is “scraping the bottom
of the barrel” in classroom assignments
• “In God We Trust, All Others Bring Data”
• W. Edwards Demming (TQM Guru)
• At CSU, there are at least 7 different
departments teaching some kind of
introductory statistics comprising over 100
faculty
• Only 4 faculty on campus have a Ph.D. in
Statistics
• At many schools that may be even lower
Differences Between Mathematics
and Statistics
• Statistics is too dirty
• Mathematics is pure and pristine
• Mathematics is built on axioms, definitions,
and theorems
• Statistics is built on “flawed” processes
right from the very beginning
Inferential Statistics
Giant Leaps of Faith
• Assume the population is definable
• Assume the population is stable
• Assume the sample is representative (bias
free)
• If all this is true, then can we rely on
Mathematics for our confidence interval to
capture the mean 95% of the time.
• Often mathematicians want “perfect”
studies or nothing
• “If you do not know what to measure,
measure anyway, you’ll learn what to
measure next time.”
– David Moore (Purdue University)
• Assessment
X

No Quod Erat Demonstrandum
• I get a representative sample
• The sample size is large enough to invoke
the Central Limit Theorem
• I calculate
s
X  1.96
n
• I still do not know if my interval contains the
unknown mean
ERGO
• I have to wonder . . .
• Mathematicians do not like uncertainty
Difference #2
• Applied Statisticians have to communicate with
other researchers
• These researchers often have limited statistical
training
• (Present company excluded), mathematicians
are not exactly known for their patience with
those deemed less worthy
• The main challenge is to take a scientific
hypothesis and turn into a testable statistical
hypothesis
• Have to convince researchers that input prior to
collecting data is critical
– Cleveland Cavaliers
• Have to educate them not to “Stone the
Messenger”
Difference #3
• Statisticians make more money
• Statisticians have more job options
• Go to icrunchdata.com
1-50 of 119 | First | Previous | Next | Last
Job No.
Job Title
Company Name
Date Posted
State
Exp. Salary
825
Senior Marketing Analyst
Advanced Financial Services,
Inc.
3/28/2005
RI
5-8
80-89K
824
Employment Systems Analyst
& Researcher
University of Connecticut
3/28/2005
CT
0-2
--
823
Senior Research Analyst Fortune 100 Company
UnitedHealth Group
3/25/2005
MN
3-4
--
822
Manager, Statistical Analysis
The Brixton Group, Inc.
3/24/2005
VA
5-8
100-109K,
110-119K,
120-129K,
130-139K,
140-149K
821
Sr. Statistician
The Brixton Group, Inc.
3/24/2005
VA
5-8
90-99K, 100109K
820
Business Analyst
The Brixton Group, Inc.
3/24/2005
VA
0-2
50-59K, 6069K
819
Informatics Statistics
Manager, Senior/Lead
Informatics Analyst and
Informatics Analyst
BSA Advertising for Aetna
3/22/2005
PA
0-2
--
818
DATABASE MARKETING
SPECIALIST
Home Shopping Network
3/21/2005
FL
0-2
--
810
Statistician (Marketing)
Vistrio
3/21/2005
OH
3-4
open
• Try going www.idoproofs.com
• Great Opportunities in Math
– 101 Careers in Mathematics
– http://www.maa.org
My Own History
• BS in Mathematics
• MS in Mathematics
• Took Prelims in Real Analysis, Topology,
Complex Analysis, and Math Stat
• Would have gotten a Ph.D. in mathematics
…
• I do love Mathematics and Mathematicians
• HONEST!
Why Can’t We Be Friends???
• Undergraduate Math Departments Need
Math Majors
• Graduate Statistics Departments Need
applicants
• We need to offer mathematically talented
students as many options as possible
Easier Said Than Done
• We need to let undergraduates know what
statistics is
• Traditional Probability and Statistics
sequence is NOT statistics
• Students need authentic experience
working with data
Enrollment
• 264,000 students took Elementary
Statistics according the 2000 CBMS
• www.ams.org/cbms
• 77,000 to take AP STATS in 2005
• These people are NOT welcome in
Mathematics Departments
If I were King of the World …
•
•
•
•
Calculus I, II, III
Linear Algebra
Intro Proof/Discrete
Differential Equations
•
•
•
•
Real Analysis
Probability
Math Stat
Applied Stats
Do Not Reinvent The Wheel
• The American Statistical Association has
guidelines:
– Majors
– Concentrations
– Minors
– Google Search USEI Guidelines
– Journal of Statistics Education
• www.amstat.org/jse
Shameless Plug …
• Check out an innovative statistics course
for majors at www.rossmanchance.com
(click ISCAT link)
• Beth Chance and Allan Rossman
• Investigating Statistical Concepts,
Applications, and Methods (Duxbury)
• MAA PREP Workshop July 18-22
• www.maa.org/prep/2005
Goals
• Show specific examples of frustrating news
stories involving statistics
• Discuss the importance of these “soft” ideas in
low – level courses
• “Feel the Pain” of my own tortured statistics
students
• Discuss the differences between statistics and
mathematics
• Talk about how we need each other –
desperately!!!
Last Quote
“To Understand God’s Thoughts We Must
Study Statistics, for These Are the
Measure of his Purpose”