Overview of How To Lie With Statistics by Darrell Huff
Download
Report
Transcript Overview of How To Lie With Statistics by Darrell Huff
Overview of How To Lie With
Statistics by Darrell Huff
With additional insights
Chapter 1 - Sampling Biases
• Response Bias: Tendency for people to over- or
under-state the truth
• Non-response: People who complete surveys are
systematically different from those who fail to
respond. Accessibility/Pride.
• Representative Sample: One where all sources of
bias have been removed. (Literary Digest)
• Questionnaire wording/Interviewer effects
• Recall Bias: Tendency for one group to remember
prior exposure in retrospective studies
Chapter 2 - Well-Chosen Average
• Arithmetic Mean: Evenly distributes the total
among individuals. Can be unrepresentative
when measurements are highly skewed right.
(e.g. per capita income)
• Median: Value dividing distribution into two
equal parts. 50th percentile. (e.g. median
household income)
• Mode: Most frequently observed outcome (rarely
reported with numeric data)
Chapter 3 - Little Figures Not There
• Small samples: Estimators with large standard
errors, can provide seemingly very strong effects
• Low incidence rates: Need very large samples for
meaningful estimates of low frequency events
• Significance levels/margins of error: Measures of
the strength and precision of inference
• Ranges: Report ranges or standard deviations
along with means (e.g. “normal” ranges)
• Inferring among individuals versus populations
• Clearly label chart axes
Chapter 4 - Much Ado About Nothing
• Probable Error: Estimation error with probability
0.5. If estimator is approximately normal, PE is
approximately 0.675 standard errors. (Old school)
• Margin of Error: Estimation error with probability
0.95. If estimator is approximately normal, PE is
approximately 2 standard errors
• Clinical (practical) significance: In very large
samples an effect may be significant statistically,
but not in a practical sense. Report confidence
intervals as well as P-values.
Chapter 6 - Eye-Catching Graphs
• Choice of ranges on graphs can have huge
impact on interpretation (e.g. percent change)
• Choice of proportion of y-axis to x-axis can
distort as well (very easy to do with modern
software)
• Can also distort bar charts by having them start
at positive values and/or trimming below an
artificial baseline to 0
Chapter 6 - 1-D Pictures
• Bar Charts and Pictorial Graphs should have
areas proportional to values (only make
comparisons in one dimension)
Chapter 7 - Semiattached Figure
• Target Population: Group we want to make
inference regarding
• Study Population: Group or items that
experiment or survey is conducted on
• When comparative studies are conducted among
products,treatments, or groups; what is the
comparison product, treatment, or group?
• Control for all other potential risk factors when
studying effects of factors
Chapter 8 - Causal Relationships
• Correlation does not imply causation
• Elements of causal relationships
– Association between Y and X
– Clear time ordering (X precedes Y)
– Removal of alternative explanations
(controlling for other factors)
– Dose-Response (when possible)