Transcript Slide 1
Learning and Applying Biostatistics:
How the Guinness Brewery Changed History
Katheryne Downes, M.P.H.
Statistical Data Analyst
Tampa General/USF College of Medicine
Lecture Outline
Part I: The Literature Review
Part II: Statistics
Part III: Sample Size Calculations
Part I:
The Literature Review
Who’s done what?
Literature Review
– Don’t want to duplicate efforts (or maybe you
should?)
– Can give ideas about how to (or how not to)
conduct the study
– Required for sample size calculations
Critical Review of Literature
How were patients selected/recruited?
What population are they attempting to generalize to?
Definition of intervention?
Definition of outcomes?
What was the sample size?
Sample size calculations vs. power analysis
What are the possible confounding variables? What was done
to control for these variables?
Statistics?
Interpretation of findings and conclusions?
Kat’s Notes:
The Lit Review
How big is the sample size?
– Sample size or power calculations?
Randomization? (if applicable)
– If you’re dealing with a clinical trial, randomization helps you get
rid of many potential sources of bias
Description of Design, Groups, Treatments?
– You need DETAILED descriptions of the design of the study, how
the study groups were defined and details of the treatment
(dosage, machines, devices, etc)
Kat’s Notes:
The Lit Review
Confounding variables?
– Does the author discuss/address possible confounding variables?
(i.e. variables that might be distorting the relationship between
the two variables of interest) Does the author control for
(statistically) the possible confounding variables?
Statistical significance ≠ Clinical Significance
– Read carefully and critically!
Be Careful…
REMEMBER: Just because it’s published does not
necessarily mean that it’s a good study or that it’s
without flaw. Also- remember publication bias:
Studies that show non-significant findings are often
NOT published (Despite the fact that they are
equally important)
BREAK
Part II: Statistics
Statistics in Literature: The Basics
Statistic
Confidence Intervals
– (mean +/- SD)
Significance Values
– (P-values)
Statistics in Literature:
The Confidence Interval
Confidence Intervals
– Estimation (Avg IQ = 100; 95% CI= 70-130)
– Hypothesis Testing
(Sample Avg IQ = 136, normal 95% CI = 70-130)
One Tail or Two?
One-tail:
– We hypothesize Drug A is worse than Drug B
– We Hypothesize Drug A is better than Drug B
Two-Tailed:
– We hypothesize Drug A performs differently than Drug B
(direction isn’t specified, more conservative test)
Confidence Intervals:
FAQs
Q: My Standard Deviation is larger than my mean- what did I do
wrong?!?!
A: Most likely, you didn’t do anything wrong. An SD that’s larger
than the mean indicates one of two things: 1) a lot of
variation in the dataset 2) a non-normal distribution
Q: Why is my confidence interval SO wide (or narrow)?
A: The width of the confidence interval is a reflection of its
precision. If there’s a lot of variation in the dataset or if
there’s a great deal of uncertainty in the estimate, your
interval will be quite wide. The opposite is also true.
Statistics in Literature:
Significance Values
P-value: the probability of observing your finding by chance
alone.
A p-value = .001 means that the probability of observing that
particular event by chance would only be about 1/1000.
Translation? You can be fairly certain that your observation did
NOT occur by chance alone- something intervened.
Quiz Time!
Q: What is the 95% CI for the following data: mean=30, SD=5 ?
A: 95% CI = 20 – 40
Q: For the previous question, if you obtained a sample mean
=10, what would you conclude?
A: Since 10 lies outside of the 95% CI, this event is unlikely to
have occurred by chance alone. In fact, the chances of
observing this event by chance would most likely be less than
5%
Q: How do you interpret a p-value = .05?
A: The probability that the event occurred by chance is
approximately 5%.
How the Guinness Brewery Changed History…
“Student’s” t-test
William Gossett (left)
R.A. Fisher (right)
Understanding Statistics in Literature
Are the statistics appropriate?
What, exactly, does this really mean?
– What does an odds ratio of 1.5 really mean?
– Why am I looking for a “1” or a “0” in this confidence
interval?
– What does a significant ANOVA tell you? (for that matter,
what’s an ANOVA!?!?!)
T-test/Z-test
What type of data?
(2) Group Means (continuous)
Reported as?
t-statistic/z-score & p-value
What does it REALLY test?
The difference in group distributions- in particular- the difference in
group means.
T-test/Z-test Continued…
T-tests are used when the sample size for each group is very
small
Z-tests utilize the normal distribution and can be used when
the sample size is adequately large
Not Appropriate for categorical data
ANOVA: Analysis of Variance
What type of Data?
(3+) Means (continuous)
Reported as?
F-Statistic, p-value
What does it REALLY test?
It compares the distributions of several groups simultaneously- it
examines whether the amount of variation between groups is greater
than that of within groups. A significant F-statistic tells you that the
groups are not all equal, but it does NOT tell you which groups are
different.
ANOVA
Once a significant F-statistic is obtained, your next step would
be to conduct a post-hoc test to determine which groups are
different (Tukey).
Again, cannot be used for categorical data.
Chi-Square
What type of data?
Reported as?
Categorical/dichotomous
Χ2, p-value
What does it REALLY test?
A chi-square tests whether the observed frequency of an event is different
than the expected frequency of the event (that which would occur by chance).
***Chi-Square tests can ONLY be used when each cell count is greater than
or equal to “5”
Fisher Exact Test
Works in basically the same manner as a chi-square, but it’s
used when you have cell counts below “5”
An “exact” test CAN be used when cell counts are “5” or
higher, but it becomes difficult to calculate with large sample
sizes
OR, RR, HR
OR: Odds Ratio
RR: Relative Risk or Risk Ratio
HR: Hazard Ratio
All three are ratios of risk- one test group is reflected in the
numerator, the other in the denominator- therefore, if you get
a ratio = “1” that means there’s NO DIFFERENCE between
groups. Keep this in mind while we look at them individually.
Odds Ratios
What type of data?
Reported as?
What does it REALLY test?
Case/Control Studies
OR, CI, p-value
The amount of risk associated with a particular exposure.
***An Odds Ratio must be used in case-control studies as the measure of risk
because we have incomplete information about the prevalence/incidence of
the disease in the calculations
OR: Interpretation
OR* <1: Exposure is Protective
OR*=1: No Difference
OR*>1: Exposure is Risk Factor
Example
OR, CI, and p-value
– OR = 1 = NO DIFFERENCE
– What would a CI containing “1” mean?
(OR*: The same thing applies to RR and HR)
Relative Risk
What type of data?
Cohort Studies
Reported as?
RR, CI, p-value
What does it REALLY test?
The amount of risk associated with a particular exposure.
***Relative Risk can be safely used in cohort studies because we have
incident rates available.
Quiz Time!
Q: You’re conducting a study examining the
complication rate (yes/no) in relationship to type of
plate utilized in surgery (titanium/stainless steel).
– What type of data is this? Categorical or Continuous?
– Let’s say that there are 4 people with titanium plates that
didn’t have complications- which test would you have to
use?
A: Categorical data, Fisher Exact Test
Quiz Time!
Q: You’re conducting a study on the average number of hours a
surgery takes to complete. You have 3 groups (70 people in
each): interns, residents, and fellows. What’s the appropriate
statistic to use to determine whether a difference exists
between these groups?
A. Chi-Square
B. Fisher Exact Test
C. T-test/Z-test
D. ANOVA
E. Odds Ratio
Quiz Time!
Q: The t-distribution/test was created to test the brew
quality of which of the following beers:
A.
B.
C.
D.
E.
F.
Budweiser
Coors
Presidente
Guinness
Samuel Adams
Miller
*Bonus Point: Name the country of origin of Presidente
Kat’s Notes: Statistics
Confidence Intervals
– Mean +/- SD
– Estimation
– Hypothesis Testing
P-value
– Probability of observing a phenomenon by chance alone
Kat’s Notes: Statistics
T-Test/Z-Test
– Used for testing 2 group means.
ANOVA
– Used for testing 3+ group means. Tells you that a
difference exists, but doesn’t tell you which groups are
different.
Chi-Square
– Used for categorical data (yes/no; male/female). Tells you
whether observed matches expected outcomes. Every cell
count MUST be “5” or greater.
Kat’s Notes: Statistics
Fisher Exact
– Also used for categorical data. Necessary when any cell count is
below “5”
Odds Ratio
– Used for comparing categorical data again- observed vs.
expected. Needed to approximate RR in Case-control studies
Relative Risk
– Used for comparing risk in two groups with categorical data
(sick/not sick; male/female). Can be used in cohort studies
where incidence/prevalence data are available.
BREAK
Part III: Sample Size
Why does it matter?
Why are sample size calculations so
important?
*A sample size calculation allows us to determine how
many people we need to detect a difference if one
exists…
Why does it matter?
1)
Significant difference.
-You might have been able to use a smaller sample size…
2)
So, What
Not Significant.
happens if
-You don’t
knowdon’t
whether yourdo
lack sample
of significance was due
you
to low power or the fact that no difference really
exists… size calculations?
Sample Size Calculations vs.
Power Analysis
Sample Size Calculations:
– Completed prior to gathering data
– Tells you how many people you need to investigate your
phenomenon of interest
Power Analysis
– Completed after all data has been collected and analyzed
– Determines whether you had adequate power to find a
significant difference
Sample Size Calculations
Depends on what test you’re planning on
conducting, but, in general…
– Expected value in your control (mean, proportion, etc)
– Expected differences Large or small?
– Amount of variation known to exist (SDs, etc)
Heterogeneous vs. homogeneous
Sample Size Calculations:
t-tests
From Literature/pilot study
Standard deviation
Expected difference (based off experience,
previous research or other evidence)
Remember: select your numbers from a welldesigned study. Be Careful!!
Sample Size Calculations:
Proportions Test
From the literature/pilot study:
Proportion of observed events in the control group
Anticipated proportion of observed events in the active group
(based off previous trends)
Kat’s Notes:
Sample Size Calculations
Sample Size Calculations are much more desirable
than power analysis
Obtain information from well-conducted studiesRemember: GIGO (garbage in, garbage out) Don’t
pick out your numbers from a bad study!
You generally need the 1) average value and 2)
amount of variation in your control (comparison
group)
REMEMBER!
No matter what- if you find a significant result, there’s still a
small possibility that you’re WRONG. This is inherent in
probability- we don’t have 100% certainty. We can only
attempt to minimize the possible problems.
If you fail to find a significant result- it doesn’t necessarily
mean that there isn’t a relationship there. The study might
have been structured incorrectly, used the wrong statistics,
the wrong model, the relationship might not be the form that
you think it is (linear regression on curvilinear data), or there
might be another variable interfering that you don’t know
about…
QUESTIONS?
On-Site Biostatistics:
The Take-Home Menu
Clinical Trial Design
Database Design
Sample Size Calculations
Randomization Schemes
Data Analysis
Instruction
IRB Statistical Review
Publication consultation
Thank you!