Post-test probabilities - STT
Download
Report
Transcript Post-test probabilities - STT
Statistics & graphics
for the laboratory
Diagnostic measures, with
Bayesian statistics
Dietmar Stöckl
[email protected]
Linda Thienpont
[email protected]
In cooperation with AQML: D Stöckl, L Thienpont &
• Kristian Linnet, MD, PhD
[email protected]
• Per Hyltoft Petersen, MSc
[email protected]
• Sverre Sandberg, MD, PhD
[email protected]
Prof Dr Linda M Thienpont
University of Gent
Institute for Pharmaceutical Sciences
Laboratory for Analytical Chemistry
Harelbekestraat 72, B-9000 Gent, Belgium
e-mail: [email protected]
STT Consulting
Dietmar Stöckl, PhD
Abraham Hansstraat 11
B-9667 Horebeke, Belgium
e-mail: [email protected]
Tel + FAX: +32/5549 8671
Copyright: STT Consulting 2007
Statistics & graphics for the laboratory
2
Content
Content overview
Sensitivity and specificity
ROC curves
• Influence of analytical quality on sensitivity and specificity and ROC curves
Predictive values
• Independent tests
Bayesian statistics
• ”Double Bayes”
Odds/Likelihood and likelihood ratios
• Influence of analytical quality on predictive values, Likelihood ratios
The optimal study design
Glossary
EXCEL-files
Diagnostic
Measures
Diagnostic
MeasuresCalculator
Statistics & graphics for the laboratory
3
Introduction
What is a disease?
A disease is what the patient has.
A diagnosis is what the physician thinks the patient has.
The diagnosis can vary between physicians, and as time develops.
There are often different valid methods to establish the diagnosis.
Diagnosis – The Bimodal distribution
Prerequisites
The “gold-standard”
The true status of each population has to be established by other means
than the test being subject to evaluation, namely, a so-called “gold standard”, or
reference standard.
Defining a decision point (“cut-off” value)
A decision point (sick/healthy) must be defined. Note that this point must not
lie at the crossing of the two distributions. Dependent on the importance of false
negatives or false positives, it can be moved towards increased sensitivity or
specificity.
Note: For monitoring, a medically significant change has to be defined.
Classification of results
With respect to the gold standard, test outcome is classified as “true
positive”, “false positive”, “true negative”, or “false negative”.
“Gold standard”: The best test available.
Problems
(a) new tests being “better” than the reference standard
(b) the test is part of the reference standard
The reference standard must be performed without knowledge to the results of the
test that shall be examined.
The test that shall be examined must be performed without knowledge of the
result from the reference standard.
Statistics & graphics for the laboratory
4
Bimodal situation
Bimodal situation
Classification of results
True positive (TP):
The number of diseased patients correctly classified by the test.
True negative (TN):
The number of non-diseased patients correctly classified by the test.
False positive (FP):
The number of non-diseased patients misclassified by the test.
False negative (FN):
The number of diseased patients misclassified by the test.
The 2 x 2 Table
Disease
Total
–
+
True
Positive
(TP)
False
Positive
(FP)
All with
Positive Test
TP+FP
–
False
Negative
(FN)
True
Negative
(TN)
All with
Negative Test
FN+TN
Total
All with
Disease
TP+FN
All without
Disease
FP+TN
Everyone=N
TP+FP+FN+TN
Test
+
Statistics & graphics for the laboratory
5
Sensitivity and specificity
Sensitivity and specificity
Sensitivity
= TP/(TP + FN)
Specificity
= TN/(TN + FP)
A sensitivity of 80% means that 80 percent of the diseased people will have a
positive test. For a quantitative test this is dependent on the cut-off point.
The 2 x 2 table expanded
Disease
Total
–
+
True
Positive
(TP)
False
Positive
(FP)
All with
Positive Test
TP+FP
–
False
Negative
(FN)
True
Negative
(TN)
All with
Negative Test
FN+TN
Total
All with
Disease
TP+FN
All without
Disease
FP+TN
Everyone=N
TP+FP+FN+TN
Test
measure
Test
+
Sensitivity
Specificity
TP/(TP+FN) TN/(FP+TN)
Likelihood ratios
Changing the cut-off point
A change of the cut-off will change sensitivity and specificity!
Cut-off 2.5
D+
D-
Tot
5
0
5
T-
5
10
15
Tot
10
10
20
Sens
0.50
T+
1.00 Spec
Cut-off 1.5
D+
D-
Tot
T+
8
2
10
T-
2
8
10
10
20
Tot
10
Sens
0.80
0.80 Spec
Statistics & graphics for the laboratory
6
Sensitivity and specificity
Changing the cut-off point
An EXCEL-template
Diagnostic
Measures
Sensitivity/Specificity & Prevalence
In principle, both are independent of the prevalence. BUT - in a population with
low prevalence, e.g. primary health care the disease (D+) is often in an earlier
stage, shifting the mean of D+ to the left.
Statistics & graphics for the laboratory
7
Sensitivity and specificity
Sensitivity/Specificity & Prevalence
In a population with a disease where there are many differential diagnosis giving high
values the mean value of the ”Non-Diseased” population is shifted to the right.
Sensitivity/Specificity & analytical bias
Bias changes sensitivity & specificity (same effect as moving the cut-off point!)
Sensitivity/Specificity & imprecision
Imprecision : Sensitivity & Specificity
Statistics & graphics for the laboratory
8
Sensitivity and specificity
Sensitivity/Specificity & analytical error
Summary
Systematic error
• The introduction of a systematic error in the direction of the diseased
population increases the false positive results. The introduction of a
systematic error in the direction of the healthy population would increase the
false negative results.
Random error
• The introduction of random analytical error, generally, deteriorates test
accuracy.
Sensitivity & specificity – Standard error
Same as in binomial samples.
Sensitivity: SE = SQRT[Sens • (1 – Sens)/ndiseased)]
Specificity: SE = SQRT[Spec • (1 – Spec)/nnon-diseased)]
Assumption: n • Sens • (1 – Sens) >5
Otherwise, a more complicated formula should be used.
Confidence intervals: e.g., Sens ± 1.96 SE
Statistics & graphics for the laboratory
9
ROC
ROC – Receiver Operating Curve
ROC analysis
ROC was developed during World War II for the analysis of radar images. Radar
operators had to decide whether a blip on the screen represented an enemy target,
a friendly ship, or just noise. Their ability to do so was called the Receiver Operating
Characteristics. It was not until the 1970's that ROC was recognized as useful for
interpreting medical test results.
A ROC plot is a plot of sensitivity (TP) versus 1- specificity (FP) as the underlying
values used for the cutoff (decision threshold) traverses the entire range of results.
1 - Specificity
at cut-off
ROC depends on distance of distributions
Smaller
distance
1 - Specificity
at cut-off
Bigger
distance
1 - Specificity
at cut-off
• ROC does NOT depend on analytical bias
• ROC DETERIORATES with increase of analytical imprecision
Statistics & graphics for the laboratory
10
ROC
ROC-analysis
Perfect and worthless test
Tests compared by ROC
Test A is superior to Test B
-Test A, at all cutoffs, is closer to the upper left corner of the plot.
-The area under the curve is greater for Test A than Test B.
Statistics & graphics for the laboratory
11
Predictive values
Bayesian statistics
The Bayes' theorem of conditional probability in general terms:
"Post-test" = "Pre-test" x Likelihood
Bayes' theorem comes in two equivalent forms:
• One uses the probability of disease
• Another uses the odds of disease
This leads us to:
• Predictive values
• Post-test probabilities
• Odds & Likelihood ratios
Predictive values
Predictive values & 2 x 2 table
Disease
Total
Use of the result
–
+
True
Positive
(TP)
False
Positive
(FP)
All with
Positive Test
TP+FP
Positive
Predictive
Value (PPV)
TP/(TP+FP)
Post-Test
Probability
Positive Test
=PPV
–
False
Negative
(FN)
True
Negative
(TN)
All with
Negative Test
FN+TN
Negative
Predictive
Value (NPV)
TN/(FN+TN)
Post-Test
Probability
Negative Test
=1-NPV
Total
All with
Disease
TP+FN
All without
Disease
FP+TN
Everyone=N
TP+FP+FN+TN
Test
+
Pre-Test Probability
(TP+FN)/(TP+FP+FN+TN)
Positive predictive value (PPV)
= TP/(TP + FP)
Negative predictive value (NPV)
= TN/(TN + FN)
A positive predictive value of 80% means that 80% of persons with Test+ have the
disease.
Depends on prevalence!
Statistics & graphics for the laboratory
12
Predictive values
Predictive values
Predictive values & Prevalence
Example: Urinary tract infection; WBC ≥ ++
Diagnostic
Measurescalculator
Under the given SENS & SPEC, the PPV increases with the prevalence!
Predictive values & Post-test Probabilities
The post-test probability of disease present (D+) when the test is positive (T+) =
PPV.
The post-test probability of D-/T- = NPV
The post-test probability of D+/T- = 1 - NPV
The post-test probability of D-/T+ = 1 – PPV
Discriminatory power of a test for disease (D+)
Compare PPV with 1 – NPV of a test for D+
Statistics & graphics for the laboratory
13
Post-test probabilities
Post-test probabilities
PPV
1-NPV
An EXCEL file shows the connection between post-test probability D+/T+ (= PPV)
and the post-test probability of D+/T- (= 1 – NPV) with varying distances of healthy
and diseased.
Statistics & graphics for the laboratory
14
Post-test probabilities
Post-test probabilities
Influence of bias
Under the given circumstances, bias (here in the direction of the diseased)
decreases the post-test probability of D+/T+ and D+/T-. The net effect is that the
discriminatory power decreases.
Influence of imprecision
Under the given circumstances, imprecision decreases the post-test probability of
D+/T+ and and increases that of D+/T-. The net effect is that the discriminatory
power of the test is deteriorated.
Statistics & graphics for the laboratory
15
Post-test probabilities
To test or not to test?
To test or not to test…?
Bandolier Extra, February 2002
Do not
test
Do not
treat
0.0
Test, and treat
on the basis of the
test’s result
0.20
Do not test
Get on with
treatment
0.70
1.0
Prevalence or pre-test probability of target disorder
Case study: Urinary tract infection
A woman of 32 years has the last few days experienced a little increased urgency
and some pain when she goes to the toilet.
What is the probability that she has a urinary tract infection. With a probability
above 75% you will treat and not test. With a probability of less than 15% you will
not treat nor test.
• WBC ≥ +2: sens = 0.82 and spec. = 0.88
• Nitritis pos: sens = 0.5 and spec. = 0.90
In this case you estimate the pre-test probability to 20% and you test.
Statistics & graphics for the laboratory
16
Post-test probabilities
Case study: Urinary tract infection
Two independent tests
• WBC and
• Nitritis
In this case, we use the 2 tests consecutivley. We take the post-test prabability of
WBC (63%) as pre-test probability (= prevalence) for the nitritis test. In that way,
we arrive at a total post-test probability of 89%.
(D+/T-)
(D+/T-)
Statistics & graphics for the laboratory
17
Post-test probabilities
”Double Bayes”
A 30 year old woman has pain in her stomach and you wonder if this can be due
to an ulcus duodeni. 95% of ulci is caused by Helicobacter pylori. However 15% of
the population (of this age) carries Helicobacter pylori without having any
symptoms.
You have a rapid test with a sensitivity of 85% and specificity of 80% to detect
Helicobacter pylori.
What is the probability that she has an ulcus if she has a positive rapid test?
What is the probability that the woman has ulcus if the test is negative?
Probability: combination ulcus/bacteria
With the epidemiologic information about ulci and Helicobacter Pylori, we are able
to set up a 2x2 table before doing the test. However, again, we introduce a
subjective statement of the doctor about the pre-test probability: chosen at 30%
according to the person and symptoms.
• With the symptoms given by the patient, the
doctor estimates the pre-test probability of
ulcus to be 30%.
• 95% of patients with ulcus have bacteria:
sens=95%
• 15% of healthy have bacteria:
spec = 85%
Calculations with a total population of 1000
Pre-test probability of ulcus to be 30%: 0.3*1000 = 300
Sens=95%: 0.95*300 = 285
Spec=85%: 0.85*700 = 595
Then, we split the table in two 2x2 tables, one for bacteria+ and one for bacteria-.
Those will be used as the new totals of the tables (bottom).
Now, we make the test, and calculate the respective fields with the sensitivity and
specificity data of the test.
Statistics & graphics for the laboratory
18
Post-test probabilities
”Double Bayes”
Post-test probabilities of ulcus
Rapid test - sensitivity 85%, specificity 80% to detect Helicobacter pylori bacteria.
Bacteria +:calculate with sensitivity
Calculate "New" TP = 285 * Sens = 242$
Calculate "New" FP = 105 * Sens = 89$
Calculate "New" FN = 285 – 242 = 43# (similar TN)
Bacteria -: calculate with specificity
Calculate "New" FN = 15 * Spec = 12$
Calculate "New" TN = 595 * Spec = 476$
Calculate "New" TP = 15 – 12 = 3# (similar FP)
Pos. pred value for ulcus = (242+3)/(242+89+3+119)=0.54
Neg. pred value for ulcus = (16+476)/(43+16+12+476)=0.9
Statistics & graphics for the laboratory
19
Odds
Odds
Pre-test Odds
= Diseased/Non-diseased = Prevalence/(1-Prevalence)
= (TP + FN)/(FP +TN)
Post-test Odds
= Diseased with T+/Non-diseased with T+
= TP/FP
Odds & Probability
If you have 10 people, 7 are healthy and 3 have the disease.
• Odds for disease = 3/7
• Probability for disease = 3/10
Probability
• Total number of possibilities are always in the denominator
Odds
• Total number of possibilities are the sum of denominator and nominator.
Sick
1
10
100
1000
2500
5000
7500
9000
9900
9990
9999
Probability and Odds
Healthy
Probability
9999
0.0001
9990
0.001
9900
0.01
9000
0.1
7500
0.25
5000
0.5
2500
0.75
1000
0.9
100
0.99
10
0.999
1
0.9999
Odds
0.0001
0.0010
0.0101
0.1111
0.3333
1
3
9
99
999
9999
Statistics & graphics for the laboratory
20
Likelihood ratio
Likelihood ratio
Likelihood (probability) for a pos. test in the diseased population divided by the
likelihood for a pos. test in the non diseased population (LR+)
LR+: The ratio of the true postive rate to the false positive rate .
= Sensitivity/(1 – Specificity)
= [TP/(FN + TP)]/[FP/(TN + FP)]
LR–: Likelihood ratio for a negative test (LR-):
• (1–sensitivity)/specificity or
• [FN/(TP+FN)]/[TN/(FP+TN)] or
• The ratio of the false negative to the true negative rate.
It can be shown that
The Likelihood ratio (LR) can be expressed as: = Post-test Odds/Pre-test Odds
and therefore: Post-test Odds = LR x Pre-test Odds
Likelihood ratios for intervals of test results
(Goldstein and Mushlin. J Gen Intern Med 1987;2:20-24.
http://gim.unmc.edu/dxtests)
T4 - value
Hypothyreot
Euthyreot
LR
<5
18
1
52
5.1 – 7
7
17
1.2
7.1 – 9
4
36
0.3
>9
3
39
0.2
S 32
S 93
Statistics & graphics for the laboratory
21
Likelihood ratio
Likelihood ratio
The best test to use for ruling in a disease is the one with the largest likelihood
ratio of a positive test.
The better test to use to rule out disease is the one with the smaller likelihood ratio
of a negative test.
The Positive [Negative] Likelihood ratio measures the diagnostic power of a test
to change the pre-test into the post-test probability of a disease being present
[absent]. The Table below shows how much LRs change disease likelihood.
Note: LR>1 [<1]
include [exclude]
condition.
Likelihood ratios – Advantages over Sensitivity/Specificity
1. They give direct information about the power of a test to discriminate between
sick/healthy.
2. They allow the direct calculation of Post-test probabilities from Pre-test
Probabilities with the Bayes Theorem. For this purpose, however, Pre- and Post
test Probabilites have to be transformed into "Odds".
3. LRs can be used to directly calculate the discriminatory power of test cascades.
4. For continuous data, LRs can be calculated at different cut-off values and thus
allow a more precise estimation of the discriminatory power of a specific test
result.
For completeness: Diagnostic Odds ratio
= LR+/LR- = (TP/FP)/(FN/TN)
Importance of prevalence
Odds and probabilities for a disease
• are PREVALENCE dependent.
Likelihood ratios
• are NOT prevalence dependent
Statistics & graphics for the laboratory
22
All diagnostic measures
All diagnostic measures
Disease
Total
Use of the result
–
+
True
Positive
(TP)
False
Positive
(FP)
All with
Positive Test
TP+FP
Positive
Predictive
Value (PPV)
TP/(TP+FP)
Post-Test
Probability
Positive Test
=PPV
–
False
Negative
(FN)
True
Negative
(TN)
All with
Negative Test
FN+TN
Negative
Predictive
Value (NPV)
TN/(FN+TN)
Post-Test
Probability
Negative Test
=1-NPV
Total
All with
Disease
TP+FN
All without
Disease
FP+TN
Everyone=N
TP+FP+FN+TN
Test
measure
Test
+
Sensitivity
Specificity
TP/(TP+FN) TN/(FP+TN)
Pre-Test Probability
(TP+FN)/(TP+FP+FN+TN)
Likelihood ratios
Statistics & graphics for the laboratory
23
Fagan Nomogram
Fagan Nomogram
Statistics & graphics for the laboratory
24
Optimal study design
The STARD initiative
The STARD Initiative –
Towards Complete and Accurate Reporting of Studies on
Diagnostic Accuracy
(http://www.consort-statement.org/stardstatement.htm)
See also
www.bazian.com
More EBM/Test utility
•http://www.mclibrary.duke.edu/respub/guides/ebm/ratios.html
•http://www.cebm.utoronto.ca/practise/ca/statscal/
•http://www.math.bcit.ca/faculty/david_sabo/apples/math2441/section8/oddsratio/o
ddsratio.htm - Prosp
•http://painconsortium.nih.gov/symptomresearch/chapter_14/Part_1/sec6/chspt1s
6pg1.htm
Validity >
Hierarchy of major study designs
• Review of Randomised control clinical trial (RCT)
• RCT (interventional)
• Cohort (observational)
• Case control
Best design = prospective, blind comparison of the test and reference standard in
a consecutive (or randomly selected) series of patients from a relevant clinical
population.
Internal validity
Selection bias - for example, we select sicker patients to receive active treatment
and fitter patients to receive inactive treatment.
Observer bias - for example, we know that a patient had active treatment so we
subconsciously encourage her to rate her quality of life as higher than it really is.
Participant bias - for example, in a study of aspirin versus no treatment, people
allocated to no treatment take aspirin anyway.
Withdrawal bias or drop out bias - when we lose people to follow up, those that
remain for analysis at the end of the study may not be representative of the group
originally included at the start of the study.
Recall bias - for example, mothers of children with leukemia may remember living
near high voltage power cables because they fear a link between power lines and
cancer, while mothers of children without leukemia are likely to forget whether they
lived near a power line, because they regard it as a trivial fact.
Instrument or measurement bias
Publication bias - results from researchers and journals being biased towards
publishing only positive results.
Statistics & graphics for the laboratory
25
Optimal study design
External validity ("Bazian")
Was the question relevant to me?
Schematic representation of a randomized controlled trial.
Exter nal validity: Was the question, and PETO
chosen, relevant to me?
Outcome
Question
Ee
study population
r1
No Outcome
r2
Outcome
Ec
Bias : was the study
population biased
towards some
atypical group?
Power
calculation?
No Outcome
random
sample from
study
population
Bias : was the
sample really
selected
randomly ?
Power : are
there enough
people?
random allocation to
experimental control
exposure
Bias : was allocation
properly randomised?
Was it double or singleblinded?
Confounding: did, b y
chance, people with a
confounding factor all end
up in one group? (Ee or
Ec)
Inter nal validity:
were outcomes
and exposures
measured
meaningfully?
Bias : did the
measuring
instruments skew
results in any
particular
direction?
Assessment of study quality
Critical appraisal checklist
• Study design
• Internal validity
• External validity
• What are the results?
- sensitivity, specificity, LR, ROC, CI, prevalence, thresholds, etc.
Best design
= prospective, blind comparison of the test and reference standard in a
consecutive (or randomly selected) series of patients from a relevant clinical
population.
Statistics & graphics for the laboratory
26
Optimal study design
Critical appraisal checklist
Internal validity – spectrum of diseases
Normal persons compared to persons with the disease.
> Overestimation of TP and TN
Verification bias
Did the result of the test being evaluated influence the decision to perform the
reference standard?
Use of different reference standards in test positive and test negative cases leads
often to misclassification of FN as TN, and overestimates both sensitivity and
specificity.
For example:
It is difficult to estimate the value of CRP in diagnosing appendicitis if only patients
with elevated CRP are operated on and the histological examination of appendix is
used as the reference standard these and follow up as reference standard for the
others.
Review bias
If interpretation of results of the reference test and the experimental test are not
blinded, this may lead to overestimation of both sensitivity and specificity
External validity
Can I use the results in my practice?
Specificity falls - more disease which are similar (FP rate increases)
and
sensitivity increases - disease in a more advanced stage (FN rate decreases)
What are the results?
The results must be clearly stated with e.g. sensitivity, specificity, LR, ROC, CI,
prevalence, thresholds, etc.
See also
Critical appraisal worksheets
http://www.cebm.utoronto.ca/teach/materials/dx.htm
Statistics & graphics for the laboratory
27
Glossary
The 2 X 2 table, including some measures for test accuracy
Disease
Total
–
+
True
Positive
(TP)
False
Positive
(FP)
All with
Positive Test
TP+FP
–
False
Negative
(FN)
True
Negative
(TN)
All with
Negative Test
FN+TN
Total
All with
Disease
TP+FN
All without
Disease
FP+TN
Everyone=N
TP+FP+FN+TN
Test
+
True Positive (TP) = Postive with Test & Positive with Gold Standard
False Positive (FP) = Postive with Test & Negative with Gold Standard
False Negative (FN) = Negative with Test & Positive with Gold Standard
True Negative (TN) = Negative with Test & Negative with Gold Standard
Statistics & graphics for the laboratory
28
Glossary
Glossary
Specific, unconditional measures (independent of prevalence)
Sensitivity & Specificity (see also Figures below)
Sensitivity: TP/(TP+FN); true positive rate = The proportion of people with the
target disorder who have a positive test (used to assist in assessing and selecting
a diagnostic test/sign/symptom).
• Strongly depends on a cutoff-point; meaningless when healthy and diseased
populations are the same!
• For a test to be useful in ruling out a disease, it must have a high sensitivity
(>SnNOut = Sensitivity, negative [=test result], out).
Specificity: TN/(FP+TN); true negative rate = Proportion of people without the
target disorder who have a negative test (used to assist in assessing and selecting
a diagnostic test/sign/symptom).
• Strongly depends on a cutoff-point; meaningless when healthy and diseased
populations are the same!
• For a test to be useful at confirming a disease (ruling in), it must have a high
specificity (>SpPIn = Specificity, positive [=test result], in).
-Sensitivity & Specificity are interrelated (when Sn, Sp & vice versa) and
should be interpreted together. This means, for example, it is not possible to
produce SnNOuts or SpPIns by simply adjusting the threshold (cut-off).
-In theory, Sensitivity & Specificity are independent of prevalence. In practice
however, the 2 populations (healthy, sick) may contain different grades of disease
or health in different situations. In a low prevalence situation, for example, the
diseased population may be shifted to the left because it contains more patients
with less severe disease than in a high prevalence situation ("error of spectrum").
Statistics & graphics for the laboratory
29
Glossary
Likelihood ratios
Likelihood ratio (LR) = The likelihood that a given test result would be expected in a
patient with the target disorder compared with the likelihood that this same result
would be expected in a patient without the target disorder.
Likelihood ratio for a positive test (LR+): sensitivity/(1–specificity) or
[TP/(TP+FN)]/[FP/(FP+TN)] = The ratio of the true postive rate to the false positive
rate (calculation: substitute in 1 –specificity the "1" by (FP+TN)/(FP+TN).
• The best test to use for ruling in a disease is the one with the largest
likelihood ratio of a positive test.
Likelihood ratio for a negative test (LR-): (1–sensitivity)/specificity or
[FN/(TP+FN)]/[TN/(FP+TN)] = The ratio of the false negative to the true negative
rate.
• The better test to use to rule out disease is the one with the smaller
likelihood ratio of a negative test.
The Positive [Negative] Likelihood ratio measures the diagnostic power of a test to
change the pre-test into the post-test probability of a disease being present [absent].
The Table below shows how much LRs change disease likelihood.
Note: LR>1 [<1]
include [exclude]
condition.
Likelihood ratios – Advantages over Sensitivity/Specificity
1. They give direct information about the power of a test to discriminate between
sick/healthy.
2. They allow the direct calculation of Post-test probabilities from Pre-test Probabilities
with the Bayes Theorem (see below). For this purpose, however, Pre- and Post test
Probabilites have to be transformed into "Odds" (see below).
3. LRs can be used to directly calculate the discriminatory power of test cascades.
4. For continuous data, LRs can be calculated at different cut-off values and thus
allow a more precise estimation of the discriminatory power of a specific test result.
Statistics & graphics for the laboratory
30
Glossary
Glossary
Global, unconditional measures (independent of prevalence)
• Diagnostic odds ratio (DOR): LR+/LR- (Odds: see below: Bayes' Theorem)
• Area under the Receiver Operating Characteristic (ROC) curve (see also
Figure below)
ROC shows the relationship between specificity (or better 1-specificity) and
sensitivity when the cut-off value is moved over the whole range of values (from
sick to healthy).
Statistics & graphics for the laboratory
31
Glossary
Glossary
Specific, conditional measures (dependent on prevalence)
Prevalence or Pre-Test Probability: (TP+FN)/(TP+FP+FN+TN) = The proportion of
people with the target disorder in the population at risk at a specific time (point
prevalence) or time interval (period prevalence).
• Predictive Values
Positive [Test] Predictive Value (PPV) [for Disease, D+]: TP/(TP+FP) = Proportion
of people with a positive test who have the target disorder.
Negative [Test] Predictive Value (NPV) [for Health, D-]: TN/(FN+TN) = Proportion of
people with a negative test result who are free of the target disorder.
• Post-Test Probabilities
Post-Test Probability [for D+] Positive Test = PPV (see also below: Bayes'
Theorem; can also be calculated with LR+)
Post-Test Probability [for D+] Negative Test = 1–NPV (also calculated with LR-)
Post-Test Probability [for D-] Negative Test = NPV (also calculated with [1/LR-])
Post-Test Probability [for D-] Positive Test = 1–PPV (also calculated with [1/LR+])
Global, conditional measure (dependent on prevalence)
Accuracy: (TP+TN)/(TP+FP+FN+TN) = The proportion of patients for whom a correct
diagnosis has been made.
Statistics & graphics for the laboratory
32
Glossary
The Bayes' theorem, generally: "Post-test" = "Pre-test" x LR
Bayes' theorem comes in two equivalent forms:
• One uses the probability of disease
• Another uses the odds of disease
Odds = A ratio of the number of people incurring an event (e.g., disease) to the
number of people who have non-events (e.g., no disease).
Pre-test Odds for disease = (TP+FN)/(FP+TN)
Bayes' theorem, generally: "Post-test" = "Pre-test" x LR
• with Probability
Post-test Probability = [Pre-test Probability/(1-Pre-test Probability)] * LR/{[Pre-test
Probability/(1-Pre-test Probability)] * LR + 1}
Note:
Pre-test Probability/(1-Pre-test Probability) =
Prevalence/(1-Prevalence) = Pre-test Odds
• with Odds
Post-test Odds (T+ or T–) = Pre-test Odds x Likelihood ratio (LR+ or LR–)
The likelihood ratio, which combines information from sensitivity and specificity,
gives an indication of how much the odds of disease change based on a
positive or a negative result. You need to know the pre-test odds, which
incorporates information about prevlance of the disease, characteristics of your
patient pool, and specific information about this patient. You then multiply the
pre-test odds by the likelihood ratio to get the post-test odds.
Pre-test Odds: Prevalence/(1–Prevalence) = The odds that the patient has the
target disorder before the test is carried out.
Post-test odds: Pre-test odds x Likelihood ratio = The odds that the patient
has the target disorder after the test is carried out.
Statistics & graphics for the laboratory
33
Glossary
Glossary
Calculation example
Sensitivity: 0.9;
Specificity: 0.83;
Prevalence for disease (D+): 0.1
> Prevalence for D-: 0.9; Pre-test Odds for D+: 0.111; Pre-test Odds for D-: 9
Measures of diagnostic test accuracy – Use and relationship
• The test
Sensitivity, Specificity, Likelihood ratio, and ROC characterize a test and are
independent of the prevalence of a disease. However, except ROC, they are
dependent on the chosen cut-off. ROC, in fact, shows the relationship between
sensitivity and specificity when the cut-off value is moved over the whole range of
values (from sick to healthy).
•The patient (tested/not tested)
Predictive Values, Pre- and Post-test Probabilities, and Pre- and Post-test Odds
are used to decide whether a test should be done for a particular patient and
if a test is done, they give information about the probability of the
presence/absence of a disease. They are dependent on the prevalence of a
disease.
Post-test Probabilities and Predictive Values are identical (the Post-test
Probability of a positive test = Positive Predictive Value) or closely related (the
Post-test Probability of a negative test = 1–Negative Predictive Value). The Posttest Odds give similar information as the two before, however, they carry it in
different numbers. Odds are preferred by many because Post-test Odds can easily
be calculated from Pre-test Odds and the Likelihood ratio: Post-test Odds = Pretest Odds x Likelihood ratio (Bayes' Theorem). Usually, Post-test Probabilites and
Post-test Odds are calculated for Disease (D+), however, they also can be
calculated for Disease absent (D-) (see above).
While a nomogram is available for obtaining Post-test Probabilities from Pretest Probabilities and the Likelood ratio ("Fagans Nomogram"), it is more accurate
to calculate them with so-called "Bayesian Calculators" that are available for free
on the net.
References
1. The Bayes Library of Diagnostic Studies and Reviews. 2 nd edition 2002.
http://www.ispm.unibe.ch/files/file/261.Bayes_library_handbook.pdf
2. Henderson AR. Assessing test accuracy and its clinical consequences: a
primer for receiver operating characteristic curve analysis [Review]. Ann Clin
Biochem 1993;30:521-39.
Statistics & graphics for the laboratory
34
Glossary
Diagnostic measures and their main use
Diagnostic Test -Sensitivity, Specificity, PPV, NPV, LR+, and LRProspective Study - Relative Risk (RR), Absolute Relative Risk (ARR), and
Number Needed to Treat (NNT)
Case-control Study - Odds Ratio (OR)
Randomized Control Trial (RCT) - Relative Risk Reduction (RRR), ARR, and
NNT
Glossary of terms
Case-control study
A study which involves identifying patients who have the outcome of interest
(cases) and patients without the same outcome (controls), and looking back to see
if they had the exposure of interest. Retrospective
Cohort Study
Involves identification of 2 groups (cohorts) of patients, one which received the
exposure of interest, and one which did not, and following these cohorts forward
for the outcome of interest. Prospective: Present > Future; "Past assembled" >
Present
Control Event Rate (CER)
The frequency with which the outcome of interest occurs in the study group not
receiving the experimental therapy.
Event rate
The proportion of patients in a group in whom the event is observed. Thus if out of
100 patients, the event is observed in 27, the event rate is 0.27. Control event rate
(CER) refers to the proportion of patients in the control group who experience the
event and the experimental event rate (EER) is the proportion of patients in the
experimental group who experience the event of interest. The patient expected
event rate (PEER) refers to the rate of events we'd expect in a patient who
received conventional therapy or no treatment.
Experimental event rate (EER)
The proportion of patients in the experimental treatment group who are observed
to experience the outcome of interest.
Likelihood ratio
The likelihood that a given test result would be expected in a patient with the
target disorder compared with the likelihood that this same result would be
expected in a patient without the target disorder.
Negative predictive value
Proportion of people with a negative test result who are free of the target disorder.
Statistics & graphics for the laboratory
35
Glossary
Glossary of terms (ctd.)
Number needed to treat (NNT)
The number of patients that we need to treat with a specified therapy in order to
prevent one additional bad outcome. Calculated as the inverse of the absolute risk
reduction (1/ARR).
Odds
A ratio of the number of people incurring an event to the number of people who
have non-events.
Odds ratio (OR)
The ratio of the odds of having the target disorder in the experimental group
relative to the odds in favour of having the target disorder in the control group (in
cohort studies or systematic reviews) or the odds in favour of being exposed in
subjects with the target disorder divided by the odds in favour of being exposed in
control subjects (without the target disorder).
Positive predictive value
Proportion of people with a positive test who have the target disorder.
Post-test odds
The odds that the patient has the target disorder after the test is carried out
(calculated as the pre-test odds x likelihood ratio).
Pre-test probability (prevalence)
The proportion of people with the target disorder in the population at risk at a
specific time (point prevalence) or time interval (period prevalence).
Randomised control clinical trial (RCT)
A group of patients is randomised into an experimental group and a control group.
These groups are followed up for the variables/outcomes of interest.
Relative risk reduction (RRR)
This is a measure of treatment effect and is calculated as (CER-EER)/CER.
Risk Ratio
The ratio of risk in the treated group (EER) to the risk in the control group (CER).
This is used in randomised trials and cohort studies and is calculated as
EER/CER.
Sensitivity
The proportion of people with the target disorder who have a positive test. It is
used to assist in assessing and selecting a diagnostic test/sign/symptom.
Specificity
Proportion of people without the target disorder who have a negative test. It is
used to assist in assessing and selecting a diagnostic test/sign/symptom.
Statistics & graphics for the laboratory
36