Transcript Prob(E+)

Introduction
9th International Workshop on
Plant Disease Epidemiology.
Laederneau, France,
April 10-15,2005
International Workshop on Plant Disease Epidemiology
1. Pau, France,
1963
2. The
Netherlands,
1971
3. Penn State
Univ. 1979
4. North Carolina
State Univ.,
1983
9.
Landerneau,
France, 2005
5. Jerusalem,
Israel, 1986
6. Giessen,
Germany,
1990
7. Papendal, The
Netherlands,
1994
8. Ouro Preto,
Brazil, 2001
Beijing, China
2009
L. V. Madden
IEW9
• Time: April 10-15,2005
• Address: Landerneau, France
• Subject: Facing challenges of the 21th
century
• Country Number:
• Research scientist: 100
• Keynote Number: 13
• Chair: Laurence V.Madden(OSU)
Botanical epidemiology:
Some key advances, and
its continuing role in
disease management
Historical Background
• 1963—a most important year:
– Plant Disease: Epidemics and Control
by J. E. Vanderplank (“van der Plank”)
– NATO Advanced Study Institute international meeting on
plant disease epidemiology
• Now ‘known’ as the “1st International Workshop on Plant
Disease Epidemiology”
• Vanderplank made the compelling argument that:
– “Chemical industry and plant breeders have forged fine tactical
weapons; but only epidemiology sets the strategy.”
• Although this audience would certainly accept this statement,
unfortunately not all plant pathologists consider epidemiology,
especially some of the ‘sophisticated’ aspects, in developing
and testing controls
Historical background
• Tremendous growth in the discipline in the 1960s, 1970s, and
1980s.
– Eclipsed by the even larger growth in molecular biology across all of
the biological sciences
• Nevertheless, epidemiology has been, and will continue to be, of
critical importance until
– Broad-acting durable resistance to all major diseases is achieved, or
– Until there are highly effective and inexpensive fungicides, with no
environmental concerns, and no pathogen resistance
• The new concerns about invasive pathogen (and pest)
species around the world, as well as emerging (re-emerging)
diseases, and biosecurity and risk assessment, only serve to
increase the importance of epidemiology
Botanical epidemiology
• Numerous advances have been made over the last 40+
years, and many of the speakers at this workshop will
highlight some of them
– Multiple pathogen and host taxa, multiple pathosystems, crop
loss, genetics (including molecular), spatial analysis,
evolution, and many more topics….
• I will outline two major areas of research, both of which
have implications for control strategies and tactics
– Temporal and spatio-temporal disease dynamics
• Importance of models, mathematics, and statistics – general
– Prediction of epidemics (or the need for a control intervention)
on a real-time basis, using concepts of decision theory
• Importance of prior knowledge of the prevalence of epidemics,
accuracy of predictors, and costs of decisions – specific example
Temporal disease dynamics
The fundamental importance of disease
progress curves for characterizing,
comparing, understanding, and
predicting epidemics has been
understood for 40+ years.
For polycyclic diseases, the logistic
model has been the first choice for
quantifying epidemics, for practical and
theoretical reasons.
dy
dt
 rL y 1  y 
y
1
1  e ( a  rLt )
 y 
  a  rLt
ln 
1 y 
From Vanderplank
(1963, page 29)
Temporal disease dynamics
All models are simplifications of reality, and as one
would expect, all polycyclic epidemics are not
adequately described by the logistic model.
dy
dt
 rL y1  y 
Many alternatives are possible, some of which
have some theoretical justification, and some of
which are very flexible.
dy
dt
 f ( y; r* , K ,...)
As easily shown (but not here), a good fit to a
particular model is not proof of a particular
mechanism. However, a consistent fit of a given
model could lead one to hypothesize about a
mechanism (that could be further tested).
Using an appropriate model is important for
comparing epidemics, forecasting magnitude of
disease increase, and developing controls.
Nevertheless, the logistic (or mononmolecular)
is remarkably useful for summarizing
epidemics.
Temporal disease dynamics
Vanderplank understood that to
increase understanding of polycyclic
epidemics, more complex models
were required.
dyt
dt
 rL yt 1  yt 


dyt
 Rc yt   yt  1  yt 
dt
Differential-delay equation to
relate rate of disease increase to
infectious disease intensity
A cumbersome approach for developing
principles of epidemics (and control),
expansions for additional epidemic
features, and model fitting
The
“Contemporary”
Approach
Coupled differential equations
(SEIR model)
Susceptible-Exposed-Infectious-Removed (Recovered) or
Healthy-Latent-Infectious-Removed
Four host states in
the population

H
Infection
L

I

Reaching the end
of the latent period
and becoming
infectious
Figure 5.6
R
Reaching the end
of the infectious
period and
becoming removed
1/ω: mean latent period
1/μ: mean infectious period
Coupled differential equations

dH
HI
dt
dL
  H I L
dt
dI
  L  I
dt
dR
 I
dt
One basis for defining the
basic reproduction
number (ratio), R0
H
Infection
L

I
Reaching the end
of the latent period
and becoming
infectious

R
Reaching the end
of the infectious
period and
becoming removed
Figure 5.6Y = L+I+R
Coupled differential equations
Expansion for: host dynamics, simple-interest component,
vector transmission, spatial heterogeneity, etc.)
Primary
infections
Host ‘growth’
and mortality


H
dH
   H I  xH   ( H max  H )
dt
dL
  H I  xH   L  L
dt
dI
  L   I  I
dt
dR
  I  R
dt
dx
 x
dt
Inoculum mortality


L

x


I


R

R0
Number of new infected individuals (e.g., diseased plants, sites
on leaves, etc.) resulting from a single infected individual
placed in a disease-free host population
• Threshold for an epidemic of a polycyclic disease
• R0 = 1
• Final intensity of disease (“epidemic size”)
– Or steady-state intensity of disease
• Prediction of exponential rate of increase early in
epidemic (rE) – the link to more descriptive approaches
• Other threshold formulae when there is a simpleinterest component
Coupled differential equations:
Control strategies (vector-virus example)
Temporal dynamics:
Selected References
• C. A. Gilligan (2002). Advances in Botanical Research 38: 1-64.
• Excellent review and synthesis of many modeling approaches for
understanding epidemics and developing control strategies (with a
special emphasis on root diseases)
– Lots of references to important work (great place to start)
• M. J. Jeger, J. Holt, F. van den Bosch, and L. V. Madden (2004).
Physiological Entomology 29: 291-304.
• Synthesis of approaches for modeling plant viruses and phytoplasma
(i.e., pathogens with arthropod vectors), with an emphasis on control
strategies
– Lots of references to other work with plant viruses
• J. Segarra, M. J. Jeger, and F. van den Bosch (2001).
Phytopathology 91: 1001-1010.
• Linkages (and commonality) of various modeling approaches
Temporal disease dynamics:
Contemporary statistical models for repeated measures
• Disease progress curves are comprised of longitudinal data (repeated
measures)
• Considerable advances have been made in statistics over the last 20
years for analyzing longitudinal data that have not been adequately
incorporated into botanical epidemiology (or plant pathology), including:
– Linear Mixed Models
• Simultaneous modeling of disease progress and treatment effects
• Complex covariances (correlations)
– Fixed or heterogeneous variances
– Generalized linear mixed models (non-normal)
– Nonlinear Mixed Models
• Wider class of models (and, thus, biological realism)
• Narrower class of experimental designs (but growing)
• Not for the timid (!)
– Nonparametric models (“relative marginal effects”)
• Ideal for common ordinal disease ratings
• Considerable advances in statistical computation
It could be argued
that we are not
retrieving the
maximum amount
of information from
the experiments
and surveys we
conduct (or are
using results with
excessive type I
and II error rates)
Temporal disease dynamics:
Contemporary models for repeated measures
Instead of a regression line through the replicated data at each time,
think of the profiles of Y over time for each experimental unit.
There is variation within each unit and between units.
F(y) = f(t; parameters) + between-plot error + within-plot error
Nonzero correlations and
unequal variances
Temporal disease dynamics:
References
• Schabenberger & Pierce (2002). Contemporary Statistical Models
for the Plant and Soil Sciences. CRC Press.
– Outstanding new general textbook.
• Diggle, Liang, & Zeger. (1994). Analysis of Longitudinal Data.
Clarendon Press.
• Brunner, Domhof, and Langer. (2002). Nonparametric Analysis of
Longitudinal Data in Factorial Experiments. Wiley.
– Great for ordinal rating data.
• Garrett, Madden, Hughes, and Pfender. (2004). Phytopathology 94:
999-1003.
– Discussion of several of the developments in statistics that are relevant
in plant pathology.
– Gives the reader key references for learning methods.
Disease Dynamics—some considerations
• “As a matter of fact all epidemiology, concerned as it is with
variation of disease from time to time or from place to place, must
be considered mathematically…if it is to be considered scientifically
at all.”
– Sir Ronald Ross (1911)
• However, Anderson & May (1991) and Jeger (2004) lament that
insights gained from advanced theoretical (i.e., mathematical) work
have had inadequate impact on empirical studies and on practical
disease management
– In plant pathology, most characterizations of epidemics, and evaluations of
controls, rely on the simpler (one-variable) population growth models (e.g.,
logistic), and the related AUDPC
– It is our challenge to continually ‘bridge the gap’ between
mathematical/statistical and empirical disciplines
• One area where the gap is bridged involves disease prediction,
because it involves many empirical observations of disease
intensity, and (often) descriptive and/or more mechanistic models
Disease prediction
• “Of the potential benefits of mathematical modeling to
improving the efficiency of control of crop disease, prediction
stands foremost.”
• C. A. Gilligan (1985)
• Concept:
– Prediction of an outbreak of disease, or an increase in disease
intensity, based on weather, crop, pathogen, and/or vector
variables, or prediction of the need for a control intervention
– Also called disease forecasting or disease warning
– Generally based on completed infection events
• Long history in botanical epidemiology:
– Early predictors of late blight, apple scab, etc.
– 1960 book chapter by Paul Waggoner
– Many reviews
Prediction or forecasting
• A humbling experience…
– “Forecasting is difficult, especially forecasting the future”
– Victor Borge (also attributed to Niels Bohr)
– “The trouble with … forecasting is that it's right too often for us to
ignore it and wrong too often for us to rely on it”
– Patrick Young (regarding the weather)
• As stated by R. D. Shrum (1978):
– “…forecasting means ‘to foresee or to calculate beforehand’. Thus,
the calculation of probabilities is implicit in the meaning of the word”
• With probabilities (and right and wrong decisions) there are
statistics, so one must keep in mind:
– "An unsophisticated forecaster uses statistics as a drunken man uses
lamp-posts -- for support rather than for illumination”
– Andrew Lang
– But, hopefully, in botanical epidemiology, we use statistics (correctly)
for illumination (i.e., understanding)
Prediction
• Although epidemiologists have been developing predictive
(warning, forecasting) systems for decades, major conceptual
advances have been made in the last 5-10 years
• The advances center on the use of formal decision theory, with
explicit application of Bayesian principles, to develop and assess
disease predictors
– Many of the concepts have been explored in medical diagnosis research
– Jonathan Yuen and Gareth Hughes have pioneered these methods
– Gives a quantitative basis for why real-time predictions are accepted
and utilized (or not)
• Some key references include:
–
–
–
–
–
J. Yuen & G. Hughes (2002). Plant Pathol. 51: 407-412.
J. Yuen et al. (1996). European J. Plant Pathol. 102: 847-854.
G. Hughes, N. McRoberts, & F. J. Burnett (1999). Plant Pathol. 48: 147-153.
G. Hughes & L. Madden (2003). Agric. Sys. 76: 755-774.
W. Turechek & W. Wilcox (2005). Phytopathology (in press).
• Best explained with a (thorough) example…
Example:
Fusarium head blight of wheat (scab)
• Also known as ear blight
• Economically important in U.S., Europe, and elsewhere
• Disease intensity (severity and incidence) and mycotoxin (e.g., DON)
varies considerably from location to location and from year to year
• In particular, the disease is not rare, and is not so common that a
major epidemic (or the need to apply fungicide) occurs virtually every
year
• There is considerable evidence from controlled experiments and
empirical observations that epidemics depend on the environment
• Thus, the disease is a good candidate for real-time forecasting or
prediction for the risk of an epidemic
•Note that risk is a term for the probability of an unfavorable event
(e.g., epidemic). So, risk prediction can be another synonym for
disease forecasting
Prediction
• The success of a forecasting system depends, among other
things, on
– The commonness of epidemics (or need to intervene)
– The accuracy of predictions of epidemic risk (based on
weather in this example)
– The ability to deliver predictions in a timely fashion
– The ability to implement a control tactic (fungicide application,
for example)
– The economic impact of using a predictive system
• I address some of these issues
• Note:
– Although there is more than one risk model for FHB, I emphasize
the model developed at Ohio State and Penn State (with many
collaborators), that is currently being used in 23 U.S. states
Fusarium head blight predictor
Flowering date
is key
How common are scab epidemics?
• There is no simple answer.
– Depends on definition of “epidemic” (or the need to use
a fungicide, i.e., intervene)
• High disease severity or high toxin, or both
• We considered an epidemic to be >10% final severity
• In our efforts to develop a prediction system, we
considered N = 124 location-years (for 7 U.S. states)
• 40% were classified as epidemics
• Prob(E+) = 0.40 (estimated probability of an epidemic)
• This is a “working concept” for the so-called prior
probability of a scab epidemic.
• It is very reasonable to use other information to
estimate this prior probability.
Probability and Odds
• Prior probability of epidemic:
• Prob(E+) = 0.4
• Prior odds of an epidemic
• Odds(E+) = Prob(E+) / [1 – Prob(E+)] = 0.67
» Note: with Prob(E+) = 1/2, Odds(E+) = 1
• Prior probability of no epidemic:
• Prob(E-) = 0.6
• Prior odds of no epidemic:
• Odds(E-) = Prob(E-)/[1 – Prob(E-)] = 1.5
Prediction Model
• Initial model described in:
– De Wolf, Madden, and Lipps (2003) Phytopathology 93:428-435.
• Based on 50 location-years (compared with current 124).
• Slightly different Prob(E+)
• My Proceedings article uses results reported in 2003 article
• Current prediction system on the web uses different
variables and risk model, and is based on N = 124
location-years
• Numbers in this talk reflect the more recent results
• Current system exists because of the collaboration of
many individuals, for either data collection, analysis, of
prediction delivery, especially:
• Pat Lipps (OSU)
• Erick De Wolf (Penn State)
Prediction Model
• Prediction model: Z = f(environment, crop factors)
– Derived with logistic regression
• Increasing favorability for an epidemic is associated with increasing Z (which
is on a logit scale)
– Predict an epidemic when Z > threshold (label this P+)
– Predict a nonepidemic when Z < threshold (label this P-)
• Note: predictor could be derived with
– many different statistical modeling approaches
• Logistic regression (including Bayesian logistic (or other) analysis)
• Discriminant analysis
• Neural networks, etc.
– or with ad hoc “pencil and paper” methods
• Late blight warning systems (severity values)
• Mills’ tables for apple scab, etc., …
– or formally using parameters from (mechanistic) population dynamic
models (exponential or logistic rate parameter)
Fusarium head blight risk model (124 location-years)
Possible thresholds for
epidemic prediction
Epidemics
Mo d e l
Pr o b a b i l i t y
NE W ( W & S )
y =1
1. 0
0. 9
0. 8
0. 7
0. 6
Predicted
probability of
an epidemic
0. 5
0. 4
0. 3
0. 2
0. 1
0. 0
- 20
- 10
0
P RE DI CT OR
Non-epidemics
(Z)
Increasing favorableness of environment  
10
Four possible decisions
TPP:
•True positive proportion
(“sensitivity”)
•Proportion of known epidemics
correctly predicted
•Estimate of Prob(P+|E+)
FPP:
•False positive proportion
•(=1-TNP)
•Proportion of nonepidemics
incorrected predicted to be
epidemics
•Estimate of Prob(P+|E-)
FNP:
•False negative proportion
•(=1-TPP)
•Proportion of epidemics
incorrectly predicted to be
epidemics
•Estimate of Prob(P-|E+)
TNP:
•True negative proportion
(“specificity”)
•Proportion of known
nonepidemics correctly predicted
•Estimate of Prob(P-|E-)
Fusarium head blight risk model:
Different thresholds
Epidemics
Mo d e l
Correct decisions
Pr o b a b i l i t y
TN P
TPP
1.00
NE W ( W & S )
y =1
1. 0
0. 9
0. 8
0.80
FNP =
1 -TNP
FPP =
1 -TNP
0.60
0. 7
0. 6
0. 5
0.40
0. 4
0.20
0.00
0. 3
0. 2
-8
-6
-4
-2
0
2
4
6
8
Predictor threshold
0. 1
0. 0
- 20
- 10
0
P RE DI CT OR
Non-epidemics
10
Predictor Accuracy
True positive proportion (TPP)
0.820
True negative proportion (TNP)0.824
Overall accuracy
0.820
Positive prediction Likelihood Ratio
LR(+) = TPP/(1-TNP)
4.70
Negative prediction Likelihood Ratio
LR(-) = (1-TPP)/TNP
0.22
Proportion of
epidemics
correctly
predicted
Proportion of
non-epidemics
correctly
predicted
LR(+) and LR(-) are measures of the
effectiveness of a predictor
Large LR(+) and small LR(-) are ideal.
Receiver Operating Characteristic (ROC) curve:
Mo d e l
Pr o b a b i l i t y
NE W ( W & S )
y =1
1. 0
1.00
Correct decisions
True positive proportion (TPP)
An overall measure of predictor accuracy
0.80
0.60
0.40
0.20
0.00
0.00
1.00
0.80
0.60
0.40
0.20
0.00
TPP
TNP
-8 -6 -4 -2 0 2 4 6 8
Predictor threshold
0.20
0.40
0.60
0.80
1.00
False positive proportion (FPP = 1-TNP)
0. 9
0. 8
0. 7
0. 6
0. 5
0. 4
0. 3
0. 2
0. 1
0. 0
- 20
- 10
0
P RE DI CT OR
10
Receiver Operating Characteristic (ROC) curve
Tr ue po sitive p r o po r tio n ( TPP)
Very accurate
predictor
1.00
0.80
Increasing accuracy
0.60
0.40
Worthless predictor
0.20
0.00
0.00
0.20
0.40
0.60
0.80
1.00
Fa ls e po s itiv e pr o por tion ( FPP = 1- TN P)
Predictors in practice
Statistics such as TPP and TNP indicate how well one can predict known
epidemics and non-epidemics. In application, one wants to know how
well the model predicts unknown cases (location-years) based on
calculated risk values. This can be estimated easily by invoking Bayes
Theorem, using the so-called posterior odds.
Odds(E+|P+) = Odds(E+)LR(+)
Odds(E-|P-) = Odds(E-)/LR(-)
Likelihood ratio
Odds(E+|P+):
Posterior odds,
post-prediction odds that
there is an epidemic, given
that one is predicted
(or posterior odds that this
not an epidemic when one
is not predicted)
Prior odds
Fusarium Model Accuracy
Odds(E+|P+) = Odds(E+)LR(+)
Posterior odds–
Post-prediction odds that
there is an epidemic, given
that one is predicted
Prior odds
Likelihood
ratio of a
positive
prediction
Fusarium prediction model: 3.1 = 0.674.7
A little algebra shows that the posterior probability of an
epidemic, given that one is predicted, Prob(E+|P+), is: 0.76
(compared with prior probability of 0.40).
Fusarium Prediction Model
(Prior and posterior odds, and LR)
•
•
•
•
Prob(E+|P+)
Prob(E+|P-)
Prob(E-|P+)
Prob(E-|P-)
= 0.76
= 0.13
= 0.24
= 0.87
Prob(E+) = 0.4
Prob(E-) = 0.6
Probability before
using predictor
Probability after
using the predictor
Algebraic steps
not shown
Fusarium Prediction Model
(Value of predictor depends on LR and prior probability)
•
•
•
•
Prob(E+|P+)
Prob(E+|P-)
Prob(E-|P+)
Prob(E-|P-)
= 0.76
= 0.13
= 0.24
= 0.87
Prob(E+) = 0.4
Prob(E-) = 0.6
New calculations can be done with any new
information on prior odds.
Prior probability versus model accuracy
TPP = 0.820, TNP = 0.824, LR(+) = 4.7, LR(-)=0.22
Prob(E+)
0.01
0.05
0.10
0.20
0.40
0.60
0.80
0.90
0.95
Prob(E+|P+)
0.045
0.198
0.343
0.540
0.760
0.876
0.949
0.977
0.989
Prob(E-)
0.99
0.95
0.90
0.80
0.60
0.40
0.20
0.10
0.05
Prob(E-|P-)
0.98
0.988
0.976
0.948
0.872
0.752
0.532
0.336
0.193
Forecasters need to be EXTREMELY accurate to be of value
for very rare or very common diseases!
Nominal
results
Rare
epidemics
Very
common
epidemics
Prior probability versus model accuracy
TPP = 0.9, TNP = 0.91, LR(+) = 10.0, LR(-)=0.11
Prob(E+)
0.01
0.05
0.10
0.20
0.40
0.60
0.80
0.90
0.95
Prob(E+|P+)
0.092
0.344
0.526
0.714
0.870
0.938
0.976
0.989
0.995
Prob(E-)
0.99
0.95
0.90
0.80
0.60
0.40
0.20
0.10
0.05
Prob(E-|P-)
0.999
0.994
0.988
0.973
0.932
0.858
0.694
0.503
0.324
Example
results, more
accurate
model (overall)
Rare
epidemics
Very
common
epidemics
Other thresholds of predictor:
Mo d e l
Pr o b a b i l i t y
NE W ( W & S )
y =1
1. 0
0. 9
0. 8
0. 7
0. 6
0. 5
0. 4
0. 3
0. 2
0. 1
0. 0
- 20
- 10
0
10
1.00
0.80
Lower threshold
0.60
0.40
Higher threshold
0.20
0.00
0.00
0.20
0.40
0.60
0.80
1.00
Fa ls e po s itiv e pr o por tion ( FPP = 1- TN P)
Correct decisions
Tr ue po sitive p r o po r tio n ( TPP)
P RE DI CT OR
TN P
TPP
1.00
0.80
FNP =
1 -TNP
FPP =
1 -TNP
0.60
0.40
0.20
0.00
-8
-6
-4
-2
0
2
4
6
8
Predictor threshold
Prior probability versus model accuracy
[use higher threshold for lower Prob(E+)]
Tr ue po sitive p r o po r tio n ( TPP)
TPP, TNP, and
hence LR(+) and
LR(-) are changed
Prob(E+|P+):
1.00
0.80
Lower threshold
0.60
0.40
Higher threshold
0.20
0.00
0.00
0.20
0.40
0.60
0.80
1.00
Fa ls e po s itiv e pr o por tion ( FPP = 1- TN P)
Prob(E+)
0.01
0.05
0.10
0.20
0.40
0.60
0.80
0.90
0.95
Nominal
0.045
0.198
0.343
0.540
0.760
0.876
0.949
0.977
0.989
High Thresh.
0.095
0.353
0.535
0.722
0.874
0.940
0.976
0.989
0.995
Low Thresh.
0.021
0.100
0.190
0.345
0.584
0.760
0.894
0.950
0.976
Prior probability versus model accuracy
[use lower threshold for higher Prob(E+)]
Tr ue po sitive p r o po r tio n ( TPP)
TPP, TNP, and
hence LR(+) and
LR(-) are changed
Prob(E+)
1.00
0.01
0.80
0.05
Lower threshold
0.10
0.60
0.20
Higher threshold
0.40
0.40
0.20
0.60
0.80
0.00
0.00 0.20 0.40 0.60 0.80 1.00
0.90
Fa ls e po s itiv e pr o por tion ( FPP = 1- TN P) 0.95
Prob(E-|P-):
Nominal
0.980
0.988
0.976
0.948
0.872
0.752
0.532
0.336
0.193
High Thresh.
0.993
0.963
0.924
0.844
0.670
0.474
0.253
0.131
0.066
Low Thresh.
0.999
0.994
0.988
0.974
0.933
0.861
0.698
0.507
0.328
Optimum threshold
Previous result:
Odds(E+|P+) = Odds(E+)LR(+)
Cost ratio (for a predictor):
CR = (CFP – CTN)/(CFN-CTP)
≈ CFP/CFN
It can be shown that the optimum
threshold to minimize costs (on average)
is found from:
CR = Odds(E+)f′(FPP)
Where f′(FPP) is the first derivative of the
ROC curve, the instantaneous likelihood
ratio at point (FPP,TPP).
ROC
f (FPP)
f ′(FPP)
Optimum threshold
CFP/CFN ≈ CR = Odds(E+)f′ (FPP)
In practice, for a given prevalence and
CR, get: f′ (FPP) = CR/Odds(E+)
Solve for corresponding FPP and TPP
(from a ROC model)
Find the corresponding threshold of Z for
operating the predictor. Determine
predictor results from Odds(E+|P+)
For high CR, move down
the ROC curve, resulting in
higher threshold
For low CR, move up the
ROC curve, resulting in
lower threshold
ROC
Optimum threshold example
Resulting
threshold
for
predictor
Three
examples
for CR
TPP
1.0
2.0
Thresh.
(1-FPP)
(CFP/CFN)
0.25
TNP
0.90
0.74
0.61
0.65
0.86
0.92
-1.70
-0.08
+0.08
Correct decisions
CR
TN P
TPP
1.00
0.80
FNP =
1 -TNP
FPP =
1 -TNP
0.60
0.40
0.20
0.00
-8
-6
-4
-2
0
2
4
6
8
Predictor threshold
Predictor
(Conclusions)
• Although the terminology and methodology of Bayesian
decision theory, etc., are foreign to most plant
pathologists, the concepts are intuitive (once an initial
hurdle is overcome)
• Work to date in plant pathology has dealt with relatively
simple scenarios
– Binary reality (epidemic or not; need to spray or not)
– Binary decisions (epidemic or not; spray or not)
• The next phase of work will deal with more complex
scenarios
– (quantitative reality, decisions, results)
• More work is also needed on predictor validation and
costs of using a predictor
Overall conclusions
• “Epidemiological analysis has come to stay”
– Vanderplank (1963)
• The prophetic words of Vanderplank certainly have
come true
– The discipline of botanical epidemiology has evolved in many
ways, and will continue to evolve
• There is a continuing need for epidemiology (in the
broad sense), since many of the easy problems have
been solved.
– The hard problems may require novel research, which will likely
involve complex research and analysis
Coupled differential equations
(expansions)
Expand for:
•Host response to infection,
•Multiple pathogens,
•Multiple hosts
•Biocontrol agents,
•Heterogeneity,


H

•Transmission by vectors
•Vector dynamics (density,…)
L

x


I


R


•Etc.
Also expand models
(stochastic, discrete time, …)
Analytical solutions may be
challenging, without approximations,
but thresholds may still be found
Spatio-temporal dynamics
• A highly desirable property of coupled differential equations is that
they can be expanded for spatial processes
• Generally requires the use of partial (rather than ordinary)
differential equations
• Requires specification of a contact distribution, D(s)
– A statistical distribution: scaled version of a dispersal gradient (so
integration over all distances is 1)
• Can lead to thresholds for spread
– Velocity of disease-focus expansion proportional to ln(R0)
Mathematical solutions may depend
on simple initial conditions (e.g., a
single original focus), and/or other
approximations, although numerical
solutions can be found for ‘any’ set of
initial conditions
Power-law (Pareto)
type of contact
distribution in example
H (t , s )
   H (t , s )
t
L(t , s )

t
I (t , s )

t
R(t , s )

t


I (t ,  ) D( s   ) d


 H (t , s )

I (t ,  ) D( s   ) d   L(t , s )

 L(t , s )   I (t , s )
 I (t , s )
“Dispersive wave” in example:
increasing velocity, s/t
Solutions
1000
800
600
400
200
0
-4 0
Di se a se i n t e n si t y (l o g sca l e )
Need to
identify
time and
location
Di se a se i n t e n si t y
Spatio-temporal dynamics
1000
100
10
1
0 .1
0 .0 1
0 .0 0 1
0 .0 0 0 1
-4 0
-2 4
-8
8
24
40
24
40
Di s t a n c s)
e (
-2 4
-8
8
Di s t a n c s)
e (
s/t is proportional to
ln(R0)
Spatio-temporal dynamics
• A mathematical approach to epidemic
characterization may become too cumbersome
with complex starting conditions
– Multiple foci, of varying sizes, locations, and
inoculum densities, as well as numbers of
diseased individuals
• Numerical methods (simulations) are useful here
– See Xu & Ridout (1998; Phytopathology)
• A different approach typically is taken, however
– Based on statistical models
• Utilizes properties of the measured variable, and
measures of covariance (correlation) and heterogeneity
– Borrowed from biometry, statistical ecology, etc.
• Recent work shows that many of the original
approaches need modification when analyzing disease
(since disease intensity is a proportion, not a count)
– Madden & Hughes (1995; Annu. Rev. Phytopathol.)
review the topic
– Leads naturally to sampling protocols, and
appropriate analyses for treatment effects
Coupled differential
equations can be
expanded for spatiotemporal dynamics
– Rates of disease expansion
– Covered by Scherm
•
•
•
Analytical results may
depend on fairly simple
starting conditions (single
focus)
Simulations may be
needed for general
understanding
With complex scenarios
(multiple foci, different
inoculum levels, etc.),
more statistical
approaches are required
– “Spatial analysis”
1000
800
600
400
200
0
-4 0
Di se a se i n t e n si t y (l o g sca l e )
•
Di se a se i n t e n si t y
Spatio-temporal dynamics
1000
100
10
1
0 .1
0 .0 1
0 .0 0 1
0 .0 0 0 1
-4 0
-2 4
-8
8
24
40
24
40
Di s t a n c s)
e (
-2 4
-8
8
Di s t a n c s)
e (