Lecture_Statistics_Revisited - Sortie-ND

Download Report

Transcript Lecture_Statistics_Revisited - Sortie-ND

Lecture 7
All of statistics……revisited
Likelihood Methods in Forest Ecology
October 9th – 20th , 2006
Standard statistics revisited
Bolker
Standard statistics revisited:
Simple Variance Structures
Standard statistics revisited
General linear models
• Predictions are a linear function of a set of parameters.
• Includes:
– Linear models
– ANOVA
– ANCOVA
• Assumptions:
– Normally distributed, independent errors
– Constant variance
• Not to be confused with generalized linear models!
• Distinction between factors and covariates.
Linear regression
Y ~ a  bX  N ( 0, 2 )
Standard R code:
>lm.reg<-lm(Y~X)
>summary(lm)
>anova(lm.reg)
Likelihood R code:
>lmfun<-function(a, b, sigma)
{
Y.pred<-a+b*x
-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))
}
Analysis of variance (ANOVA)
Yij ~  i   j  ij  N ( 0, 2 )
Standard R code:
>lm.onewayaov<-lm(Y~f1)
>summary(lm.aov)
>anova(lm.aov) # will give you an ANOVA table
Likelihood R code:
>aovfun<-function(a11, a12, sigma)
{
Y.pred<-c(a11,a12)
-sum(dnorm(DBH, mean=Y.pred, sd=sigma, log=TRUE))
}
Analysis of variance (ANOVA): H &M (p177)
Individual and cage effects on fly wing length
Yij ~  i   j  ij  N ( 0, 2 )
Cage
Female
Left wing
Right wing
Compare:
1
1
1
2
1
3
1
2
2
2
2
3
3
3
12
Table 7.5
58.5
59.5
Likeli of mean model
Likeli of cage model
Likeli of indiv fly model
Analysis of covariance (ANCOVA)
Yi ~  i  i X  N ( 0, 2 )
Standard R code:
>lm.anc<-lm(Y~f*X)
>summary(lm.anc)
>str(summary(lm.anc))
Likelihood R code:
>ancfun<-function(a11, a12, slope1, slope2, sigma)
{
Y.pred<-c(a11,a12)[f] + c(slope1, slope2)[f]*X
-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))
}
Standard statistics revisited
Nonlinearlity: Non-linear least squares
Yi ~ aX b  N ( 0, 2 )
Uses numerical methods similar to those use in likelihood
Standard R code:
>nls(y~a*x^b, start=list(a=1,b=1)
>summary(nls)
>str(summary(lns))
Likelihood R code:
>nlsfun<-function(a, b, sigma)
{
Y.pred<-a*x^b
-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))
}
Standard statistics revisited
Generalized linear models
•
•
Assumptions:
– Non-normal distributed errors ( but still independent and only certain kinds
of non-normality)
– Non-linear relationships are allowed but only if they have a linearizing
transformation (the link function).
Linearizing transformations:
y
 y 




x

log
x
1

y
1 e


ex
y  e x   x  log( y )
x  y   y  x2
•
Non-normal distributed errors ( but still independent and only certain kinds of
non-normality). These include the exponential family and are typically used
with a specific linearizing function.
• Poisson: loglink
• Binomial: logit transfomation
• Gamma: inverse Gaussian
•
Fit by iteratively reweighed least square methods: estimate variance
associated with each point for each estimate of parameter(s).
Not to be confused with general linear models!
•
GML: Poisson regression
Standard R code:
y  ae
>glm.pois<-glm(Y~X, family=poisson)
>summary(gml.pois)
Likelihood R code:
>poisregfun=function(a,b)
{Y.pred<-exp(a+b*X)
-sum(dpois(Y, lambda=Y.pred, log=TRUE))}
bx
GML: Logistic regression
Standard R code:
y
ex
1 ex
>glm2<-glm(y,x, family=“binomial”)
 y 

link function  log it ( x )  log 
>summary(gml2)
1 y 
Likelihood R code:
>logregfun=function(a,b,N)
{p.pred<-exp(a + b*X))/(1+exp(a + b*X))
-sum(dbinom(Y, size=N, prob=p.pred, log=TRUE))}
Standard statistics revisited
Generalized (non)linear least-squares models:
Variance changes with a covariate or among groups
Standard R code:
yi ~ c  N ( 0, i2 )
>gls<-gls(y~1,weights=varIdent(form=~1|f)
>summary(gls)
Likelihood R code:
>vardifffun=function(a, sd1,sd2)
{sdval<-c(sd1,sd2)[f]
-sum(dbinom(Y, mean=a, sd=sdval, log=TRUE)}
Standard statistics revisited:
Complex Variance Structures
Complex error structures
• Error structures are not independent
• Complex likelihood functions
• Includes:
– Time series analysis
– Spatial correlation
– Repeated measures analysis
Variance-covariance matrix
x
Vector of
Vector of Means (pred)
data
x
Complex error structures
x
exp( xi  i )2 / 2 2
exp( xi  i ) / 2
2
exp( xi  i )2 / 2 2
Increasing variance
Independent
2
(x
exp( xi  i )2 / 2 i2
General case
Complex error structures
• Variance/covariance matrix is symetric so we need to
specify at most n(n-1)/2 parameters.
• V/C matrix must also be positive definite (logical), this
translates to having a positive eigenvalue or positive
diagonal values/
• Select elements of matrix that define the error structure
and ensure positive definite.
|  | 1
• In this example, correlation drops off with the number ofd
steps between sites.c
Complex error structures: An example
Spatially-correlated errors
R code:
>rho=0.5
>m=matrix(nrow=5, ncol=5)
>m<-rho^(abs(row(m)-col(m)) #OR#
>m[abs(row(m)-col(m))==1]=rho
>mvlik<-function(a,b,rho)
{
mu=a+b*x
n=length(x)
m=diag(n) generates diag matrix of n rows, n columns
m[abs(row(m)-col(m))==1]=rho
-dmvnorm(y, mu, Sigma=m, log=TRUE)
}
Mixed models & Generalized
linear mixed models (GLMM)
• Samples within a group (block, site) are equally correlated
with each other.
• Fixed effects: effects of covariates
• Random effects: block, site etc.
• GLMM’s are generalized linear models with random
effects
Complex variance structures
• So how do you incorporate all potential
sources of variance?
– Block effects
– Individual effects (repeated measures
includes both individual and temporal
correlation)
– Measurement vs. process error
– …..
Bolker
Analyses
of
Experimental data
Threshold
Ambien density
Natural variation
Variance
DD detected
DD undetected
Osenberg et al. 2002
Why variation in experimental
conclusions?
•
•
•
•
•
Inference derived from p-value
No effect size (strength of the process)
No per-capita effects
Time difference between studies
Spatial extent difference between studies
Approach:
Analyze data using one equation
di
dd
Results
No difference in per-capita effects
Difference due to initial density
So….beware of experiments!
Evidence of a subterranean
trophic cascade (HSS)
Strong et al. 1999 Ecology
Lupine
Response=Survival (1/0)
15 reps each
treatment
Root feeding caterpillar
(x= 0,8, 16,32)
Caterpillar effect
Base mort
Nematode
Present/absent
Hierarchical structure
Analyses
i=1 (nem pres), 2 (nem absent)
j = caterpillar treatment j
No. of sdlgs dead = Binomial random variable
Model selection
Results
• 47% died in absence of nematodes
• 11% died in presence of nematodes
Nematodes present
Nematodes absent
Model selection
Four models
beta 2 positive (neg eff of nematodes and
diff from beta 1 (which = 0 or < beta 2)
Traditional Approach:
Logistic regression
Logistic regression
Beware of canned packages! Need to determine hierarchical error
structure when testing complex hypotheses
Take-home points
• Non-linear effects and non-normal
response variables will often cause
problems with canned packages.
• Focus on model construction, parameter
estimation and model evaluation.
• Represent variability in your data using the
appropriate probability function
Predator-induced hatching
plasticity
Vonesh & Bolker 2004
Trait-mediated predator effects
• Density effects: consumptive effects
resulting from predators killing prey
(affects density).
• Trait effects: non-consumptive effects
resulting from changes in prey behavior
or morphology in response to predation
risk (e.g., growth rates)
Lutberg & Kirby 2005
Trait-mediated effects
Preisser et al. 2005
Predator-mediated plasticity in anurans
• Prey respond to predators by changing
their behavior, morphology and life history.
• Timing of habitat shifts, metamorphosis,
and hatching involve change of habitat
and often, suite of predators.
• Timing of transition between two life
stages should evolve in response to
variation in growth and mortality among
life history stages.
Postponement of hatching in response to
predators may
• Allow hatchlings to reach a greater body
size before encountering predators, thus
increasing their survival
• But…there may different predation risk at
different life stages (i.e., terrestrial vs
aquatic predators) so it may be best to
hatch early.
• How are these tradeoffs determined?
The study system
Predator effects on terrestrial stage
• Both frogs and flies cause embryos to
hatch approximately 30% earlier.
• Early hatchlings have lower weights and
are at earlier developmental stages
• Frogs can reduce density of tadpoles
entering the pond by 60%
• Flies have a much smaller effect.
• So both size and density change over
time!
Experiment I: Quantifying the functional
response
• Vary larval density in the presence and
absence of the dragonfly (aquatic
predator).
Scientific Model: Functional responseMortality as a function of density
Keep size fixed
Number of predators
Number of prey
eaten in t days
Attack rate
Handling time
•Assume that actual attacked number follows a binomial distribution
with p =probability of an individual being killed over the course
of the experiment
•Obtain estimates of α and HD that maximize the likelihood.
Experiment II: Effect of larval size on
predation risk
• Expose five larval age/size classes to
aquatic predators.
• Dragonflies were replaced daily to keep
predator densities constant
The Scientific Model Part II:
Size-specific mortality
(Keep density fixed)
Prey size
Phenomenological scientific model
Size-specific
predation prob.
Assume that probability of predation
follows a binomial distribution with
this probability. This function peaks at
intermediate prey sizes
Combining size & density-dependent mortality
to predict attack rate
• Two tricks….
Number of prey eaten
per predator per day=
It becomes the risk of predation
at density N
Density of predation in
the size experiment
Monte-Carlo methods
• Population of interest is simulated.
• Draw repeated samples from pseudopopulation.
• Statistic (parameter) computed in each
pseudo-sample.
• Sampling distribution of statistic
examined.
• Where do true parameters fall within this
distribution?
Basic procedure
1. Calculate predicted values with known
parameter values (these may also be
calculated from data).
2. Add random error to predicted values to
create observed.
3. Estimate parameter values given observed
and predicted.
4. Go back to step 2 and loop through 100-1000
times.
5. Examine frequency distribution of estimated
parameters of interest.
Describe the distribution of the
predicted variable
• Vonesh & Bolker obtain parameter estimates of their
model with CI and variance-covariance matrix.
• Draw repeatedly from these distributions.
• Simulate larval growth and survival from estimated
parameters and error around estimates (var-cov
matrix).
• Generate expected distribution of the variable of
interest (number killed).
• Can do these with just one set of data analyzed in a
traditional framework.
Measurement & Observation Error
Schnute
No process uncertainty
(measurement error)
X measured
Perfectly
(process error)
Why should we care?
• Measurement error only affects the current
measurement.
• Process error propagates through time.
• This is big deal in dynamic models.
Bolker
A famous example
Pascual & Kareiva 1996
Fit L-V models to
Gause’s data
Traditional conclusion
Observation & Process Error
• Process uncertainty: random events cause the
response variable to change in ways that are not
predicted by the model. These may be errors in
the process itself or in the observer of y.
Propagates through time.
• Observer uncertainty: Error in sampling due to
measurement. Error in the predictive variable.
Does not propagate through time.
Observation error
• We only need initial conditions.
• We take observation from each time step
It, and predict just the next step, I(t+1)
• Contrast trajectory and actual data.
• Minimize the difference between observed
and predicted data
• Often involves non-linear minimization
Process error
• We need complete series of observations.
• We take observation from each time step
It, and predict just the next step, I(t+1)
• For estimation we fit a regression between
N(t+1) and N(t)
• Minimize the difference b/observed and
predicted N(t+1)
• One-step ahead fitting
• Linear regression approach
Estimation
• To estimate both observation and process error we need
either independent estimates of:
– the magnitude of the errors OR
– their relative size
• Otherwise we have to choose between the two
• Fitting assuming observation errors provides unbiased
and more precise estimates even when data contained
only process error. However, it produces downwardbiased estimates of variance.
• If two kinds of errors uncorrelated, it gives the extreme
values of possible parameter estimates.
Statistical flip-flopping
• We often use MLE’s to estimate parameters
although IT does not have this requirement.
• An AIC does not say anything about our
confidence (or error) in the parameter estimate.
• Therefore, we resort to frequentist stats to
generate some 95% CI.
• An alternative is to generate a true likelihood
profile and chi-square but ultimately this also
produces a p value.
• The only consistent statistical logic is Bayesian
Philosophy vs pragmatism
• It is useful to have a broader more
encompassing philosophy but…
• Greater generality often implies greater
complexity –often computational and
mathematical