Biological proxies in sedimentary archives
Download
Report
Transcript Biological proxies in sedimentary archives
BIOLOGICAL PROXIES IN
SEDIMENTARY ARCHIVES –
PROGRESS, PROBLEMS, POTENTIALITIES
John Birks
University of Bergen and University
College London
NICE Autumn School on Methods of Quantitative
Palaeoenvironmental Reconstructions
6-10 October 2008
CONTENTS
Introduction
Indicator-species approach
Assemblage approach
Transfer-function approach
Linear-based transfer-function methods
Unimodal-based transfer-function methods
Pollen-climate response surfaces
Modern analogue-based approaches
Consensus reconstructions and smoothers
Problems of spatial autocorrelation in transfer functions
Use of artificial simulated data-sets
Multi-proxy approaches
Use of methods
Implications for current projects
Potential palaeoecological applications of reconstructions
Acknowledgements
INTRODUCTION
Emphasis on biological climatic proxies in continental
sedimentary archives
Some mention of biological proxies in marine archives in order to
put the available methods for the quantitative reconstruction of
past climates into a broader context
Three basic approaches
• indicator-species approach
• assemblage approach
• transfer-function approach (=calibration in statistics,
bioindication in applied ecology)
All can provide quantitative reconstructions of past climate
Some of the methods were originally developed for the quantitative
reconstruction of other environmental variables (e.g. lake-water
pH) but are directly applicable to climate reconstructions
Gradient Analysis and Bioindication
Relation of species to environmental variables or gradients
Ecology
Gradient analysis:
Environment gradient
Community or Assemblage
In gradient analysis, use environmental conditions or gradient
values to explain community composition
Palaeoecology
Community or Assemblage
Environment gradient
In bioindication, use species optima or indicator values to obtain an
estimate of environmental conditions or gradient values.
Calibration, bioindication, reconstruction.
Relevant Reviews of Continental
Reconstruction Approaches
Birks HJB (1995) Quantitative palaeoenvironmental reconstructions. In: Maddy &
Brew (eds) Statistical Modelling of Quaternary Science Data. QRA pp. 161-254.
Birks HJB (1998) Numerical tools in palaeolimnology – progress, potentialities, and
problems. J. Paleolimnol. 20: 307-332.
Birks HJB (2003) Quantitative palaeoenvironmental reconstructions from Holocene
biological data. In: Mackay et al. (eds) Global Change in the Holocene. Arnold, pp.
107-123.
Birks HJB & Seppä H (2004) Pollen-based reconstructions of late-Quaternary climate
in Europe – progress, problems, and pitfalls. Acta Palaeobotanica 44: 317-324.
ter Braak CJF (1995) Non-linear methods for multivariate statistical calibration in
palaeoecology. Chemo. Intell. Lab. Sys. 28: 165-180.
ter Braak CJF (1996) Unimodal Models to Relate Species to Environment (2nd edn).
DLO-Agricultural Maths Group, Wageningen.
ter Braak CJF & Prentice IC (1988) A theory of gradient analysis. Advances Ecol. Res.
18: 271-317.
Also for marine environments: Guiot J & de Vernal A (2007) Transfer functions: methods
for quantitative paleoceanography based on microfossils. In: Hillaire-Marcel & de Vernal
(eds) Proxies in Late Cenozoic Paleoceanography. Elsevier, pp. 523-563.
Notation
Xf
Past climatic variable to be reconstructed
Yf
Fossil biological proxies used in the climatic
reconstruction
Xm
The same climatic variable as Xf but modern
values
Ym
Modern biological data from the same localities
as Xm
Ûm
Modern transfer functions estimated
mathematically from Ym and Xm
Basic Idea of Quantitative Environmental
Reconstruction
Fossil biological data
(e.g., pollen, chironomids)
'Proxy data'
Environmental variable
(e.g., temperature)
1, ........... m species
1
Yf
t
samples
Xf Unknown.
To be estimated
or reconstructed
t
samples
To solve for Xf, need modern data about species and climate
1, ........... m species
1
Ym
Xm
n
samples
n
samples
Modern biology
(e.g., pollen, chironomids)
Modern environment
(e.g., temperature)
INDICATOR SPECIES APPROACH
Viscum album (mistletoe), Hedera helix (ivy), & Ilex aquifolium
(holly) at or near their northern limits in Denmark (Iversen
1944)
Considered occurrence, growth, and fruit production in relation
to mean temperature of the coldest month and of the warmest
month to delimit 'thermal
limits'.
Hedera helix
Iversen 1944
Mid-Holocene changes:
ca. 2-2.5ºC warmer
summers
ca. 1-1.5ºC warmer
winters
Birks 1981
The thermal-limit curves for
Ilex aquifolium, Hedera helix,
and Viscum album in relation to
the mean temperatures of the
warmest and coldest months.
1, 2, and 3 represent samples
with pollen of Ilex, Hedera, &
Viscum, Hedera & Viscum, and
Ilex & Hedera, respectively.
Walther et al. (2005)
show shifts in the
northern limit of Ilex
(holly) in last 50 years
Ilex aquifolium
in response to climatic
changes, confirming its sensitivity to climatic shifts.
Viscum album
ASSEMBLAGE APPROACH
Compare fossil assemblage with modern assemblages
from known environments.
Identify the modern assemblages that are most similar
to the fossil assemblage and infer the past environment
to be similar to the modern environment of the relevant
most similar modern assemblages.
If done qualitatively, standard approach in Quaternary
pollen analysis, etc., since 1950s.
If done quantitatively, modern analogue technique or
analogue matching.
Modern analogue technique (MAT) = k-nearest neighbours (k-NN).
e.g. Hutson (1980), Prell (1985), Guiot (1985, 1987, 1990), ter Braak (1995)
Compare fossil sample (Yf)
t with modern sample i
Repeat for all
modern samples (Ym)
Repeat for
all fossil
samples
Calculate DC
between t and i
DC=dissimilarity coefficient
= proximity measure
Select k-closest
analogues for fossil
sample t
Estimate past environment (Xf)
for sample t as (weighted) mean
of the modern environment (Xm)
of the k analogues
Value of k estimated
by visual inspection,
arbitrary rules (e.g.,
10, 20, etc.), or
cross-validation
Mutual Climatic Range Method
Grichuk et al.
Atkinson et al.
Coleoptera
USSR 1950s–1960s
UK
1986, 1987
TMAX - mean temperature of warmest month
TMIN - mean temperature of coldest month
TRANGE - TMAX–TMIN
Quote median values of mutual overlap and ‘limits
given by the extremes of overlap'.
Thermal envelopes for hypothetical species A, B, and C
Schematic representation of the Mutual Climate Range method
of quantitative temperature reconstructions (courtesy of Adrian
Walking).
ASSUMPTIONS
1. Species distribution is in equilibrium with climate.
2. Distribution data and climatic data are same age.
3. Species distributions are well known, no problems with
species introductions, taxonomy or nomenclature.
4. All the suitable climate space is available for species to
occur. ? Arctic ocean, ? Truncation of climate space.
5. Climate values used in MCR are the actual values where
the beetle species lives in all its known localities. Climate
stations tend to be at low altitudes; cold-tolerant beetles
tend to be at high altitudes. ? Bias towards warm
temperatures. Problems of altitude, lapse rates.
495 climate stations across Palaearctic region from
Greenland to Japan. Deriving reliable climate data is a
major problem.
Climate reconstructions from (a) British Isles, (b) western Norway, (c) southern
Sweden and (d) central Poland. TMAX refers to the mean temperature of the warmest
month (July). The chronology is expressed in radiocarbon years BPx1000 (ka). Each
vertical bar represents the mutual climatic range (MCR) of a single dated fauna. The
bold lines show the most probable value or best estimate of the palaeotemperature
derived from the median values of the MCR estimates and adjusted with the
consideration of the ecological preferences of the recorded insect assemblages.
Coope & Lemdahl 1995
Probability Density Functions
Kühl et al. (2002) Quaternary Research 58; 381-392
Kühl (2003) Dissertations Botanicae 375; 149 pp.
Kühl & Litt (2003) Vegetation History & Archeobotany 12; 205-214
Basic idea is the quantify the present-day distribution of plants that
occur as Quaternary fossils (pollen and/or macrofossils) in terms of July
and January temperature and probability density functions (pdf).
Assuming statistical independence, a joint pdf can be calculated for a
fossil assemblage as the product of the pdfs of the individual taxa. Each
taxon is weighted by the extent of its climatic response range, so
'narrow' indicators receive 'high' weight.
The maximum pdf is the most likely past climate and its confidence
interval is the range of uncertainty.
Can be used with pollen (+/-) and/or macrofossils (+/-).
Distribution of Ilex aquilifolium
in combination with January
temperature.
Kühl et al. 2002
Estimated probability density function
of Ilex aquilifolium as an example for
which the parametric normal
distribution (solid line) fits well the
non-parametric distribution (e.g.,
Kernel function (dashed line)
histogram).
Kühl et al. 2002
Estimated one- and two-dimensional
pdfs of four selected species. The
histograms (non-parametric pdf) and
normal distributions (parametric pdf) on
the left represent the one-dimensional
pdfs. Crosses in the right-hand plots
display the temperature values provided
by the 0.5º x 0.5º gridded climatology
(New et el., 1999). Black crosses
indicate presence, grey crosses absence
of the specific taxon. A small red circle
marks the mean of the corresponding
normal distribution and the ellipses
represent 90% of the integral of the
normal distribution centred on . Most
sample points lie within this range. The
interval, however, may not necessarily
include 90% of the data points. Carex
secalina as an example of an azonally
distributed species is an exception. A
normal distribution does not appear to
be an appropriate estimating function
for this species, and therefore no
normal distribution is indicated.
Reconstruction for the fossil assemblage of Gröbern. The thin
ellipses indicate the pdfs of the individual taxa included in the
reconstruction, and the thick ellipse the 90% uncertainty range
of the reconstruction result.
Kühl et al. 2002
Simplified pollen diagram for last interglacial from Gröbern (Litt 1994),
reconstructed January and July temperature, and 18O (after Boettger et al.
2000).
Kühl & Litt 2003
Reconstructed most probable mean January (blue) and July (red)
temperature and 90% uncertainty range (dotted lines) for three last
interglacial sequences
Kühl & Litt 2003
Comparison of the
reconstructed mean
January temperature
using the pdfmethod (green) and
the modern
analogue technique
(blue).
Bispingen uncertainty
range – 90%; La
Grande Pile – 70%.
Kühl & Litt 2003
TRANSFER-FUNCTION APPROACH
General Theory
Y - biological responses ("proxy data")
X - set of environmental variables that are assumed to be causally related
to Y (e.g. sea-surface temperatures)
B - set of other environmental variables that together with X completely
determine Y (e.g. trace nutrients)
If Y is totally explicable as responses to variables represented by X and B, we have a
deterministic model (no allowance for random factors, historical influences)
Y = XB
If B = 0 or is constant, we can model Y in terms of X and Re, a set of ecological
response functions
Y = X (Re)
In palaeoecology we need to know Re. We cannot derive Re deductively from
ecological studies. We cannot build an explanatory model from our currently poor
ecological knowledge.
Instead we have to use direct empirical models based on observed patterns of Y
in modern surface-samples in relation to X, to derive U, our empirical
calibration or transfer functions.
Y = XU
Imbrie & Kipp 1971
Basic Idea of Quantitative Environmental
Reconstructions using Transfer Functions
BIOLOGICAL DATA
ENVIRONMENTAL DATA
(e.g. Diatoms, pollen,
chironomids)
(e.g. Mean July temperature)
Modern data
”training set”
1,
1 variable
,m
taxa
Ym
n
samples
Fossil data
Ûm
n
samples
1,
,m
taxa
Yf
t
samples
Xm
+
1 variable
+ Ûm
Xf
t
samples
Unknown
To be reconstructed
from Ûm
In practice, this is a two-step process
Regression in which we estimate Ûm, modern calibration functions
or regression coefficients
or
Ym = Ûm(Xm)
Training set
Ym modern surface-sample data
Xm associated environmental data
(classical regression)
Xm = Ûm(Ym)
(inverse regression)
^
Calibration, in which we reconstruct Xf, past environment, from
fossil core data using transfer functions Ûm or their inverse Ûm-1
or
^
Xf = Ûm-1(Yf)
Yf fossil core data fossil set
(classical calibration)
^
(inverse calibration)
Xf = Ûm(Yf)
Xm
Ym
Ûm TRANSFER
Yf
FUNCTION
Xf
Based on an
unpublished
diagram by
Steve Juggins
Surface-Sediment Sampling
Renberg HON corer
Renberg HON corer + sediment
core & mud-water interface
304 modern pollen
samples Norway,
northern Sweden,
Finland (Sylvia Peglar,
Heikki Seppä, John
Birks, Arvid Odland)
All from lakes of
comparable size and
morphometry, all
collected in same way,
and all with consistent
pollen identifications and
taxonomy
Now extended to
Estonia, Latvia,
Lithuania, Sweden and
western Russia – c. 950
comparable samples
(Seppä et al. unpublished)
Seppä & Birks 2001
PROXIES
Pollen - good indicators of vegetation and hence indirect indicators of climate.
Betula (birch)
Alnus (alder)
Quercus (oak)
Pinus (pine)
Empetrum nigrum
Agropyron repens
(Gramineae) (grass)
(crowberry)
Modern pollen, identical treatment, all at same magnification, all stained with safranin
Chironomids - good indicators of past lake-water
temperatures and hence indirectly of past climate
Common late-glacial chironomid taxa. A: Tanytarsina; b: Sergentia; c:
Heterotrissocladius; d: Hydrobaenus/Oliveridia; e: Chironomus; f: Dicrotendipes; g:
Microtendipes; h: Polypedilum; i: Cladopelma. Scale bar represents 50 m.
Freshwater diatoms - excellent indicators of lake-water chemistry
(e.g. pH, total P). Not reliable direct climate indicators.
Basic Biological Assumptions
Marine planktonic foraminifera - Imbrie & Kipp 1971
Foraminifera are a function of sea-surface temperature (SST)
Foraminifera can be used to reconstruct past SST
Pollen
Pollen is a function of vegetation
Vegetation is a function of climate
Pollen is an indirect function of climate and can be used to reconstruct
past climate
Chironomids (aquatic non-biting midges)
Chironomids are a function of lake-water temperature
Lake-water temperature is a function of climate
Chironomids are an indirect function of climate and can be used to
reconstruct past climate
Freshwater diatoms (microscopic algae)
Diatoms are a function of lake-water chemistry
Diatoms can be used to reconstruct past lake-water chemistry
Lake-water chemistry may be a very weak function of climate
Diatoms may be a very weak function of climate
Biological Proxy Data Properties
• May have 200-300 species, expressed as proportions
or percentages in 200-500 samples
• “Closed” compositional data – difficult statistical
properties
• Multicollinearity
• Biological data contain many zero values (absences)
• Species generally show non-linear unimodal responses
to their environment, not simple linear responses
Environmental Data Properties
• Generally few variables, often show a skewed distribution
• Strong multicollinearity (e.g. July mean temperature,
growing season duration, annual mean temperature)
• Often difficult to obtain (few modern climate stations,
corrections for altitude of sampling sites, etc.)
• Strong spatial autocorrelation (tendency of values at sites
close to each other to resemble one another more than
randomly selected sites. Values at one site can be partially
predicted from its values at neighbouring sites)
Species Response Models
LINEAR
A unimodal relation
between the abundance
value (y) of a species
and an environmental
variable (x).
(u=optimum or mode;
t=tolerance;
c=maximum).
A straight line displays the linear
relation between the abundance
value (y) of a species and an
environmental variable (x), fitted
to artificial data (•). (a=intercept;
b=slope or regression coefficient).
UNIMODAL
Species Responses
Species nearly always have non-linear unimodal
responses along gradients
trees
(m)
J. Oksanen 2002
Assumptions in Quantitative Palaeoclimatic
Reconstructions
1. Taxa in training set (Ym) are systematically related to the climate (Xm) in
which they live.
2. Environmental variable (Xf , e.g. summer temperature) to be
reconstructed is, or is linearily related to, an ecologically important
variable in the system.
3. Taxa in the training set (Ym) are the same as in the fossil data (Yf) and
their ecological responses (Ûm) have not changed significantly over the
timespan represented by the fossil assemblage.
4. Mathematical methods used in regression and calibration adequately
model the biological responses (Um) to the environmental variable (Xm).
5. Other environmental variables than, say, summer temperature have
negligible influence, or their joint distribution with summer temperature
in the fossil set is the same as in the training set.
6. In model evaluation by cross-validation, the test data are independent
of the training data.
Birks et al. 1990, Telford & Birks 2005
Approaches to Estimating Transfer Functions
1. Basic Numerical Models
• Classical Approach
(1)
Y = f(X) + error
Biology
Environment
(2) Estimate f by some mathematical procedure and 'invert' our estimated
(f) to find unknown past environment Xf from fossil data Yf
Xf f-1(Yf)
• Inverse Approach
In practice, for various mathematical reasons, do an inverse regression or
calibration
(3)
X = g(Y) + error
(4)
Xf = g(Yf)
Obtain 'plug-in' estimate of past environment Xf from fossil data Yf
f or g are 'transfer functions'
2. Assumed Species Response Model
• Linear or unimodal
• No response model assumed (linear or non-linear)
3. Dimensionality
• Full (all species considered)
• Reduced (species components used)
4. Estimation Procedure
• Global (estimate parametric functions, extrapolation
possible)
• Local (estimate non-parametric functions, extrapolation
not possible)
Commonly Used Methods
Principal components regression (PCR)
Segmented linear inverse regression
I
I
L (U)
L
R
F
G
Ln
Partial least squares (PLS)
Guassian logit regression (GLR)
Two-way weighted averaging (WA)
I
C
C
L
U
U
R
F
F
G
G
G
WA-PLS
Modern analogue technique (MAT)
Artificial neural networks (ANN)
Smooth response surfaces
I
I
I
C
U
NA
NA
NA
R
F
F
F
G
Ln
Ln
Ln
I = inverse; C = classical
L = linear; U = unimodal; NA = not assumed;
R = reduced dimensionality; F = full dimensionality;
G = global parametric estimation; Ln = local non-parametric estimation
Reasons for preferring methods with assumed species
response model, full dimensionality, and global parametric
estimation
1. Can test statistically if species A has a statistically
significant relation to particular climate variables
2. Can develop ‘artificial’ simulated data with realistic
assumptions for numerical ‘experiments’
3. Such methods have clear and testable assumptions –
less of a ‘black box’ than, for example, artificial neural
networks
4. Can develop model evaluation or diagnostic procedures
analogous to regression diagnostics in statistical modelling
5. Having a statistical basis, can adopt well-established
principles of statistical model selection and testing.
Minimises ‘ad hoc’ aspects
LINEAR-BASED TRANSFER FUNCTIONS
Inverse Multiple Regression Approach
= principal components regression
Multiple regression of temperature (Xm) on abundance of taxa in core
tops (Ym) (inverse regression)
Xm = ÛmYm = b0 + b1y1 + b2y2 + b3y3…+ bmym
^
Xf = ÛmYf = b0 + b1y1 + b2y2 + b3y3…+ bmym
i.e.
^
Xf = b0 +
m
bkyik + e
S
k=1
Approach most efficient if:
1. relation between each taxon and environment is linear with normal
error distribution
2. environmental variable has normal distribution
Usually not usable because:
1. taxon abundances show multicollinearity
2. very many taxa
3. many zero values, hence regression coefficients unstable
4. basically linear model
Consider non-linear model and introduce extra terms:
Xm = b0 + b1c1 + b2c12 + b3c2 + b4c22 + b5c3 + b6c32 + …
Can end up with more terms than samples. Cannot be solved.
Hence dimension reduction approach of Imbrie & Kipp (1971)
Location of 61 core top samples (Imbrie & Kipp 1971)
61 core-top samples x 27 taxa
Principal components analysis + varimax rotation (‘factor analysis’)
61 samples x 4 varimax assemblages (79%)
Will use as illustrative data-set in this lecture
Abundance of the subtropical assemblage
versus winter surface temperature for 61
core top samples. Data from Tables 4 and
13. Curve fitted by eye
Abundance of the subpolar assemblage
versus winter surface temperature for 61
core top samples. Data from Tables 4 and
13. Curve fitted by eye
Abundance of the polar assemblage
versus winter surface temperature for
61 core top samples. Data from Tables
4 and 13. Curve fitted by eye
Imbrie & Kipp 1971
Abundance of the tropical assemblage
versus winter surface temperature for 61
core top samples. Data from Tables 4 and
13. Curve fitted by eye
Now did inverse regression using 4 varimax assemblages rather than
the 27 original taxa.
Xm = b0 + b1A + b2B + b3C + b4D
Linear
where A, B, C and D are varimax assemblages.
Xm = b0 + b1A + b2B + b3C + b4D + b5AB + b6AC + b7AD + b8BC +
b9BD + b10CD + b11A2 + b12B 2 + b13C 2 + b14D 2
Non-linear, quadratic
CALIBRATION STAGE using the fossil assemblages described as the 4
modern varimax assemblages
^
^
^
^
^
^
Xf =b0 + b1Af + b2Bf + b3Cf + b4Df
General abundance
trends for four of the
varimax assemblages
related to winter
surface temperatures.
Winter surface
temperatures
"measured" by
Defant (1961) versus
those estimated from
the fauna in 61 core
top samples by
means of the transfer
function.
Imbrie & Kipp 1971
Average surface
salinities ”measured”
by Defant (1961)
versus those estimated
from the fauna in 61
core top samples.
Summer surface
temperatures
”measured” by Defant
(1961) versus those
estimated from the
fauna in 61 core top
samples.
Imbrie & Kipp 1971
Salinity
Palaeoclimatic estimates for 110 samples of Caribbean core V12-133,
based on palaeoecological equations derived from 61 core tops. Tw =
winter surface temperature; Ts = summer surface temperature; ‰
= average surface salinity.
Imbrie & Kipp 1971
Principal components regression
(= Imbrie & Kipp (1971) approach)
PC1
Y
PC2
X
Multiple linear regression or
quadratic regression of X on
PC1, PC2, PC3, etc.
PC3
PCA components maximise
variance within Y
Selection of components done visually until very recently. Now
cross-validation is used to select model with fewest components,
lowest root mean square error of prediction (RMSEP), and lowest
maximum bias. ‘Minimal adequate model’ in statistical modelling.
Approach unsatisfactory because:
1. Why 4 assemblages? Why not 3, 5, 6? No crossvalidation
2. Assemblages inevitably unstable, because of many
transformation, standardization, and scaling options in
PCA
3. Assumes linear relationships between taxa and their
environment although non-linear models are possible
(quadratic terms) but curiously rarely used
4. Expressing fossil assemblages in terms of modern
varimax assemblages. Problems of fossil assemblages
not resembling modern assemblages (no-analogue
problem)
5. No sound theoretical basis, ‘ad hoc’
Segmented Linear Inverse Regression
Scatter diagrams of:
(A) % birch
(Betula); and (B) %
oak (Quercus)
pollen versus
latitude.
The thirteen
regions for which
regression
equations were
obtained.
Local estimation
of parametric
functions
Bartlein & Webb 1985
Regression equations for mean July temperature from the thirteen calibration
regions in eastern North America
Region A: 54-71 N; 90-110 W
Pollen sum: Alnus + Betula + Cyperaceae + Forb sum + Gramineae + Picea + Pinus
July T (oC) = 12.39 + 0.50*Pinus.5 + 0.26*Forb sum + 0.15*Picea.5
(1.61) (.14)
(.05)
(.10)
- 0.89*Cyperaceae.5 – 0.37*Gramineae – 0.03*Alnus
(.13)
(.08)
(.01)
R2 = 0.80; adj. R2 = 0.78; Se = 0.96oC
n = 114; F = 69.86; Pr = 0.0000
Region B: 53-71 N; 50-80 W
Pollen sum: Abies + Alnus + Betula + Herb sum + Picea + Pinus
July T (oC) = 8.17 + 0.54*Picea.5 + 0.17*Betula.5 - 0.04*Herb sum – 0.01*Alnus
(2.27) (.19)
(.14)
(.01)
(.01)
R2 = 0.70; adj. R2 = 0.70; Se = 1.52oC
n = 165; F = 95.48; Pr = 0.0000
"We selected the
appropriate equation
for each sample by
identifying the
calibration region that;
(1) contains modern
pollen data that are
analogous to the fossil
sample; and (2) has an
equation that does not
produce an
unwarranted
extrapolation when
applied to the fossil
sample.”
‘Ad hoc’
Regression equations used to reconstruct mean July
temperature at 6000 yr BP.
Bartlein & Webb 1985
Isotherms for estimated mean July temperatures (ºC) at 6000 yr BP.
Bartlein & Webb 1985
Elk Lake,
Minnesota
Bartlein &
Whitlock 1993
Reconstructions produced by the regression approach using
regression equations from five different calibration areas
Approaches to Multivariate Calibration
Responses
Chemometrics –
predicting chemical
concentrations from
near Infra-red spectra
Linear-based
methods
Predictors
Partial Least Squares Regression – PLS
Inverse approach
Form of PC regression developed in chemometrics
PCR
PLS
- components are selected to capture maximum variance within
the predictor variables (species in our case) irrespective of
their predictive value for the environmental response variable
- components are selected to maximise the covariance between
linear combinations of predictor variables with the
environmental response variables
PLS usually requires fewer components and gives a lower prediction
error than PCR.
Both are ‘biased’ inverse regression methods that guard against multicollinearity among predictors by selecting a limited number of
uncorrelated orthogonal components (reduced rank or dimensionality).
(Biased because some data are discarded; reduced dimensionality).
CONTINUUM REGRESSION
= 0 = normal inverse least squares regression
= 0.5 = PLS
= 1.0 = PCR
PLS is thus a compromise and performs well by combining
desirable properties of inverse regression (high correlation) and
PCR (stable predictors of high variance) into one technique.
PLS will always give a better fit (r 2) than PCR with same number
of components.
Stone & Brooks 1990
Partial least squares regression (PLS)
PLS1
Y
PLS2
PLS3
X
Components selected to
maximise covariance
between linear combinations
of species and environmental
variables X
Selection of number of PLS components to include based on
cross-validation. Model selected should have fewest components
possible and low RMSEP and low maximum bias.
Y = species
X = environmental variables
UNIMODAL-BASED TRANSFER FUNCTIONS
In late 1980s major environmental issue was so-called ‘acid-rain’
research. Involved establishing causes of recent surface-water
acidification in lakes and rivers in NW Europe and parts of N
America.
Very politically charged with large power-generating companies
employing scientists who were well paid to find faults in whatever
one did.
Led to major developments in palaeolimnology (and
later to palaeoclimatology) thanks to the work and
doctoral thesis (1987) of Cajo ter Braak, a Dutch
statistician, on numerical and statistical methods
based on the biologically realistic unimodal
response model.
Cajo ter Braak, April 1992
Basic Requirements in Quantitative
Palaeoenvironmental Reconstructions
1. Need biological system with abundant fossils that is responsive and sensitive
to environmental variables of interest.
2. Need a large, high-quality training set of modern samples. Should be
representative of the likely range of variables, be of consistent taxonomy
and nomenclature, be of highest possible taxonomic detail, be of
comparable quality (methodology, count size, etc.), and be from the same
sedimentary environment.
3. Need fossil set of comparable taxonomy, nomenclature, quality, and
sedimentary environment.
4. Need good independent chronological control for fossil set.
5. Need robust statistical methods for regression and calibration that can
adequately model taxa and their environment with the lowest possible error
of prediction and the lowest bias possible.
6. Need statistical estimation of standard errors of prediction for each
constructed value.
7. Need statistical and ecological evaluation and validation of the
reconstructions.
Birks et al. 1990
Linear or Unimodal Methods?
Estimate the compositional turnover or gradient length for the
environmental variable(s) of interest.
Detrended canonical correspondence analysis with x as the only
external or environmental predictor. Detrend by segments, non linear
rescaling.
Estimate of gradient length in relation to x in standard deviation (SD)
units of compositional turnover. Length may be different for different
climatic variables and the same biological data.
July mean T
2.62 SD
Jan mean T
2.76 SD
annual mean T
1.52 SD
If gradient length < 2 SD, taxa are generally behaving monotonically
along gradient and linear-based methods are appropriate (e.g. PLS).
If gradient length > 2 SD, several taxa have their optima located within
the gradient and unimodal-based methods are appropriate.
Birks 1995
Unimodal Classical Methods
Maximum Likelihood Prediction of Gradient Values
• Gaussian response model - regression
+ We know observed abundances Ym
+ We know gradient values Xm
Estimate or model the species response curves for all species (Ûm)
• Bioindication - calibration
+ We know observed abundances Yf
+ We know the modelled species response curves for all species (Ûm)
= Estimate the gradient value of Xf
• The most likely value of the gradient is the one that
maximises the likelihood function given observed and
expected abundances of species
Gaussian logit regression can be parameterised as a
generalised linear model (binomial or quasi-binomial error
structure, logarithmic link function). Solved by maximum
likelihood estimation. Can estimate optimum (u),
tolerance (t), and maximum (c) for each species
b1
u
2b 2
2
b1
c exp b 0
4b 2
t
1
2b 2
exp b 0 b1u b 2u 2
Guassian logit regression
Imbrie and Kipp 1971
61 core tops
Summer SST
27 taxa
Winter SST
Salinity
19
21
21
Significant increasing
linear logit model
6
3
4
Significant decreasing
linear logit model
1
1
0
No significant relationship
1
2
2
Significant Gaussian logit model
Maximum likelihood approach to reconstruction
• Likelihood is the probability of a given observed value
with a certain expected value
• Maximum likelihood estimation: expected or
reconstructed values that give the highest likelihood
for the observed fossil assemblages
- ML estimates are close to observed values, and the
proximity is measured with the likelihood function
- commonly use the negative logarithm for the
likelihood, since combined probabilities may be very
small
J. Oksanen 2002
Inferring past temperature from multivariate species
composition
Observed
Modern
responses
Inferred
9
10
11
12
13
14
Temperature (ºC)
Modified from J. Oksanen 2002
Guassian logit regression (GLR) and maximum likelihood
(ML) calibration
b0, b 1, b2
Ym + Xm
modern
data
b0, b 1, b 2
Yf
b0, b 1, b2
fossil
data
species GLR regression
coefficients for all species
^
ML
Xf
environmental
reconstruction
Root mean squared error for winter SST, summer SST, and
salinity using different procedures
Winter Summer
SST ºC SST ºC
Salinity
‰
Imbrie & Kipp 1971 linear
2.57
2.55
0.573
Imbrie & Kipp 1971 non-linear
1.54
2.15
0.571
Maximum likelihood regression
1.20
and ML calibration
1.63
0.54
Problems with Gaussian logit regression and maximum
likelihood calibration
1. Computationally difficult – a bug existed in our
software for nearly 10 years!
2. Computationally demanding and time consuming (and
complex) to derive sample-specific prediction errors.
Less of a problem as computing power increases and
reliable procedures for computer-intensive procedures
are becoming more available in, for example, R
ter Braak (1987) showed that simple two-way weighted
averaging (WA) closely approximates GLR with artificial
data. ter Braak and van Dam (1989) and Birks et al. (1990)
showed that WA performed nearly as well (or even slightly
better) than GLR with real data.
Two-Way Weighted Averaging
The basic idea is very simple.
In a lake with a certain temperature range, chironomids with their
temperature optima close to the lake’s temperature will tend to be the
most abundant species present.
A simple estimate of the species’ temperature optimum is thus an
average of all the temperature
values for lakes in which that
species occurs, weighted by the
species’ relative abundance.
(WA regression)
Conversely, an estimate of a
lake's temperature is the
weighted average of the
temperature optima of all the
species present.
(WA calibration)
Estimating species optima and tolerances from
modern data
n
Weighted
averaging
regression
Uˆk
Y ik X i
i 1
n
Optimum
Y ik
i 1
X ˆ 2
i 1Y ik i U k
tˆ
n
Y ik
i 1
n
1
2
Tolerance
k
Ûk is the WA optimum of taxon k
^
tk is WA standard deviation or tolerance of k
Yik is percentage of taxon k in sample i
Xi is environmental variable of interest in sample i
And there are i=1,....,n samples
and k=1, ....,m taxa
where
Reconstructing an environmental variable from a
fossil assemblage
Weighted averaging calibration
A lake will tend to be dominated by taxa with temperature optima close
to the lake's temperature
Estimate of this temperature is given by averaging the optima of all taxa
present in the lake.
If a species' abundance data are available these can be used as
weights:
WA Calibration
m
Xˆi
Y
ik
Uˆk
k 1
m
Y
ik
k 1
^
where Xi = estimate of environmental variable for fossil sample i
Yik= abundance of species k in fossil sample i
Ûk= optima of species k
Weighted averaging calibration or reconstruction
m
Y ik Uˆk
Xˆi
k 1
m
Y ik
WA
Simple WA
WAtol
WA with tolerance
downweighting
k 1
m
Xˆ
i
2
Y Uˆ tˆ
k 1
m
ik
k
k
2
Y tˆ
k 1
ik
k
Two-way weighted averaging
U1
Ym + Xm
modern
data
U2
Yf
Ut
fossil
data
species optima
‘transfer function’
^
Xf
environmental
reconstruction
Two-way weighted averaging. ter Braak & van Dam (1989) and
Birks et al. (1990)
(i) Estimate species optima (u ) by weighted averaging of the
environmental variable (x) of the sites. Species abundant at a site will
tend to have their ecological optima close to the environmental
variable at that site. (WA regression).
^
(ii)
Estimate the environmental values (x^ ) at the sites by weighted
^
averaging of the species optima (u ). (WA calibration.)
(iii) Because averages are taken twice, the range of estimated x-values is
shrunken, and a simple 'inverse' or 'classical' deshrinking is required.
^
Usually regress x on the preliminary estimates (x ) and take the fitted
values as final estimates of x. (Deshrinking regression)
Can downweight species in step (ii) by their estimated WA tolerances
(niche breadths) so that species with wide tolerances have less
weight than species with narrow tolerances (WAtol)
[Two-way WA with inverse regression = canonical correspondence
analysis regression]
Estimation of Sample-Specific Errors
- basic idea of computer re-sampling procedures
TRAINING SET - 178 modern samples and lake-water temperature (T)
Jack-knifing
Do reconstruction 177 times. Leave out sample 1 and reconstruct T; add sample 1
but leave out sample 2 and reconstruct T. Repeat for all 177 reconstructions using
a training set of size 177 leaving out one sample every time. Can derive jackknifing estimate of T and its variance and hence its standard error.
Bootstrap
Draw at random a training set of 178 samples using sampling with replacement so
that same sample can, in theory, be selected more than once. Any samples not
selected form an independent test set. Reconstruct T for both modern testset samples and for fossil samples. Repeat for 1000 bootstrap cycles.
Mean square error of prediction =
1. error due to variability in estimating species parameters in training set (i.e. s.e.
of bootstrap estimates)
+
2. error due to variation in species abundances at a given T (i.e. actual
prediction error differences between observed T and the mean bootstrap
estimate of T for modern samples when in the independent test). This
component commonly ignored in simple bootstrapping.
Birks et al. 1990
Use of data and the
bootstrap distribution to
infer a sampling
distribution. The bootstrap
procedure estimates the
sampling distribution of a
statistic in two steps. The
unknown distribution of
population values is
estimated from the sample
data, then the estimated
population is repeatedly
sampled to estimate the
sampling distribution of
the statistic.
Error estimation by bootstrapping WACALIB 3.1+ & C2
61 sample training set, draw 61 samples at random with replacement
to give a bootstrap training set of size 61. Any samples not selected form a
test set.
Mean square error of prediction =
error due to variability in estimates of
optima and/or tolerances in training set
boot
(xˆi, boot xi, boot )2
error due to variation in abundances
at a given temperature
+
+
n
(xi, boot xi, boot )2
n
boot
(s.e. of bootstrap estimates)
(actual prediction error differences between
observed xi and mean bootstrap estimate
S2
S1
(xi,boot is mean of xi,boot for all cycles when sample i is in test set).
For a fossil sample
MSEP
boot
RMSE = (S1 + S2)½
xˆ
i,boot
x i,boot
n
S
2
2
2
Root mean square errors of prediction estimated by
bootstrapping
WA
W Atol
Summer sea-surface temperature C Training set
RMSE total
2.31
2.37
RMSE S1
0.63
0.70
RMSE S2
2.22
2.27
Fossil samples
2.225-2.251
2.283-2.296
Winter sea-surface temperature C Training set
RMSE total
2.23
RMSE S1
0.62
RMSE S2
2.14
2.19
0.7
2.07
Fossil samples
2.156-2.201
2.106-2.249
Salinity ‰ Training set
RMSE total
RMSE S1
RMSE S2
0.61
0.11
0.60
0.60
0.13
0.59
Fossil samples
0.603-0.607
0.599-0.606
Transfer Function Assessment
ROOT MEAN SQUARED ERROR (RMSE) of
x
i
xˆi
CORRELATION BETWEEN xi and x^i
r
COEFFICIENT OF DETERMINATION
r2
xˆ
i
xi
2
n
r or r2 measures strength between observed and inferred values and
allows comparison between transfer functions for different variables.
RMSE2 = error2 + bias2
Error
= SE (xi - x^i)
Bias
= Mean (xi - x^i) SYSTEMATIC PREDICTION ERROR
(Mean of prediction errors)
RANDOM PREDICTION ERROR ABOUT BIAS
Also Maximum bias – divide sampling interval of xi into equal intervals
(usually 10), calculate mean bias for each interval, and the largest absolute
value of mean bias for an interval is used as a measure of maximum bias.
Note in RMSE the divisor is n, not (n - 1) as in standard deviation.
This is because we are using the known gradient values only.
Bias and Error
Reliable : Root mean squared error of prediction (RMSEP)
Unreliable: Correlation as it depends on the range of observations
•
Root mean squared error
RMSE
N
i 1
(xˆi x i ) 2 / n
•
Bias b: systematic difference
•
Error e: random error about
bias.
•
RMSE2 = b 2 + e2
Must be cross-validated or
will be badly biased
J. Oksanen 2002
Cross-validation
Leave-one-out ('jack-knife'), each in turn, or divide data into training
and test data sets. Leave-one-out changes the data too little, and hence
exaggerates the goodness of prediction. K-fold cross-validation leaves
out a certain proportion (e.g. 1/10) and evaluates the model for each of
the data sets left out.
Badly biased unless one does cross-validation
J. Oksanen 2002
Cross-validation statistics
RMSEP
r cv
r 2 cv
PREDICTED VALUES
mean bias
maximum bias
cf. RMSE
r
r2
APPARENT VALUES or ESTIMATED VALUES
mean bias
maximum bias
TRAINING SET ASSESSMENT AND SELECTION
Lowest RMSEP, highest r or r 2 cv, lowest mean bias, lowest maximum bias.
Often a compromise between RMSEP and bias.
PARTITIONING RMSEP
RMSEP2 = ERROR2 + BIAS2
S12 + S22
Error due to estimating optima and tolerances
Error due to variations in
abundance of taxa at given
environmental value
Statistical and Ecological Evaluation of
Reconstructions
INITIAL ASSUMPTIONS
1. Taxa related to physical environment.
2. Modern and fossil taxa have same ecological
responses.
3. Mathematical methods adequately model the
biological responses.
4. Reconstructions have low errors.
5. Training set is representative of the range of
variation in the fossil set.
Reconstruction evaluation or diagnostics
1. RMSEP for individual fossil samples
Monte Carlo simulation using leave-one-out initially to estimate standard errors of
taxon coefficient and then to derive specific sample standard errors, or
bootstrapping.
2. Goodness-of-fit statistics
Canonical correspondence analysis (= unimodal regression) of calibration set, fit
fossil sample passively on axis (environmental variable of interest), examine
squared residual distance to axis, see if any fossil samples poorly fitted.
3. Analogue statistics
Good and close analogues. Extreme 5% and 2.5% of modern DCs.
4. Percentages of total fossil assemblage that consist of taxa not represented in
all calibration data set and percentages of total assemblage that consist of taxa
poorly represented in training set (e.g. < 10% occurrences) and have coefficients
poorly estimated in training set (high variance) of beta values in cross-validation).
< 5% not present
reliable
< 10% not present
okay
< 25% not present
possibly okay
> 25% not present
not reliable
Canonical correspondence analysis
Goodness-of-fit statistics
Statistically evaluated reconstructions
Transfer-function performance statistics
Imbrie & Kipp 1971 data
RMSEP
Summer SST
Winter SST
Salinity
PC regression
2.55ºC
2.57ºC
0.57 ‰
PC regression with
quadratic terms
GLR (ML)
2.15ºC
1.54ºC
0.57 ‰
1.63ºC
1.20ºC
0.54 ‰
WA
2.02ºC
1.97ºC
0.57 ‰
ECOLOGICAL VALIDATION
Diatoms and pH
Reconstructions of the pH history of Lysevatten based on historical data and inference from the
subfossil diatoms in the sediment. Historical data are pH measurements (thin solid line) and
indirect data from fish reports and data from other similar lakes (thin broken line). The insert,
showing pH variations from April 1961 to March 1962, is based on real measurements. Diatominferred values (thick solid line) were obtained by weighted averaging. Renberg & Hultberg 1992
Weighted Averaging – An Assessment
1. Ecologically plausible – based on unimodal species response model.
2. Mathematically simple but has a rigorous mathematical theory.
Properties fairly well known now.
3. Empirically powerful:
a. does not assume linear responses
b. not hindered by too many species, in fact helped by many species!
Full dimensionality
c. relatively insensitive to outliers
4. Tests with simulated and real data – at its best with noisy, species-rich
compositional percentage data with many zero values over long
environmental gradients (> 3 standard deviations).
5. Because of its computational simplicity, can derive error estimates for
predicted inferred values.
6. Does well in ‘non-analogue’ situations as it is not based on the
assemblage as a whole but on INDIVIDUAL species optima and/or
tolerances. Global parametric estimation.
7. Ignores absences.
8. Weaknesses.
Species packing model: Gaussian logit curves of the probability (p) that a
species occurs at a site, against environmental variable x. The curves
shown have equispaced optima (spacing = 1), equal tolerances (t = 1)
and equal maximum probabilities occurrence (pmax = 0.5). xo is the value
of x at a particular site.
ter Braak 1987
Diatoms
and pH
Birks 1994
Weaknesses of Weighted Averaging
1. Sensitive to distribution of environmental variable in
training set.
2. Considers each environmental variable separately.
3. Disregards residual correlations in species data.
Can extend WA to WA-partial least squares to include
residual correlations in species data in an attempt to
improve our estimates of species optima.
Weighted averages
•
1.
2.
WA estimate of species optimum (ũ) is good if:
Sites are uniformly distributed over species range
Sites are close to each other
•
1.
2.
3.
4.
WA estimates of gradient values (x) are good if:
Species optima are dispersed uniformly around x
All species have equal tolerances
All species have equal modal abundances
Optima are close together
u~i
y x
y
ij
i
j
y
u
x
=
=
=
x~ j
j
ij
abundance
optimum
gradient value
y u
y
ij
i
j
i
j
~
=
=
=
i
ij
species
site
WA estimate
These conditions are only true for infinite species packing conditions!
Biases and truncation in weighted averages
Weighted averages are good estimates of Gaussian optima, unless
the response is truncated.
Bias towards the gradient centre: shrinkage, hence the need for
deshrinking regression
WA
WA GLR
GLR
pH
J. Oksanen 2002
Inverse Approaches to Multivariate Calibration
Chemometrics – predicting chemical concentrations from near infra-red spectra
Responses
LINEAR
Predictors
PCR
PLS
CAR
UNIMODAL
WA-PLS
Correspondence Analysis Regression
Roux 1979
Reduced Imbrie & Kipp (1971) modern foraminifera data to 3 CA axes.
Then used these in inverse regression. Reduced dimensionality
RMSEP
Summer temp
Winter temp
PC regression
2.55°C
2.57°C
PC regression with
quadratic terms
2.15°C
1.54°C
CA regression
1.72°C
1.37°C
GLR (ML)
1.63°C
1.20°C
WA
2.02°C
1.97°C
WA-PLS
1.53°C
1.17°C
Shows importance of using a unimodal-based method
Weighted Averaging Partial Least Squares
(WA-PLS)
Extend simple WA to WA-PLS to include residual correlations in species
data in an attempt to improve our estimates of species optima.
Partial least squares (PLS)
Form of PCA regression of x on y
PLS components selected to show maximum covariance with x, whereas
in PCA regression components of y are calculated irrespective of their
predictive value for x.
Weighted averaging PLS
WA = WA-PLS if only first WA-PLS component is used
WA-PLS uses further components, namely as many as are useful in terms
of predictive power. Uses residual structure in species data to improve
our estimates of species parameters (optima) in final WA predictor.
Optima of species that are abundant in sites with large residuals are likely
to be updated most in WA-PLS.
Weighted averaging partial least squares regression (WA-PLS).
ter Braak & Juggins (1993) and ter Braak et al. (1993)
PLS1
Y
PLS2
PLS3
X
Components selected to
maximise covariance between
species weighted averages and
environmental variable x
Selection of number of PLS components to include based on crossvalidation. Model selected should have fewest components possible
and low RMSEP and maximum bias – minimal adequate model.
Reduced dimensionality. Global parametric estimation.
Weighted averaging partial least squares – WA-PLS
1. Centre the environmental variable by subtracting weighted mean.
2. Take the centred environmental variable (xi) as initial site scores –
(cf. WA)
3. Calculate new species scores by WA of site scores.
4. Calculate new site scores by WA of species scores.
5. For component 1, go to 6. For components 2 and more, make site
scores uncorrelated with previous axes.
6. Standardise new site scores and (cf. WA) use as new component.
7. Regress environmental variable on the components obtained so far
using a weighted regression (inverse) and take fitted values as
current estimate of estimated environmental variable. Go to step 2
and use the residuals of the regression as new site scores
(hence name ‘partial’) (cf WA).
Optima of species that are abundant in sites with large residuals
likely to be most updated.
Calculating inferred environment variable
Using the WA-PLS model, inferred environment is calculated as a
weighted sum:
xˆ
i
m
y ik b
ˆk
k 1
m
y
k 1
ik
where x^i is the inferred environment for sample i,
yik is the abundance of fossils of taxon k in sample i, and
b^k is the Beta of species k
Leave-one-out and test set cross-validation
Performance of WA-PLS in relation to the number of components (s):
apparent error (RMSE) and prediction error (RMSEP). The estimated
optimum number of components is 3 because three components give
the lowest RMSEP in the training set. The last column is not generally
available for real data as it is based on independent test data. These
suggest a 2-component model. Problem of ‘overfitting’ the model.
s
1
2
3
4
5
6
Apparent
RMSE
6.14
3.37
2.87
2.22
2.01
1.82
Training set
Leave-one-out
RMSEP
6.22
4.24
4.16*
4.65
4.65
4.50
Test set
RMSEP
6.61
4.40*
4.57
4.94
5.11
5.62
ter Braak & Juggins 1993
The performance of WA-PLS applied to three diatom data sets
for different number of components (s) in terms of apparent RMSE
and leave-one-out (RMSEP) (* = selected model).
Dataset
s
1
2
3
4
5
6
SWAP
RMSE RMSEP
Bergen
RMSE RMSEP
Thames
RMSE RMSEP
0.276
0.232
0.194
0.173
0.153
0.134
0.353
0.256
0.213
0.192
0.174
0.164
0.341
0.238
0.196
0.166
0.153
0.140
0.310*
0.302
0.315
0.327
0.344
0.369
Reduction in prediction
error (%)
0
0.394
0.318*
0.330
0.335
0.359
0.374
19
0.354
0.279
0.239*
0.224
0.219
0.219
32
ter Braak & Juggins 1993
Norwegian chironomid–climate training set
Leave-one-out cross validation
Predicted air temperature.
RMSEP = 0.89ºC
Bias = 0.61ºC
Predicted – observed air
temperature
1:1
Brooks & Birks 2000
Inferred mean
July air
temperature
Oxygen isotope
ratios
Brooks & Birks
2000
Norwegian pollen and
climate
Precipitation
300 - 3537mm
Mean July
7.7 - 16.4ºC
Mean January
-17.8 - 1.1ºC
Birks et al. (unpublished)
Generalised additive
models
Root mean squared errors of prediction (RMSEP) based on
leave-one-out jack-knifing cross-validation for annual
precipitation, mean July temperature, and mean January temperature
using five different statistical models.
Pptn
(mm)
July
(C)
January
(C)
Weighted averaging (WA) (inverse)
427.2
1.07
2.61
WA-PLS
417.5
1.03
2.57
Modern analogue technique (MAT)
385.3
0.91
2.42
WA-PLS
Vuoskojaurasj, Abisko, Sweden
Vuoskojaurasj consensus reconstructions
Reconstruction validation
Tibetanus, Abisko Valley
Oxygen
isotopes
Inferred from
pollen
Inferred from
pollen
Hammarlund et al. 2002
Björnfjelltjörn, N. Norway
Björnfjelltjörn consensus reconstructions
Reconstruction validation
Broad-Scale Studies
304 modern pollen
samples Norway,
northern Sweden,
Finland (Sylvia Peglar,
Heikki Seppä, John
Birks, Arvid Odland)
Seppä & Birks (2001)
Performance statistics - WA-PLS - leave-one-out cross-validation
July temperature (7.7 - 17.1ºC)
RMSEP
R2
Max. bias
1.0ºC
0.73
3.64ºC
0.71
960 mm
Annual precipitation (300 - 3234 mm) 341 mm
Predicted versus observed
Residuals
Seppä & Birks (2001)
Broad-scale patterns in western Norway using pollen
data
Changes in July summer temperature relative to present-day reconstructed
temperature on a S-N transect west of the Scandes mountains. 16 sites
covering all or much of the Holocene.
South
North
Nesje et al. (2005)
Some Warnings!
1. Different methods, although they have similar modern model
performances, can give very different reconstruction results.
Birks (2003)
2. Use of different proxies - different proxies may give different
reconstruction, e.g. mean July temperature at Björnfjell, northern
Norway.
Validate using another proxy - macrofossils of tree birch
Importance of independent validation but which proxies to use?
3. Covarying environmental variables e.g. temperature and lake
trophic status (e.g. total N or P) or temperature and lake depth
Brodersen &
Anderson (2002)
pH and climate
Environmental
variables often
co-vary
Is the inferred
temperature
change real or
a result of pH
changes that
weakly co-vary
with
temperature?
Anderson
2000
MODERN ANALOGUE-BASED APPROACHES
Do an analogue-matching between fossil sample i and all available
modern samples with associated environmental data. Find modern
sample(s) most similar to i, infer the past environment for sample i to be
the modern environment for those modern samples. Repeat for all fossil
samples.
PROBLEMS
1. Assessment of ‘most similar’?
2. 1, 2, 9, 10 most similar?
3. No-analogues for past assemblages.
4. Choice of similarity measure.
5. Require huge set of modern samples of comparable site type, pollen
morphological quality, etc, as fossil samples. Must cover vast
geographical area.
6. Human impact, especially in Eurasia and North America.
7. Multiple analogues (the Pinus problem).
Elk Lake, Minnesota
Bartlein &
Whitlock
1993
Reconstructions produced using the analogue approach
Modified modern analogue approaches
Joel Guiot
1. Taxon weighting
Palaeobioclimatic operators (PBO) computed from either a timeseries of fossil sequence or from a PCA of fossil pollen data from
large spatial array of sites.
Weights are selected to 'emphasis the climate signal within the fossil
data‘ and to 'highlight those taxa that show the most coherent
behaviour in the vegetational dynamics', 'to minimise the human
action which has significantly disturbed the pollen spectra', and 'to
reduce noise'.
2. Environmental estimates are weighted means of estimates based on
20, 40 or 50 or so most similar assemblages.
3. Standard deviations of these estimates used as an approximate
standard error.
Reconstruction of
variations in annual total
precipitation and mean
temperature expressed as
deviations from the
modern values (1080 mm
and 9.5oC for La Grande
Pile. 800 mm and 11oC for
Les Echets). The error
bars are computed by
simulation. The vertical
axis is obtained by linear
interpretation from the
dates indicated in Fig.2
Guiot et al. 1989
Cor is the correlation between estimated
and actual data. +ME is the mean upper
standard deviation associated to the
estimates, -ME is the lower standard
deviation. These statistics are calculated
on the fossil data and on the modern data.
In this case, R must be replaced by C.
Modern Analogue Techniques for environmental
reconstruction = k-nearest neighbours (k-NN)
MAT, ANALOG, C2
1. Modern data and environmental variable(s) of interest.
Do analog matches and environmental prediction for all samples but
with cross-validation jack-knifing.
Find number of analogues to give lowest RMSEP for environmental
variable based on mean or weighted mean of estimates of
environmental variable. Can calculate bias statistics as well.
2. Reconstruct using fossil data using the ‘optimal’ number of
analogues (lowest RMSEP, lowest bias).
Use chord distance or chi-squared distance as dissimilarity measure.
Optimises signal to noise ratio with ‘closed’ percentage data.
POLLEN-CLIMATE RESPONSE
SURFACES
Pollen percentages in modern samples plotted in ‘climate space’ (cf
Iversen’s thermal limit species +/– plotted in climate space).
Contoured
Trend-surface analysis R2
Bartlein et al. 1986
Contoured only by locally weighted regression Webb et al. 1987
Reconstruction purposes – grid, analogue matching
Simulation purposes
PROBLEMS
1. Need large high-quality modern data for large geographical areas.
2. No robust error estimation for reconstruction purposes.
3. Reconstruction procedure ‘ad hoc’ – grid size, etc.
Response surfaces for
individual pollen types. Each
point is labelled by the
abundance of the type. Many
points are hidden – only the
observation with the highest
abundance was plotted at each
position. For (a) to (e) ’+’
denotes 0%, ’0’ denotes 010%, 1 denotes 10-20%, ’2’
denotes 20-30% etc. For (f) to
(h), ’+’ denotes 0%, ’0’
denotes 0-1%, 1 denotes 12%, ’2’ denotes 2-3%, etc. ’H’
denotes greater than 10%.
Bartlein et al. 1986
Bartlein et
al. 1986
Percentage of spruce
(Picea) pollen at
individual sites plotted
in climate space along
axes for mean July
temperature and
annual precipitation.
(B) Grid laid over the
climate data to which
the pollen percentage
are fitted by local-area
regression. The box
with the plus sign is the
window used for localarea regression. (C)
Spruce pollen
percentages fitted onto
the grid. (D) Contours
representing the
response surface and
pollen percentages
shown in part C.
Webb et al. 1987
Elk Lake,
Minnesota
Reconstruction
produced by
the response
surface
approach
Bartlein & Whitlock 1993
Simulation purposes
Fossil and simulated isopoll map sequences for Betula. Isopolls are drawn
at 5, 10, 25, 50 and 75% using an automatic contouring program.
Huntley 1992
Fossil and simulated isopoll map sequences from Quercus (deciduous). Maps are drawn at 3000year intervals between 12000 yr BP and the present. The upper map sequence presents the
observed fossil and contemporary pollen values. The lower map sequence presents the pollen
values simulated, by means of the pollen-climate response surface from the climate conditions
obtained by applying to the measured contemporary climate the palaeoclimate anomalies that
Kutzbach & Guetter (1986) simulated using the NCAR CCM, for 12000 to 3000 yr BP. The map for
the present is simulated from the measured contemporary climate. Isopolls are drawn at 2, 5, 10,
25, and 50% using an automatic contouring program.
Huntley 1992
Response surfaces - Critique
1. Choice of how much or how little smoothing.
2. Choice of scale of grid for reconstructions.
3. No statistical measure of ‘goodness-of-fit’.
4. No reliable error estimation for predicted values.
5. Reconstruction procedure is close to modern
analogue technique but with smoothed modern
data.
CONSENSUS RECONSTRUCTIONS
AND SMOOTHERS
Plotting of Reconstructed Values
1. Plot against depth or age the reconstructed values, indicate the
observed modern value if known.
2. Plot deviations from the observed modern value or the inferred
modern value against depth or age.
3. Plot centred values (subtract the mean of the reconstructed
values) against depth or age to give relative deviations.
4. Plot standardised values (subtract the mean of the reconstructed
values and divide by the standard deviation of the reconstructed
values) against depth or age to give standardised deviations.
5. Add LOESS smoother to help highlight major trends, and to
identify signal from noise in reconstructions.
LOESS
smoother
LOESS = locally
weighted scatter-plot
smoother
Pollen-climate reconstructions
Seppä & Birks 2002
Chironomids
Trends or RMSEP?
Brooks & Birks 2001
Consensus Reconstructions
Elk Lake climate reconstruction
summary. The three series
plotted with red, green and
blue lines show the
reconstructions produced by
the individual approaches, the
series plotted with the thin
black line show the envelope of
the prediction intervals, and
the series plotted with a thick
purple line represents the
stacked and smoothed
reconstruction of each variable
(constructed by simple
averaging of the individual
reconstructions for each level,
followed by smoothing
[Velleman, 1980]). The modern
observed values (1978-1984)
for Itasca Park are also shown.
Bartlein & Whitlock 1993
PROBLEMS OF SPATIAL
AUTOCORRELATION IN TRANSFER
FUNCTIONS
Telford & Birks (2005) Quaternary Science Reviews 24:
2173-2179
Estimating the predictive power and performance of a
training set as RMSEP, maximum bias, r2, etc., by crossvalidation ASSUMES that the test set (one or many
samples) is STATISTICALLY INDEPENDENT of the
training set.
Cross-validation in the presence of spatial autocorrelation seriously violates this assumption as the
samples are not spatially and statistically independent.
Positive spatial autocorrelation is the tendency of sites close
to each other to resemble one another more than randomly
selected sites. Value of an autocorrelated variable at one
site can be predicted from its values at neighbouring sites.
Property of almost all climate data and much ecological and
biological data.
Can cause spurious associations (inflated Type I errors)
between species assemblages and environmental variables,
particularly when the latter are spatially smooth like seasurface temperature or air temperature.
“The elimination of spurious correlation due to position in
time or space”
Student 1914
“Spatial autocorrelation: trouble or new paradigm?”
Pierre Legendre 1993
Atlantic foraminiferal data-set
- 947 samples
Gaussian logit regression
WA
WA-PLS
Artificial neural networks (ANN)
MAT
Ten-fold cross-validation repeated 10 times with
80% training set,
10% optimisation set (to find number of WA-PLS
components, number of analogues in MAT, to
optimise ANN),
and 10% independent set.
Cross-validation within the N Atlantic set (80% training,
10% optimisation, 10% independent test) and both N
Atlantic (80% training, 20% optimisation) and S Atlantic
(independent test set)
Cross-validation
RMSEP ºC
N Atlantic
Independent test
RMSEP ºC
S Atlantic
GLR
1.53
1.61 *
WA
1.93
1.73 *
WA-PLS (5)
1.37 *
1.95
ANN
1.01 *
2.01
MAT
0.89 *
1.87
Cross-validation of N Atlantic data suggests ANN,
MAT, and WA-PLS have the best performance.
Using the S Atlantic data as a test set where there
can be no spatial autocorrelation, ANN, WA-PLS, and
MAT are the worst!
Similar results if N Atlantic split at 40ºW into western
(test set) and eastern (training set) halves, namely
ANN, MAT, and WA-PLS are the worst.
Why?
Why should MAT, WA-PLS, and ANN perform worse when applied to a
spatially independent test set, but appear to perform best when
applied to spatially dependent or spatially autocorrelated test set?
MAT – selects the k-nearest neighbours in the training set using an
appropriate dissimilarity coefficient, and calculates a (weighted) mean
of the environmental parameter (e.g. SST) of interest.
Dissimilarity measure between sites is a holistic measure of
compositional similarities in the assemblages. Best analogues are not
only similar in SST but similar in other environmental variables. As
most environmental variables are autocorrelated, sites that are similar
in ALL environmental variables are likely to be close to the site being
tested. As such sites will also have a similar SST, this will strengthen
the apparent relationship between assemblages and SST.
Local non-parametric estimation in MAT.
ANN – extract unknown structure from data. Cannot distinguish
between structure imposed by SST and structure generated by
spatial autocorrelation. Uses both to minimise RMSEP.
Need careful training to prevent model overfitting. Use training
and optimisation sets and then an independent test set. If training
and test data are not spatially independent, the ANN model learns
features of the training set area. Must have a spatially
independent test set to force ANN to use the general relationships
only. Local non-parametric estimation.
WA-PLS – uses additional components to account for residual
correlation in the biological data after fitting a WA model to SST.
Improves RMSEP by reducing 'edge effects' implicit in WA and by
picking up residual variation in certain areas of the gradient, and
this may be spatially autocorrelated to SST.
WA and GLR – only model the variation in the assemblages that
is correlated with SST. Robust to spatial structure in the data.
Global parametric estimation.
Models with tuneable parameters (ANN, MAT, WAPLS) perform
worse when tested with a spatially independent test set.
Models without tuneable parameters (WA, GLR) are easier to fit
and perform better when tested with a spatially independent
test set.
These arguments also apply to pollen-climate data-sets using MAT,
ANN, WA-PLS, and response-surfaces (= smoothed MAT).
How to define a spatially independent test set for pollen?
Individualistic nature of lakes may mean that diatom-pH, diatomtotal P, diatom-salinity, and possibly chironomid-temperature
transfer functions will be less prone to autocorrelation problems.
Lowest autocorrelation in MAT
and ANN residuals
Highest autocorrelation in WA
and GLR (= ML)
residuals
Highlights 'secret
assumption' of
transfer functions
How widespread is the spatial autocorrelation
problem?
Telford & Birks 2009 Quat Sci Rev doi:
10.1016/j.quascirev.2008.12.020
Several data-sets
• diatoms and lake-water pH
• benthic foraminifera and salinity
• planktonic foraminifera and summer SST
• arctic pollen and July sunshine
Simple tool to detect autocorrelation is to delete samples at
random and derive transfer function and its performance
statistics or to delete sites geographically close to the test
sample and derive transfer function and its performance.
If strong autocorrelation is present, deleting geographically
close sites will preferentially delete the closest samples.
Because of autocorrelation these will bias the apparent
‘good’ performance of the transfer function in crossvalidation. Thus their deletion should drastically decrease
the performance of the transfer function. In contrast,
random deletion should have much less effect on model
performance.
neighbour
deletion
random deletion
Note different y-axis scales
Telford & Birks 2008 submitted
How to cross-validate a transfer function in the
presence of spatial autocorrelation?
Telford & Birks 2008 submitted
h-block cross-validation. In time-series analysis, an observation is deleted
from the training-set along with h observations on either side to prevent
temporal autocorrelation.
Extend this to spatial autocorrelation by defining radius h in km. What value
to use?
Use the range of a variogram of the residuals (e) from the basic transfer
function model
Y = f(X) + e
(Variogram is a standard tool in spatial analysis – a plot of half the squared
difference between two observations against their distance in space,
averaged for a series of distance classes – Legendre & Fortin 1989).
What are the MAT and WA-PLS RMSEP for leave-one-out and h-block crossvalidation?
random deletion
neighbour
deletion
RMSEP
L-O-O 0.37
h-b
0.37
RMSEP
L-O-O 0.14
h-b
0.36
MAT
RMSEP
L-O-O 1.2ºC
h-b
3.0ºC
RMSEP
L-O-O 2.3%
h-b
7.6%
Telford & Birks 2008 submitted
RMSEP
MAT and WA-PLS
Diatoms - pH
Benthic forams - salinity
Planktonic forams - SST
Pollen – July sunshine
L-O-O
h-block
% change
L-O-O
h-block
% change
L-O-O
h-block
% change
L-O-O
h-block
% change
MAT
0.37
0.37
0%
0.14
0.36
171%
1.2ºC
3.0ºC
150%
2.3%
7.6%
230%
WA-PLS
0.36
0.36
0%
0.17
0.34
100%
1.7ºC
2.7ºC
58%
3.6%
9.0%
150%
Increase of RMSEP in h-block cross-validation for all data except diatoms
and pH
Greatest increase in RMSEP always in MAT and in training-sets with the
highest spatial autocorrelation in the X variable
USE OF ARTIFICIAL, SIMULATED
DATA-SETS
Simulated data-sets
Generate many training sets (different numbers of samples
and taxa, different gradient lengths, vary extent of noise,
absences, etc) and evaluation test sets, all under different
species response models to investigate specific problems,
e.g. no-analogue problem
No-analogue Problem
1. Probably widespread.
2. Does it matter?
3. Analogue-based techniques for reconstruction - YES!
Modern analogue technique
Response surfaces
Artificial neural networks
4. GLR, WA, and WAPLS regression methods
What we need are ‘good’ (i.e. reliable) estimates of ûk. Apply them to
same taxa but in no-analogue conditions in the past.
Assume that the realised niche parameter ûk is close to the potential
or theoretical niche parameter uk*.
GLR, WA, and WA-PLS are, in reality, additive indicator species
approaches rather than strict multivariate analogue-based methods.
5. Simulated data
ter Braak, 1995. Chemometrics & Intelligent Laboratory Systems 28,
165–180
100
Set C
No analogue test set
Z
Training set
Set A
0
X
100
L-shaped climate configuration of samples (circles) in
the training set (Table 3), with x the climate variable
to be calibrated and z another climate variable. Also
indicated are the regions of the samples in evaluation
set A and set C
6. General conclusions from simulated data experiments
WA, WA-PLS, Maximum likelihood, and MAT can all perform
poorly under no-analogue conditions and no one method
performs consistently better than other methods.
For strong extrapolation, WA performed best. Appears WAPLS deteriorates quicker than WA with increasing extrapolation.
Hutson (1977) – in no-analogue conditions WA outperformed
inverse linear regression and PCR.
Important therefore to assess analogue status of fossil samples
as well as ‘best’ training set in terms of RMSEP, bias, etc.
7. Multiple-analogue problem – try to avoid!
MULTI-PROXY APPROACHES
Swiss surface
pollen samples
– lake
sediments
Selected trees
and shrubs
Swiss surface lake sediments
Selected herbs and pteridophytes
WA-PLS Modern Swiss pollen-climate RMSEP
1.25ºC
1.03ºC
194 mm
Gerzensee, Bernese Oberland, Switzerland
Gerzensee
PB-O
YD-PB Tr
YD
AL-YD Tr
G-O
Lotter et al. 2000
Gerzensee
O16/O18
Pollen
Chydorids
Lotter et al. 2000
Birks & Ammann 2000
USE OF METHODS
Marine studies
- PCR, much MAT plus variants, some
ANN and response surface uses, very
(foraminifera, diatoms, coccoliths,
few WA or WA-PLS uses
radiolaria, dinoflagellate cysts)
Freshwater studies
- WA or WA-PLS, very few MAT or ANN
uses, no PCR uses
(diatoms, chironomids, ostracods,
cladocera)
Terrestrial studies
(pollen)
- MAT plus variants, response surfaces,
a few ANN uses, increasingly more
WA-PLS uses, no PCR uses
In comparisons using simulated and real data, WA and WA-PLS usually
outperform PCR and MAT but not always.
Classical methods of Gaussian logit regression and calibration rarely used
(freshwater, terrestrial) but this is changing. Some applications of artificial
neural networks and few studies within a Bayesian framework.
Bayesian framework may be an important future research direction but it
presents very difficult and time-consuming computational problems.
Use weighted-averaging (WA) or weighted-averaging partial least
squares regression. (WA-PLS)
Takes account of % data, ignores zero values, assumes unimodal
responses, can handle several hundred species, and gives transfer
functions of high precision (0.8ºC), low bias, and high robustness.
X = g(Y1, Y2, Y3, ... ... ..., Ym)
Modern data
^
Fossil data
Xf = g (Yf1, Yf2, Yf3, ... ... ..., Yfm)
g is our transfer function for X and Y
Simple, ecologically realistic, and robust
WA is robust to spatial autocorrelation
IMPLICATIONS FOR CURRENT
PROJECTS
1. We have the statistical tools for transfer function work
2. Need high-quality training sets and associated environmental data
i.
consistent detailed taxonomy
ii. comparable methods and quality
iii. comparable sedimentary environment
iv. good environmental data
3. Need to minimise ‘noise’ as much as possible (marine vs
freshwater systems).
4. Need care in design of sampling along primary environmental
gradients. Problems of secondary gradients.
5. Unless one can use linear-based methods (problems of
getting enough samples along short gradients), the gradient
in the training set should lie about 1–1.5 SD either side of
environmental variable’s range represented in the fossil data
(to avoid end of gradient problems).
6. Major problems of spatial autocorrelation in evaluating
models unless the test data are spatially independent of the
training set. Usually not possible. In which case RMSEP based
on cross-validation are over-optimistic and misleading.
Same spatial autocorrelation problems arise in deriving
sample-specific prediction errors by boot-strapping as it
assumes all samples are independent.
Much work still needed on the problems of spatial
autocorrelation not only in transfer functions but also in much
climate research.
POTENTIAL PALAEOECOLOGICAL
APPLICATIONS OF RECONSTRUCTIONS
Palaeoecological data (e.g. pollen, diatom, plant macrofossil,
chironomid) can be interpreted in at least two ways
1. Descriptive – descriptive and narrative approaches in which the
emphasis is on reconstruction of past biota, past populations,
past communities, past landscapes, and past environment.
Primarily a hypothesis-generation approach.
2. Causal – causes or forcing functions of observed changes or
observed stability. Analytical approach.
Primarily a hypothesis-testing approach.
Analogous to exploratory data analysis and confirmatory data
analysis in statistics.
Causal interpretations in palaeoecology not easy, as
it is only too easy to fall into circular arguments.
e.g. pollen is used to reconstruct past climate,
climate reconstruction is used to explain changes in
pollen stratigraphy.
Great need for rigorous approaches to both
reconstructions and causal interpretation.
Quantitative palaeoenvironmental reconstructions in the
context of palaeoecology are not really an end in
themselves (in contrast to palaeoclimatology) but they are
a means to an end.
Use the reconstructions based on one proxy to provide an
environmental history against which observed biological
changes in another, independent proxy can be viewed and
interpreted as biological responses to environmental
change.
Minden Bog, Michigan.
Booth & Jackson 2003
Black portions = wet
periods,
grey = dry periods
Major change 1000 years ago towards drier conditions,
decline in Fagus and rise in Pinus in charcoal
Climate vegetation fire frequency
Central New England, eastern USA
Environmental proxies
– hydrogen isotope ratios as
temperature proxy (low values
indicate colder temperatures)
- lake levels indicate moisture balance
See major pollen changes coincide
with climatic transitions
Climate control of vegetational
composition at millennial scales
Shuman et al. 2004
These approaches involving environmental
reconstructions independent of the main fossil record
can be used as a long-term ecological observatory or
laboratory to study long-term ecological dynamics under
a range of environmental conditions, not all of which
exist on Earth today (e.g. lowered CO2 concentrations,
low human impact).
Can begin to study the Ecology of the Past.
Exciting prospect, many potentialities in future research.
ACKNOWLEDGEMENTS
Key Figures in Transfer-Function Statistical Research
John Imbrie
Cajo ter Braak
Tom Webb
Svante Wold
Steve Juggins
Richard Telford
One cannot do transfer-function research without high quality data and
these need skilled palaeoecologists. Many colleagues have contributed
to the development of transfer functions by creating superb modernenvironmental data sets
Heikki Seppä
Andy Lotter
Steve Brooks
Viv Jones
Oliver Heiri
Ulrike Herzschuh
Many other people have contributed to the work on which
this talk is based or to the preparation of this presentation
Hilary Birks
Sylvia Peglar
Rick Battarbee
John Boyle
Arvid Odland
Gaute Velle
Kathy Willis
Cathy Jenks