NUMERICAL ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL …

Download Report

Transcript NUMERICAL ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL …

QUANTITATIVE METHODS IN
PALAEOECOLOGY AND
PALAEOCLIMATOLOGY
Class 5
Hypothesis Testing
Espegrend August 2008
CONTENTS
Randomisation tests – an introduction
pH changes at Round Loch of Glenhead
Assessing impacts of volcanic ash deposition on
terrestrial and aquatic systems
Multi-proxy studies
Assessing potential external 'drivers' on an aquatic
ecosystem
Conclusions
RANDOMISATION TESTS
Simple introductory example
Mandible lengths of male and female jackals in Natural History
Museum
Male
120
107
110
116
114
111
113
117
114
112
mm
Female 110
111
107
108
110
105
107
106
111
111
mm
Is there any evidence of difference in mean lengths for two sexes?
Male mean larger than female mean.
Null hypothesis (Ho) – no difference in mean lengths for two sexes,
any difference is purely due to chance. If Ho consistent with data,
no reason to reject this in favour of alternative hypothesis that
males have a larger mean that females.
Classical hypothesis testing – t-test for comparison of 2
means
Group 1
n1 objects
Group 2
n2
x1 mean
x2
s1
s2
Assume that values for group 1 are random sample from a
normal distribution with 1 mean and standard deviation ,
and mean 2 and standard deviation 
H0
1 = 2
H1
1 > 2
Test null hypothesis with estimate of common within-group s.d.
S = [{(n1 –1)S12 + (n2 – 1)S22}/(n1 + n2 –2)]
T = (x1 – x2)/(S(1/n1 + 1/n2))
If H0 true, T will be a random value from t-distribution with n1 + n2 – 2d.f.
Jackal data
x1 = 113.4mm
s1 = 3.72mm
x2 = 108.6mm
s2 = 2.27mm
T = 3.484
18 d.f.
s = 3.08
Probability of a value this large is 0.0013 if null hypothesis is true.
 Sample result is nearly significant at 0.1% level. Strong evidence against
null hypothesis. Support for alternative hypothesis.
Assumptions of T-test
1. Random sampling of individuals from the
populations of interest
2. Equal population standard deviations for males and
females
3. Normal distributions within groups
Alternative Approach
If there is no difference between the two sexes, then the length distribution in the
two groups will just be a typical result of allocating 20 lengths at random into 2
groups each of size 10. Compare observed difference with distribution of differences
found with random allocation.
TEST:
1. Find mean scores for male and female and difference D0 for observed data.
2. Randomly allocate 10 lengths to male group, remaining 10 to female. Calculate
D1 .
3. Repeat many times (n e.g. 999 times) to find an empirical distribution of D that
occurs by random allocation. RANDOMISATION DISTRIBUTION.
4. If D0 looks like a ’typical’ value from this randomisation distribution, conclude
that allocation of lengths to males and females is essentially random and thus
there is no difference in length values. If D0 unusually large, say in top 5% tail
of randomisation distribution, observed data unlikely to have arisen if null
hypothesis is true. Conclude alternative model is more plausible.
If D0 in top 1% tail, significant at 1% level
If D0 in top 0.1% tail, significant at 0.1% level
The distribution of the differences observed between the mean for males
and the mean for females when 20 measurements of mandible lengths are
randomly allocated, 10 to each sex. 4999 randomisations.
x1 = 113.4mm x2 = 108.6mm D0 = 4.8mm
Only nine were 4.8 or more, including D0.
Six were 4.8
2 > 4.8
9 .
Significance level = 5000 = 0.0018 = 0.18%
(cf. t-test
20C
10
0.0013
= 184,756.
0.13%)
5000 only 2.7% of all possibilities.
Three main advantages
1. Valid even without random samples.
2. Easy to take account of particular features of data.
3. Can use 'non-standard' test statistics.
Tell us if a certain pattern could or could not be caused/arisen
by chance. Completely specific to data set.
Randomisation tests and Monte Carlo permutation tests
If all data arrangements are equally likely, RANDOMISATION TEST with
random sampling of randomisation distribution. Otherwise, MONTE CARLO
PERMUTATION TEST.
Validity depends on validity of permutation types for particular data-type –
time-series stratigraphical data, spatial grids, repeated measurements
(BACI). All require particular types of permutations.
pH CHANGES IN ROUND LOCH OF
GLENHEAD
Monte Carlo permutation tests
ROUND LOCH OF GLENHEAD
pH change 1874-1931 (17.3-7.3cm) very marked.
Is it any different from other pH fluctuations over last 10,000 years?
Null hypothesis – no different from rates of pH change in preacidification times.
Randomly resample with replacement 1000 times to create temporally
ordered data of same thickness as the interval of interest – timeduration or elapsed-time test. As time series contains unequal depth
intervals between pH estimates, not possible for each bootstrapped
time series to contain exactly 10cm. Instead samples are added in time
series until depth interval equals or exceeds 10cm.
Rate (pH change per cm)
Statistical methods for testing competing
causal hypotheses
Response variable(s)
Y
e.g. lake-water pH, sediment LOI, tree pollen
stratigraphy
Predictor variable(s)
X
e.g. charcoal, age, land-use indicators, climate
Also covariables
Basic statistical model:
Y =
BX
Y
X
Method
1
1
Simple linear regression
1
>1
Multiple linear regression, principal components regression,
partial least squares (PLS)
>1
≥1
Redundancy analysis (= constrained PCA, reduced-rank
regression, PCA of y with respect to x, etc.)
Statistical testing by Monte Carlo permutation tests to derive empirical
statistical distributions
Variance partitioning or decomposition to evaluate different hypotheses.
ASSESSING IMPACTS OF LAACHER
SEE VOLCANIC ASH ON TERRESTRIAL
AND AQUATIC ECOSYSTEMS
A.F. Lotter & H.J.B. Birks
(1993)
J. Quat. Sci. 8, 263 - 276
11000 BP
? Any impact on terrestrial and aquatic systems
Also:
H.J.B. Birks & A.F. Lotter
(1994)
J. Paleolimnology 11, 313 - 922
A F Lotter et al.
(1995)
J. Paleolimnology 14, 23 - 47
Map showing the location of
Laacher See (red star), as
well as the location of the
sites investigated (blue
circle). Numbers indicate the
amount of Laacher See
Tephra deposition in
millimetres (modified from
van den Bogaard, 1983).
Loss-on-ignition of cores Hirschenmoor HI-1 and Rotmeer RO-6. The line
marks the transition from the Allerød (II) to the Younger Dryas (III)
biozone. LST = Laacher See Tephra.
Diatoms in cores HI-1 and RO-6 grouped according to lifeforms. LST = Laacher See Tephra.
YD
Al
Diatom-inferred pH values for cores HI-1 and RO-6. The interpolation is
based on distance-weighted least-squares (tension = 0.01). The line marks
the transition from the Allerød (II) to the Younger Dryas (III) biozone. LST
= Laacher See Tephra.
Data
Terrestrial pollen and spores (9, 31 taxa)
Aquatic pollen and spores (6, 8 taxa)
Diatoms (42,54 taxa)
RESPONSE VARIABLES
% data
Biozone
(Allerød, Allerød/Younger Dryas, Younger Dryas)
+/-
Lithology
(gyttja, clay/gyttja)
+/-
Depth
("age")
Continuous
Ash
Exponential decay process
Continuous
Exp x-t
211 years
Time AL
YD
EXPLANATORY VARIABLES
 = 0.5
NUMERICAL ANALYSIS
x = 100
(Partial) redundancy analysis
t = time
Restricted (stratigraphical) Monte
Carlo permutation tests
Variance partitioning
Log-ratio centring because of %
data
The biostratigraphical data sets used in the (partial)
redundancy analyses
(SD = standard deviation units)
HIRSCHENMOOR CORE HI-1
Terrestrial pollen
Aquatic pollen and spores
Diatoms
Number of samples
16
16
16
Number of taxa
9
6
42
0.48
0.84
1.44
Gradient length (SD)
ROTMEER CORE RO-6
Terrestrial pollen
Aquatic pollen and spores
Diatoms
Number of samples
21
21
21
Number of taxa
31
8
54
0.74
0.71
1.68
Gradient length (SD)
RESULTS OF (PARTIAL) RESUNDANCY ANALYSIS OF THE BIOSTRATIGRAPHICAL
DATA SETS AT ROTMEER (RO-6) AND HIRSCHENMOOR (HI-1) UNDER DIFFERENT
MODELS OF EXPLANATORY VARIABLES AND COVARIABLES.
Entries are significance levels as assessed by restricted Monte Carlo permutation tests (n = 99)
Data Set
Site
Explanatory
variables
Covariables
Terrestrial
pollen
RO-6
-
0.01a
-
0.01a
0.10
0.01a
RO-6
Depth + biozone +
ash + lithology
Depth + biozone +
ash + lithology
Ash
Aquatic
pollen &
spores
0.01a
Depth + biozone
0.09ns
0.48ns
0.16ns
HI-1
Ash
Depth + biozone
0.28ns
0.13ns
0.01a
RO-6
Ash + lithology
Depth + biozone
-
0.88ns
0.17ns
HI-1
Ash + lithology
Depth + biozone
-
0.10ns
0.01a
RO-6
Ash
-
0.53ns
0.08ns
HI-1
Ash
-
0.10ns
0.19ns
RO-6
Ash + lithology +
ash*lithology
Ash + lithology +
ash*lithology
Depth + biozone
+ lithology
Depth + biozone
+ lithology
Depth + biozone
-
0.25ns
0.03b
Depth + biozone
-
0.12ns
0.05b
HI-1
HI-1
a
p  0.01 b 0.01 < p  0.05
Diatoms
0.01a
Unique ash effect
(no lithology)
Unique ash +
lithology effect
Unique ash effect
(lithology
considered)
Unique ash +
lithology +
(ash*lithology)
interaction effect
The Laacher See eruption is reflected
in the tree-rings of the Scots pines
from Dättnau, near Winterhur,
Switzerland, by a growth disturbance
lasting at least 5 yr, and persisting in
most of the trees for a further 3 yr. The
X-ray photograph shows normal growth
rings in sector (a); a very narrow treering sequence in sector (b); three more
rings of smaller width in sector (c); and
in sector (d) after recovery, normally
grown rings. The graph of the density
curve shows on the vertical axis the
maximum latewood densities; on the
horizontal axis the tree-ring width. The
latewood densities reflect a reduction
in summer temperature lasting for 4 yr.
Hämelsee - annually laminated sediments
Effects on lake
sediments lasted
no more than 20
years.
8 winter layers
contain clay and
silt.
MULTI-PROXY STUDIES
Major development in Quaternary palaeoecology in
the last 10-15 years has been multi-proxy studies
where several stratigraphical variables (e.g. pollen,
plant macrofossils, diatoms, sediment magnetics,
geochemistry, grain-size distribution) are studied on
the same core.
If 'split' available data into 'reconstruction data' and
'response data', can test hypotheses about potential
causes of change in the 'response data'.
Sägistalsee, Bernese
Oberland, Swiss Alps
A.F. Lotter et al. 2003
Sägistalsee
Sägistalsee, Switzerland
Ideal study:
1. Critical ecological situation at tree-line today; sensitive
2. One core. Many proxies (pollen, macros, chironomids, cladocera, grain
size, sediment magnetics, sediment geochemistry)
3. Well dated; 18 AMS
14C
dates on terrestrial plant material
4. Well co-ordinated by A. Lotter
5. High quality data:
Data-set
Pollen
Plant macros
Chironomids
Cladocera
Geochemistry
Grain-size
Magnetics
No. of samples
212
372
82
112
176
294
504
No. of taxa/variables
203
53
30
7
14
6
5
6. Consistent numerical methodology on all proxies
7.
New approach: numerical methods used to test hypotheses
about the influence of climate and catchment processes on the
aquatic ecosystem in the perspective of the Holocene time-scale.
(Partial redundancy analysis with restricted Monte Carlo
permutation tests)
Of the catchment changes, the main ones appear to be the spread
of Picea abies at about 6300 cal BP and Bronze Age and
subsequent forest clearances and conversion to grazing pastures.
Hypotheses tested:
1. Climate has had a significant control on lake ecosystem changes
2. Catchment vegetation has played significant role on lake changes
"Responses"
(proxies)
Terrestrial
Scale
Climate a significant
predictor?
Catchment vegetation a
significant predictor?
Pollen
Catchment &
regional
Catchment
Y
Y
-
-
Lake
Lake
N
N
Y
Y
Lake
Lake
Lake
*
Y
Y
(Y)
#
Macrofossils
Lake biotic
Chironomids
Cladocera
Lake abiotic
Grain size
Magnetics
Geochemistry
* Tested against insolation, central
European cold phases, & Atlantic IRD record
# Veg phases: Betula-Pinus cembra; Alnus-Pinus cembra; Picea abies ~ 6300 cal BP;
Pasture phases from Bronze Age to present
ASSESSING POTENTIAL EXTERNAL
'DRIVERS' ON AN AQUATIC
ECOSYSTEM
Bradshaw et al. 2005 The Holocene 15: 1152-1162
Dalland Sø, a small (15 ha), shallow (2.6 m) lowland
eutrophic lake on the island of Funen, Denmark.
Catchment (153 ha) today
agriculture
77 ha
built-up areas
41 ha
woodland
32 ha
wetlands
3 ha
Nutrient rich – total P 65-120 mg l-1
Map of Dalland
Sø
Multi-proxy study to assess role of potential external 'drivers' or
forcing functions on changes in the lake ecosystem in last 7000 yrs.
Data:
No. of samples Transformation
Sediment loss-on-ignition %
560
None
Sediment dry mass accumulation
rate
560
Log (x + 1)
Sediment minerogenic matter
accumulation rate
560
Log (x + 1)
Plant macrofossil concentrations
280
Log (x + 1)
Pollen %
90
None
Diatoms %
118
None
Diatom inferred total P
118
None
Biogenic silica
84
Not used
Pediastrum %
90
None
Zooplankton
31
Not used
Terrestrial landscape or
catchment development
Bradshaw
et al. 2005
Aquatic ecosystem development
Bradshaw et al. 2005
DCA of pollen and diatom data separately to summarise major
underlying trends in both data sets
Pollen – high scores for trees, low
scores for light-demanding
herbs and crops
Diatom - high scores mainly
planktonic and large
benthic types, low scores
for Fragilaria spp. and
eutrophic spp. (e.g.
Cyclostephanos dubius)
Bradshaw et al. 2005
Major contrast between samples before and
after Late Bronze Age forest clearances
'Lake'
Prior to clearance,
lake experienced
few impacts.
After the clearance,
lake heavily
impacted.
'Catchment'
Bradshaw et al. 2005
Canonical correspondence analysis
Response variables
Diatom taxa
Predictor variables
Pollen taxa, LOI, dry mass and minerogenic accumulation rates, plant
macrofossils, Pediastrum
Covariable
Age
69 matching samples
Partial CCA with age partialled out as a covariable. Makes
interpretation of effects of predictors easier by removing temporal
trends and temporal autocorrelation
Partial CCA all variables
18.4% of variation in diatom data explained by Poaceae pollen,
Cannabis-type pollen, and Daphnia ephippia.
As different external factors may be important at different times, divided
data into 50 overlapping data sets – sample 1-20, 2-21, 3-22, etc.
Bradshaw
et al. 2005
CCA of 50 subsets from bottom to top and % variance explained
1. 4520-1840 BC Poaceae is sole predictor variable (20-22% of diatom
variance)
2. 3760-1310 BC LOI and Populus pollen (16-33%)
3. 3050-600 BC Betula, Ulmus, Populus, Fagus, Plantago, etc. (17-40%)
i.e. in these early periods, diatom change influenced to some degree
by external catchment processes and terrestrial vegetation change.
4. 2570 BC – 1260 AD Erosion indicators (charcoal, dry mass
accumulation), retting indicator Linum capsules, Daphnia ephippia,
Secale and Hordeum pollen (11-52%)
i.e. changing water depth and external factors
5. 160 BC – 1900 AD Hordeum, Fagus, Cannabis pollen, Pediastrum
boryanum, Nymphaea seeds (22-47%)
i.e. nutrient enrichment as a result of retting hemp, also changes in
water depth and water clarity
Bradshaw
et al. 2005
Strong link between inferred catchment change and within-lake development. Timing
and magnitude are not always perfectly matched, e.g. transition to Mediæval Period
CONCLUSIONS
Phases in palaeoecology
Descriptive phase - patterns are detected, described and classified
Narrative phase -
plausible, inductively-based explanations,
generalisations, or reconstructions are proposed
for observed patterns
Analytical phase -
falsifiable or testable hypotheses are proposed,
evaluated, tested and rejected
Why is there so little analytical hypothesis-testing in palaeoecology?
MONTE CARLO PERMUTATION TESTS are valid without random samples,
can be developed to take account of the properties of the data of
interest, can use "non-standard" test statistics, and are completely
specific to the data-set at hand. Ideal for palaeoecology.