Transcript Slide 1

Adapted from Ecological Statistical Workshop,
FLC, Daniel Laughlin
Properties of
Community Data
in Ecology
Community Data Summary
•
•
•
•
Community data matrices
Species on gradients
Problems with community data
Normality assumptions
Key questions to keep in the back of your mind:
1. How do species abundances relate to each other?
2. How do species relate to environmental gradients?
Community data matrices
or Molecular marker
Independent
sample units
(abundance or
presence/absence
used as a measure of
species performance)
Traits
SPARSE
Full Community Dataset
n = # of sample
units (plots)
p = # of species
t = # of traits
e = # of environmental
variables or
factors
d = # of dimensions
plots in
species space
nxp
nxe
plots in envir
space
nxt
plots in
trait space
traits in species space
txp
used for species in
environmental space (A’E)
exp
dxp
nxd
txe
plots in
reduced
species
space
traits in envir
space
species in reduced
plot space
Ordination can address more questions than how plots differ in composition…
Species on environmental gradients
Gaussian ideal
- peak abundances, nonlinear
- this is challenging to analyze
Linear responses to gradients
- okay for short gradients
Major Problems with Community Data
1. Species responses have the “zero
truncation problem”
2. Curves are “solid” due to the action of
many other factors
3. Response curves can be complex
4. High beta diversity
5. Nonnormal species distributions
Major Problems with Community Data
1. Zero truncation
• species responses truncated at zero
• only zeros are possible beyond limits
• no info on how unfavorable the
environment is for a species
2. “Solid” curves
• “curves” are typically solid envelopes
rather than curves
• species is usually less abundant than
its potential (even zeros are possible)
Major Problems with Community Data
3. Complex curves
-polymodal, asymmetric, discontinuous
Average lichen cover on twigs in shore
pine bogs in SE Alaska.
High beta diversity
• Beta diversity = the difference in community
composition between communities along an
environmental gradient or among communities
within a landscape
Whittaker’s (1972) Beta Diversity
γ = number of species in composite sample (total number of species)
ά = average species richness in the sample units
No formal units, but can be thought of as ‘number of distinct
communities”
The one is subtracted to make zero beta diversity correspond to zero
variation in species turnover.
Rule of thumb:
βw < 1 are low, βw > 5 are high
Are species distributions normal?
• Univariate normality (it’s what we’re used to)
• Bivariate normality (it’s easy to visualize)
– Idealized community data
– Real community data
• Multivariate normality (straightforward
extension of bivariate normality to multiple
dimensions)
Univariate normality
Skew = 0
Kurtosis = 0
Normality can be assessed by:
skewness (asymmetry), and
kurtosis (peakiness)
Skewness
• Community data will nearly always be
positively skewed due to lots of zeroes
• Linear models require |skew| < 1
• Assess skewness of data in PCORD (Row and
Column Summary)
Positively skewed distribution
typical of community data
HYVI
PLHE
-0.1
0
.1
.2
.3
.4
.5
.6
.7
0
.05
HYIN
.1
.15
-0.2
0
.2
.4
.6
.8
1
Bivariate Normality
Views
from
above
Bivariate Species Distributions
positive association
bivariate distribution is non-linear
negative association
dust bunny distribution-plotting one
species against another (lots of points
near orgin and along axes)
Idealized Gaussian species response curves
Bivariate Species Distributions
positive association
dust bunny distribution
negative association
dust bunny distribution
Realistic data with “solid” response curves
Bi- and Tri-variate Distributions
Bivariate normal distribution
forms elliptical cloud
Multivariate normal distribution
(hyperellipsoid)
Bivariate distribution with most
points lying near one or two axes
Multivariate dust bunny distribution
Dust bunny in 3-D species space
Environmental gradients form
strong non-linear shape in species
space
A: cluster within the cloud of points
(stands) occupying vegetation space.
B: 3 dimensional abstract vegetation
space: each dimension represents an
element (e.g. proportion of a certain
species) in the analysis (X Y Z axes).
A, the results of a classification approach
(here attempted after ordination) in which
similar individuals are grouped and
considered as a single cell or unit.
B, the results of an ordination approach in
which similar stands nevertheless retain
their unique properties and thus no
information is lost (X1 Y1 Z1 axes).
Key Point: Abstract space has no
connection with real space from which the
records were initially collected.
Multivariate Normality
• Linear algebra easily extends these
concepts into multiple dimensions
• Most multivariate methods assume
multivariate normality (linear ordination
methods)
• Ecological data are seriously abnormal
• Thus, we will often require different
methods