Correspondence Analysis: Simple ( CA) and Detrended (DCA)
Download
Report
Transcript Correspondence Analysis: Simple ( CA) and Detrended (DCA)
Correspondence Analysis:
Simple ( CA) and Detrended (DCA)
Vamsi
Sundus
Shawnalee
What is Correspondence Analysis?
AKA Reciprocal Averaging (RA).
Basically: An ordination technique that involves repeatedly
calculating weighted averages.
Popular only in France (due to Benzecri).
What is Detrended Correspondence
Analysis?
Designed specifically to solve certain problems found when
using CA on ecological data based on “empirical desire to
reshape data closer to the models visualized by ecologists.”
Popular mainly in the ecological community.
Weighted Means
Weighted mean results when some of the numbers in the
data are repeated.
Consider:
Arithmetic Mean:
1 2 3 4 5 15
3
5
5
Weighted Mean; value of 1 found 10 times.
(10 1) 2 3 4 5 24
1.71...
10 1 1 1 1
14
Application of Weighted Means
Let’s say we had some hypothetical data as follows:
Year
1
2
3
4
5
6
7
8
9
10
Counts
100
90
80
60
50
40
20
5
0
0
Application of Weighted Means
To know what’s the average lifetime of the species, you
would have to use the weighted averages to compute a
weighted mean (below):
1100 2 90 3 80 4 60 5 50 6 40 7 20 8 5
3.21
100 90 80 60 50 40 20 5
Year Counts
1
100
2
90
3
80
4
60
5
50
6
40
7
20
8
5
9
0
10
0
M
e
a
n
Y
e
a
r
Application CA Algorithm to find “mean
species” in a 3 species case.
But theoretically, most ecologists and the like would be
observing multiple species at the same time and hence have
count data for these multi-species groups such as follows:
Year Counts
1
100
2
90
3
80
M Y
4
60
e e
5
50
a a
6
40
n r
7
20
8
5
9
0
10
0
Year Counts
1
0
2
10
3
20
4
35
5
50
6
60
7
30
8
20
9
10
10
0
M
e
a
n
Y
e
a
r
Year Counts
1
0
2
0
3
5
4
10
5
20
6
30
7
40
8
60
9
75
10
90
M
e
a
n
Y
e
a
r
Step 1
Start with a random weighting. It’s pretty kosher to start
from 0.0 100.0 in whatever increments are needed.
In our case, we’ll do (0,50,100) for (A, B, C)
Use this formula for nth species rank:
n 1
100
| S species
S 1
Step 2
Use the starter weights (which are arbitrary essentially) and
compute a weighting for each of the years
Year Counts Counts Counts
1
100
0 0
2
90
10 0
3
80
20 5
4
60
35 10
5
50
50 20
6
40
60 30
7
20
30 40
8
5
20 60
9
0
10 75
10
0
0 90
Y1
--> 0.0
--> 5.0
--> 14.3
--> 26.2
--> 37.5
--> 46.2
--> 61.1
--> 82.4
--> 94.1
--> 100.0
0 100 50 0 100 0
0.0 | Year1
100 0 0
Step 3
We can now calculate a new weighting for each species using
these new year weightings.
0 100 5 90 14.3 80 ... 0 94.9 0 100
19.1
100 90 ... 20 5
Calculate similarly for B, C
Old weightings for
species
S10
S1a
0
19.1
50
43.9
100
78.5
New calculated
weightings for
species
Step 4
These new weightings for each species though aren’t that
useful, so we need to rescale them back to 0 100, instead
of currently 19.1 78.5.
So, to do this, simply use a logical rescaling method.
S1a
19.1
43.9
78.5
100 ( S1a MIN )
S1b
MAX MIN
Step 4 cont.
So, after computing the rescaled values, we find the
following:
S10
0
50
100
S1a 19.1
43.9
78.5
S1b 0.00 41.75 100.00
Step 5
This is now one cycle of the CA completed.
“Weightings for each year are recalculated using the new,
rescaled weightings for the species.”
Eventually a stable patter will emerge.
10-20 iterations.
Correspondence Analysis
That was CA utilized in a simplistic example.
Detrended Correspondence Analysis
• This technique is not purely mathematical
• It’s a series of rules that are used to reshape data to make it
friendlier for analysis.
• Once again, primarily used for ecological data, but can be
extended to anything (data simply can’t contain negative
values).
• The reason that this technique is used is to over come the
arch effect (the horseshoe effect).
Arch Effect (Horseshoe Effect)
• Found in data whenever “PCA or other distance conserving
ordination techniques are applied to data which follow a
continuous gradient, along which there is a progressive
turnover of dominant variables.”
– Such as in ecological succession
• After ordination by a distance conserving technique and the
first two axes are plotted against each other, one would find
an arch shape.
Steps of DCA
Two major stages
Ordination by CA (as previous)
Then get rid of arch effect by brute-force.
Goal (the bold one)
Notice
There’s a loss of information, specifically the second CA axis,
the Y-axis in this case.
Software
Standard software according to Shaw is based on the same
source code and entered through some front-end of
DECORANA.
However, there is a package to do this in R.
Basics in R.
decorana(veg, iweigh=0, iresc=4, ira=0, mk=26, short=0,
before=NULL, after=NULL)
veg = data matrix
Iweigh = downweighting of rare species. Both CA and DCA are
extremely sensitive to rare species, so this would decrease the
importance of rare species.
Iresc = number of cycles of reiteration.
Ira = turns CA into DCA, if turned on (0 = detrended, 1 =
simple)
There’s no information to extend this in Shaw, so, leaving it
until a later time.
FIN