Vector geometry

Download Report

Transcript Vector geometry

Vector geometry: A visual tool
for statistics
Sylvain Chartier
Laboratory for Computational
Neurodynamics and Cognition
Centre for Neural Dynamics
Vector geometry
• How using a vector (arrow) we can represent concepts of
– Mean, variance (standard deviation), normalization and
standardization.
• How using two vectors we can represent concepts of
– Correlation and regression.
A datum
(0)
(16)
Two data
(8)
(0)
Principal of independence of
observation : perfectly opposed
direction
(16)
Two data
(8)
(16,8)
(0,(0)
0)
(16)
Two data
(16,8)
(0, 0)
Starting point: Zero
Finish point
(16,8)
Starting point
(0,0)
Starting point: Mean
Finish point
x = (x1, x2)
Starting point
(x, x)
Starting point: Mean
Starting point
(12, 12)
Finish point
x = (16, 8)
One group
Many groups
Degrees of freedom
We remove the effect of the mean
We centralized the data
Starting point (mean)
(12, 12)
(0, 0)
Finish point
x = (16, 8)
x  x = (4, -4)
We remove the effect of the mean
(many groups)
We remove the effect of the mean
(many groups)
We remove the effect of the mean
(many groups)
What is the real
dimensionality?
We remove the effect of the man
• If we have two data, we will get one dimension.
• If we have three data, we will get two dimensions
.
.
.
• If we have n data, we will get n-1 dimensions.
 In other words, degrees of freedom represent the true dimensionality of
the data..
Variance
What is the difference between these three
(composed of two data each) ?
 Length (distance)
 The higher the variability, the longer the length
will be.
(-0.5, 0,5)
(1.5, -1.5) (2.5, -2.5)
What is the difference between these three
groups?
How do we measure the length (distance)?
Pythagoras
Hypotenuse of a triangle
? = (4^2+3^2) = 25 = 5
(4,3)
5?
3
4
What is the difference between these three
groups?
Therefore, the point (4,3) is at a distance of 5 from
its starting point.
n
52 

( xi  x ) 2 = sum of squares = variance×(n-1)
i 1
(4,3)
5
What is the difference between these three
groups?
What is the length of this three lines?
1?
A)
1
1
1
2?
B)
C)
3
?
1
1
1
 The dimensionality inflates the variability.
In order to a have measure that can take into
account for the dimensionality, what do we
need to do?
What is the difference between these three
groups?
•We divide the length of the data set by its true dimensionality
n
Variance 

( xi  x ) 2
i 1
n-1
= (quadratic) distance (from the mean)
corrected by the (true) dimensionality of the
data.
Normalization et standardization
Normalization vs Standardization
•
To normalize is equivalent as to bring a given vector x (arrow) centered (mean = 0) at a
length of 1..
•
Normalization: z = x  by its length
zTz = 1
•
Standardization: zx = x  SD
zxTzx = n-1
=> zx = z*(n-1)
Two groups
One group of three participants
Two groups of three participants
Two groups of three participants
• They can be
represented by a
plane
Two groups of three participants
• They can be
represented by a
plane
Two groups of three participants
• They can be
represented by a
plane
Two groups of three participants
• They can be
represented by a
plane
• This is true
whatever the
number of
participants
Correlation and regression
Relation between two vectors
•
•
If two groups (u and v) has the same data, then the two vectors are superposed on
each other.
As the two vectors distinguish from each other, the angle between them will increase.
Relation between two vectors
•
If the angle reaches 90 degrees, then they share nothing in
common.
Relation between two vectors
•
The cosine of the angle is the coefficient of correlation
n
u v
i i
covuv
u v
i 1
cos  


 ruv
u v
u v
su sv
T
Relation between two vectors
•
Regression:
v̂  b0  b1u
e
–
The shortest distance is
the one that crosses at
90° the vector u
b
Relation between two vectors
•
Regression: The formula to obtain the regression coefficients can be obtained
directly from the geometry
uTe  0
u T ( v  u  b1 )  0
u T v  u T u  b1  0
–
By substitution, we can isolateTthe b1 coefficient.
T
u v  u u  b1
(u T u) 1 (u T v )  (u T u) 1 (u T u)  b1
(u T u) 1 (u T v )  1 b1  b1
If we generalized to any
situation (multiple, multivariate)
B  ( XT X) 1 XT Y