D 1 - Laboratorio di Geomatica

Download Report

Transcript D 1 - Laboratorio di Geomatica

Statistics and Probability
in Geosciences:
from Least Squares
to Random Fields
Fernando Sansó
DIIAR - Politecnico di Milano – Polo Regionale di Como
Athens - 11 May 2004
Statistics of Geosciences
Laplace
Legendre
Gauss
Poisson
Markov
From the availability of new
electronic HW, new impulses to
model theory and field theory
Statistics drifts away from its
origins, so much entangled with
geo-sciences (and astronomy)
Athens - 11 May 2004
Jeffreys
Baarda
Moritz
Krarup
Tarantola-Vallette
Geman-Geman
recognizing that Statistics is the art and science of
ambiguous knowledge,
I claim that the “whole probabilistic”, Bayesian,
point of view can be taken as a unifying foundation
for all spatial information sciences.
Athens - 11 May 2004
Scientific concepts are born of an abstraction process,
namely
when we observe natural phenomena, after eliminating
a multitude of tiny details, we can grasp the common
and regular elements on which axioms rules and laws
can be built.
(From Plato’s ideas () to Euclede’s ()
elements).
Athens - 11 May 2004
Examples
• a straight line; who has ever seen one?
• a Eucledean triangle: measuring its angles at the
astronomical level has become one of the means to
decide about the curvature of the universe;
• the Galilean equivalence principle, which emerges
from physical experiments only by abstracting from
friction, by assuming a constant gravity etc.
Athens - 11 May 2004
Psychologically one can understand why at the
beginning of modern science the incertitude was
considered as the enemy, classified as “measurement
error”.
This is how modern statistics was born from the very
beginning as “error theory”, based on a probabilistic
interpretation via the central theorem and used in an
inferential approach, to produce “best” estimates of the
parameters of interest, according to a proto-maximum
likelihood criterion.
Athens - 11 May 2004
Historical examples
• The astronomical measurement of Jupiter diameters
to test the hypothesis that its figure was ellipsoidal
and rotating around the minor axis.
• The geodetic measurements of arcs of meridians,
performed by the French Academy in France, in
Lapland and on the Andes, to measure the eccentricity
of the Earth.
Athens - 11 May 2004
One fundamental step in the development of the
understanding of statistics has been the clear
establishment of the so called Gauss-Markov linear
standard model, with all its developments in least
squares theory:
this is understood by explaining what are
• the deterministic model,
• the stochastic model.
Athens - 11 May 2004
The deterministic model in Gauss-Markov theory
(discrete and finite-dimensional):
every experiment can be described by n+m variables
organized in two vectors
• y (dim m ) measurable quantities
• x (dim m ) parameters;
these variables are deterministic, i.e. in principle they
can be given a fixed numerical value in the experiment
analysed, and they are related by geometric and
physical laws.
Athens - 11 May 2004
General mathematical form of the physics of the experiment
g (x , y )  0 .
From the observations themselves or from prior knowledge we
have approximate values x~, y~ and we put
x  x~   ,
y  y~  
after linearization we have
g (x~, y~) 
g
x
x~ ,y~

g
y
x~ ,y~
 0
and we assume to be able to solve for  .
In the end we have a linear model of observation equations
y  Ax  a .
Athens - 11 May 2004
Once linearized, the deterministic model has the
meaning that y cannot wander over the whole m, but
it is constrained to
V  y  A x  a , x  n   a  Range[A ]
a
V
Range [A]
Athens - 11 May 2004
The stochastic Model
(reduced to pure 2nd order information)
We assume now we observe the vector y , so that we
draw a vector Y0 from an m-dimensional variate:
Y0 ~Y  y  
( " errors" )
E { }  0  E {Y }  y
C    02Q (Q known)  CYY  C 
L.S. problem find yˆ on V, somehow related to Y0
Y0
yˆ
Athens - 11 May 2004
V
L.S. Principle
2
Let yˆ be given by:
yˆ  arg MinY 0  yˆ Q 
1
yˆV
 arg Min Y 0  yˆ  Q 1 Y 0  yˆ 

yˆV
Justification (Markov theorem)
Among all linear unbiased estimators
y~  HY  h
(y~ V , i.e. y~  A x~  a , x~  K Y  k
AK  I , Ak  a  0 )
putting
e~  y~  y , eˆ  yˆ  y
we have C eˆeˆ  C e~e~ .
Athens - 11 May 2004
By L.S. theory, complemented by suitable numerical
techniques, several very large geodetic problems have
been solved:
 adjustment of large geodetic networks
(N.A. datum ~ 40.000 parameters ~ 1980)
• satellite orbit control (from 1970)
• analytic photogrammetry
• discrete finite models of the gravity field (e.g. by buried
masses or by truncated spherical harmonics expansion:

T 
R
m
,
 1
R 
  T m Y m ( ,  )
r 
Athens - 11 May 2004
From L.S. theory new problems have evolved:
• testing theory as applied to
 Correctness of the model
(2 test on  02)
 Values of the parameters
(significance of input factors in linear regression analysis)
 Outliers identification and rejection
(Baarda’s data snooping) and the natural evolution towards
robust estimators (L1-estimators etc.)
Athens - 11 May 2004
Mixed models
with two types of parameters x (continuous), b (integers)
like with GPS observations where b are initial phase
ambiguities.
Note:
the numerical complexity if we adopt a simple trial and
error strategy for b; if we have a base line with 10 visible
satellites and for each double difference we want to try
3 values, we have to perform
39 ~ 20.000
adjustments.
Athens - 11 May 2004
Variance elements estimation or random effects model
When we don’t know C but we have a model
q
C     i2Q i
,
i 1
 i2 ? ;
this corresponds to the following stochastic model
q
Y  A x   Ai  i
i 1
E { i }  0 , E { i  j }   i2 ij
Ai Ai  Q i ,
when i are basically non observable (or hidden) random
parameters.
Athens - 11 May 2004
Examples (ITRF 2005)
We estimate 3N coordinated of Earth stations x1, x2, …, xN by
different spatial techniques (e.g. GPS, LR, etc.).
Each technique has a vector of adjusted coordinates in its own
reference frame
x 1k  1k
Y k      x k k
x Nk  Nk
where, with respect to a unified reference system,
x1
x k  T (k )x  T (k ) 
xN
and
C  k  k  C k   02k N k
Note: due to imperfect modelling, one can assume that the
estimate of  02k is unrealistic.
Athens - 11 May 2004
If we called all the equations we get
Y1
x1
Y    T (1 ,...,t )   
Yt
xN
we get
C  
0
t

k
 02k
1

0
0
Nk
0
0

0
and in the next ITRF, the IERS is going to estimate  02k
together with
x1
x  
xN
.
Athens - 11 May 2004
The Bayesian revolution
• Probability is an axiomatic index measuring the
subjective state of knowledge of a certain system,
• every system is thus described by a number of
random variables through their joint distribution,
• every observation modifies the state of knowledge of
the system, namely the distribution of the relevant
variables, through the operation of probabilistic
conditioning.
Athens - 11 May 2004
According to this vision, the physical laws are only
verified in the mean, when we average on a population
of effects that cannot be controlled, and which can be
described only by a probability distribution, expressing
our prior knowledge of the phenomenon.
(De Finetti)
Athens - 11 May 2004
Linear Bayesian Models
We start from observation equations
Y  AX  N
where all variables are random; X, N are the primitive variables
of the observation process and we assume them to be
independent and described by some prior distribution
p 0 X N (x , )  p 0 X (x ) p 0 N ( ) ;
Y, the variable we sample by observations, is a derived variable
with joint prior
p 0 X Y (x , y )  p 0 Y |X (y | x ) p 0 X (x )  p 0 N (y  A x ) p 0 X (x ) .
Athens - 11 May 2004
According to Bayes theorem the observation Y0 enters to
condition the X distribution, namely
p X |Y (x |Y 0 ) 
p 0 N (Y 0  A x ) p 0 X (x )
 p N (Y
0
0
 A ) p 0 X ( ) d 
(Posterior)
Example (random networks): we measure two distances of P
from known P1 and P2
P1
D1
P
D2
P2
Athens - 11 May 2004
Prior of P
We measure D1
P1
Effect of D1
D1
D1
P1
P2
P2
We measure D2
Effect of D1 and D2
P1
P1
D2
D1
Posterior
of P
D2
P2
P2
Athens - 11 May 2004
A general Bayesian network is a network where points
are random and measurements change their distribution;
in a sense (apart from Heisenberg’s principle) there is a
striking similarity with quantum theory.
Let us now restrict the Bayes concept to the linear
regression case.
Athens - 11 May 2004
Y  AX  N
Xˆ  LY  LA X  LN
,
eˆ  Xˆ  X  (LA  I ) X  LN
E 2  Tr C eˆeˆ  TrLA  I C X0 A L  I   LC N0 L 
Min E  L  C X A AC X A  C N 
2

0
0
0 1

The solution is then written as

Xˆ  C X A AC X A  C N  Y  A C N A  C X
0

0

0 1

0
1
0
1

1
1
A CN Y

0
Where we see that it is a combination of the observations
with the prior knowledge

E 0 {X }  0 , E 0 {X X }  C X0 .
Athens - 11 May 2004
Note: now we don’t have any more a rank deficiency
because C N0 > 0 for sure, so that we can have n > m and
and even
n = + ,
i.e. X is in reality a random field:
u (P )  n X n n (P )
Y i  Li {u (P )}
functionals of u

  X n Li { n (P )}
n 0
Athens - 11 May 2004
Examples
• Cartography: u(P) is DEM
L[u (P )]  u (P )
(point height)
• Image analysis: u(P) is density of flux through a sensor
element
L[u (P )]   u (P ) dS
Pixel
• Physical Geodesy: u(P) is the anomalous Earth potential
u 2
L[u (P )]  
 u
r r
Athens - 11 May 2004
(gravity anomaly)
Important remark: it is easy to prove that C X0 controls the
prior regularity of the field.
elevations profile
image profile
gravity profile
C X0 can be considered as a hyperparameter and estimated
through an infinite dimensional calculus (Malliavin Calculus).
Here statistics is fused with functional analysis to properly
define the space of estimators.
Athens - 11 May 2004
Conclusions
If modern statistics, which was born at the beginning
together
with
geodesy
and
astronomy
to
treat
measurement errors, has slowly drifted away to become
mate of mechanics, then of radio-signals analysis and
finally of economic sciences, nowadays we are entitled to
say that Earth sciences with their need of estimating
spatial fields, are giving to statistics a serious scientific
contribution, pushing it along the road of modern
probability theory and functional analysis.
Athens - 11 May 2004