Ensemble Empirical Mode Decomposition

Download Report

Transcript Ensemble Empirical Mode Decomposition

ENSEMBLE EMPIRICAL MODE DECOMPOSITION
Noise Assisted Signal Analysis (nasa)
Part I Preliminary
Zhaohua Wu and N. E. Huang:
Ensemble Empirical Mode Decomposition: A Noise
Assisted Data Analysis Method. Advances in
Adaptive Data Analysis, 1, 1-41, 2009
Theoretical Foundations
• Intermittency test, though ameliorates the mode
mixing, destroys the adaptive nature of EMD.
• The EMD study of white noise guarantees a
uniformed frame of scales.
• The cancellation of white noise with sufficient
number of ensemble.
Theoretical Background I
Intermittency
Sifting with Intermittence Test
• To avoid mode mixing, we have to institute a special
criterion to separate oscillation of different time
scales into different IMF components.
• The criteria is to select time scale so that oscillations
with time scale longer than this pre-selected
criterion is not included in the IMF.
Observations
• Intermittency test ameliorates the mode mixing
considerably.
• Intermittency test requires a set of subjective criteria.
• EMD with intermittency is no longer totally adaptive.
• For complicated data, the subjective criteria are hard, or
impossible, to determine.
Effects of EMD (Sifting)
• To separate data into components of similar scale.
• To eliminate ridding waves.
• To make the results symmetric with respect to the
x-axis and the amplitude more even.
– Note: The first two are necessary for valid IMF, the
last effect actually cause the IMF to lost its intrinsic
properties.
Theoretical Background II
A Study of White Noise
Wu, Zhaohua and N. E. Huang, 2004:
A Study of the Characteristics of White Noise Using
the Empirical Mode Decomposition Method,
Proceedings of the Royal Society of London , A 460,
1597-1611.
Methodology
• Based on observations from Monte Carlo
numerical experiments on 1 million white noise
data points.
• All IMF generated by 10 siftings.
• Fourier spectra based on 200 realizations of
4,000 data points sections.
• Probability density based on 50,000 data points
data sections.
IMF Period Statistics
1
2
3
4
5
6
7
8
9
number of
peaks
347042
168176
83456
41632
20877
10471
5290
2658
1348
Mean period
2.881
5.946
11.98
24.02
47.90
95.50
189.0
376.2
741.8
period in year
0.240
0.496
0.998
2.000
3.992
7.958
15.75
31.35
61.75
IMF
Fourier Spectra of IMFs
Fourier Spectra of IMFs
spectrum (10**-3)
1.5
1
0.5
0
0
1
2
3
4
5
6
7
8
9
Shifted Fourier Spectra of IMFs
spectrum (10**-3)
1
0.8
0.6
0.4
0.2
0
1
1.5
2
2.5
ln T
3
3.5
Empirical Observations : I
Mean Energy
En =
1
N
N
c
j=1
n
( j)
2
Empirical Observations : II
Normalized spectral area is constant

SlnT ,n d lnT  const
Empirical Observations : III
Normalized spectral area is constant
En =
 S
,n
d
is the total Energy density of n-th
IMF component
Empirical Observations : IV
Computation of mean period
En   S ,n d    ST ,n
Tn
dT
d lnT
  Sln T ,n

2
T
T
S


S
lnT ,n
lnT ,n
d lnT
d lnT
T
S
ln T ,n
d lnT
Tn
Empirical Observations : V
The product of the mean energy and period is constant
E nTn  const
ln E n  lnTn  const
Monte Carlo Result : IMF Energy vs. Period
Empirical Observation: Histograms IMFs
By Central Limit theory IMF should be normally distributed.
5000
mode 2
0
-1
0
mode 4
5000
0
-0.5
0
-0.4
-0.2
0
5000
0
0.2
0.4
mode 8
-0.2
-0.1
0
0.1
0.2
-1
-0.5
0
5000
0
0.5
mode 6
5000
0
0
1
mode 3
5000
-0.5
0
0.5
mode 7
-0.2
0
0.2
mode 9
5000
0
1
mode 5
5000
0
0.5
-0.1
0
0.1
Fundamental Theorem of Probability
• If we know the density function of a random variable, x,
then we can express the density function of any random
variable, y, for a given y=g(x). The procedure is as
follows:
Solve the roots of y = g(x1 ) + ... + g(xn ) + ...
y( y ) =
f x ( x1 )
,
g ( x1 )
+ .... +
f x ( xn )
,
g ( xn )
then
+ ...
because d y = g , ( x1 ) d x j ; therefore, d x j =
dy
.
,
g ( xj )
Fundamental Theorem of Probability
• If we know the density function of a random variable, x,
is normal, then x-square should be
 (y) =
1

2 y
exp  -y/2 2  U(y).
where U(y) is a normalizing function.
See: A. Papoulis : Probability,
Random Variables, and Stochastic Processes.
1984. Page 97-98.
Chi and Chi-Square Statistics
Given n normal identical independent random
varaibles with density
 (x1 , ..., xn ) =
1


2
we have the RV's  =

exp -  x12 +... +x n2  /2 2
n
x
2
1
+... +x
2
n

1/ 2
 U(y).
y=  2
then the density for y with  -degree of freedom is

 (y) = a y -1+ /2 exp - y
with

a = 1  2

n
2 2
 U(y)
 ( / 2 )
See: A. Papoulis : Probability, Random Variables, and Stochastic Processes
1984. Page 187-188.
CHI SQUARE-DISTRIBUTION OF ENERGY
200
mode 2
100
0.15
200
0.2
0.25
mode 4
100
200
200
0.02
0.04
0.06
0.08
mode 6
0.1
0.15
mode 5
0
0.01
200
0.02
0.03
0.04 0.05
mode 7
100
0
0.01
0.02
0.03
mode 8
0
300
0
0.01
0.02
mode 9
200
100
0
0
0.05
200
100
100
0
mode 3
100
0
0
200
100
0
0.005
0.01
0
0
0.005
0.01
Chi-Squared Energy Density Distributions
Probability for degree of freedom NE n should be
  NEn   ( NEn ) NE
n
2 1
e  NEn
2
Then, by the fundamental theory of probability, we have
  En   N  ( NEn )NE
n
2 1
e  NEn
2
Let us make a variable change: E = e y , then
  y   N  ( N e y ) NE
n
2 1
e  NEn
 NE 
= C exp 
y
 2 
2
E 

E

ey
DEGREE OF FREEDOM
• Random samples of length N contains N degree
of freedom
• Each Fourier component contains one degree of
freedom
• For EMD, the shares of DOF is proportional to its
share of energy; therefore, the degree of freedom
for each IMF is given as
f i = N Ei .
CHI SQUARE-DISTRIBUTION OF ENERGY
200
mode 2
100
0.15
200
0.2
0.25
mode 4
100
200
200
0.02
0.04
0.06
0.08
mode 6
0.1
0.15
mode 5
0
0.01
200
0.02
0.03
0.04 0.05
mode 7
100
0
0.01
0.02
0.03
mode 8
0
300
0
0.01
0.02
mode 9
200
100
0
0
0.05
200
100
100
0
mode 3
100
0
0
200
100
0
0.005
chi-square dist.
0.01
0
0
 wi   wi
ri  NEi
0.005
ri 21  wi 2
e
wi  NEi
0.01
Formula of Confidence Limit for IMF Distributions I
Introducing new variable, y = ln E; then E = e y . It follows:
  y   N   Ne
y

NE 2 1
 e  NE 2  e y
 NE  E
 NE NE 

 C  exp  y

 y 
  C  exp  

2 

 2
 2 E
C  N NE 2
E
 e y y
E
y  y  y  y

 1 y  y 

 ...
2!
3!
2
3
Formula of Confidence Limit for IMF Distributions II
With the new variable, y = ln E; then E = e y , it follows:
2
3
 NE 
y  y  y  y

1  y 
  y   C'  exp  


2!
3!
 2 
C= N
NE/2
 1

exp  - NE ( 1 - y )  .
 2

 

 
Formula of Confidence Limit for IMF Distributions III
When
y - y << 1 , we can neglect the higher power terms:
 NE   y  y  2  


  y   C  exp  
 2  2 !  
C' = N
NE/2

 1
exp  - NE ( 1 - y )  .

 2
Formula of Confidence Limit for IMF Distributions IV
For given confidence limit,  ,
the corresponding vairable, y  should satisfy
y
   y  dy


   y  dy
 .

For a Gaussian distribution, it is often to relate α to the
standard deviation, σ , i.e., α confidence level corresponds to
kσ, where k varies with α. For example, having values -2.326,
-0.675, -0.0, 0.675, and 2.326 for the first, 25th, 50th, 75th and
99th percentiles (with α being 0.01, 0.25, 0.5, 0.75, 0.99),
respectively.
Formula of Confidence Limit for IMF Distributions V
When
y - y << 1 , the distribution of En is
approximately Gaussian,
2 Tn
1
 =
=
NEn / 2
N
2
Therefore , for any given  , in terms of k , we have
2T
y  y  k  k
N
Formula of Confidence Limit for IMF Distributions VI
2T
Given y  y  k  k
N
and lnE + lnT  0.
If we write x = lnT , y = ln E as defined before, then
y   x; therefore
2 x2
y  x  k
e
N
A pair of upper and lower bounds will be
y  x  k
2 x2
e
N
Confidence Limit for IMF Distributions
C1 Raw SOI
Data and IMFs SOI
R
C9
C8
C7
C6
C5
C4
C3
C2
5
0
-5
2
0
-2
2
0
-2
2
0
-2
2
0
-2
1
0
-1
1
0
-1
1
0
-1
0.5
0
-0.5
0.5
0
-0.5
0.2
0
-0.2
-0.4
1930
1940
1950
1960
1970
1980
1990
2000
Statistical Significance for SOI IMFs
IMF 4, 5, 6 and 7 are 99% statistical significance signals.
1 mon
1 yr
10 yr
100 yr
Summary
• Not all IMF have the same statistical
significance.
• Based on the white noise study, we have
established a method to determine the
statistical significant components.
• References:
•
•
Wu, Zhaohua and N. E. Huang, 2003: A Study of the Characteristics of
White Noise Using the Empirical Mode Decomposition Method,
Proceedings of the Royal Society of London A460, 1597-1611.
Flandrin, P., G. Rilling, and P. Gonçalvès, 2003: Empirical Mode
Decomposition as a Filterbank, IEEE Signal Proc Lett. 11 (2): 112-114.
Observations
The white noise signal consists of signal of all scales.
EMD separates the scale dyadically.
The white noise provide a uniformly distributed frame of
scales through EMD.
Different Approaches but reach the
same end.
Flandrin, P., G. Rilling and P. Goncalves, 2004: Empirical
Mode Decomposition as a filter bank. IEEE Signal
Process. Lett., 11, 112-114.
Flandrin, P., P. Goncalves and G. Rilling, 2005: EMD
equivalent filter banks, from interpretation to applications.
Introduction to Hilbert-Huang Transform and its
Applications, Ed. N. E. Huang and S. S. P. Shen, p. 57-74.
World Scientific, New Jersey,
Fractional Gaussian Noise
aka Fractional Brownian Motion
A continuous time Gaussian process, x H (t), is a Fractional noise,
if it starts at zero, with zero mean and has correlation function:
R(t,s) = E  x H ( t ) x H (s) =

2
2
t
2H
s
2H
 ts
2H
,
where H is a paramter known as the Hurst Index with value
in  0 ,1 , and  is the rms value of x H (t).
If H = 1/2, the process is Gaussian, or regular Brownian motion.
If H > 1/2, the process is positively correlated, or more red.
If H < 1/2, the process is negatively correlated, or more blue.
Examples
Flandrin’s results
Flandrin’s results
Flandrin’s results
Flandrin’s results
Flandrin’s results
Flandrin’s results : Delta Function
Flandrin’s results : Delta Function
Theoretical Background III
Effects of adding White Noise
Some Preliminary
• Robert John Gledhill, 2003: Methods for Investigating
Conformational Change in Biomolecular Simulations,
University of Southampton, Department of Chemistry, Ph
D Thesis.
• He investigated the effect of added noise as a tool for
checking the stability of EMD.
Some Preliminary
• His basic assumption is that the correct result is the
one without noise:
1
Discrepancy 
M
 1

N
j=1 
M

 c ( t ) - c ( t ) 

t=1

N
p
j
r
j
1/ 2
2
where c jp ( t ) is the IMF from the perturbated signal (signal + noise)
and c rj ( t ) is the IMF from the original signal without noise.
Test results
Top Whole data perturbed; bottom only 10% perturbed.
10%
Test results
Observations
• They made the critical assumption that the
unperturbed signal gives the correct
results.
• When the amplitude of the added
perturbing noise is small, the discrepancy
is small.
• When the amplitude of the added
perturbing noise is large, the discrepancy
becomes bi-modal.