Time Series Analysis…

Download Report

Transcript Time Series Analysis…

Biomedical Imaging 2
Class 8 – Time Series Analysis (Pt. 2); Image
Post-processing (Pt. 2)
03/20/07
BMI2 SS07 – Class 8 “Image Processing 2” Slide 1
Flowchart for Imaging Data Analysis
Filter, normalize,
SNR threshold
TSA
Measurement
→ Raw Data
Preprocessing,
or preconditioning
Image
Reconstruction
“Post-postpostprocessing”
“Post-postprocessing”
Postprocessing
Develop metrics
into diagnostic
indicators
Integrate in space
and/or time,
define metrics
Time-series analysis
(TSA)
(FT, corr., SSS, GLM)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 2
Time Series Analysis…
Definitions
• The branch of quantitative forecasting in which data for one variable are
examined for patterns of trend, seasonality, and cycle.
nces.ed.gov/programs/projections/appendix_D.asp
• Analysis of any variable classified by time, in which the values of the variable
are functions of the time periods. www.indiainfoline.com/bisc/matt.html
• An analysis conducted on people observed over multiple time periods.
www.rwjf.org/reports/npreports/hcrig.html
• A type of forecast in which data relating to past demand are used to predict
future demand. highered.mcgrawhill.com/sites/0072506369/student_view0/chapter12/glossary.html
• In statistics and signal processing, a time series is a sequence of data points,
measured typically at successive times, spaced apart at uniform time
intervals. Time series analysis comprises methods that attempt to understand
such time series, often either to understand the underlying theory of the data
points (where did they come from? what generated them?), or to make
forecasts (predictions). en.wikipedia.org/wiki/Time_series_analysis
BMI2 SS07 – Class 8 “Image Processing 2” Slide 3
Time Series Analysis…
Varieties
• Frequency (spectral) analysis
– Fourier transform: amplitude and phase
– Power spectrum; power spectral density
• Auto-spectral density
– Cross-spectral density
– Coherence
• Correlation Analysis
– Cross-correlation function
• Cross-covariance
• Correlation coefficient function
– Autocorrelation function
– Cross-spectral density
• Auto-spectral density
BMI2 SS07 – Class 8 “Image Processing 2” Slide 4
Time Series Analysis…
Varieties
• Time-frequency analysis
– Short-time Fourier transform
– Wavelet analysis
• Descriptive Statistics
– Mean / median; standard deviation / variance / range
– Short-time mean, standard deviation, etc.
• Forecasting / Prediction
– Autoregressive (AR)
– Moving Average (MA)
– Autoregressive moving average (ARMA)
– Autoregressive integrated moving average (ARIMA)
• Random walk, random trend
• Exponential weighted moving average
BMI2 SS07 – Class 8 “Image Processing 2” Slide 5
Time Series Analysis…
Varieties
• Signal separation
– Data-driven [blind source separation (BSS), signal source
separation (SSS)]
• Principal component analysis (PCA)
• Independent component analysis (ICA)
• Extended spatial decomposition, extended temporal
decomposition
• Canonical correlation analysis (CCA)
• Singular-value decomposition (SVD) an essential
ingredient of all
– Model-based
• General linear model (GLM)
• Analysis of variance (ANOVA, ANCOVA, MANOVA, MANCOVA)
– e.g., Statistical Parametric Mapping, BrainVoyager, AFNI
BMI2 SS07 – Class 8 “Image Processing 2” Slide 6
A “Family Secret” of Time Series Analysis…
• Scary-looking formulas, such as

F     f t  e
 i t
dt ,

F   x , y  


1
f t  
2

i t
F

e
d ,





 i x x y y 
f
x
,
y
e
dxdy ,



 
F f 1 x , y  f 2 x , y    F1x , y   F 2x , y 
– Are useful and important to learn at some stage, but not really
essential for understanding how all these methods work
• All the math you really need to know, for understanding, is
– How to add: 3 + 5 = 8, 2 - 7 = 2 + (-7) = -5
– How to multiply: 3 × 5 = 15, 2 × (-7) = -14
• Multiplication distributes over addition
u × (v1 + v2 + v3 + …) = u×v1 + u×v2 + u×v3 + … c
– Pythagorean theorem: a2 + b2 = c2
a
b
BMI2 SS07 – Class 8 “Image Processing 2” Slide 7
A “Family Secret” of Time Series Analysis…
A most fundamental mathematical operation for time series analysis:
x 1 , xx22, xx33, ..., xxNN
 ...  z N  Z
z1  z 2  z 3 
y 1 , y 2 , y 3 , ..., y N
y1 , y2 , y3 , ..., y N
z1
z2
zN
z3
The xi time series is measurement or image data. The yi time series
depends on what type of analysis we’re doing:
Fourier analysis: yi is a sinusoidal function
Correlation analysis: yi is a second data or image time series
Wavelet or short-time FT: non-zero yi values are concentrated in a
small range of i, while most of the yis are 0.
GLM: yi is an ideal, or model, time series that we expect some of
the xi time series to resemble
BMI2 SS07 – Class 8 “Image Processing 2” Slide 8
Correlation Analysis
BMI2 SS07 – Class 8 “Image Processing 2” Slide 9
-8
x 10
4
2
0
Hb-oxy
-2
-4
-6
-8
x 10
50
500
1000
1500
2000
2500
3000
3500
4
3
2
1
Hb-deoxy
0
-1
-2
-3
-4
0
500
1000
1500
2000
2500
3000
3500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 10
-8
x 10
4
2
0
Hb-oxy
-2
-4
mean
value
standard
deviation
-6
-8
x 10
50
500
1000
1500
2000
2500
3000
3500
4
3
2
1
Hb-deoxy
0
-1
-2
-3
-4
0
500
1000
1500
2000
2500
3000
3500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 11
-8
x 10
4
2
0
Hb-oxy
-2
-4
mean
value
standard
deviation
-6
-8
x 10
50
500
1000
1500
+k
2000
2500
3000
3500
4
3
2
1
Hb-deoxy
0
-1
-2
-3
-4
0
500
1000
1500
2000
2500
3000
3500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 12
-8
x 10
4
2
0
Hb-oxy
-2
-4
-6
-8
x 10
50
500
1000
1500
2000
2500
3000
3500
4
3
2
1
Hb-deoxy
0
-1
-2
-3
-4
0
500
1000
1500
2000
2500
3000
3500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 13
-5
1
x 10
Detector Reading
0.5
0
-0.5
-1
0
100
200
300
Time(Sec)
400
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 14
-5
4
x 10
Detector Reading
2
0
-2
-4
0
100
200
300
400
Time(Sec)
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 15
-5
4
x 10
Detector Reading
2
0
-2
-4
0
100
200
300
400
Time(Sec)
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 16
-5
2
x 10
Detector Reading
1
0
-1
-2
-3
0
100
200
300
400
Time(Sec)
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 17
-5
3
x 10
Detector Reading
2
1
0
-1
-2
-3
0
100
200
300
400
Time(Sec)
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 18
-5
1
x 10
Detector Reading
0.5
0
-0.5
-1
-1.5
0
100
200
300
400
Time(Sec)
500
600
700
800
Cross-Correlation
1
0.5
0
-0.5
-1
-1500
-1000
-500
0
Time Delay(Sec)
500
1000
1500
BMI2 SS07 – Class 8 “Image Processing 2” Slide 19
Time-Frequency Analysis
BMI2 SS07 – Class 8 “Image Processing 2” Slide 20
(a)
(b)
(c)
Figure 9. Illustration of Morlet wavelet analysis concept. The complex wavelet (solid and dashed sinusoidal
curves denote real and imaginary part, respectively) shown in 9(a) is superimposed on the time-varying
measurement depicted in 9(b). A new function, equivalent to the covariance between the wavelet and measured
signal, as a function of the time point about which the wavelet is centered, is generated. (See Figure 10 for an
example of such a computation.) Varying the width of the wavelet, as shown in 9(c), changes the frequency
whose time-varying amplitude is computed.
BMI2 SS07 – Class 8 “Image Processing 2” Slide 21
Figure 10. Result of wavelet analysis (see Fig. 9) applied to (a) an unmodulated 0.1-Hz sine wave
and (b) a frequency-modulated 0.1-Hz sine wave. In 10(a) it is seen that the amplitude and frequency
both are constant over time, while in 10(b) it is seen that the amplitude is fixed but the frequency
varies.
BMI2 SS07 – Class 8 “Image Processing 2” Slide 22
Data “Post-Post-Processing” and “PostPost-Post-processing”
BMI2 SS07 – Class 8 “Image Processing 2” Slide 23
Starting point: Time Series of Reconstructed Images
Position
Physiological parameters:
1) Hboxy, 2) Hbdeoxy, 3) Blood volume
Time
4) HbO2Sat
1. Temporal Averaging  Spatial Averaging
2. Spatial Averaging  Temporal Averaging
3. Wavelet Analysis
BMI2 SS07 – Class 8 “Image Processing 2” Slide 24
Method 1: Temporal Spatial Averaging
(IV)
Position
temporal
integration
Time
Spatial map of
temporal standard
deviation (SD)
drop position
information
Baseline temporal
mean is 0, by
definition
(III)
spatial
integration
100
mean
Hboxy
(II)
Hbdeoxy
0
sorted parameter value
SD
scalar
quantities
(I)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 25
Method 2: Spatial  Temporal Averaging
(IV)
Position
spatial
integration
Time
Time series of
spatial mean → O2
demand / metabolic
responsiveness
Time series of spatial
SD → Spatial
heterogeneity
(II)
temporal
integration
Temporal mean of spatial mean time series: 0, by definition
Temporal SD of spatial mean time series
Temporal mean of spatial SD time series
Temporal SD of spatial SD time series
scalar
quantities
(I)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 26
Method 3: Time-frequency (wavelet) analysis
1. Starting point is reconstructed image time series (IV)
2. Use (complex Morlet) wavelet transform as a time-domain
bandpass filter operation
A. Output is an image time series (IV) of amplitude vs. time vs. spatial
position, for the frequency band of interest
B. Filtered time series can be obtained for more than one frequency
band
f2
f1
time
3. Recompute previously considered Class-II and Class-I results,
using Methods 1 and 2, but starting with the wavelet amplitude
time series
BMI2 SS07 – Class 8 “Image Processing 2” Slide 27
Baseline GTC: Healthy Volunteer
Class IV results: normalized
wavelet amplitude, right breast
Normalized wavelet amplitude
3.5
200
3
400
600
2.5
FEM mesh node
800
1000
2
1200
1.5
Temporal
coherence
index = 25.7%
(26.3% for left
breast (not
shown))
1400
1600
1
1800
0.5
2000
2200
200
400
600
Time Point
800
1000
1200
BMI2 SS07 – Class 8 “Image Processing 2” Slide 28
Baseline GTC: Ductal Carcinoma in Right Breast
Class IV results: normalized
wavelet amplitude, left (-CA) breast
Normalized wavelet amplitude
2.2
200
2
400
1.8
600
1.6
FEM mesh node
800
Temporal
coherence
index = 18.4%
1.4
1000
1200
1.2
1400
1
1600
0.8
1800
0.6
2000
2200
200
400
600
Time Point
800
1000
1200
0.4
Sharp, deep
troughs are
indicative of strong
spatial coordination
BMI2 SS07 – Class 8 “Image Processing 2” Slide 29
Example 2: Ductal Carcinoma in Right Breast
Class IV results: normalized
wavelet amplitude, right (+CA) breast
Normalized wavelet amplitude
2.2
200
2
400
1.8
600
1.6
FEM mesh node
800
Temporal
coherence
index = 13.5%
1.4
1000
1200
1.2
1400
1
1600
0.8
1800
0.6
2000
2200
200
400
600
Time Point
800
1000
1200
0.4
Troughs (and
peaks) appreciably
reduced, or absent
BMI2 SS07 – Class 8 “Image Processing 2” Slide 30
Specificity and Sensitivity
Test (+)
Test (–)
Test Result
Presence of Disease
Disease (+)
Disease (–)
TP
Sensitivity 
TP  FN
True
Positive
False
Positive
Given disease, what
is the probability of a
positive test result?
False
Negative
True
Negative
TN
Specificity 
TN  FP
Given no disease,
what is the probability
of a negative test?
BMI2 SS07 – Class 8 “Image Processing 2” Slide 31
Predictive Values
Test (+)
Test (–)
Test Result
Presence of Disease
Disease (+)
Disease (–)
TP
PositivePV 
TP  FP
True
Positive
False
Positive
Given positive test
result, what is the
probability of disease?
False
Negative
True
Negative
TN
NegativePV 
TN  FN
Given negative test
result, what is the
probability of not
having disease?
BMI2 SS07 – Class 8 “Image Processing 2” Slide 32
ROC (Receiver Operating Characteristic) Analysis
-3
x 10
8
7
7
Fraction of Population
8
6
5
4
3
CA Subjects
Non-CA Subjects
4
3
2
1
0
1
2
3
4
5
0
-1
6
Metric Value
Diagnostic
Threshold
5
1
0
-1
-3
6
2
0
1
2
3
4
5
6
Metric Value
100
90
80
70
Sensitivity (%)
Fraction of Population
x 10
60
50
40
30
20
10
0
0
20
40
60
100 - Specificity (%)
80
100
BMI2 SS07 – Class 8 “Image Processing 2” Slide 33
ROC Curves for Metrics – 1
Area
0.854
(0.708)
Area
0.780
(0.560)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 34
ROC Curves – 2
Area
0.786
(0.572)
Area
0.665
(0.330)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 35
ROC Curves – 3
Area
0.826
(0.652)
Area
0.809
(0.618)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 36
ROC Curves - 4
Area
0.818
(0.636)
Area
0.800
(0.600)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 37
ROC Curves - 5
Area
0.227
(0.546)
Area
0.205
(0.590)
BMI2 SS07 – Class 8 “Image Processing 2” Slide 38
Summary of Calculated Metrics
Baseline Measurements
TMSSD
HbOXY
HbRED
TSDSM
XX
X
Valsalva
TSDSSD
SMTSD
Area
Height
Wavelet
X
XX
X
X
X
X
XX
X
X: 0.01 ≤ p < 0.05, for difference between Cancer and Non-Cancer Subjects
XX: p < 0.01
•
•
•
Data reduction yielded 16 “metrics”
Paired t-tests and ROC curves were used to select metrics that can
distinguish between cancer and non-cancer subjects
Selected metrics used in Logistic Regression
BMI2 SS07 – Class 8 “Image Processing 2” Slide 39
Logistic Regression
•
•
Binary Distributions (Cancer vs. NonCancer) are non-linear
Logistic regression expresses
probability of event as a linear
combination of “metrics” Xi and
coefficients i
 P(cancer ) 
ln 
    1 X 1   2 X 2  ...
 1  P(cancer ) 
BMI2 SS07 – Class 8 “Image Processing 2” Slide 40
Logistic Regression Applied
Metrics calculated and selected
based on t-tests & ROC curves
Logistic regression model
calculates i for each metric (Xi)
Using i, a predicted probability
distribution can be created
New patient’s Xi used to generate
probability of cancer in patient
Probability
Metrics used as inputs into logistic
regression model
New Patient’s Values
X1 = .43; X2 = -.05
Linear Model: P(cancer) = 0.75
Metrics
Logistic Regression: P(cancer) = 0.90
BMI2 SS07 – Class 8 “Image Processing 2” Slide 41
Limitations of Logistic Regression
• Metrics Xi must be independent of each other orthogonalization
may be needed
• Consequently, biologically relevant phenomenology may be ignored
by model
• Model may be mathematically unstable if the number of cases is low
BMI2 SS07 – Class 8 “Image Processing 2” Slide 42
Orthogonalization
• The logistic regression model excluded several metrics due to
inherent co-linearity (not all are linearly independent)
• Transforming excluded metrics to be orthogonal to each other
caused a loss of magnitude and of significance
• Result using orthogonalized metrics was very similar to original
result
BMI2 SS07 – Class 8 “Image Processing 2” Slide 43
Final Result
•
•
The final predicted probabilities were established by averaging the predicted probabilities for
the N=21 and N=37 results
Predicted probabilities for patients within the N = 37 group and not in the N = 21 group were
unchanged
Combined Metrics (N=21 & N=37)
Sensitivity
0.93
Specificity
0.96
PPV
0.93
NPV
0.96
BMI2 SS07 – Class 8 “Image Processing 2” Slide 44