Transcript Document
Applications of change point
detection in Gravitational Wave
Data Analysis
Soumya D. Mohanty
AEI
Plan of the talk
• Brief introduction to change point detection
and its relevance to GW data analysis
• Contrast with prevalent methods
• Three applications in different areas
26/3/03
UT Brownsville
2
What is a change point?
26/3/03
UT Brownsville
3
Signals and Change points
• The most elementary signature of a signal is to
introduce a change in the distribution of data
• Isolating a subset of given data that is
significantly different from the rest is the most
general signal detection method
• This division is subject to statistical
uncertainty
26/3/03
UT Brownsville
4
Mathematical Statement
• Data described by a joint probability density p(x).
• CP detection: Can the data be divided into disjoint
sets y, z (x = yz), such that p(y) is different from
p(z)? Not required to know p(y) or p(z)
themselves.
• Adaptive detection: Somehow deduce or estimate
a noise p(x). Then given new data y, test if it could
have come from p(x).
26/3/03
UT Brownsville
5
Pros and Cons
•
•
•
•
Change point detection
Can go from full prior
information to no prior
Less sensitive
Possible to tune away
response to different types
of inhomogeneity
Post analysis definition
required of what is a
signal and what is noise
26/3/03
•
•
•
•
Adaptive detection
Needs prior information and
assumption of stationarity
More sensitive provided
prior information is correct
Tuning is a complicated
process if at all possible
Signal & noise pre-defined
UT Brownsville
6
Applications
• Change point detection in the timefrequency plane – burst detection
• Change point detection in a multivariate
time series – Data/Detector Characterization
Robot
• Two sample comparison – GRB-GW
association
26/3/03
UT Brownsville
7
Bursts in time-frequency plane
• Time frequency plane – arena for burst
detection
• Example: split time series into segments and FFT
each one.
• Basic signature of a burst: changes the
distribution of samples in some region of the
time-frequency plane.
26/3/03
UT Brownsville
8
26/3/03
UT Brownsville
9
• Most Burst detection algorithms try to look
for this effect in different ways
• Excess power: thresholds the average (=band
limited rms)
• Tfclusters: thresholds cluster size
• PSDCD (Mohanty, PRD,’99): tests for
difference in sample distributions of blocks in
TF plane.
• PSDCD is a change point detector, others are
adaptive detectors.
26/3/03
UT Brownsville
10
Non-parametric CP detection
• Non-parametric detection: the false alarm rate is
independent of noise distribution by construction.
Sets it apart from other burst detectors.
• A non-stationary time series can be thought of as a
sequence of transitions from one noise model to
another (e.g. 1 10...). A non-parametric
detector should maintain a constant false alarm
rate even for non-stationary noise.
• CP detection can be tuned to prevent triggering on
known technical features.
26/3/03
UT Brownsville
11
KSCD
• Power Spectral Density Change Detector [
DMT Monitor]
• Kolmogorov-Smirnov test based Change
Detector (KSCD)
• KSCD: improvement in detection efficiency
and implementation
26/3/03
UT Brownsville
12
26/3/03
UT Brownsville
13
26/3/03
UT Brownsville
14
Trial run on GEO S1 data
• Uncalibrated h(t). 3.47 days (some breaks).
• Plagued by fast non-stationarity in the <1.5kHz
band.
• 90% - 95% of MTFC triggers could be attributed
to this fast non-stationarity.
• These false triggers skew the interpretation of
histograms such as the time interval between
triggers.
• KSCD can be tuned to be insensitive to these
features but still catch “genuine” glitches.
26/3/03
UT Brownsville
15
Rejection of features
26/3/03
UT Brownsville
16
Analysis goals
• Disentangle fast low frequency non-stationarity
from “genuine” triggers.
• Study time dependent behavior of the triggers.
Study trigger rate vis a vis band limited rms trend.
• Does KSCD trigger rate track band limited rms?
Tune KSCD to reject triggers but catch fast nonstationarity
• Analyze the dependence of “genuine” trigger channel
on fast non-stationarity channel.
26/3/03
UT Brownsville
17
Trigger rate
26/3/03
UT Brownsville
18
Future of KSCD
• Test various aspects of non-parametric change
point detection using real data (S1
GEO/LIGO, S2 LIGO)
• Understand efficiency (very preliminary:
40% of matched filtering)
• Build LDAS DSO
• KSCD: Main engine of DCR
26/3/03
UT Brownsville
19
Data/Detector Characterization
Robot
All
channels
View data as a
single
multivariate
time series
DCR
Detect change
points
Transform the multivariate data
Database
Design
Example: construct crosscorrelation of two channels
Data Mining
26/3/03
UT Brownsville
20
Data Characterization
What is the best analysis strategy given some data?
• Quantify
• non-stationarity of noise floor
• Types and rates of transients
• Drifting carrier frequencies
• Simulate real data and do Monte Carlo
studies
• Hopefully, lead to more believable detection
of GW signals.
26/3/03
UT Brownsville
21
Detector Characterization
• Hunt down sources of deviations from
expected ideal behavior and fix them
• To help, interferometers blindly record data
from several other sensors
• control system
• environment monitors (e.g., temperature)
• Seismometers, magnetometers
26/3/03
UT Brownsville
22
Change Points
Mathematical abstraction of the problem
• Main interest in both data and detector
characterization– change points
• Example: transients, change in rate of transients,
non-stationarity, change in coupling between two
channels
• Natural conclusion-- Build database of change
points using automated algorithms and analyse
the database
26/3/03
UT Brownsville
23
Analysis of databases
• Exploratory
• Limited to small databases of high confidence
detections
• Data mining
• Emerging field of synthesis between statistics
and computing – aim is to detect new,
informative patterns in huge databases
• Requires reliable database quality
26/3/03
UT Brownsville
24
DCR project
• Overall Aim: enable data mining of multichannel interferometric data
• Elements:
• Algorithms – few, well understood and
complementary (not an arbitrary set of
independent simple monitors)
• Software/Hardware
• Data mining
26/3/03
UT Brownsville
25
Algorithms in DCR
• Change point detector – KSCD
• generalized to the case of cross-spectral density
of two channels
• Line removal – MBLT
• no modeling required of line behavior
• transient resistant
• Robust noise floor tracking – MNFT
26/3/03
UT Brownsville
26
Sample Power Spectral Density
26/3/03
UT Brownsville
27
DCR implementation
• Core Digital Signal Processing library in
C++
• Template based Statistics and Signal Processing
library (TSSP). Uses STL.
• FFT, Filtering, Filter Design, Windows, PSD,
Modulation, Demodulation, ...
• Stand alone C++ main function for a given
pipeline
26/3/03
UT Brownsville
28
Stand alone code
• Frame reading class
• Multiple ADC channels
• Database IO class (uses MySQL)
• Database to be used for both job description
and storing job outputs
• Multiple jobs launched using Condor
• At present: dedicated 10 node cluster
(Linux-alpha)
26/3/03
UT Brownsville
29
GRB-GW association
• Finn, Mohanty, Romano, PRD, 1999
• Based on two sample comparison
• on-source sample
• off-source sample
• Two sample tests also used in CP detection
26/3/03
UT Brownsville
30
Introduction to Gamma-Ray
Bursts
• High-energy, short-duration
http://online.itp.ucsb.edu/online/gamma_c99/piran/oh/06.html
electromagnetic radiation from
extra-galactic sources
• Favored models point to
exploding fireball
• Involve large amounts of matter,
• ejected at relativistic speeds,
• producing a series of highenergy E/M shockwaves--• initially gamma-rays (some
redshift to lower-energy
gamma-rays or X-rays, others
are absorbed),
• then X-rays (red-shifted to
optical wavelengths),
• then visible light (red-shifted to
radio wavelengths)
26/3/03
UT Brownsville
31
GRBs and Gravitational Waves
• GRB progenitors thought to be new formed
•
•
•
•
Black Holes
Black Hole formed as a result of massive
stellar collapse or binary NS mergers
BH accretes debris rapidly
Leads to beams of ultra-relativistic ejecta
This violent scenario is a natural candidate
for strong GW emission also
26/3/03
UT Brownsville
32
Motivation for an FMR type
search
• GRBs occur at cosmological distances. Hence
chance of detecting GWs from an individual GRB
is small
• However, GRB astronomy is very active
• Relatively large number of events were detected
(~O(1/day)) by BATSE
• Several more missions coming up soon (e.g., SWIFT
and GLAST)
• FMR: Combine information from several triggers
to build up signal to noise ratio
26/3/03
UT Brownsville
33
Algorithm
• Cross-correlate time series between two
interferometers for each GRB trigger
• time shift segments to align GW signal
• Compare cross-correlation to times not
associated with GRBs
• Build an on-source and a off-source sample of
cross-correlations
• Test if the means values of the two samples are
significantly different
26/3/03
UT Brownsville
34
Implementation
• External Triggers subgroup of Bursts Upper
Limit group
• S. Marka, R. Rahkola, S. Mohanty, S.
Mukherjee, R. Frey
• Could not apply FMR in toto for S1 because
only one trigger received during double
lock (LIGO tech note)
• Already have 15 triggers for S2!
26/3/03
UT Brownsville
35
Issues
• Non-stationarity of data
• Data conditioning – line removal
• Noise floor tracking -- MNFT
• Lack of directional accuracy
• Use H1+H2 – but strong (non-stationary?) correlations
• How to best use multiple interferometers
• Systematic uncertainties
• Rely on signal injection and Monte Carlo simulations
• DCR – simulate real data?
26/3/03
UT Brownsville
36
Summary
• Applications of change point detection in
GW data analysis
• Exploration of such techniques has just only
started
• Offers better control on data analysis with
real, complicated data
• Improvements in efficiency possible. Can
be combined with adaptive methods.
26/3/03
UT Brownsville
37