RIDA: A Robust Information-Driven Data Architecture for Irregular

Download Report

Transcript RIDA: A Robust Information-Driven Data Architecture for Irregular

RIDA: A Robust Information-Driven
Data Compression Architecture for
Irregular Wireless Sensor Networks
Nirupama Bulusu
(joint work with Thanh Dang, Wu-chi Feng)
Portland State University
http://www.cs.pdx.edu/~nbulusu
Data Compression
– Communication is expensive in dense WSNs
•
Dominant energy consumer; limited network capacity
• Useful to compress sampled data before transmission
Objective
Design a robust architecture for
data compression and analysis in
sensor networks
Idea
• A sensor data map
can be viewed as
an image
• Can we apply
image compression
techniques to
sensor networks?
Y
X
Compression Challenges
Multiple data sources, limited memory
and computation
Irregular network topology
Unavailability of meta-information such
as sensor location
Frequent failures/missing sensor data
Network and environmental dynamics
Contributions
Multiple data sources, limited power,
memory and computation
Distributed
data
transformation
Irregular network topology
Unavailability of meta information
such as sensor location
Logical
Mapping
Resiliency
Frequent failures/missing sensor data
mechanism
RIDA: A Robust Information-Driven Data
Network
and environmental
dynamics
Compression
Architecture
Outline
• Related Work
• Understanding Data Correlation
• RIDA: Robust Information-Driven Architecture
• Evaluation
• Conclusion and Future Work
(Partial) Related Work
• Source Coding
– Lempel-Ziv-Welch (S-LZW) (Sadler et al, Sensys 06)
– Individual node codes the source using LZW algorithm
– Delay, not robust to failure, does not explore spatial
correlation
• Channel Coding
– DISCUS (Pradhan et al, ISIT 03)
– Code the channel with side-correlated information to reduce
number of information bits
– No guarantees for optimal performance
(Partial) Related Work
• Transform Coding
– Fourier-Based Transform, Wavelet-Based Transform (Cancio
and Ortega, ICASSP 2004 , Raymond et al) (Ganesan et al
2003), Random Projection (Candes and Tao, 2004),
– Rely on communication path, location, regularity of the
networks
– Most of these only focus on adapting the transformation to the
network, no guarantee of optimal performance, not robust to
failure
Understanding Data
Correlation
Physical Map
Light Reading Map
Sensors that are not spatial neighbors can report
correlated data
Understanding Data
Correlation
Physical Map
Voltage Reading Map
Sensors withofsimilar
tend to degrade
Correlation
data voltage
may belevel
independent
from together
externalregardless
factors
of changes in environmental
suchcondition
as location
Thesis
• To explore the correlation of sensor data,
examine the value of the data itself
(information)
– Correlation amongst sensor data can be obtained by
statistically observing the data values over a short
period of time
RIDA: A Robust InformationDriven Architecture
Logical
Node idnode
= 16 id = (7, 7)
REMAPPING
INFORMATION-BASED
LOGICAL MAPPING
RESILIENCY MECHANISM
DATA
TRANSFORMATION
DATA
TRANSFORMATION-1
QUANTIZATION
QUANTIZATION-1
Transmit only non-zero
coefficients
RIDA: Logical Mapping
• Logical Mapping
– M is the mapping from sensor s to logical index (x,y) based on
• d(s), the data value of sensor s
• D, the set of data values of all sensors in the cluster
• only consider a single-hop cluster in this work
• intended to be periodic
• Choosing M
– depends on specific applications
functions for data transformation
– Gradients with DCT
and underlying basis
RIDA: Distributed Data
Transformation
• A node calculates only the coefficient
corresponding to its index
– E.g.: With 2-D mapping, for node (i, j) perform DCT
operations only on corresponding row i and column j
• Only non-zero coefficients are transmitted
• Flexible to work on logical indices
RIDA: Resiliency
Mechanism
Classify values
below a
threshold as
faulty
Project to [128,255]
255
128
64
Compression
De-compression
COMPRESSION
0
Decompressed data
Original data
Missing readings
Normal readings
Evaluation of RIDA
• Compression Performance
– Logical mapping (1D, 2D)
– Data transformation (DCT, Wavelets)
• Robustness
– Accuracy vs. number of faulty sensors
• Energy and Bandwidth Savings
Methodology
• Experiments on real world data
–
–
–
–
source: Intel Research, Berkeley
54 sensors from February 28th and April 5th, 2004
Modified data set : sensor data is interpolated in time
Real data set: sensor data is kept as original
Metrics: Compression
Performance
• Compression Ratio
n: number of nodes
n’: number of non-zero
coefficients
n
r
n'
• Normalized MSE
di : reconstructed value of
sensor data
oi : original value of sensor
data
n : number of nodes
1/2
n
e
(
i 1
d i  oi 2
)
oi
n
Compression Performance
(Ideal Data)
Normalized MSE (%)
Mapping vs. without
mapping
Logical (Sorted)
mapping gives lower
error than without
mapping for the same
Data transformation
scheme
Compression Ratio
Compression Ratio
Normalized MSE (%)
Compression Performance on
Real Data (Humidity)
Quantization Scale
Although the compression ratio is around
4:1, error is less than 5%
Quantization Scale
DCT slightly better than Wavelet
Metrics: Error Detection
TP
recall 
TP  FN
TP  TN
accuracy 
TP  TN  FP  FN
TP-True Positive: # correctly classified healthy nodes
TN-True Negative: # correctly classified faulty nodes
FP-False Positive: # incorrectly classified healthy nodes
FN-False Negative: # incorrectly classified faulty nodes
Detection Accuracy (%)
Classification Recall (%)
Error Detection
Number of faulty nodes
Number of faulty nodes
Even when half the nodes are missing, accuracy > 90%, recall > 97%
Conclusions
• RIDA: A Data Compression Architecture
– Time-slicing across multiple sensor data streams
– Information-driven approach maximally leverages
correlation
– Logical mapping decouples compression from physical
topology
– Resiliency mechanism provides robustness to data loss
– Adapted to DCT and Wavelet Transforms
• Results
– Compression ratios of 10:1 (ideal) and 4:1 (real) with less
than 5% error
– 90% accuracy, 97% recall even when half the network
data is missing.
Future Directions
• Appropriate system parameters for sensor
data
– projection range, quantization
• Energy Balancing
• System Deployments
• Non-scalar or high rate data
– vibration, audio and video
Thank You
Questions?
Backup Slides
RIDA: Deployment View
Compression Ratio
Normalized MSE (%)
Performance in a Faulty
Environment (Temperature)
Number of faulty nodes
Even when half the nodes are missing,
compression ratio of 4:1 can be
achieved with less than 5% of error
Number of faulty nodes
DCT results in much lower
error with the same
compression ratio
Metrics: Energy Savings
Bench mark :
n is the network size
cb  n(t x  tr )h
h is average hop count
tx, tr are transmitting and
receiving power
Compression using DCT transform
cc  n(t x  tr  d )  n' h(t x  tr )
rh 
cb  cc h(t x  t r )( n  n' )  n(t x  tr  d )

cb
n(t x  tr )h
d : cost to compress the data
n’ : number of non-zero
coefficients, n/n’ ~ 20 for jpeg
Energy Saving:
Energy Savings
RF Transmission Power vs. CPU
Power Ratio
Ratio
Consumed Energy (mJ)
Energy Consumption by RF and
CPU
Simulation Time (Virtual)
Simulation Time (Virtual)
Power compression for transmitting one packet is still 2.5 times that
of sensing and data compression
Energy Savings
Using distributed DCT compression
Bandwidth Savings: 80-90%
Energy Savings: 36% for 4hop network
Energy Saving Using Compression (%)
Number of Hops
1
2
3
4
5
Interpolated data
-50.0
20.0
43.3
55.0
62.0
Real data
-68.8
1.4
24.7
36.4
43.4