Transcript 2 MB ppt
Comparing Reputation Schemes for Detecting
Malicious Nodes in Sensor Networks
Partha Mukherjee & Sandip Sen
Department of Math & CS
University of Tulsa
Motivation
ASSUMPTION : A network of sensors deployed for
sensing data over a region
Correlation between data sensed at different nodes
Correlation pattern may change over time
Colluding malicious nodes may attempt to subvert the data
reported by the sensor network
GOAL : Comparing the performances of the reputation
mechanisms used to detect malicious / erroneous nodes in
the network
Sensor Networks
Monitor physical / environmental conditions
Resource constraints
Sensed/aggregated data reported back to Base station
Susceptible to security breaches/compromise
Sensor Network Organization
Sensor field consists of nodes laid out on a grid
Nodes organized in a hierarchy
Assumption: time-varying data sensed by different nodes
are correlated
Example: Temperatures at different grid points over the
day
Schemes used to detect malicious nodes
Reinforcement learning
Q-learning approach
Statistically grounded scheme:
-reputation approach
Discount factors: weights on past / present experiences
• Un-weighted
• Linear
• Exponential
Varying parameters:
Patterns in the sensed data
Delay of onset of malicious data
Detecting Malicious Nodes
Collect sufficient data when sensor network is operating
normally for mining correlation patterns
Use neural networks to model correlation between data sensed by
siblings in the sensor node hierarchy
The value sensed at any node is predicted from the values sensed by
its siblings
Offline training of the nets using back-propagation
Use learning techniques to discover patterns
Each malicious node adds a random offset in the range [0,]
to the reported value
Detecting Malicious Nodes
At each reporting time step error between actual and
predicted data sensed by a node is calculated
This sequence of “errors” is used to incrementally update
the reputation of the node
Node labeled malicious if reputation falls below threshold
Detecting Malicious nodes
Choose Reputation Threshold,
For each node:
Compute relative error at time t : t
Compute error statistic : (t)
Update Reputations :
Q-Learning : tQL = (1 - ). (t-1)QL
• Balance Factor :
+ . (t)
- Reputation : t = (t + 1) / (t + t + 1)
• Cooperative Response : , Non-cooperative Response :
(1 f ( )),
f ( )
– Un-weighted :
t
1
1
(1 f ( )).
, t f ( j ).
j1
t j 1
– Linear :
t j 1
t
t
t
t j 1
(1
f
(
))
.
,
j1 f ( ).
– Exponential
:
t
t
j
j 1
t
t
t
j1
t
t j 1
j
Exponential
discount factor :
Node is malicious :
if QL
or if
t
j 1
j
j1
<
t
<
j
Experiment
Computation of sensed data
Based on generation function : g
Model fluctuation
Add Gaussian Noise : N
Variation of the sensed parameter is represented by the
stochastic function ƒ
ƒ(x,y,t) = g(x,y) + h(t) + N(0,)
h : T [l, u]
Experiment
Considered two generation functions g to generate data
patterns over the 85 node sensor network
g1: exp(-(x2 + y2))
g2 : (x + y) / 2
Considered error-free time interval set
D = {0,10,20,30,40,50}
Considered exponential discount factor set
= {0.2,0.4,0.6,0.8}
Q-learning and -reputation Schemes with
Linear and Two Extreme Discount Factors
Q-learning scheme detects the erroneous nodes earlier than
-reputation for distribution exp(-(x2 + y2))
Q-learning and -reputation Schemes with
Linear and Two Extreme Discount Factors
Q-learning scheme detects the erroneous nodes earlier than
-reputation for distribution (x + y)/2
Comparison Between -Reputation
Schemes with Different discount factors
-reputation schemes of lower discount factors detects the
erroneous nodes earlier for distribution exp(-(x2 + y2))
Comparison Between -Reputation
Schemes with Different discount factors
-reputation schemes of lower discount factors detects the
erroneous nodes earlier for distribution (x + y)/2
Conclusions
Q-Learning is more efficient than β-Reputation for higher
values of initial error free time steps
β-Reputation is more efficient than Q-learning to detect first
malicious node when the initial delay of attack is in between
0 to 4 iterations
Among β-Reputation schemes with discount factors,
schemes with lower discount values exhibit higher
efficiency. The un-weighted one ( = 1) is least efficient
The combination of learning and reputation management
makes this scheme work with the following observations
All faulty nodes are detected (No false positives)
No normal node labeled faulty (No false negatives)
Future Work
Testing with different complex data patterns.
Testing with different topologies.
Exploring the possibility of developing more robust scheme.
Handling sophisticated collusion.
Hierarchical structure : If nodes in higher level collude.
THANK YOU