Transcript 2 MB ppt

Comparing Reputation Schemes for Detecting
Malicious Nodes in Sensor Networks
Partha Mukherjee & Sandip Sen
Department of Math & CS
University of Tulsa
Motivation
 ASSUMPTION : A network of sensors deployed for
sensing data over a region
 Correlation between data sensed at different nodes
 Correlation pattern may change over time
Colluding malicious nodes may attempt to subvert the data
reported by the sensor network
GOAL : Comparing the performances of the reputation
mechanisms used to detect malicious / erroneous nodes in
the network
Sensor Networks
 Monitor physical / environmental conditions
 Resource constraints
 Sensed/aggregated data reported back to Base station
 Susceptible to security breaches/compromise
Sensor Network Organization
Sensor field consists of nodes laid out on a grid
Nodes organized in a hierarchy
Assumption: time-varying data sensed by different nodes
are correlated
Example: Temperatures at different grid points over the
day
Schemes used to detect malicious nodes
Reinforcement learning
 Q-learning approach
Statistically grounded scheme:
 -reputation approach
Discount factors: weights on past / present experiences
• Un-weighted
• Linear
• Exponential
Varying parameters:
 Patterns in the sensed data
 Delay of onset of malicious data
Detecting Malicious Nodes
Collect sufficient data when sensor network is operating
normally for mining correlation patterns



Use neural networks to model correlation between data sensed by
siblings in the sensor node hierarchy
The value sensed at any node is predicted from the values sensed by
its siblings
Offline training of the nets using back-propagation
Use learning techniques to discover patterns
Each malicious node adds a random offset in the range [0,]
to the reported value
Detecting Malicious Nodes
At each reporting time step error between actual and
predicted data sensed by a node is calculated
This sequence of “errors” is used to incrementally update
the reputation of the node
Node labeled malicious if reputation falls below threshold
Detecting Malicious nodes
Choose Reputation Threshold, 
For each node:
 Compute relative error at time t : t
 Compute error statistic : (t)
 Update Reputations :
Q-Learning : tQL = (1 - ). (t-1)QL
• Balance Factor : 
+ . (t)
- Reputation : t = (t + 1) / (t + t + 1)
• Cooperative Response : , Non-cooperative Response : 
   (1  f ( )),   
f ( )
– Un-weighted :
t
1
1
   (1 f ( )).
,  t   f ( j ).
j1
t  j 1
– Linear :
t  j 1
t
t
t
t j 1


(1
f
(

))
.

,
   j1 f ( ).



– Exponential
:
t
t
j
j 1
t
t
t
j1
t
t j 1
j
Exponential
discount factor : 

Node is malicious :

if QL
 or if 
t
j 1
j
j1
<
t
<


j
Experiment
Computation of sensed data
 Based on generation function : g
Model fluctuation
 Add Gaussian Noise : N
Variation of the sensed parameter is represented by the
stochastic function ƒ
 ƒ(x,y,t) = g(x,y) + h(t) + N(0,)
 h : T  [l, u]
Experiment
Considered two generation functions g to generate data
patterns over the 85 node sensor network
g1: exp(-(x2 + y2))
g2 : (x + y) / 2
Considered error-free time interval set
 D = {0,10,20,30,40,50}
Considered exponential discount factor set
  = {0.2,0.4,0.6,0.8}
Q-learning and -reputation Schemes with
Linear and Two Extreme Discount Factors
Q-learning scheme detects the erroneous nodes earlier than
-reputation for distribution exp(-(x2 + y2))
Q-learning and -reputation Schemes with
Linear and Two Extreme Discount Factors
Q-learning scheme detects the erroneous nodes earlier than
-reputation for distribution (x + y)/2
Comparison Between -Reputation
Schemes with Different discount factors
-reputation schemes of lower discount factors detects the
erroneous nodes earlier for distribution exp(-(x2 + y2))
Comparison Between -Reputation
Schemes with Different discount factors
-reputation schemes of lower discount factors detects the
erroneous nodes earlier for distribution (x + y)/2
Conclusions
Q-Learning is more efficient than β-Reputation for higher
values of initial error free time steps
β-Reputation is more efficient than Q-learning to detect first
malicious node when the initial delay of attack is in between
0 to 4 iterations
Among β-Reputation schemes with discount factors,
schemes with lower discount values exhibit higher
efficiency. The un-weighted one ( = 1) is least efficient
The combination of learning and reputation management
makes this scheme work with the following observations
 All faulty nodes are detected (No false positives)
 No normal node labeled faulty (No false negatives)
Future Work
Testing with different complex data patterns.
Testing with different topologies.
Exploring the possibility of developing more robust scheme.
Handling sophisticated collusion.
 Hierarchical structure : If nodes in higher level collude.
THANK YOU