Apweb220PPt - NEC Laboratories America

Download Report

Transcript Apweb220PPt - NEC Laboratories America

IntruMine: Mining Intruders in Untrustworthy
Data of Cyber-physical Systems
Lu-An Tang, Quanquan Gu, Xiao Yu, Jiawei
Han, Thomas La Porta, Alice Leung, Tarek
Abdelzaher, Lance Kaplan
2015/7/18
Outline
Introduction
Related Works
IntruMine Framework
Experiments
2015/7/18
DAIS UIUC
2
Introduction
Cyber Physical System: Integrate physical devices (e.g.,
sensors, cameras) with cyber components to form a
situation aware analytical system
One important application of CPS: monitor the
designated region and detect intruders on battlefields
Igloo White: Use UGS (Unattended Ground Sensors) to
detect enemy infantry and vehicle
REMBASS(REmotely Monitored BAttlefield Sensor
System): seismic-acoustic sensor, infrared sensor and
magnetic sensor
Ground Tactical System, etc
2015/7/18
DAIS UIUC
3
The Scenario: A Line in the Sand
Deploy a large set of simple, inexpensive sensors (“sand”) to
blanket a battlefield and obtain fine grained situational
awareness to see through the “fog of war”
2015/7/18
DAIS UIUC
4
The Intruder Mining Task
Basic requirement: Whether there is an intruder
in the region or not?
Advanced requirements:
How many intruders? Where are they?
What (type) is this intruder? The unarmed citizen,
the soldiers and the vehicles have different levels of
threat
2015/7/18
DAIS UIUC
5
Intruder Mining is a Difficult Task I
Untrustiworthy data: The sensors are passive sensors
(low cost and energy saving)
highly unreliable. Many previous studies report that the false
reading may be as high as 40%
No advanced knowledge: The users usually cannot
provide in advance the knowledge of the intruders e.g.,
energy(emitted signal strength), speed, number in
advance. In constrast, they would like to get such
informaiton from the system
2015/7/18
DAIS UIUC
6
Intruder Mining is a Difficult Task II
No training datasets: It is hard to manually label the
sensor data. The sensor data are collected with a
specific deployment plan. There is no common training
set/models
Large-scale data: The network may contain hundreds,
even thousands sensors, each sensor generates a record
in minutes, even seconds
2015/7/18
DAIS UIUC
7
Our Contribution
We propose a novel framework called IntruMine, to
find real intruders from untrustworthy sensor data
Use a monitoring graph model to construct the relationship
between intruders and sensors
Design a gradient algorithm to estimate the detailed
information of the intruders
Introduce the concepts of intruder confidence to verify the
detected intruders
In experiments on both real and synthetic dataset, show that
IntruMine yields higher precision and time efficiency than
existing methods
2015/7/18
DAIS UIUC
8
Outline
Introduction
Related Works
IntruMine Framework
Experiments
2015/7/18
DAIS UIUC
9
Intruder Detection in Sensor Network
Many works focus on the deployment of sensors,
minimize the sensor number and cost (optimization)
Many works focus on answer “whether there is an
intrusion or not” – not to compute the intruder details
Major metric:
Intrusion distance: how soon the intruder can be detected
Energy cost: to save energy and cost for deployment plan
2015/7/18
DAIS UIUC
10
Our Focus
IntruMine actually complements those technologies
The sensor network has already been deployed
The sensor are densely deployed
The CPS can receive the sensor data
Focus on sensor data analysis, not the physical level
2015/7/18
DAIS UIUC
11
Intruder Detection by Signal Processing
Detect the intruder location by Maximum Likelihood
algorithm(ML) /Particle filter of acoustic signal
Only single intruder
No untrustiworthy data (use synthetic datasets for
experiment)
Make several assumptions of the intruders, such as: the
energy and the speed of intruder are already known in
advance
Prove the ML method can be seen as a special case of
IntruMine framework
2015/7/18
DAIS UIUC
12
Sensor Localization by Data Mining
Learning intruder location from wifi, with the ideas of
transfer learning
Scenario is different:
The fixed nodes are the APs that send out signals, and the
sensors are moving and receiving signals
Locate the sensor with the received signals
Supervise learning or Semi-supervised learning (transfer
learning)
2015/7/18
DAIS UIUC
13
Outline
Introduction
Related Works
IntruMine Framework
Experiments
2015/7/18
DAIS UIUC
14
Problem Definition
Given a sensor network with the spatial locations of
each sensor, the sensor record set R, the task is:
estimate each intruder’s spatial locations
compute their energy to help indentify their types
Passive sensors: mechnism is known, r(s) = f(e(o), d(s,o))
s13=62
s12=62
s8=8
s3=61
s2=80
o1
s1=91
s6=55
o2
s14=23
s5=34
s17=6
s4=7
s16=9
2015/7/18
s9=43
s7=59
s10=42
s15=43
DAIS UIUC
intruder
sensor with
high reading
s11=29
sensor with
low reading
15
The IntruMine Framework
Two steps:
Rough Estimation
Find the break point: initialize the information of intruders
by the alarming sensors
find out the set of monitoring sensors for each intruder,
and build up their relationship with an monitoring graph
Detailed Estimation
Use a gradient algorithm to estimate the details
Verify the detected intruders
2015/7/18
DAIS UIUC
16
The Monitoring Graph
The CPS has a large set of deployed sensors, but not all
of them are relevant to intruder mining
Use the monitoring graph to select out the relevant
sensors and make a rough initialization for the
intruders
Algorithm:
Scan the reading set and find out the peak sensors
For each peak sensor, we initialize an intruder at its position
The intruder’s energy = the peak sensor’s reading
Calculate the monitoring sensor set for such intruder
2015/7/18
DAIS UIUC
17
Example of Monitoring Graph
s12=62
s3=61
s2=80
o3
s6=55
s1=91
o1
s4=7
s13=62
s8=8
s14=23
s10=42
g1
s15=43
g2
e1=91
s11=29
s16=9
o4
s1
s7
g3
e2=59
s2
s12
g4
e3=43
s13
o1
s4
o2
s10
2015/7/18
intruder
sensor with
high reading
sensor with
low reading
o2
s17=6
s5=34
s9=43
s7=59
s15
s16
o3
s14
DAIS UIUC
e4=62
o4
s17
18
The Problem of Monitoring Graph
s3=61
s2=80
s6=55
s1=91
o1
s4=7
s12=62
s13=62
o3
s7=59
s14=23
s9=43
intruder
sensor with
high reading
sensor with
low reading
o2
s17=6
s5=34
s8=8
s11=29
s10=42
s16=9
o4
s15=43
The
monitoring ggraph
localize theg3influence of false
g1
g4
2
s1 readings, but also
contain
somes12mistakes
s7
s15
e =91
e =59
e =43
e =62
1
s2
3
s13 intruder)
o3 is caused by false reading (ghost
o1
s4
2
o2
4
s16
o3
o4
s10 a normal sensorss
s17
o2 is linked with
1417, but o2 ’s actually
location is a little higher and it should not influence s17
(wrong location and energy)
2015/7/18
DAIS UIUC
19
Principle of Estimation
With the initialized intruders, the system can compute
the estimated readings for each sensor
The difference between the estimated readings and real
readings should be minimized
Using a gradient algorithm to adjust the intruder’s
information (spatial coordinates and energy) iteratively
2015/7/18
DAIS UIUC
20
Iteratively Computation
Let θ = (x1, y1, e(o1), x2, y2, e(o2), … xk, yk, e(ok))
θ at nth iteration is calculated from n-1th iteration
 
n
i
Algorithm:
n 1
i
n 1
1 

n 1
n i
Initialize the θ with the peak sensor’s information
Calculate the total reading difference
Adjust θ with above equation
Update the linked structure of monitoring graph
Repeat step 2-4 until stable (reading difference is less than
threshold)
2015/7/18
DAIS UIUC
21
Example: Estimate the Information
s13=62
o3
s3=61
s2=80
s1=91
intruder
s12=62
s9=43
s7=59
s6=55
s14=23
o1
s4=7
s17=6
s5=34
s16=9
g1
s1
s2
s3
s4
s5
s6
s8=8
o2
s10=42
s15=43
o4
g2
e1=113
2015/7/18
o1
s7
s9
s10
s11
s11=29
initialized
intruder
sensor with
high reading
sensor with
low reading
g3
e2=72
o2
s12
g4
e3=67
s15
e4=32
s13
s14
DAIS UIUC
o3
o4
s17
22
Verify the Intruders
Verify the intruders with the reading difference
If the deviation of estimated reading with respect to the true
reading is not within 3 standard deviations (i.e., too large
difference), then the intruder itself might be false
g1
s1
s2
s3
s4
s5
s6
g2
e1=113 s7
s9
s10
o1
s11
τ(o1)=0.81
2015/7/18
g3
e2=72
τ(o2)=0.94
o2
s12
g4
e3=67 s15
intruder
e4=32
s13
s14
o3
τ(o3)=0
DAIS UIUC
o4
s17
τ(o4)=0.46
false
intruder
sensor with
high reading
sensor with
low reading
23
Outline
Introduction
Related Works
IntruMine Framework
Experiments
2015/7/18
DAIS UIUC
24
Experiment Setting
Baseline: TruAlarm (TA) Maximum Likelihood
Estimation (ML)
Dataset: 1 Real dataset, 3 synthetic dataset
2015/7/18
DAIS UIUC
25
Experiment Result: Efficiency
IM costs only 10-20% time of ML and TA
2015/7/18
DAIS UIUC
26
Experiment Results: Precision and Recall
ML require the exact number of intruder as input
Only compare the precision and recall of TA and IM
2015/7/18
DAIS UIUC
27
Energy Error and Position Error
The accuracy of ML is highly influenced by the faulty
sensor data, while IM’s accuracy is high
2015/7/18
DAIS UIUC
28
Conclusion
We have introduced IntruMine to efficienctly and
effectively detect intruders from untrustiworthy sensor
data in cyber-physical system
Monitoring graph: building the relationship between sensors
and intruders
Gradient algorithm: estimate the detailed information
iteratively
Intruder Verification: Eliminate the ghost intruders according
the reading differences
Thank you very much!
Any Questions?
2015/7/18
DAIS UIUC
29