Transcript Slide 1

DETECTION OF ATTACKS ON
COGNITIVE CHANNELS
Annarita Giani
Institute for Security Technology Studies
Thayer School of Engineering
Dartmouth College
Hanover, NH
Berkeley, CA
October 12, 2006
Outline
1. Motivation and Terminology
2. Process Query System (PQS) Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Outline
1. Motivation and Terminology
2. Process Query System (PQS) Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Malware and Detection
70s. System Admins directly monitor
user activities
1990s - First
Commercial Antivirus
Late 70 - early 80s. System
Admins review audit logs for
evidence of unusual behavior.
1991 – Norton Antivirus
released by Symantec
90s. Real time IDS.
Programs analyze audit log,
usually at night.
Phishing
Attacks
Misinformation
Covert Channel
Grace Hopper.
MIT - First
Computer Bug
1945
1940
Von Neumann
studied self
reproducing
mathematical
automata
Penrose: Selfreproducing
machines
1951
Computer viruses
on ARPANET
1959
Von Neumann
demonstrated
how to create
selfreproducing
automata
Multi Stage
Attacks
~1960
1970
Stahl reproduces
Penrose idea in
machine code on
an IBM 650
Morris
worm
1988
1982
First virus in
the wild
Web
Defacements
Melissa virus,
damage = $80 M
1990s
1999
Malicious
programs exploit
vulnerabilities in
applications and
operating
systems
2001
Code Red
worm, damage
= $2 B
now
Covert
Channel
Exfiltration of
information
FOCUS OF MOST SECURITY WORK
THEORETICAL
1972: J.P. Anderson, Computer Security Technology Planning Study, ESD-TR-73-51, ESD/AFSC, Bedford, MA
OUR FOCUS
1984: D. Denning, An Intrusion Detection Model, IEEE Transaction on Software Engineering,
VolSE-13(2)
WORK
1988: M. Crosby, Haystack Project, Lawrence Livermore Laboratories
1989: from the
Haystack Project.System
Stalker, a Commercial
Product
Firstmainly
HIDS
Intrusion
Detection
(IDS)
are
based on
1990: L. Heberlein et al, A Network Security Monitor, Symposium on Research Security and Privacy First NIDS
1994: from ASIM
(Air Force) Netranger
Commercial NIDS.
signature
matching
andFirst
anomaly
detection.
4
Cognitive Channels
A cognitive channel is a communication channel between the
user and the technology being used. It conveys what the user
sees, reads, hears, types, etc.
Cognitive
Channel
Network
Channel
SERVER
CLIENT
USER
Focus of the current protection
and detection approaches
The cognitive channel is the weakest link in the whole framework.
Little investigation has been done on detecting attacks on this channel.
Cognitive Attacks
Our definition is from an engineering point of view.
Cognitive attacks are computer attacks over a cognitive channel.
They exploit the attention of the user to manipulate her
perception of reality and/or gain advantages.
COGNITIVE HACKING. The user’s attention is focused on the channel. The
attacker exploits this fact and uses malicious information to mislead her.
COVERT CHANNELS. The user is unaware of the channel. The attacker uses
a medium not perceived as a communication channel to transfer information.
PHISHING. The user's attention is attracted by the exploit. The information is
used to lure the victim into using a new channel and then to create a false
perception of reality with the goal of exploiting the user’s behavior.
6
Cognitive Hacking
The user's attention is focused on the channel. The attacker exploits
this fact and uses malicious information in the channel to mislead her.
Misleading information
from a web site
Attacker: Makes a fake web site
1
2
Attacker: Obtains advantages
from user actions
4
3
Victim: Acts on the
information from the web
site
7
Covert Channels
The user's attention is unaware of the channel. The attacker uses a
medium not perceived as a communication channel to transfer
information.
Attacker: Codes data into
inter-packet delays, taking
1 care to avoid drawing the
attention of the user.
User: does not see interpacket delay as a
communication channel and
does not notice any
communication.
data
2
8
Phishing
The user's attention is attracted by the exploit. The information is
used to lure the victim into using a new channel and then to create a
false perception of reality with the goal of exploiting the user’s
behavior.
Misleading email to
get user attention
Send a fake email
Visit
http://www.cit1zensbank.com
1
2
4
First name,
Last name
Account #
SSN
3
Bogus web site
First name,
Last name
Account Number
SSN
9
Why current IDS cannot be applied to
attacks on cognitive channels
• Sophistication of attack approaches.
• Increasing frequency and changing nature of attacks.
• Inherent limits of network-based IDS.
• Inability to identify attackers’ goals.
• Inability to identify new attack strategies.
• No guidance for response.
• Often simplistic analysis.
10
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Process Query System
Observable events coming from sensors
Hypothesis
Models
PQS
ENGINE
Tracking
Algorithms
12
Framework for Process Detection
that detect
complex attacks
and anticipate
the next steps
Multiple Processes
l1 = router failure
l2 = worm
l3 = scan
2
129.170.46.3 is at high risk
129.170.46.33 is a stepping stone
......
that
are
used
5
for
control Hypotheses
consists of
1
Indictors and Warnings
Hypothesis 1
Hypothesis 2
that produce
Events
…….
Time
Real World
that
are
seen
as
3
4
Track 1
Track 1
Track 2
Track 2
Track 3
1
0.8 3
Track
Track Score
6
INVERSE PROBLEM
FORWARD PROBLEM
An Environment
Sample
Console
0.6
0.4
0.2
that PQS resolves into
0
0
100
200
Service Degradation
Unlabelled Sensor Reports
…….
Track
Scores
Time
Process Detection (PQS)
13
Hierarchical PQS Architecture
TIER 1
TIER 1
Models
TIER 1
Observations
Scanning
TIER 1
Hypothesis
PQS
TIER 2
TIER 2
Observations
More Complex
Models
PQS
Events
Snort
Tripwire
PQS
PQS
Data Access
TIER 2
Hypothesis
Events
Snort
IP Tables
Infection
TIER 2
Models
Events
Samba
RESULTS
Exfiltration
Flow and Covert
Channel Sensor
PQS
Events
14
Hidden Discrete Event System Models
Dynamical systems with discrete state spaces that are:
Causal - next state depends only on the past
Hidden – states are not directly observed
Observable - observations conditioned on hidden state are
independent of previous states
Example. Hidden Markov Model
N States
M Observation symbols
State transition Probability Matrix, A
Observation Symbols Distribution, B
Initial State Distribution p
HDESM models are general
15
HDESM Process Detection Problem
Identifying and tracking several (casual discrete state) stochastic
processes (HDESM’s) that are only partially observable.
TWO MAIN CLASSES OF PROBLEMS
Hidden State Estimation: Determine the “best” hidden states
sequence of a particular process that accounts for a given
sequence of observations.
Discrete Sources Separation: :Determine the “most likely”
process-to-observation association
16
Discrete Source Separation Problem
HDESM Example (HMM):
3 states + transition probabilities
n observable events: a,b,c,d,e,…
Pr( state | observable event ) given/known
Observed event sequence:
….abcbbbaaaababbabcccbdddbebdbabcbabe….
Catalog of
Processes
Which combination of which process models “best” accounts for the observations?
Events not associated with a known process are “ANOMALIES”.
17
An analogy....
What does
hbeolnjouolor
mean?
Events are:
hbeolnjouolor
Models = French + English words (+ grammars!)
hbeolnjoulor = hello + bonjour
Intermediate hypotheses include tracks:
ho + be
18
PQS applications
•
•
•
•
•
•
•
•
•
•
Vehicle tracking
Worm propagation detection
Plume detection
Dynamic Social Network Analysis
Cyber Situational Awareness
Fish Tracking
Autonomic Computing
Border and Perimeter Monitoring
First Responder Sensor Network
Protein Folding
TRAFEN (TRacking and Fusion ENgine):
Software implementation of a PQS
19
Example – vehicle tracking
(Valentino Crespi, Diego Hernando)
T
T+1
T+2
Continuous Kinematic Model
Linear Model with Gaussian noise
20
Multiple Hypothesis Tracking
D. Reid. An algorithm for Tracking Multiple Targets – IEEE Transaction on Automatic
Control,1979
Use Kalman Filter
T
T+1
Hypotheses
T+2
Predictions
Track = process instance
Hypothesis = consistent tracks
Given a set of “hypotheses” for an event stream of length k-1, update the
hypotheses to length k to explain the new event (based on model description).
21
Model vehicle Kinematics
States: x(t k 1 ) = (t k )  x(t k )  w(t k )
x ( t k ) State of target at time t k
( t k ) Prediction Matrix
 Precision Matrix
w( t k ) Sequence of normal r.v. with Zero mean and covariance: Q( t k )
Model Measurement
Observe State of target through a noisy measurement: z(t k ) = Hx (t k )   (t k )
z(t k )
H
 (t k )
Measure (observation)
“Observable” Matrix: extracts observable
information from state.
Sequence of normal r.v. with Zero mean and
covariance R
z(t k ) = Hx (t k )   (t k )
State Estimation
Kalman filters are used for predictions.
22
Kalman Filters
Estimation given obs before tk
Prediction
xˆ , Pˆ
x (t k ), P (t k )
x ( t k ), P ( t k )
Noisy z (t k )
observation
Correct the
estimation given
the new obs
z (t k )
KF
xˆ (t k ), Pˆ (t k )
Error
Covariance
Estimation
Estimation
output
x(t k 1 ), P (t k )
z (t k 1 )
Estimate
state
Error
Covariance
Prediction
Prediction
x (t k 1 ), P (t k 1 )
KF
xˆ (tk 1 ), Pˆ (tk 1 )
z(t k 1 )
23
Kalman Equations
System’s state:
x(t k ) | z k 1 ~ N ( x(t k ), P (t k ))
xˆ (tk ) = x(tk )  K (tk ) [z(tk )  Hx(tk )]
(Normal Multivariate)
(output)
Estimation
Pˆ (t k ) = P (t k )  P (t k ) H T ( HP (t k ) H T  R)1 HP (t k )
K = Pˆ H T R 1
K is the Kalman Gain: minimizes
updated error covariance matrix
(mean-square error)
E[( xk  xˆ k )( xk  xˆ k ) ]
T
x( t k ) | z k
 = xˆ (t k )
 2 = Pˆ (t k )
x(t k 1 ) = (t k )  xˆ (t k )
P (t k 1 ) = (t k )Pˆ (t k )(t k )T  Q(t k )T
New Prediction
24
Real time Fish Tracking
(Alex Jordan )
• Track the fish in the fish tank
• Very strong example of the power of PQS
– Fish swim very quickly and erratically
– Lots of missed observations
– Lots of noise
– Classical Kalman filters don’t work (non-linear
movement and acceleration)
– “Easier” than getting permission to track people (we
mistakenly thought)
25
Fish Tracking Details
• 5 Gallon tank with 2 red Platys named
Bubble and Squeak
• Camera generates a stream of
“centroids”:
For each frame a series of (X,Y) pairs
is generated.
• Model describes the kinematics of a
fish:
The model evaluates if new (X,Y)
pairs could belong to the same fish,
based on measured position,
momentum, and predicted next
position. This way, multiple “tracks”
are formed. One for each object.
• Model was built in under 3 days!!!
Cybenko
Infrared Camera
Detect and differentiate
people by behavior not
26
appearance
Autonomic Server Monitoring
(Chris Roblee)
• Objective: Detect and predict deteriorating service situations
• Hundreds of servers and services
• Various non-intrusive sensors check for:
– CPU load
– Memory footprint
– Process table (forking behavior)
– Disk I/O
– Network I/O
– Service query response times
– Suspicious network activities (i.e.. Snort)
• Models describe the kinematics of failures and attacks:
The model evaluates load balancing problems, memory leaks,
suspicious forking behavior (like /bin/sh), service hiccups correlated
with network attacks…
27
Cybenko
Server Compromise Model:
Integration of host CPU load sensors and IDS sensor allows
detection of attacks not possible with different sensors
2.
Monitored host sensor output (system level)
3.
PQS Tracker Output
Current system record for host 10.0.0.24 (10 records):
Average memory over previous 10 samples: 251.000
Average CPU over previous 10 samples: 0.970
| time
| mem used | CPU load | num procs | flag |
---------------------------------------------------------------------------------| 1101094903 |
251
|
0.970
|
64
|
|
| 1101094911 |
252
|
0.820
|
64
|
|
| 1101094920 |
251
|
0.920
|
64
|
|
| 1101094928 |
251
|
0.930
|
64
|
|
| 1101094937 |
251
|
0.870
|
65
|
|
| 1101094946 |
251
|
0.970
|
65
|
|
| 1101094955 |
251
|
0.820
|
65
|
|
| 1101094964 |
253
|
1.220
|
65
| ! |
| 1101094973 |
255
|
1.810
|
65
| ! |
| 1101094982 |
258
|
2.470
|
65
| ! |
1.
Last Modified:
Mon Nov 21 21:01:03
Model Name:
server_compromise1
Likelihood:
0.9182
Target:
10.0.0.24
Optimal Response: SIGKILL proc 6992
o1 o2 o3
Snort NIDS sensor output
..
.
Nov 21 20:57:16 [10.0.0.6] snort: [1:613:7]
SCAN myscan [Classification: attempted-recon] [Priority: 2]:
{TCP} 212.175.64.248-> 10.0.0.24
..
.
Cybenko
o1
SIGKILL
t0
t4
28
Response
t 1 t2 t3
Observations
Airborne Plume Tracking
(Glenn Nofsinger)
Forward Problem - drift and diffusion
159.4
182
170
111.0
98.7
160
150
Airborne
agent
sensor on
DC Mall
Inverse Problem - locate sources and
types of releases
74.0
61.7
49.3
37.0
24.7
12.3
0.0
140
130
120
110
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100 110 120 130 140
150 160 170
182
29
Dynamic Social Network Analysis
(Wayne Chung)
A
A
B
B
A
B
A asks B to
join a project
invite
question/
accept
A adds B to
a list of
recipients
AB, C, …
join
not join
“Static” Analysis
Detect "business" and "social"
processes, not static artifacts.
Sensors...communication events
Models...social processes
B accepts
“Dynamic” Analysis
Large
group
joining
New member active
introducing others
30
PQS in Computer Security
(Alex Barsamian, Vincent Berk, Ian De Souza, Annarita Giani)
5
2
7
Internet
1
8
12
DIB:s
BGP
IPTables
Snort
BRIDGE
DMZ
WWW
WS
Mail
observations
PQS
ENGINE
Tripwire
WinXP
LINUX
SaMBa
31
Sensors and Models
1
DIB:s
Dartmouth ICMP-T3 Bcc: System
2
Snort, Dragon
Signature Matching IDS
3
IPtables
Linux Netfilter firewall, log based
4
Samba
SMB server - file access reporting
5
Flow sensor
Network analysis
6
ClamAV
Virus scanner
7
Tripwire
Host filesystem integrity checker
1
Noisy Internet Worm Propagation – fast scanning
2
Email Virus Propagation – hosts aggressively send emails
3
Low&Slow Stealthy Scans – of our entire network
4
Unauthorized Insider Document Access – insider information theft
5
Multistage Attack – several penetrations, inside our network
6
DATA movement
7
TIER 2 models
32
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Phishing Attack
The act of sending an e-mail to a user falsely claiming to be an
established legitimate enterprise in an attempt to scam the user into
surrendering private information.
The e-mail directs the user to visit a web site where they are asked
to update personal information.
Visit
http://www.cit1zensbank.com
First name,
Last name
Account Number
SSN
1
2
First name,
Last name
Account number
SSN
Bogus web site
3
34
Complex Phishing Attack Steps
Stepping
stone
1
…. visits a web page.
inserts username and password.
(the same used to access his machine)
100.20.3.127
3
5
uploads some code
accesses user machine using
username and password
Web page,
Madame X
… as usual browses the web and …
2
records username
and password
4
Attacker
51.251.22.183
165.17.8.126
Victim
downloads some data
6
100.10.20.9
35
Complex Phishing Attack Observables
Stepping
stone
DEST
100.20.3.127
SOURCE
DEST
Web Server used- Madame X
Attacker
SOURCE
Username
password
Sept 29 11:23:56
3. DATA UPLOAD
FLOW SENSOR
DEST
NON-STANDARD-PROTOCOL
2. ATTEMPT SNORT
SSH (Policy Violation)
Sept 29 11:23:56
DEST
Sept 29 11:17:09
1. RECON
SNORT: KICKASS_PORN
DRAGON: PORN HARDCORE
165.17.8.126
Victim
SOURCE
SOURCE
Attacker
51.251.22.183
DEST
5. DATA DOWNLOAD
FLOW SENSOR
Sept 29 11:24:07
SOURCE
100.10.20.9
36
Flow Sensor
• Based on the libpcap interface for packet capturing.
• Packets with the same source IP, destination IP, source port, destination
port, protocol are aggregated into the same flow.
• Timestamp of the last packet
• # packets from Source to Destination
• # packets from Destination to Source
• # bytes from Source to Destination
• # bytes from Destination to Source
• Array containing delays in microseconds between packets in the flow
We did not use Netflow only because it does not have all the fields that we need.
Two Models Based on the Flow Sensor
Low and Slow UPLOAD
Volume
Tiny: 1-128b
Small: 128b-1Kb
Packets
4:10-99
5: 100-999
6: > 1000
Duration
Balance
Percentage
4: 1000-10000 s
5: 10000-100000 s
6: > 100000 s
Out
>80
UPLOAD
Volume
Packets
Duration
Balance
Percentage
Tiny: 1-128b
Small: 128b-1Kb
Medium: 1Kb-100Kb
Large: > 100Kb
1: one packet
2: two pckts
3: 3-9
4: 10-99
5: 100-999
6: > 1000
0: < 1 s
1: 1-10 s
2: 10-100 s
3: 100-1000 s
4: 1000-10000 s
5: 10000-100000 s
6: > 100000 s
Out
>80
Phishing Attack Model 1 – very specific
ATTEMPT
UPLOAD
UPLOAD
2
4
DOWNLOAD
ATTEMPT
ATTEMPT
ATTEMPT
RECON
1
6
DOWNLOAD
7
UPLOAD
RECON
UPLOAD
ATTEMPT
3
UPLOAD
ATTEMPT
5
ATTEMPT
39
Phishing Attack Model 2 – less specific
ATTEMPT dst,src
2
UPLOAD dst,src
UPLOAD
dst,src
4
DOWNLOAD src
RECON or
ATTEMPT
or COMPROMISE
RECON or ATTEMPT
or COMPROMISE dst
ATTEMPT
dst, ! src
ATTEMPT
dst, src
UPLOAD
dst, src
1
6
DOWNLOAD
src
7
UPLOAD dst
RECON or
ATTEMPT
or COMPROMISE
ATTEMPT
dst, !src
3
ATTEMPT
dst,A
UPLOAD dst, src
5
ATTEMPT dst,src
ATTEMPT
dst, !src
40
Phishing Attack Model 3 – more general
UPLOAD dst,src
RECON or ATTEMPT
or COMPROMISE dst, src
2
UPLOAD
dst,src
RECON or
RECON or ATTEMPT
ATTEMPT
or COMPROMISE or COMPROMISE dst
RECON or ATTEMPT
or COMP dst, ! src
RECON or ATTEMPT
or COMP dst, src
UPLOAD
dst, src
1
UPLOAD dst
RECON or
ATTEMPT
or COMPROMISE
DOWNLOAD src
4
6
DOWNLOAD
src
7
RECON or ATTEMPT
or COMP dst, !src
3
5
RECON or
ATTEMPT
or COMP dst
UPLOAD dst, src
RECON or ATTEMPT
or COMP dst, src
RECON or ATTEMPT
or COMP dst,! src
41
Phishing Attack Model 3 – Most general
ATTEMPT or
UPLOAD
ATTEMPT
DOWNLOAD
ATTEMPT or
UPLOAD
RECON
1
2
3
4
ATTEMPT
DOWNLOAD
RECON
Stricter models reduce false positives, but less strict
models can detect unknown attack sequences
42
Air Force Rome Lab Blind Test
December 12-14, 2005
The collected data is an anonymized stream of network traffic, collected
using tcpdump. It resulted in hundreds of gigabytes of raw network traffic.
• Valuable feedback on performance and design
• Strengths:
– Number of sensors integrated
– Number of models
– Easy of sensor integration
– Ease of model building
• Drawback:
– System is real-time (results time-out)
43
Complex Phishing Attack Results
No observations coming from Dragon
sensor and Flow sensor
Attack steps
0 of 5
Background attackers
9 of 15
Background scanners
25 of 55
Stepping stones
0 of 1
Using Dragon and Flow observations
Attack steps
5 of 5
Background attackers
10 of 15
Background scanners
23 of 55
Stepping stones
1 of 1
False alarms
1
44
Precision
Fragmentation
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
4s1
4s3
4s4
4s13
4s14
4s5
4s6
4s8
4s16
4s17
0
4s1
4s3
4s4
4s13
4s14
4s5
4s6
4s8
4s16
4s17
Mis-Associations
100
GOAL: < AVERAGE
90
Scenario 4s14: Phishing attack
80
70
60
50
40
30
20
10
Threshold Values:
0
4s1
4s3
4s4
4s13
4s14
4s5
4s6
4s8
4s16
4s17
0.0
0.5
0.75
45
GOAL: < AVERAGE
GOAL: > AVERAGE
Summary of Results
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Data Exfiltration
The Problem:
CNN.COM
Sunday, June 19, 2005 Posted: 0238 GMT (1038 HKT)
NEW YORK (AP) -- The names, banks and account numbers
of up to 40 million credit card holders may have been
accessed by an unauthorized user, MasterCard
International Inc. said.
PQS Approach:
Tier 1 models monitor outbound data. They are based on flow
analysis.
Tier 2 models correlate outbound data within a context to infer if
it is a normal systems and user behavior or ongoing attacks
47
Basic Ideas: An Example
Exfiltration modes:
nfs2.pqsnet.net
600000
IN
OUT
• SSH
• HTTP
• FTP
• Email
• Covert channel
• Phishing
• Spyware
• Pharming
• Writing to media
• paper
• drives
• etc
500000
bytes
400000
300000
Scanning
Infection
Data Access
200000
Normal activity
100000
50
100
150
200
Time x 15 sec
Low Likelihood of
Malicious Exfiltration
Increased outbound data
250
300
350
High Likelihood of
Malicious Exfiltration
48
Hierarchical PQS Architecture
TIER 1
TIER 1
Models
TIER 1
Observations
Scanning
TIER 1
Hypothesis
PQS
TIER 2
TIER 2
Observations
More Complex
Models
PQS
Events
Snort
Tripwire
PQS
PQS
Data Access
TIER 2
Hypothesis
Events
Snort
IP Tables
Infection
TIER 2
Models
Events
Samba
RESULTS
Exfiltration
Flow and Covert
Channel Sensor
PQS
Events
49
Example PQS model: Macro in word document
for exfiltration
Balanced Flow
2
TIER 1 VIRUS
1
4
RECON
3
Data Exfiltration
Balanced Flow
or
Data Exfiltration
Word virus opens up a ftp connection with a server and upload documents.
50
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Covert Channel
• A communication channel is covert if it is neither designed
nor intended to transfer information at all. (Lampson
1973)
• Covert channels are those that use entities not normally
viewed as data objects to transfer information from one
subject to another. (Kemmerer 1983)
STORAGE
TIMING
Information is leaked by hiding data Information is leaked by triggering
packet header fields: IP
or delaying events at specific time
identification, Offset, Option, TCP
intervals.
Checksum, TCP Sequence
Numbers.
52
Covert Channel in Interpacket Delays
SENDER
We shall not
spend
a large expense
of time
Before we
reckon
with your
several loves,
And make us
even with you.
My thanes
and kinsmen,
…
RECEIVER
0
1
0
0
0
1
0
1
0
INTERNET
0
1
0
0
0
1
0
1
0
We shall not
spend
a large expense
of time
Before we
reckon
with your
several loves,
And make us
even with you.
My thanes
and kinsmen,
…
53
Binary Asymmetric Channel
1 0
0 0
0
1
1
0
0
0
0
0
ERROR: it should be a 1

Pe
1
1
Pe
Noisy
Channel
Pe
0
0

Pe
54
Binary Asymmetric Channel Capacity
Capacity: Highest amount of information per symbol that can be
transmitted with arbitrarily small error probability.
Error
Probability
Bit/symbols
24 hops.
Received
Sent
55
Statistical Detection
N (  ) = # of packets with delay 
Nmax = max # of packets with the same delay
N ( )
 1
N max
Number of packets
 = sample mean
delays
Covert channel
Delay – tenth of a sec
Level of confidence:
N ( )
N max
N  
1
N max
Threshold used in the PQS experiments
56
bits
Sensor
For every traffic flow it registers the time delays between
consecutive packets.
Number of Delays
Attributes
source ip:
dest ip:
source port:
dest port:
129.170.248.33
208.253.154.210
44806
23164
882 delays between 4/40sec and 5/40sec
Protocol:
TotalSize:
#Delays[20]:
3 0 0 16 882 2 0 17 698 2 0 0 1 0 1 0 0 0 0 0
Average delay:
Cmax;
3 delays between 0sec and 1/40sec
Cmean:
source ip: 129.170.248.33
dest ip: 208.253.154.210
source port: 56441
dest port: 23041
1
N  
=0
N max
Delay – tenth of a sec
Number of Delays
Key
source ip: 129.170.248.33
dest ip: 208.253.154.210
source port: 56441
dest port: 23036
1
N  
=1
N max
Delay – tenth of a sec
Error rates and capacity
Confidence, 1 
N  
N max
Error
Probability
1
Bit/symbols
N  
N max
Capacity
58
Detection-Capacity Tradeoffs
is a discrete random variable.
Define a covert channel,
uses
A sample of
is denoted by
which has the same sample space as
in the sense that whenever
namely
a covert message is communicated
is a sample of
Let
and
is the probability of FALSE ALARM
S = {a, b, c, d, e, f, g, h}
D = {b, c, d}
Sample space
Symbols used for covert communication
The probability of D according the natural distribution of symbols is the
false alarm rate.
Detection-Capacity Tradeoffs
Let
be the amount of information communicated by the covert channel
sample from
Define
Noting that
per
to be the entropy of the distribution
by assumption, then
The expected covert information communicated per sample is
Covert Information
False alarm
60
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Flow Analysis = Data Reduction
Flow Aggregation
EVENTS
Hundreds
per hour
Flow Attribution
Fewer events to be
analyzed
FLOWS
Thousands
per hour
Current Analysis
PACKETS
Hundreds of thousands per hour
How data move
BYTES
Billions per hour
62
Flow Attribution and Aggregation
FLOW ATTRIBUTION
FLOW AGGREGATION
The final goal is to attribute
flows to people. Intermediate
steps are a required part of
the attribution process.
Recognizing that different flows
(components), apparently totally
unrelated, nevertheless belong
to the same broader action
(event).
Uses logs that can explain a
flow as legitimate or
malicious.
Views flows as components of
broader activities.
The goal is to explain flows.
The goal is to correlate flows
based on certain criteria.
63
Aggregation
Flow aggregation.
Activity aggregation.
Recognizing that different flows,
apparently totally unrelated,
nevertheless belong to the same
broader event (activity).
Recognizing that similar activities
occur regularly at the same time, or
dissimilar activities occur regularly in
the same sequence.
Flows are aggregated from captured
network packets.
We correlate activities into activity
groups, patterns.
We aggregate flows into activities.
Examples:
Example:
• Nightly backups to all servers (each
backup is an activity)
• User requests a sequence of webpages every morning.
User requests a webpage (all DNS
and HTTP flows aggregated)
Packet = Aggregated Bytes
Flow = Correlated Packets
Activity = Correlated Flows
Pattern = Correlated Activities
64
Web Surfing in Detail
The browser breaks the URL into three parts: the protocol ("http"), the
server name ("www.dartmouth.edu") and the file name (“index.html").
1. The browser communicates with a name server to translate the
server name "www.dartmouth.edu" into an IP Address, which it uses to
connect to the server machine.
2. The browser forms a connection to the web server at that IP
address on port 80.
A FLOW IS
INITIATED
A FLOW IS
INITIATED
3. Following the HTTP protocol, the browser sends a GET request to the
server, asking for the file "http://www.dartmouth.edu/index.html."
4. The web server sends the HTML text for the Web page to the browser.
5. The browser reads the HTML tags and formatted the page onto your
screen.
6. Browser possibly initiates more DNS requests for media such as
images and video.
7. Browser initiates more HTTP and/or FTP requests for media.
MULTIPLE
FLOWS ARE
INITIATED…
65
Resulting Flows and Activity
Flows in
the activity
Activity
66
Activities and Flows
UDP Flow
TCP Flow
Activity
Long Flow
67
Complex Activities ....
TCP portscan
Regular UDP
broadcasts (NTP)
Correlated
Network
Flows
Within
a LAN
System upgrade
Regular browsing/
download behavior
UDP portscan
68
Flow + Snort Alerts
Scenario: several packets in a flow triggered IDS alerts
Snort rule 1560
generates an alert
when an attempt
is made to exploit a
known vulnerability
in a web server or a
web application.
Snort rule 1852
generates an alert
when an attempt is
made to access
the 'robots.txt' file
directly.
SNORT
ALERTS
FLOW
The flow can be characterized as malicious and further investigation must be done.
69
Current focus
Theoretical approach for clustering aggregated flows.
Flow = As defined
Activity = Aggregated flows
Pattern = Correlated Activities
Approach: Graph theory (flows are the nodes and the edges are between
correlated nodes).
We are thinking about defining a metric that captures the closeness
between two different activities to allow grouping into patterns.
Activity 2.
Activity 1.
x
x
y
t
w
y
z
Can they be grouped in one pattern?
t
s
s
70
Outline
1. Motivation and Terminology
2. PQS Approach
3. Implementation of a PQS detecting
a. Phishing
b. Data Exfiltration
c. Covert Channel
4. Flow Attribution and Aggregation
5. Conclusion and Acknowledgments
Contribution
• Identification of a new generation of threats.
• Identification and implementation of approaches
based on a Process Query System to detect
them.
• Introduction and implementation of flow attribution
and aggregation.
72
Work in Progress
• Build a theory of flow attribution and aggregation.
• Develop a theory of tractability to characterize phenomenon
in the sense of multi hypothesis tracking.
• Identification of new application domains
• Statistical theory of undetectable covert communication
73
Acknowledgements
Active Members
George Cybenko
Alex Barsamian
Marion Bates
Chad Behre
Vincent Berk
Valentino Crespi (Cal State LA)
Ian deSouza
Paul Thompson
Annarita Giani
Alumni
Robert Gray (BAE Systems)
G. Jiang (NEC LAB)
Naomi Fox (UMass, Ph.D. student)
Hrithik Govardhan, MS (Rocket Software)
Yong Sheng Ph.D. (Dartmouth CS postdoc)
Josh Peteet, MS (Greylock Partners)
Alex Jordan, MS (BAE Systems)
Chris Roblee, MS (Lawrence Livermore NL)
George Bakos (Northrup Grumman)
Doug Madory M.Sc. (BAE Systems)
Wayne Chung Ph.D. (IDA/CCS)
Glenn Nofsinger Ph.D. (BAE Systems)
Yong Sheng Ph.D (CS Dartmouth College)
Research Support: DARPA, DHS, ARDA/DTO, ISTS, I3P, AFOSR, Microsoft
74
www.pqsnet.net
[email protected]
Thanks!
75