Detection of Covert Channels through VPNs

Download Report

Transcript Detection of Covert Channels through VPNs

Noam Segev, Israel Chernyak, Evgeny Reznikov
Supervisor: Gabi Nakibly, Ph. D.


Create a covert channel detector that would
function in the described scenario.
The detector’s operation:
 A learning period of clear traffic
 Two traffic samples:
▪ A clean sample
▪ Traffic containing a covert channel

Goal: Correct classification of both samples

We used four detection methods in the
creation of the detector:
 BLOSUM
 PSPM
 Learning Algorithm
 Entropy-based approach
Taken from the field of bioinformatics
BLOSUM (BLOcks of Amino Acid SUbstitution
Matrix) is a substitution matrix used for sequence
alignment of proteins.
 PSPM(Position Specific Probability Matrix) is a
substitution matrix used for sequence alignment of
proteins.
 The algorithms constructs a substitution matrix of
probabilities for each amino acid to be present in
certain positions in the sequence.


We break down the learning communication to 10
groups of 10. (total length of 100)
 We defined the probability of a value to be

pi 

# packets _ with _ X i
# packets
We define the probability of a couple of values to be
qi1 ,i2 
si1 ,i2  # packets _ with _ X i1  # packets _ with _ X i2
s
a ,b
When receiving a new packet we compare to a
packet from
the original communication using the

 q 
 2 log 

p  p 


formula
 The values checked can be packet size or packet
delay

i, j
2
i
i
We break down the learning communication to 10
groups of 10. (total length of 100)
 We defined the probability of a value to be

pi , j 

# packets _ with _ X i _ at _ position _ j
# packets _ at _ position _ j
The values checked can be packet size or packet
delay
Learning Algorithm



There exists a range of weaker algorithms for
covert channel detection exist.
Each weak algorithm is either less accurate or
only good for detecting a certain type of
covert channel.
We utilized a learning algorithm in an
attempt to boost and combine the
effectiveness of several of the weaker
algorithms.

We used the C4.5 learning algorithm to
combine three of the weaker algorithms:
 Regularity detection
 Histogram of packet times/sizes
 Epsilon similarity

Regularity:

 i  j
regularity  STDEV 
| i, j : i 

i


 Histogram:


j





Stores and sorts the list of all inter-arrival
times between packets.
Pi – inter-arrival time i in the sorted list.
Epsilon similarity: the percentage of
|Pi - Pi+1|/Pi that are smaller than the epsilon.


During the semester we compiled a collection
of traffic samples created by 3 of the covert
channel programs designed by previous
teams, as well as some samples of randomly
generated traffic with normal distribution of
sizes and inter-arrival times.
The learning algorithm was given a training
set of the answers all the above methods
gave for each packet in the aforementioned
traffic.



Entropy measures the amount of disorder in a
system.
A covert channel injects information into
certain communication metrics, therefore
increasing the amount of order over these
metrics.
By measuring the amount of entropy of a
given channel over the above metrics we can
try to deduce the existence of a covert
channel.

We used the entropy calculation methods
presented in Gianvecchio &Wang ’07:
EN  X m   H  X1 ,..., X m  
 P  x ,..., x  log P  x ,..., x 
1
m
1
x1 ... xm
CE  X m | X m1   H  X m | X 1 ,..., X m1  
 H  X 1 ,..., X m   H  X 1 ,..., X m1 
CCE  X m | X m1   CE  X m | X m1   perc  X m   EN  X1 
m

Our method calculates the entropy of the
following variables:
 Packet delay
 Packet sizes
 Combined (size & time)
 Bursts (k-packet averages on packet size & delay)
 Peaks (maxima points of packet sizes and delays)


The challenge consisted of 3 simulations.
Each included:
 A learning phase on clean traffic.
 A detection phase on clean traffic to weed out
false positives.
 And a detection phase on traffic contaminated by
the covert channel.

In this challenge our detector hasn’t
generated any output – defined as a negative
detection result – due to an error (which was
found only later) which placed the output
statements in an unreachable “if” statement.


Due to the aforementioned error suffered by
our program, the results of the first challenge
were inconclusive.
We decided that we’d attempt to investigate
the sensitivity factor of our methods (since
we saw neither false nor true positives).


We hoped that the algorithm would detect a
pattern indicating which of the detection
methods should be trusted in which case.
In case it was needed, we intended to boost
the algorithm with providing more
information about the covert channel –
statistical information about packet
distribution, as well as the numerical values
computed by the aforementioned methods.

Unfortunately, it turned out that the covert
channels we chose to work with were mainly
detected by the histogram method, which
didn’t leave much room for maneuver with
the learning algorithm.

Several issues were detected in our program:
 The aforementioned error which prevented
output from being displayed.
 In the BLOSUM method, there were
miscalculations in the algorithm.
 The entropy calculations, albeit correct, suffered
from inefficiency, which forced us to reduce
several parameters, affecting the accuracy.

Sensitivity factors were tweaked throughout
the program.

The second challenge consisted of one
simulation, including, as in the first challenge:
 A learning phase on clean traffic.
 A detection phase on clean traffic to weed out
false positives.
 A detection phase on traffic contaminated by a
covert channel.


Unfortunately, after weeding out false
positives, no detection was made.
After some investigation, we discovered that
the BLOSUM method has, in fact, detected
the covert channel, but due to another error,
failed to report it.


Further refinement of the detection methods’
sensitivity thresholds is necessary.
In the learning algorithm method, the chosen
methods proved to be insufficiently robust.
Additionally, the lack of covert channel
communication samples further undermined
our efforts.

It would be interesting to see how the
learning algorithm fares given a large amount
of traffic samples, as well as stronger
methods such as the entropy and BLOSUM
methods we have implemented during this
project.