Honeypot Data Analysis

Download Report

Transcript Honeypot Data Analysis

By: Michael Kuritzky and
Guy Cepelevich
Supervisor: Amichai Shulman

2010
Wikipedia: “In computer terminology,
a honeypot is a trap set to detect, deflect,
or in some manner counteract attempts
at unauthorized use of information systems.
Generally it consists of a computer, data,
or a network site that appears to be part of
a network, but is actually isolated,
(un)protected, and monitored, and which
seems to contain information or a resource of
value to attackers.”
M. Kuritzky & G. Cepelevich, Technion
2




2010
Deploy a honeypot on the web.
Gather information about the usage of the
deployed honeypot (requests and replies).
Store the data for future use.
Devise a tool to conveniently review and
manually analyze the info gathered from the
honeypot in order to create automatic “rules”
that will categorize and filter the existing, and
new information.
M. Kuritzky & G. Cepelevich, Technion
3

Deploy a honeypot on the web:


2010
In order to entice possible attackers into using our
Honeypot, we “offered” them a service – an
anonymizing proxy server, a very popular “tool” in
the “scene”.
We used Amazon’s EC2 (Elastic Compute Cloud)
machine to run the anonymizing proxy.
M. Kuritzky & G. Cepelevich, Technion
4

Gather information about the usage of the
deployed honeypot (requests and replies):

2010
We used Privoxy (available from sourceforge.net)
to monitor the traffic and record&store the raw
traffic logs in an Amazon EBS (Elastic Block
Storage) volume.
M. Kuritzky & G. Cepelevich, Technion
5

Store the data for future use:

We wrote a parser to parse the raw privoxy logs.
 The parser goes over the logs, one line at a time (to
avoid memory problems), and parses it using several
Regular Expressions (a.k.a Voodoo ).

We also wrote a listener which registers with the
parser, and is called whenever the parser finishes
parsing an entry.
 The listener inputs the parsed entry into a MySQL
database for future analysis.
2010
M. Kuritzky & G. Cepelevich, Technion
6

2010
We use the following tables to store the
entries in the DB:
M. Kuritzky & G. Cepelevich, Technion
7

Devise a tool to conveniently review and
manually analyze the info gathered from the
honeypot in order to create automatic “rules”
that will categorize and filter the existing, and
new information:

2010
This is the largest part in the system and will be
covered in the next couple of slides.
M. Kuritzky & G. Cepelevich, Technion
8

The system consists of 3 panels:

Entries Panel:
 Convenient display of entries from the DB (all entries, or
entries matching a certain rule).
 Allows on-the-spot manipulation on the entries.
2010
M. Kuritzky & G. Cepelevich, Technion
9

The system consists of 3 panels:

Rule Editing Panel:
 Interface for creating “rules” for automatic data
manipulation.
2010
M. Kuritzky & G. Cepelevich, Technion
10

The system consists of 3 panels:

Rule Management Panel:
 Interface for activating and deactivating existing rules.
2010
M. Kuritzky & G. Cepelevich, Technion
11

Interest level



Tags

2010
Many entries result from regular internet usage;
those can often be automatically marked as
uninteresting using our rule system.
Some entries, on the other hand, entail potential
attacks (sql injection, automation, etc.). those can
be marked as interesting, and then manually
processed.
Using our rule system, the user can automatically
assign tags to entries that match certain patterns
(e.g. suspicious user-agents).
M. Kuritzky & G. Cepelevich, Technion
12
A simple rule to catch porn
And the results: ~1000 entries
Most requests come from the
78.159.125.0 subnet
2010
M. Kuritzky & G. Cepelevich, Technion
13
Suspicious user-agents: users
who claim to use Windows 98
2010
And the results: ~9000 entries
M. Kuritzky & G. Cepelevich, Technion
14

The project was written fully in Java, for the following
reasons:



In order to organize and save all the information
gathered from our Honeypot, we used a MySQL
database. This platform was chosen due to several
reasons:






2010
Developers’ experience.
Extensive integrated and third-party library support (i.e.
JDBC for database connections).
Very common
Free
Easy to access
Existing management tools
Easy to write rules on the entries
Developers’ experience 
M. Kuritzky & G. Cepelevich, Technion
15




2010
Make the SQL queries more efficient
(currently we have a problem dealing with
databases with a large number of entries).
Make the user defined queries more
structured and guided.
Support for creating automatic queries from
multiple selection from the entries table.
Support for reconstruction and “replay” of
requests.
M. Kuritzky & G. Cepelevich, Technion
16
2010
M. Kuritzky & G. Cepelevich, Technion
17