Passive Network Forensics

Download Report

Transcript Passive Network Forensics

Passive Network Forensics:
Behavioral Classification of Network
Host Based on Connection Patterns
John McHugh (Dalhousie Univ. Canada),
Ron McLeod (TARA Canada),
Vagishwari Nagaonkar (Wipro Tech India)
ACM SIGOPS OSR (Operating Systems Review) 2008
Why do I select this paper?

Passive Network Forensics

It just perform retrospective analyses to






discover compromised host
determine the nature of the host’s behavioral change
Not talking about back tracking the source of an attack
Rely on NetFlow data
Host behavioral classification
Define profiles to identify host “roles”

2008/10/8
How does a machine interact with others?
Speaker: Li-Ming Chen
2
Motivation & Goals

Networks receive a substantial amount of
malicious traffic everyday



Attacks are subtle
Compromises can go unnoticed for long period of time
Assume network compromises have already
occurred

Retrospective analysis



2008/10/8
Describe the current network behavior (or behavior changes)
Try to discover compromised hosts
Based on flow records only
Speaker: Li-Ming Chen
3
(Forensics) Challenges for deep
analysis

Covert Surveillance
Catastrophic Loss
Significant Time Delay
Multi-Tenant Providers and Privacy
Network of Convenience

Assumptions in this paper







2008/10/8
No payload data
IP addresses are not trustworthy over long time period
Port numbers may not reliably identify applications
Speaker: Li-Ming Chen
4
Traffic Monitoring

What affects the choice of capture technique?




Why using NetFlow data?




Storage capacity
Privacy
Encrypted Payload
Many routers support NetFlow
Avoid privacy concerns
Retained data volume is modest
Use SiLK tool set to help analyze the data


2008/10/8
Developed by CERT
Filtering, sorting, managing data…
Speaker: Li-Ming Chen
5
Part 1: Profiling Interesting
Behaviors
My Comment:
Detailed, statistical based, but boring to me.
Profiles might work for simple machine (a machine with
one role), not the complex one.
Profiles are not general enough.
Observed interesting behaviors

Monitored network & dataset:






~ 80 hosts
~ 40 hosts are user
workstations
Observe 1outside-to-inside,
2inside-to-outside and 3insideto-inside flows
Only focus on non-web flows
Period: 18 months
How to measure?


2008/10/8

What’s interesting? (inside
NetFlow data)



Interesting Behaviors: (cases)




Monitor flows “from” a specific
host
Monthly measure
Traffic volume
Connection patterns

Speaker: Li-Ming Chen
Baseline workstation behavior
An on-line game server
A compromised workstation (by
a scanning worm)
An NT file server
A laser printer
7
Observed interesting behaviors (cont’d)

Behavior Profile:


Monitor flows “from” a specific host
Based on volume, connection patterns, protocol usage, and destination
port usage
Volume &
connection
statistics
Bytes transferred
Internal Dst. IP contacted
External Dst. IP contacted
Monthly
protocol
distribution
for flows
from host
Destination
Port Usage
ICMP sent
TCP sent
UDP sent
Total observed protocols
# of ports used
% of ports used
% of bytes transferred
Port 0~1024
Port 1025~5000
Port > 5000
2008/10/8
Speaker: Li-Ming Chen
8
Normal workstation vs. On-line game server
An Internet game server
(been compromised and
being used as an on-line game server)
A normal workstation
Bytes transferred
< 20 MB
Bytes transferred
45 GB
Internal DIP contacted
< 10
Internal DIP contacted
3
External DIP contacted
< 20
External DIP contacted
1.74 * 106
ICMP sent
<2%
ICMP sent
~1%
TCP sent
> 70 %
TCP sent
~9%
UDP sent
< 30 %
UDP sent
~ 90 %
Total observed proto.
<5
Total observed proto.
3
# of
ports
used
% of
ports
used
% of
bytes
trans.
Port 0~1024
45
0.07%
N/A
> 90%
Port 1025~5000
3976
6%
1%
< 9%
Port > 5000
60K
93%
99%
# of
ports
used
% of
ports
used
% of
bytes
trans.
Port 0~1024
<7
20~50%
N/A
Port 1025~5000
< 10
> 30%
Port > 5000
<5
< 20%
2008/10/8
Speaker: Li-Ming Chen
9
Normal workstation vs. Scanning worm
Not a general case
A normal workstation +
(protocol/port) Scanning worm
A normal workstation
Bytes transferred
< 20 MB
Bytes transferred
> 20 MB
Internal DIP contacted
< 10
Internal DIP contacted
< 10
External DIP contacted
< 20
External DIP contacted
>> 20
ICMP sent
<2%
ICMP sent
< 2 % (?)
TCP sent
> 70 %
TCP sent
> 70 % (?)
UDP sent
< 30 %
UDP sent
< 30 % (?)
Total observed proto.
<5
Total observed proto.
256 (all possible)
# of
ports
used
% of
ports
used
% of
bytes
trans.
Port 0~1024
>7
20~50%
N/A
> 90%
Port 1025~5000
> 10
< 30%
< 90%
< 9%
Port > 5000
>5
> 20%
> 9%
# of
ports
used
% of
ports
used
% of
bytes
trans.
Port 0~1024
<7
20~50%
N/A
Port 1025~5000
< 10
> 30%
Port > 5000
<5
< 20%
2008/10/8
Speaker: Li-Ming Chen
10
NT file server & Laser printer
Easily be regarded as anomaly
A Windows NT server
A laser printer
Bytes transferred
93 MB
Bytes transferred
< 3 MB
Internal DIP contacted
All on host /24
Internal DIP contacted
3
External DIP contacted
Large number
External DIP contacted
5
ICMP sent
2%
ICMP sent
0.4 %
TCP sent
65 %
TCP sent
99.6 %
UDP sent
33 %
UDP sent
0%
Total observed proto.
<5
Total observed proto.
2
Dst. Port Usage
2008/10/8
Uniform across every
destination port
Dst. Port Usage
Speaker: Li-Ming Chen
all > 1032
(except reports on port 0)
 Byte volume across port
1032~58631 is uniform
distribution

11
Behavioral Anomalies of P2P Activity

Use known ports to identify P2P traffic

Results:



2008/10/8
Byte volume vs. flow volume of overall P2P traffic
Observe growth in unique source and destination IPs
Observe flow volume of each P2P application
Speaker: Li-Ming Chen
12
Some Results (P2P behaviors)
eDonkey
flow volume
Overall P2P flow volume
TCP
UDP
BitTorrent
flow volume
2008/10/8
Speaker: Li-Ming Chen
13
Some Results (P2P behaviors)
Skype flow volume of a single workstation
2008/10/8
Speaker: Li-Ming Chen
14
Part 2: Unique Identity
Assignment
My Comment:
More interesting, but only preliminary work
The approach

Assumptions



Assign identity to a host by relying on the IP address
is not available or untrustworthy
Look to other information available in the flow record
Approach:

Step 1: develop a frequency histogram




2008/10/8
Decide features (tuples)
Compute relative frequency of each tuple occurrence
The frequency value represent the behavior of a host
Step 2: training and testing by using neural network
Speaker: Li-Ming Chen
16
Develop a frequency histogram

Three tuples




(Protocol, Destination Port, Byte Range)
First two: usage
Last one: volume 
Example:
Define 23 possible tuples in the model
2008/10/8
Speaker: Li-Ming Chen
17
Conclusion

This paper presents a methodology of profiling
host behavior based on connection patterns




Using NetFlow records
Profiles contain: normal workstation, internet game
server, scanning worm, NT file server and a laser
printer
Also discuss the behavior of P2P applications
Presents the overview of a technique to produce
identity classification of hosts

2008/10/8
Based on NetFlow characteristics other than IP
address
Speaker: Li-Ming Chen
18
My comment




Not clear enough about the unique identity
assignment approach
Profiling mechanism is too straightforward
P2P application identification only rely on port
Something difficult to explain…



Why not ignore it… (e.g., traffic on port 80)
View the network behaviors as a whole, not only
look at one specific aspect
Lack for detailed literature survey
2008/10/8
Speaker: Li-Ming Chen
19
Appendix
Behaviors of a on-line game server
Accumulated monthly bytes per Dst. port
(outbound)
Game server
Remove this
server
Unsuccessful
attempts
1024 5000
NT file server
Packet volume (log-scale) to the game server
(hourly inbound)
(on port 27015)
2008/10/8
Speaker: Li-Ming Chen
21