Ir(t) - IPAM
Download
Report
Transcript Ir(t) - IPAM
Correcting Congestion-Based Error in
Network Telescope’s Observations of
Worm Dynamics
Songjie Wei
Jelena Mirkovic
Computer & Information Sciences Dept.
University of Delaware
Information Sciences Institute
University of Southern California
Proceedings of the ACM Internet Measurement Conference (IMC) 2008
Introduction
Worms compromise hosts and interfere with
network operation
Network telescopes monitor Internet-wide
security incidents
Worms enlist bots, crash routers, congest links
An allocated portion of IPv4 space, otherwise unused
Collecting data about worms, spoofed DDoS, etc.
Network telescopes collect information about
worm dynamics
CAIDA telescope, WAIL telescope
Code Red (2001), Slammer (2003), Witty worm (2004)
Motivation
How to Correct
Validation
Conclusions
Sample Telescope Observation of a Worm
Slammer’s early progress [Moore03] according to the WAIL
telescope (/8) at University of Wisconsin
80M x 404B x 8 / 256 ≈ 1Gbps
Too slow
Worm
scan size
Telescope
size
[Moore03] David Moore, Vern Paxson, Stefan Savage, Collen Shannon,
Stuart Staniford, and Nicholas Weaver. Inside the Slammer Worm. IEEE
Security and Privacy, Jul/Aug 2003.
Motivation
How to Correct
Validation
Conclusions
Why Do We Need Accurate Observations?
Curiosity
Sometimes the only way to test worm defenses
is with a simulator or a model
We should know what is REALLY going on
Internet-scale defenses
Also new worm propagation approaches
Simulators and models must be validated against
some ground truth - usually telescope’s
observations
Inaccurate observations lead to inaccurate
simulators and models
Motivation
How to Correct
Validation
Conclusions
Network Telescope’s Limitations
Aggressive worms saturate Internet links and
cause packet drops
Slammer (404 bytes worm packet) infected 75,000+
hosts, peak rate of 26,000 scans per worm per second
Witty infected 12,000 computers, generating 90 Gbps
Telescopes may only observe some worm
infectees and receive portions of worm scans
sent to them
Slow or short-life infectees less likely to be observed
Worm scans get dropped on crashed routers or
congested links (network intermediate link, telescope’s
access link)
Motivation
How to Correct
Validation
Conclusions
Telescope’s Inference of Worm Dynamics
Scaling up a telescope’s local observation by the
telescope’s size to infer the global worm
dynamics
For a /8 telescope, local_scans x 256 =
global_scans, local_infected = global infected?
The traditional inference may be wrong, due to
congestion and packet loss
Lower inference of the number of infected hosts
Underestimated worm scanning rates (both global
and individual worm rate)
Motivation
How to Correct
Validation
Conclusions
Our Research Objectives
Measure the congestion severity based on
telescope’s observation
Estimate scan loss and arrival ratio for each infectee
Infer worm’s global dynamics
Correct telescope’s local observation with
consideration of network congestion and packet loss
Scale up the local worm observation correctly to
reflect the global dynamics of worm propagation in
Internet
Motivation
How to Correct
Validation
Conclusions
Related Work
Worm simulation scale-down [Weaver04]
Not clear how you would use it to scale up
Worm forensics [Kumar05, Hama06]
More accurate than our method but need worm source
and human involvement
Complementary
Correcting telescope bias [Zou03]
Bias occurs due to delay between a host’s infection
and scans seen by telescope
Minor for /8 telescopes
Motivation
How to Correct
Validation
Conclusions
Assumptions about Worms and Internet
Constant infectee scanning rate
Scanning rate is bounded by infectee’s capability (CPU,
memory, access bandwidth, OS, etc.)
Packet loss mainly due to congestion
Many incidents and countermeasures cause packet
loss
Congestion, routing failure, security countermeasures, etc.
Hosts sharing routing paths have the same loss ratio
No significant loss in early stage of worm spread
Congestion gradually builds up as more hosts are
infected
Early stage is the time when none or a few packets are
dropped
Motivation
How to Correct
Validation
Conclusions
Scan Arrival Ratio
Inferring the constant scanning rate
T
S R (t ) / T
Duration of
early stage
t 1
Constant
sending rate
Scans received
Scan arrival ratio: the percentage of worm
scans sent by an infectee to the telescope that
is successfully received
P(t) =
Motivation
Scans received by the telescope during time i
The infectee’s constant sending rate to the telescope
How to Correct
Validation
=
R(t)
S
Conclusions
Inferring the End of the Early Stage
Epidemic model of worm propagation
Cliff C. Zou, Lixin
Gao, Weibo Gong,
and Don Towsley.
“Monitoring and
Early Warning for
Internet Worms,”
ACM CCS, 2003.
T is when the number of infected hosts departs
from the exponential model
Due to congestion or end of slow start phase
Motivation
How to Correct
Validation
Conclusions
End of Early Stage: Time T
We measure the match between the number of
infected hosts and the exponential model
R-squared: a statistical measure, between 0 and 1,
shows fit between a model’s prediction and real values
0 means totally different, and 1 means perfect match
We define T as the time when R-squared value starts to
decrease continuously from 1
Hosts infected before T are called early infectees
Underestimating T is OK, as long as we identify
enough early infectees
Motivation
How to Correct
Validation
Conclusions
Scan Arrival Ratio for Late Infectees
Infectees observed after time T are identified as
late infectees
We cannot infer their constant scanning rates directly
because their scans are always observed with loss
We differentiate between two cases
If we know the ratio of another infectee sharing the
same inter-AS routing path, we use this known ratio
Infectees sharing a routing path suffer the same congestion
Otherwise we use the average of ratios of all known
early infectees
Better than random guess
More accurate when congestion is closer to the telescope
Motivation
How to Correct
Validation
Conclusions
Inferring Worm Dynamics
Global number of infected hosts
Worm’s Internet scanning rate
Number of infected hosts in the Internet
Also hosts that are infected but not yet seen by
telescope due to congestion
Number of worm scans sent into the Internet per
second
Infectees’ scanning rate distribution
Reflect the infectees’ aggressiveness
Limited by infectee’s features and configurations
Motivation
How to Correct
Validation
Conclusions
Global Number of Infected Hosts
∆Ir(t) = Ir(t) – Ir(t-1)
Ir(t): number of infectees by time t
∆Io(t) = Io(t) – Io(t-1)
Io(t): number of infectees observed by time t
Uo(t): number of infectees not observed by time t
Pagg(t): aggregated scan arrival ratio of infectees
Smed(t-1): Med scans sent to telescope per second per infectee
Io(t)
Ir(t)
Uo(t 1)
Smed( t1)
1 (1 Pagg(t 1))
Probability that an infectee is seen
by the telescope during time t
Uo(t) (Ir(t) Uo(t 1)) (1 Pagg(t 1)) Smed(t1)
Probability that an infectee is NOT
seen by the telescope during time t
Motivation
How to Correct
Validation
Conclusions
Inference Example
t
∆Io Pagg
Smed
Io
∆Ir
Uo
Ir
0
0
1
0
0
0
0
0
1
1
0.8
5
1
1
0
1
2
3
0.5
5
4
2
0.00096 4.00096
3
2
0.5
5
6
2.06 0.065
6.06
4
3
0.2
5
9
3.03 0.097
9.10
5
6
0.2
5
15
8.83 2.92
17.92
Motivation
How to Correct
Validation
Conclusions
Variation of Scan Arrival to Telescope
Early stage
Congestion
(no congestion)
140
Scans per second)
120
100
Constant sending rate to
the telescope
80
60
Time T
40
20
0
1
4
7
10
13
16
19
22
25
28
31
34
37
40
Time (second)
A sample observation of scans from an infectee to a /8 telescope
(from our simulation of Witty worm)
Motivation
How to Correct
Validation
Conclusions
Worm’s Internet Scanning Rate
Ri(t): number of scans received by the telescope from infectee i
during time t
Pi(t): scan pass ratio of infectee i during time t
Ri (t )
Ir (t )
IPv 4
Scanning Rate =
Io (t ) iIo ( t ) Pi (t ) telescope _ size
Counting those infectees
that are not yet observed
Motivation
Each infectee’s
sending rate
How to Correct
Scaling up local observation to
global dynamics
Validation
Conclusions
Infectees’ Scanning Rate
We use the maximum of inferred infectee’s scanning rate
as the infectee’s original scanning rate
Ri (t )
IPv 4
Bi max t 0 (
)
Pi (t ) telescope _ size
Motivation
Scanning rate
Sending rate of
of infectee i
infectee i at time t
How to Correct
Scaling up from telescope’s
observation to the whole Internet
Validation
Conclusions
Inferring Global Dynamics of Witty Worm
Witty worm trace
From CAIDA telescope, a /8 block of IPv4
Scans begin on March 19, 2004 at 4:45am UTC
Original IP addresses
Ground truth [Kumar05]
Our experiment
Avoid infectees within NATs
Filter out infectees with less than 20 scans
First 75 minutes,11,326 infectees, 45.5 M scans
[Kumar05] A. Kumar, V. Paxson and N. Weaver, “Exploiting Underlying
Structure for Reconstruction of an Internet-Scale Event”, IMC 2005.
Motivation
How to Correct
Validation
Conclusions
R-squared value
# of observed infectees
Inferring T - End of the Early Stage
Motivation
How to Correct
Validation
Conclusions
Witty Worm’s Internet Scanning Rate
Larger T tends to underestimate the packet loss and thus
the Internet-wide worm propagation strength.
Motivation
How to Correct
Validation
Conclusions
Infectees’ Bandwidth
Two error 1. We use average scan arrival ratio for some late infectees
sources: 2. We lack information about slow/short-life infectees
Motivation
How to Correct
Validation
Conclusions
Conclusions and Future Work
We investigate the congestion effect on network
telescope’s observation of worm propagation
We propose ways to estimate congestion packet
loss, and to correct telescope’s observation
We correct CAIDA telescope’s observation of Witty
We plan to extend our work for other error
sources
Influence of various telescope sizes on our analysis
NATs, filters, non-uniform scanning worms
Motivation
How to Correct
Validation
Conclusions
Questions? Comments?
Jelena Mirkovic ([email protected])
Songjie Wei ([email protected])
[PAWS] S. Wei and J. Mirkovic, “A Realistic Simulation of Internet-Scale Events,”
Proceedings of the 2006 VALUETOOLS Conference, October 2006