Internet - Computer Science & Engineering

Download Report

Transcript Internet - Computer Science & Engineering

Lecture 2:
Internet Measurement
CS 790g: Complex Networks
Final Project
 Analyze a network
 What it should be
 More than just a measurement of network characteristics
 An interpretation of measurement results
 If applicable:
 discovery of community or other structures
 motifs
 weights, thresholds
 longitudinal data (how the network changes over time)
 Visualizations of the network that point out a particular feature
 Qualitative comparison with other networks
 What it should not be
 a literature review
 recapitulation of existing work
 raw analysis of data
 The data can be artificially generated or a real-world dataset
2
Final Project
 New network model
 What it should be
 Method for generating a network
 e.g. preferential attachment
 optimization wrt. different criteria
 Analysis of resulting network
 comparison with random graphs
 how do attributes change depending on model parameters
 What it should not be
 an already thoroughly explored model
3
Final Project
 Theory development
 What it should be
 An algorithm to analyze the network
 e.g. clustering or community detection algorithm
 webpage ranking algorithm
 OR a process that is influenced by the network
 gossip spreading
 games such as the prisoner’s dilemma
 Analysis of algorithm on several different networks
 What it should not be
 an exact replica of an existing algorithm applied to a network where
it has already been studied
4
Final Project
 Epidemic Characterization
 What it should be
 In-depth study of an epidemic phenomena
 fads in online content;
 virus and worm spreading in information networks;
 or word-of-mouth in product marketing
 What it should not be
 a replica of an existing study
5
6
Internet
 Web of interconnected networks
 Grows with no central authority
 Autonomous Systems optimize local communication efficiency
 The building blocks are engineered and studied in depth
 Global entity has not been characterized
 Most real world complex-networks
have non-trivial properties.
 Global properties can not be inferred from local ones
 Engineered with large technical diversity
 Range from local campuses to transcontinental backbone
providers
7
Internet Measurements
 Need for Internet measurements arises due to
commercial, social, and technical issues
 Realistic simulation environment for developed products,
 Improve network management
 Robustness with respect to failures/attacks
 Comprehend spreading of worms/viruses
 Know social trends in Internet use
 Scientific discovery
 Scale-free (power-law), Small-world, Rich-club, Dissasortativity,…
8
Internet Topology Measurement
CAIDA 2006
9
Internet Topology Measurement
10
CAIDA 2006
10
Internet Topology Measurement
11
Internet Topology Measurement
12
CAIDA 2006
12
Internet Topology Measurements
Probing
 Direct probing
IPB
IPD
Vantage Point
IPBD TTL=64
A
B
C
D
 Indirect probing
IPB
IPC
Vantage Point
IPD TTL=1
TTL=2
A
B
C
D
http://www.caida.org/publications/animations/active_monitoring/traceroute.mpg
13
Internet Topology Measurement
Topology Collection (traceroute)
 Probe packets are carefully constructed to elicit intended response
from a probe destination
IPB
IPA
IPC
IPD
Vantage Point
Destination
TTL=1
TTL=2
TTL=3
TTL=4
S
A
B
C
D
 traceroute probes all nodes on a path towards a given destination
 TTL-scoped probes obtain ICMP error messages from routers on the path
 ICMP messages includes the IP address of intermediate routers as its source
 Merging end-to-end path traces yields the network map
14
Internet Topology Measurement:
Background
Internet2 backbone
S s.3
s.2
n.1
c.2
u.1
U
c.1
u.2
k.1
u.3
l.1
K
k.2
C
w.1
c.3
c.4
L
a.1
l.3
Trace to Seattle
A
a.3
h.2
h.1
H
h.3
h.4
d
15
W
w.2
w.3
k.3
l.2
n.3
N
a.2
Trace to NY
Internet Topology Measurement:
Background
s.1
e
f
S s.3
n.2
s.2
n.1
c.1
u.1
U
u.2
k.1
u.3
l.1
K
k.2
c.2
C
w.1
c.3
c.4
a.1
l.2
l.3
A
a.3
h.2
h.1
H
h.3
h.4
d
16
W
w.2
w.3
k.3
L
n.3
N
a.2
Topology Sampling
Issues
 Sampling to discover networks
 Infer characteristics of the topology
 Different studies considered
 Effect of sample size [Barford 01]
 Sampling bias [Lakhina 03]
 Path accuracy [Augustin 06]
 Sampling approach [Gunes 07]
 Utilized protocol [Gunes 08]
 ICMP echo request
 TCP syn
 UDP port unreachable
17
Anonymous Router Resolution
Problem
 Anonymous routers do not respond to traceroute
probes and appear as a  in path traces
 Same router may appear as a  in multiple traces.
 Anonymous nodes belonging to the same router should be resolved.
 Anonymity Types
1.
2.
3.
4.
5.
Ignore all ICMP packets
ICMP rate-limiting
Ignore ICMP when congested
Filter ICMP at border
Private IP address
18
Anonymous Router Resolution
Problem
e
f
Internet2 backbone
S
N
C
U
W
K
L
A
H
d
19
Traces
•d--L-S-e
•d--A-W--f
•e-S-L--d
•e-S-U--C--f
•f--C---d
•f--C--U-S-e
Anonymous Router Resolution
Problem
S
U
L
e
Traces
•d--L-S-e
•d--A-W--f
•e-S-L--d
•e-S-U--C--f
•f--C---d
•f--C--U-S-e
H
d
S
K
C
N
A
f
W
Sampled network
C
U
f
L
A
e
d
W
Resulting network
20
Graph Based Induction
Common Structures

A
x
y1
y2
y3
C



A
x
y1
y2
y3
C
Parallel nodes
x
A
D
w
x
C
E
z
y
Clique
x

y

A
C


C
E
F
z
x
A

v
Complete Bipartite
D

w
C


y
D
A
w

x
A

E
z
C

21
x
A



y
F
w
E
z
w

E
D
Star

 
z
D
y
v
C
D
E
y
w
z
Alias Resolution:
.33
 Each interface of a router
.5
has an IP address.
 A router may respond with
different IP addresses to
different queries.
.18
Denver
.7
.13
 Alias Resolution is the process of grouping the interface
IP addresses of each router into a single node.
 Inaccuracies in alias resolution may result in a network
map that
 includes artificial links/nodes
 misses existing links
22
IP Alias Resolution
Problem
s.1
e
f
S s.3
n.2
s.2
n.1
u.1
U u.2
c.1
k.1
l.1
C
w.1
c.3
W
w.2
w.3
c.4
K k.2
u.3
c.2
N n.3
k.3
L
a.1
l.2
l.3
A
a.2
a.3
h.2
h.1
H
h.3
h.4
d
23
Traces
• d - h.4 - l.3 - s.2 - e
• d - h.4 - a.3 - w.3 - n.3 - f
• e - s.1 - l.1 - h.1 - d
• e - s.1 - u.1 - k.1 - c.1 - n.1 - f
• f - n.2 - c.2 - k.2 - h.2 - d
• f - n.2 - c.2 - k.2 - u.2 - s.3 - e
IP Alias Resolution
Problem
S
U
K
C
N
f
Sampled network
L
e
H
A
W
d
s.3
u.1
k.1
c.1
n.1
u.2
k.2
c.2
n.2
s.1
e
f
s.2
w.3
l.1
l.3
n.3
a.3
h.2
h.1
h.4
Sample map
without alias resolution
d
24
Traces
• d - h.4 - l.3 - s.2 - e
• d - h.4 - a.3 - w.3 - n.3 - f
• e - s.1 - l.1 - h.1 - d
• e - s.1 - u.1 - k.1 - c.1 - n.1 - f
• f - n.2 - c.2 - k.2 - h.2 - d
• f - n.2 - c.2 - k.2 - u.2 - s.3 - e
Genuine Subnet Resolution
Problem
 Alias resolution
 IP addresses that belong to the same router
IP2
IP3
IP1
IP4
IP6
IP5
 Subnet resolution
 IP addresses that are connected over the same medium
IP1
IP1
IP2
IP3
IP2
25
IP3
Autonomous System Level
26
27
http://www.caida.org/publications/animations/active_monitoring/as_core.mpg
28
Traffic Measurements
 Monitoring and measuring network traffic
 to produce better models of network behavior
 to diagnose failures and detect anomalies
 to defend against unwanted traffic
 Live weather map
 Internernet2
 PlanetLab
29
Code-Red Worm
 On July 19, 2001, more than 359,000 computers connected to the
Internet were infected with the Code-Red (CRv2) worm in less than
14 hours
 Spread
30
Sapphire Worm
 was the fastest computer worm in history
 doubled in size every 8.5 seconds
 infected more than 90 percent of vulnerable hosts within 10
minutes.
31
Witty Worm
 reached its peak activity after approximately 45 minutes
 at which point the majority of vulnerable hosts had been infected
 World
 USA
32
Nyxem Email Virus
 Estimate of total number of infected computers is
between 470K and 945K
 At least 45K of the infected computers were also
compromised by other forms of spyware or botware
 Spread
33
Scam Hosting
 Study dynamics of scam hosting infrastructure
34
Measurement Studies
 Glasnost
 tests whether BitTorrent is being blocked or throttled
 BW-meter
 Measurement tools for the capacity and load of Internet paths
 NPAD Diagnostics Servers
 Automatic diagnostic server for troubleshooting end-systems and
last-mile network problems
 iPlane
 construct a router interface-level atlas of the Internet
 measuring link attributes
 Hubble
 find persistent Internet black holes as they occur
35
Internet Measurements
 The Internet is man-made, so why do we need to
measure it?
 Because we still don’t really understand it
 Sometimes things go wrong
 Malicious users
 Measurement for network operations
 Detecting and diagnosing problems
 What-if analysis of future changes
 Measurement for scientific discovery
 Creating accurate models that represent reality
 Identifying new features and phenomena
36