Internet measurements

Download Report

Transcript Internet measurements

Internet Measurement Tutorial
Yuval Shavitt
School of Electrical Engineering
http://www.eng.tau.ac.il/~shavitt
Motivation
• Wide area networks are too complex to
grasp
– Many protocols at various levels interact and
effect behavior
• Many applications have performance
requirements
– End-to-end delay and loss, reliability
Motivation (2)
• Its an interesting complex system
– Has emergent characteristics like many living
systems:
• Biological systems
• Social networks
TCP/IP Protocols
TELNET
FTP
SMTP
TCP
DNS
UDP
IP
LAN
wireless
Application
Transport
Network
WAN
Physical+
Data link
Internet Measurement Challenges
Internet Measurement Challenges (1)
• Network size:
– 100,000,000s hosts, 1,000,000s routers, ~30,000 ASes
• Network Complexity
– Interaction between components, protocols,
applications, users
• All change over time
– New applications are added
– New protocol versions (TCP)
– New router design (AQM)
Internet Measurement Challenges (2)
• Not engineered for measurement:
– Initial design had no measurement thinking
– Distributed management
• Tendency not to share data
• Blocking measurement attempts (“don’t ping my
network”)
– NATs, Firewalls, …
Success Stories
DIMES+BGP (Feb 05)
14
“On the self-similar nature of Ethernet traffic”
W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson
IEEE/ACM Transactions on Networking, February 1994.
• Thorough<k>analysis kof Bellcore LAN traces established selfsimilar properties of packet arrival process.
“On power-law relationships of the internet topology”
M. Faloutsos, P. Faloutsos, and C. Faloutsos,
ACM SIGCOMM 1999, Aug./Sept. 1999.
• Analysis of the RouteViews BGP database establish the
power-law characteristics of the Internet topology.
12
log(Pr(degree))
10
Pr(k)
8
6
4
2
0
0
2
4
6
log(degree)
8
10
12
Why do we measure the Internet?
• Already mentioned:
– Because it is there!
– Operational reasons
• We cannot improve the Internet if we don’t
understand it
– We cannot understand it if we don’t measure
– We cannot build effective models or simulators
if we don’t measure
Long term objectives
• Monitor the Internet at real time
• Manage the Internet
– Monitor and react before things go bad
What can we measure in the Internet?
• Structure
– Topology (router/network) connectivity, link capacities,
link loss, available bandwidth, routing
• Traffic
– End-to-end performance, packet arrival process
(congestion built-up)
• Users and applications
– WWW, peer-to-peer, streaming
• Malicious behavior
– Attack patterns, port scans
Where can we measure the Internet?
How to chose representative measurement
points?
Example: traffic samples
–
–
–
–
–
LAN traffic vs. WAN traffic,
Inside an ISP vs. between continents
Country biases
Commercial location vs. educational
More locations is better
How can we measure the Internet?
• Active measurements
– Probes: Traceroute, ping, packet trains
– Application simulation
• Passive measurement
– Logs (WWW)
– Monitors, sniffers
Measurement resources on the
WWW
CAIDA:
www.caida.org/tools/taxonomy
SLAC:
www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html
When should we measure the Internet?
• Diurnal and weekly traffic cycles
• Time scales depend on “what” and “how”
• Passive measurement are typically continuous
– Can generate huge data sets
– Log access problems
– Privacy concerns
• Active measurements are typically discrete
– Important characteristics can be missed
– Probes can be filtered and/or detected
Who is measuring the Internet?
• Businesses do a great deal of measurement
– Mostly do not share with the research community
– examples:
• Akamai: http delay from server side
• HP (Mercury): http delay from client side
• Google: everything
• Academia and Research institutes
– Publish papers, but data may not be always available
• Internet Statistics and Metrics Analysis (ISMA)
– CAIDA attempt to create a global meta-data database
Publishing Internet Measurement
Studies
• All major networking conferences &
journals accept measurement papers
– ACM SIGCOMM, IEEE INFOCOM, ACM
SIGMETRICS
• Dedicated meetings:
– ACM Internet Measurement Conf. (IMC, IMW)
– Passive & Active Measurements Conf. (PAM)
– TridentCom
Active Measurement Techniques
Active Probes
• Active probes send stimulus (packets) into the
network and then measure the response
– Done on network, transport and application layers
• Active probes are useful to measure various
things:
– Delay, delay jitter, and loss
– Topology and routing behavior
– Capacity, bandwidth, and throughput
Simple delay/loss probing with ping
C:\>ping www.fer.hr
Pinging www.fer.hr [161.53.72.111] with 32 bytes of data:
Reply from 161.53.72.111: bytes=32 time=113ms TTL=49
Reply from 161.53.72.111: bytes=32 time=111ms TTL=49
Reply from 161.53.72.111: bytes=32 time=113ms TTL=49
Reply from 161.53.72.111: bytes=32 time=118ms TTL=49
Ping statistics for 161.53.72.111:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 111ms, Maximum = 118ms, Average = 113ms
ICMP
ICMP is the IP error diagnosis protocol.
IP header
Type
Code
Checksum
Sequence number
Any ICMP data
ICMP Message Types
Type No.
Meaning
0
Echo reply
3
Destination unreachable
4
Source quench
5
Redirect
8
Echo
9
Router advertisement
10
Router solicitation
11
Time exceeded
12
Parameter problem
13
Timestamp
14
Timestamp reply
15
Information requeste
16
Information reply
PING
Application layer “ping”
• One can generate application layer
messages to test application reaction time
• Most common:
– TCP SYN message to port 80
traceroute
• Useful to learn the route characteristics
between two hosts.
• Sends a series of probes to successive nodes
along a route to an intended destination and
records the source address and time delay of
the message returned by each.
• Based on ICMP “TTL expired” message
IP datagram format
IP protocol version
number
header length
(bytes)
“type” of data
max number
remaining hops
(decremented at
each router)
upper layer protocol
to deliver payload to
32 bits
head. type of
length
len service
fragment
16-bit identifier flgs
offset
time to upper
Internet
layer
live
checksum
ver
total datagram
length (bytes)
for
fragmentation/
reassembly
32 bit source IP address
32 bit destination IP address
Options (if any)
data
(variable length,
typically a TCP
or UDP segment)
E.g. timestamp,
record route
taken, pecify
list of routers
to visit.
ICMP Message Types
Type No.
Meaning
0
Echo reply
3
Destination unreachable
4
Source quench
5
Redirect
8
Echo
9
Router advertisement
10
Router solicitation
11
Time exceeded
12
Parameter problem
13
Timestamp
14
Timestamp reply
15
Information requeste
16
Information reply
Type
3
3
3
3
3
3
Code
0
1
2
3
6
7
description
dest. network unreachable
dest host unreachable
dest protocol unreachable
dest port unreachable
dest network unknown
dest host unknown
traceroute
traceroute
time
A
Regular UDP packets
•
successive TTLs
ICMP “TTL expired”
message
ICMP “port unreachable”
message
B
C
D
E
traceroute versions
• UNIX:
– default send UDP packets
• Start at port 33435, and increment port per packet!
– traceroute –l sends ICMP “ECHO request”
– tcptraceroute uses TCP SYN messages
• If port is close gets RST reply
• If port is open gets SYN ACK and reply with RST
• Best to overcome firewalls
• Windows
– ICMP “ECHO request”
C:\>tracert www.fer.hr
Tracing route to www.fer.hr [161.53.72.111]
over a maximum of 30 hops:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<1
19
17
21
19
20
69
82
101
105
117
113
120
114
120
114
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
20
22
19
23
20
69
82
98
105
112
115
122
112
119
114
Trace complete.
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
19
20
19
18
20
69
82
98
105
113
115
123
119
119
113
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
192.168.200.254
vxr.tau.ac.il [132.66.8.10]
c6509.tau.ac.il [132.66.8.20]
tel-aviv.tau.ac.il [132.66.4.1]
gp1-tau-fe.ilan.net.il [128.139.191.70]
iucc.il1.il.geant.net [62.40.103.69]
il.it1.it.geant.net [62.40.96.154]
it.ch1.ch.geant.net [62.40.96.33]
ch.at1.at.geant.net [62.40.96.1]
at.hu1.hu.geant.net [62.40.96.178]
hu.hr1.hr.geant.net [62.40.96.145]
carnet-gw.hr1.hr.geant.net [62.40.103.218]
193.198.228.6
193.198.229.10
161.53.16.14
duality.cc.fer.hr [161.53.72.111]
C:\>tracert www.colbud.hu
Tracing route to www.colbud.hu [81.182.250.153]
over a maximum of 30 hops:
1
<1 ms
<1
2
19 ms
21
3
20 ms
21
4
21 ms
20
5
20 ms
22
6
26 ms
22
7
91 ms
92
8
97 ms
97
9
95 ms
96
10
96 ms
96
11
110 ms
112
12
*
*
13
112 ms
110
14
112 ms
114
15
114 ms
112
16
120 ms
122
[81.182.250.153]
Trace complete.
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
18
21
19
19
21
92
97
93
150
114
*
111
110
114
124
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
192.168.200.254
vxr.tau.ac.il [132.66.8.10]
c6509.tau.ac.il [132.66.8.20]
tel-aviv.tau.ac.il [132.66.4.1]
gp1-tau-fe.ilan.net.il [128.139.191.70]
iucc.il1.il.geant.net [62.40.103.69]
il.nl1.nl.geant.net [62.40.96.117]
nl.de1.de.geant.net [62.40.96.101]
ffm-b2-pos2-3.telia.net [213.248.77.89]
ffm-bb2-pos2-3-0.telia.net [213.248.64.177]
bpt-b1-pos2-0.telia.net [213.248.64.26]
Request timed out.
10ge-0-0.core0-ip2.net.telekom.hu [145.236.85.2]
tenge1-2.core0.adatpark.hu [145.236.89.10]
fixip-lns2.adatpark.hu [195.228.253.58]
153-250-182-81.adsl-fixip.axelero.hu
Probing for link characteristics
• Packet dispersion techniques can be used to
infer characteristics of each link along an
Internet path.
– Bandwidth, queuing delays, propagation delay
– Cross traffic may cause problem
• Many tools are available:
– bprobe [CC97], clink [D99], nettimer [LB99],
pathchar [J97], pchar [M00], pathrate [DRM01]
Capacity
• Maximum IPlayer throughput that a flow
can get, without any cross traffic
link 2
link 1
source
• Ci = capacity of link i
• Path capacity C=mini{Ci}
link 3
sink
Available Bandwidth
• Maximum IPlayer throughput that a flow
can get, given (stationary) cross traffic
link 2
link 1
source
link 3
sink
• ui = utilization of link i
• Path available bandwidth A=mini{Ci(1- ui)}
Packet Pair Dispersion
•
•
•
•
Packet transmission time: τ=L/C
Send two packets back-to-back
Measure dispersion  at the reciever
Estimate C as L/
C
L/3C
3C
L/C
• But cross-traffic ‘noise’ can effect .
L/C
Pathchar
• Developed by Van Jacobson to:
“allows any user to find the bandwidth,
delay, average queue and loss rate of every
hop between any source & destination on
the Internet”
• Measure the path hop by hop
– Default: 32 probes per hop
Self-Loading Periodic Streams
(SLoPS) [Jain Dovrolis 02]
•
•
•
•
SND sends a periodic UDP packet stream at rate R.
R=L/T, L=packet size, T=period, K=number of packets
Measure one way delay (OWD): Dk=tarrive-tsend
OWD variation: Dk=Dk+1-Dk (independent of clock
offset)
• With stationarity & fluid model for the cross traffic, and
FIFO queues:
k   0 if R  A
D 
 0 if R  A
Illustration of SLoPS
Periodic Stream: K packets, size L bytes, rate R = L/T
Trends in Real Data
U. Oregon to U. Delaware (12 hops)
A=74Mbps (MRTG), K=100, T=100S, L=1200B
R= 96Mbps and 37Mbps
When RA
Passive Measurement Techniques
Passive packet measurement
• Capture packets as they pass by
– Packet capture applications (tcpdump) on hosts use packet
capture filter
• Requires access to the wire
– Promiscuous mode or mirror ports to see other traffic
– Hardware-bases solutions
• Endace, Inc.’s DAG cards …
OC12/48/192 (0.622/2.5/10Gbps)
• Programmable NIC cards (<$100)
• Issues:
– Timestamps
– Data volumes
– Privacy
tcpdump
• Can capture entire packet or n first bytes
• Timestamps each packet
• Can filter based on any combination of
header field
12:40:18.501228 IP bakara.eng.tau.ac.il.23 > amirotem-pc.eng.tau.ac.il.2260: P 1:3(2) ack 1 win 8760 (DF)
12:40:18.692431 IP amirotem-pc.eng.tau.ac.il.2260 > bakara.eng.tau.ac.il.23: . ack 3 win 64162 (DF)
12:40:18.692775 IP bakara.eng.tau.ac.il.23 > amirotem-pc.eng.tau.ac.il.2260: P 3:10(7) ack 1 win 8760 (DF)
12:40:18.893601 IP amirotem-pc.eng.tau.ac.il.2260 > bakara.eng.tau.ac.il.23: . ack 10 win 64155 (DF)
Full Packet Capture
12:22:42.401784 IP (tos 0x0, ttl 128, id 37074, len 41) AMIROTEM.dummy.net.3214 >
bakara.eng.tau.ac.il.23: P [tcp sum ok] 3535692137:3535692138(1) ack 1410929928 win 16196 (DF)
0x0000
4500 0029 90d2 4000 8006 2d02 c0a8 c803 E..)[email protected].....
0x0010
8442 300c 0c8e 0017 d2be 6169 5419 1508 .B0.......aiT...
0x0020
5018 3f44 1d9e 0000 6c
P.?D....l
12:22:42.426889 IP (tos 0x0, ttl 252, id 33630, len 41) bakara.eng.tau.ac.il.23 >
AMIROTEM.dummy.net.3214: P [tcp sum ok] 1:2(1) ack 1 win 9324 (DF)
0x0000
4500 0029 835e 4000 fc06 be75 8442 300c E..).^@....u.B0.
0x0010
c0a8 c803 0017 0c8e 5419 1508 d2be 616a ........T.....aj
0x0020
5018 246c 3875 0000 6c88 8888 8888
P.$l8u..l.....
12:22:42.600874 IP (tos 0x0, ttl 128, id 37075, len 41) AMIROTEM.dummy.net.3214 >
bakara.eng.tau.ac.il.23: P [tcp sum ok] 1:2(1) ack 2 win 16195 (DF)
0x0000
4500 0029 90d3 4000 8006 2d01 c0a8 c803 E..)[email protected].....
0x0010
8442 300c 0c8e 0017 d2be 616a 5419 1509 .B0.......ajT...
0x0020
5018 3f43 169d 0000 73
P.?C....s
12:22:42.617003 IP (tos 0x0, ttl 252, id 33631, len 41) bakara.eng.tau.ac.il.23 >
AMIROTEM.dummy.net.3214: P [tcp sum ok] 2:3(1) ack 2 win 9324 (DF)
0x0000
4500 0029 835f 4000 fc06 be74 8442 300c E..)[email protected].
0x0010
c0a8 c803 0017 0c8e 5419 1509 d2be 616b ........T.....ak
0x0020
5018 246c 3173 0000 7388 8888 8888
P.$l1s..s.....
Passive IP flow measurement
• An IP flow is defined by the five-tuple:
– src addr, src port, dst addr, dst port, protocol
• Cisco’s NetFlow
– Part of the IOS
– Provide template based flow records
• Many tools can manipulate NetFlow data
FlowScan [Plonka00]
• Combines flow collection engine, database,
visualization tool
• Provides a near real-time visualization of
network traffic
• Breaks down traffic into well known service
or application
FlowScan
Examples
(May 2005)
Analysis of Flows
• Examining flows of packets one can determine
OOO packets:
–
–
–
–
–
Losses
Reorders
TCP state machine
Retransmissions
Duplicates
• Analysis can be done on 1- or 2-directional flows
Diagnostic node
Local ISP
Internet
Unidirectional Flows
• Evaluates TCP seq. # and IP-ID patterns
– Assumption: the sender’s IP ID forms a monotonic increasing sequence
Brosh&Shavitt, Infocom’05
HTTP Logs
• Have data about the client IP, transaction
time, command (GET/POST), return code,
bytes transferred, referrer, metadata (browser
type, OS, languages, etc.)
• Tools are available to analyze HTTP logs
– Webalizer
HTTP Log Example
24.77.192.99 - - [15/May/2005:23:54:59 +0300] "GET /science_down.gif HTTP/1.1" 200 1138 "http://www.netdimes.org/science.html
" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3"
68.231.117.28 - - [15/May/2005:23:52:05 +0300] "GET /ipmap.png HTTP/1.1" 200 4874697 "http://slashdot.org/" "Mozilla/5.0 (X11
; U; Linux i686; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /home_up.gif HTTP/1.1" 200 1096 "http://www.netdimes.org/" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /AboutUs_up.gif HTTP/1.1" 200 1169 "http://www.netdimes.org/" "Mozilla/4
.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.77.192.99 - - [15/May/2005:23:55:00 +0300] "GET /Install_down.gif HTTP/1.1" 200 1219 "http://www.netdimes.org/science.html
" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3"
69.141.103.137 - - [15/May/2005:23:54:50 +0300] "POST /DIMES/server HTTP/1.1" 200 3 "-" "Java/1.4.1_03"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /news_up.gif HTTP/1.1" 200 1086 "http://www.netdimes.org/" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /community_up.gif HTTP/1.1" 200 1199 "http://www.netdimes.org/" "Mozilla
/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /datastat_up.gif HTTP/1.1" 200 1233 "http://www.netdimes.org/" "Mozilla/
4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /science_up.gif HTTP/1.1" 200 1126 "http://www.netdimes.org/" "Mozilla/4
.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
71.106.2.53 - - [15/May/2005:23:55:00 +0300] "GET /favicon.ico HTTP/1.1" 200 5694 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.
1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4"
62.179.197.156 - - [15/May/2005:23:54:02 +0300] "GET /ipmap.png HTTP/1.1" 200 4874697 "http://slashdot.org/" "Mozilla/5.0 (Wi
ndows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /Install_up.gif HTTP/1.1" 200 1219 "http://www.netdimes.org/" "Mozilla/4
.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
24.236.177.187 - - [15/May/2005:23:55:00 +0300] "GET /EVERGROW40.gif HTTP/1.1" 200 4089 "http://www.netdimes.org/" "Mozilla/4
.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
195.252.52.155 - - [15/May/2005:23:55:00 +0300] "GET /science_down.gif HTTP/1.1" 200 1138 "http://www.netdimes.org/science.ht
ml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; sv-SE; rv:1.7.6) Gecko/20050318 Firefox/1.0.2"
[root@jupiter httpd]# grep "GET / " access_log |tail -10
68.54.223.47 - - [19/May/2005:12:36:20 +0300] "GET / HTTP/1.1" 200 14067 "-" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
132.76.80.118 - - [19/May/2005:12:49:44 +0300] "GET / HTTP/1.1" 304 "http://www.eng.tau.ac.il/~shavitt/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
1.1.4322)"
24.169.148.213 - - [19/May/2005:13:06:58 +0300] "GET / HTTP/1.1" 200 14067 "-" "Mozilla/5.0
(Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4"
84.170.181.64 - - [19/May/2005:13:07:14 +0300] "GET / HTTP/1.1" 200 14067
"http://www.google.de/search?hl=de&q=dimes&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows
NT 5.1; SV1)"
130.240.136.220 - - [19/May/2005:13:07:25 +0300] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
81.72.13.30 - - [19/May/2005:13:11:00 +0300] "GET / HTTP/1.1" 200 14067
"http://www.miranet.it/php/Articolo.php?id=708" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
194.78.199.123 - - [19/May/2005:13:13:44 +0300] "GET / HTTP/1.1" 200 14067 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)"
82.152.182.12 - - [19/May/2005:13:23:10 +0300] "GET / HTTP/1.1" 200 14067 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1)"
80.119.126.44 - - [19/May/2005:13:38:08 +0300] "GET / HTTP/1.1" 200 14067 "-" "Mozilla/5.0 (Windows;
U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4"
80.250.186.101 - - [19/May/2005:13:46:14 +0300] "GET / HTTP/1.1" 200 14067
"http://distributed.ru/forum/?a=topic&topic=583" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.7.8) Gecko/20050511 Firefox/1.0.4"
Example of Log Analysis
June 5th, 2005
Webalizer
access analysis
MultiQ
• Analyzing incoming packet streams
• Gaps between packets are used to calculate
bottleneck link speeds
• Multiple bottlenecks can be inferred
M&M, MIT
How does it work?
• 50% of traffic is comprised of 1500B packets
• Behavior at the second bottleneck:
Effect on dist.:
Keep b.n. gap
“white noise”
Gap shifts reveal
2nd bottleneck
Three bottlenecks with one strike
Nettimer:
A Tool for Measuring Bottleneck Link Bandwidth
Kevin Lai and Mary Baker,
"Nettimer: A Tool for Measuring
Bottleneck Link Bandwidth",
USENIX Symp. on Internet Tech.
and Sys., March 2001.
Topology Discovery
from IP to AS routes
C:\>tracert www.fer.hr
Tracing route to www.fer.hr [161.53.72.111]
over a maximum of 30 hops:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<1
19
17
21
19
20
69
82
101
105
117
113
120
114
120
114
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
20
22
19
23
20
69
82
98
105
112
115
122
112
119
114
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
19
20
19
18
20
69
82
98
105
113
115
123
119
119
113
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
private network
192.168.200.254
vxr.tau.ac.il [132.66.8.10]
Tel Aviv Uni.
c6509.tau.ac.il [132.66.8.20]
tel-aviv.tau.ac.il [132.66.4.1]
AS378
gp1-tau-fe.ilan.net.il [128.139.191.70] ILAN MACHBA
iucc.il1.il.geant.net [62.40.103.69]
il.it1.it.geant.net [62.40.96.154]
it.ch1.ch.geant.net [62.40.96.33]
DANTE
ch.at1.at.geant.net [62.40.96.1]
at.hu1.hu.geant.net [62.40.96.178]
AS20965
hu.hr1.hr.geant.net [62.40.96.145]
GEANT
carnet-gw.hr1.hr.geant.net [62.40.103.218]
193.198.228.6
HR-ZZ
193.198.229.10
CARnet
161.53.16.14
CARnet
duality.cc.fer.hr [161.53.72.111]
AS2108
Trace complete.
378
20965
2108
How to map IP to AS?
• BGP announcements
• Use public databases:
– Internet Routing Registry (IRR), http://www.irr.net
– whois servers
• Commercial databases
– MaxMind, etc.
• Problem: incomplete and out-of-date
• Due to acquisitions, mergers, break-ups of institutions
What is the AS level traceroute?
A-B-C
A
B
C
C-B-A
A
B
C
Are A and C neighbor ASes?
What AS does the middle router belong to, B or C?
The Internet Structure
routers
The Internet Structure
The AS graph
The Internet Structure
The AS graph
The PoP level graph
Delay Measurements
Minimum delay of a link
C:\>tracert www.fer.hr
Tracing route to www.fer.hr [161.53.72.111]
over a maximum of 30 hops:
Negative
Link
Min.
delay
0
1
<1 ms
<1 ms
<1 ms 192.168.200.254
19
19
2
19 ms
20 ms
19 ms vxr.tau.ac.il
[132.66.8.10]
-2
17
3
17 ms
22 ms
20 ms c6509.tau.ac.il
[132.66.8.20]
2
19
4
21 ms
19 ms
19 ms tel-aviv.tau.ac.il
[132.66.4.1]
-1
18
5
19 ms
23 ms
18 ms gp1-tau-fe.ilan.net.il
[128.139.191.70]
2
20
6
20 ms
20 ms
20 ms iucc.il1.il.geant.net
[62.40.103.69]
49
69
7
69 ms
69 ms
69 ms il.it1.it.geant.net
[62.40.96.154]
13
82
8
82 ms
82 ms
82 ms it.ch1.ch.geant.net
[62.40.96.33]
16
98
9
101 ms
98 ms
98 ms ch.at1.at.geant.net
[62.40.96.1]
7
105
10
105 ms
105 ms
105 ms at.hu1.hu.geant.net
[62.40.96.178]
7
112
11
117 ms
112 ms
113 ms hu.hr1.hr.geant.net
[62.40.96.145]
1
113
12
113 ms
115 ms
115 ms carnet-gw.hr1.hr.geant.net
[62.40.103.218]
7
120
13
120 ms
122 ms
123 ms 193.198.228.6
2
112
14
114 ms
112 ms
119 ms 193.198.229.10
7
119
15
120 ms
119 ms
119 ms 161.53.16.14
-6
113
16
114 ms
114 ms
113 ms duality.cc.fer.hr
[161.53.72.111]
Trace complete.
delays
A delay of a link inside TAU
Link Delay Measurements Histogram
5000
4500
4000
Distribution of the delay
among 1 ms bins
3500
3000
2500
2000
1500
1000
500
0
-150
-100
-50
negative delay
0
50
Link delay [ms]
100
150
200
250
Auto-Correlation Histogram
Histogram AutoCorrelation
10
10
autocorr of new histogram
autocorr of "sampled" histogram
9
10
8
10
7
10
6
10
5
10
4
10
3
10
-300
-200
Why periodic?
-100
0
AutoCorrelation lag
100
200
300
Maybe something wrong with the
code?
int gettimeofday(struct timeval* tv, struct timezone *tz)
{
if(!tv)
return -1;
struct _timeb timebuffer; millisecond
accuracy
_ftime(&timebuffer);
translate to
seconds
tv->tv_sec = timebuffer.time;
tv->tv_usec = timebuffer.millitm * 1000 + 500;
return 0;
}
New vs. Old timing routines
4
2.5
Link delay measurements histogram
x 10
Old Version
2
1.5
1
0.5
0
-100
-80
-60
-40
-20
0
Delay [ms]
20
40
60
80
100
6000
New Version
5000
4000
3000
2000
1000
0
-100
-80
-60
-40
-20
0
Delay [ms]
20
40
60
80
100
Auto-Correlation Histogram
Histogram AutoCorrelation
10
10
autocorr of new histogram
autocorr of "sampled" histogram
9
10
8
10
7
10
6
10
5
10
4
10
3
10
-300
-200
Why periodic?
-100
0
AutoCorrelation lag
100
200
300
How to define distance between ASes?
Maybe the same as between nodes?
• The distance between two ASes will be the distance
between the two border routers connecting them
AS 378
20ms
17ms
AS 1248
26ms
40ms
14ms
35ms
AS 701
89ms
79ms
?
91ms
from IP to AS routes
C:\>tracert www.fer.hr
Tracing route to www.fer.hr [161.53.72.111]
over a maximum of 30 hops:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<1
19
17
21
19
20
69
82
101
105
117
113
120
114
120
114
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
<1
20
22
19
23
20
69
82
98
105
112
115
122
112
119
114
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
Trace complete.
<1
19
20
19
18
20
69
82
98
105
113
115
123
119
119
113
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
ms
private network
192.168.200.254
vxr.tau.ac.il [132.66.8.10]
c6509.tau.ac.il [132.66.8.20] Tel Aviv Uni.
tel-aviv.tau.ac.il [132.66.4.1]
AS378
gp1-tau-fe.ilan.net.il [128.139.191.70]
ILAN MACHBA
iucc.il1.il.geant.net [62.40.103.69]
il.it1.it.geant.net [62.40.96.154]
it.ch1.ch.geant.net [62.40.96.33]
DANTE
ch.at1.at.geant.net [62.40.96.1]
at.hu1.hu.geant.net [62.40.96.178]
AS20965
hu.hr1.hr.geant.net [62.40.96.145]
GEANT
carnet-gw.hr1.hr.geant.net [62.40.103.218]
193.198.228.6
HR-ZZ
193.198.229.10
CARnet
161.53.16.14
CARnet
duality.cc.fer.hr [161.53.72.111]
AS2108
2ms
378
20965
2108
GEANT 2
DIMES AS distance definition (1)
• Define the following distances:
– MaxAS(n) – the maximum delay to a node in AS n.
– MinAS(n) – the minimum delay to a node in AS n.
• For AS edge (src,dest) define the distances:
MinASEdge(src,dest) = MinAS(dest)-MaxAS(src)
MaxASEdge(src,dest) = MaxAS(dest)-MaxAS(src)
– All distances are positive.
• Define: ASDiameter(n) = MaxAS(n)– MinAS(n)
DIMES AS distance definition (2)
AS 378
20ms
17ms
AS 1248
26ms
40ms
35ms
AS 701
89ms
79ms
91ms
MinASEdge(378,1248) = 9ms
MinASEdge(1248,701) = 1ms (non negative.)
MaxASEdge(378,1248) = 63ms
MaxASEdge(1248,701) = 56ms
DIMES AS Diameter definition
AS 378
20ms
17ms
diameter = 9ms
AS 1248
26ms
40ms
35ms
AS 701
89ms
diameter = 54ms
79ms
91ms
diameter = 12ms
Measurement Projects
ETOMIC
(Evergrow Traffic Observatory Measurement InfrastruCture)
http://www.etomic.org
• Active precise one-way delay measurement.
• Specialized hardware.
• With packet train techniques one can
– Estimate available bandwidth
– Bottleneck capacity
– Perform network tomography
• 18 boxes were deployed in Europe.
• More have been deployed this year
ETOMIC Deployment
Hardware Structure
• A PC with a
– DAG card
• high precision sampling hardware
• high precision packet train generation
– GPS connection
• For synchronized timing
The GPS module
• Garmin 35HVS GPS
reciever
• 1 s PPS signal
• RS 232 – RS 422
converter – max 100m
cable
Serial port
PC
RS422->232
DAG PPS
GPS
RS232->422
The Endace DAG 3.6GE card
• PCI bus – 32 bit 33 Mhz
• Single port full packet
capture at 10/100/1000
Mbit/s
• Precise timestamping
• Burst of patterned traffic
generator – sending
special packets at
10/100/1000 Mbit/s
ATOMIC -> ANME
Skitter
http://www.caida.org/tools/measurement/skitter
• Primarily intended to be used to measure
forward IP paths (each ‘hop’) from a source
to many destinations.
• traceroute based
• Based on FreeBSD box with kernel
modification for timestamp accuracy.
• Deployment: 20-30 skitter hosts, worldwide
(Half in the USA).
Skitter Goals
• Measure Forward IP Paths
skitter records each hop from a source to many destinations. by
incrementing the "time to live" (TTL) of each IP packet header and
recording replies from each router (or hop) leading to the destination
host.
• Measure Round Trip Time
skitter collects round trip time (RTT) along with path (hop) data.
skitter uses ICMP echo requests as probes to a list of IP destinations.
• Track Persistent Routing Changes
skitter data can provide indications of low-frequency persistent routing
changes. Correlations between RTT and time of day may reveal a
change in either forward or reverse path routing.
• Visualize Network Connectivity
By probing the paths to many destinations IP addresses spread
throughout the IPv4 address space, skitter data can be used to visualize
the directed graph from a source to much of the Internet.
Skitter Visualization
2003:
12,517 node
35,334 edges
RTT and loss plot
Archipelago (Ark)
• 43 monitors
– 3 commercial
• IPv4 & IPv6
• 25th, 50th, and 75th
percentiles
RouteViews
http://www.routeviews.org
• Peers with ~70 ASes (mostly backbones) to
collect their BGP paths
• The largest and most reliable source of AS
level routing and interconnectivity.
Animating BGP Routing
BGP Routing Table Growth
NLANR
http://www.nlanr.net
• The National Laboratory for Applied Network
Research (NLANR)
• Lots of measurement data
• Active Measurement Project (AMP)
– ~150 universities with high performance connection
measure to each other.
• Passive Measurement and Analysis (PMA)
– Header taken daily from OC3 - OC48 speed links.
Ono
• A plugin for the Vuze (Azureus)
BitTorrent Client
• 3.5M measurements a day
• Over 3000 ASes a year
– Few hundreds of measurement per client
– Measure only to other clients
iPlane
An Information Plane for Distributed Services
• Performs traceroutes from PlanetLab nodes
and traceroute servers to construct a router
interface-level Internet map.
• Clustering interfaces into PoPs
– Based on TTL response time
• Latency prediction
PlanetLab
• 1080 nodes over 496 (academic) sites
• Bare bone machines. Load your own tool.
• Host various measurement projects:
– DIMES
– iPlane
– ScriptRoute: (flexible scripts)
Scamper
A tool for network measurement
• IPv4 & IPv6
• Parallel measurements
• Measurement rate control
• Measurement type: UDP, ICMP, TCP,
UDP-paris, and ICMP-paris.
– By default, UDP is used.