Transcript ppt

On the Stability of Network
Distance Monitoring
Yan Chen, Chris Karlof, Yaping Li and
Randy Katz
{yanchen, ckarlof, yaping, randy}@CS.Berkeley.EDU
EECS Department
UC Berkeley
1
Introduction
• Lots of applications/services may benefit from end-to-end
distance monitoring/estimation
–
–
–
–
Mirror Selection
- VPN management/provisioning
Overlay Routing/Location
- Peer-to-peer file system
Cache-infrastructure Configuration
Service Redirection/Placement
• Problem formulation:
Given N end hosts that may belong to different administrative
domains, how to select a subset of them to be probes and
build an overlay distance monitoring service without knowing
the underlying topology?
• Solution: Internet Iso-bar
– Cluster of hosts that perceive similar performance to Internet
– For each cluster, select a monitor for active and continuous probing
– The first one for monitoring site selection and stability evaluation
with real Internet measurement data
– Compare with other distance estimation services: ID Maps, GNP
2
Related Work
• Existing Internet E2E distance estimation systems fall
in two categories:
– Clustering based (service-centric): IDMaps, Network Distance
Map, Internet Iso-bar
– Coordinate based (end host-centric): Triangulated schemes, GNP
• Pioneering work: IDMaps
– Clustering with IP address prefix (not very accurate)
– Based on triangulation inequality
– Number of hops only
- No dynamics nor stability addressed
• Network Distance Map
– Clustering based on network proximity rather than similarity
– Fixed monitors, no monitor placement/selection
• GNP:
– Each client maintains its own coordinate
– Distance estimated through certain distance function over the
coordinates
3
Framework of Internet Iso-bar
• Define correlation distance between each pair of hosts
• Apply generic clustering methods below
– Limit the diameter (max distance between any two hosts in the
cluster) of a cluster, and minimize number of clusters
– Limit the number of clusters, then minimize the max diameter of
all clusters
• Choose the center of each cluster as monitor
• Periodically monitors measure distance among each other
as well as the distance to the hosts in its cluster
• Inter-cluster distance estimation
dist(i,j) = dist(monitori, monitorj)
• Intra-cluster distance estimation (i,j has same monitor m)
dist(i,j) = (dist(i, m) + dist(j, m) ) / 2
• Inter-cluster estimation dominates
– Given K evenly distributed clusters, ratio of inter- vs. intraestimation is K-1
4
Correlation Distance
• Network distance based
– Using proximity: dij = measured network distance(pij)
– Using Euclidean distance of network distance vector:
Vi = [pi1, pi2, …, pin]T
d 
(p  p ) 2
ij

ik
1 k  n
jk
– Using cosine vector similarity of network distance
n
vector:
dij  1 
 p *p
 p * p
ik
k 1
2
k
• Geographical distance based
– Using proximity
k 1
ik
jk
i 1
jk
d ij  (long i  long j )  (lat i  lat j )
2
2
k
2
5
Properties Comparison
N: # of hosts; K: # of monitors:
AP: # of address prefix; D: # of dimensions
I: # of iterations for optimization, proportional to
# of variables, could be very large
IDMaps
Internet Iso-bar
GNP
Communi. cost
Offline setup
O(K * AP)
O(N2) for net_*
O(N) for geo_p
O(N2) for lm selection
O(K2+N*K) for random lm
Communi. cost
Online update
O(K2 + AP)
O(K2 + N)
O(K2 + N * K)
Computation cost
Offline setup
O (AP * K)
O(N * K)
Computation cost
Online update
O(1)
O(1)
O(K *N2 logN) +
O(K3 * D) * I(K * D) +
O(N * K *D) * I(D)
O(K3 * D) * I(K * D) +
O(N * K *D) * I(D)
6
Evaluation Methodology
• Experiments with NLANR AMP data set
– 119 sites on US (106 after filtering out most off sites)
– Traceroute between every pair of hosts every minute
– Clustering uses daily geometric mean of round-trip time (RTT)
– Evaluation uses daily 1440 measurement of RTT
– Raw data:
6/24/00 –
12/3/01
7
Performance & Stability Evaluation
• Compare 6 distance estimation schemes (denotations)
–
–
–
–
–
–
Clustering with proximity of network distance (net_p)
Clustering with Euclidean dist of network dist vector (net_ed)
Clustering with vector similarity of network dist vector (net_vs)
Clustering with proximity of geographical distance (geo_p)
GNP
- All schemes above have 15 clusters/landmarks
Omniscient: using the original pij to predict future pij (omni)
• Stability analysis
– Clustering / coordinates calculation with day1 (birth date)
measurement
– Compute relative predict error (rpe) using day2 (estimation date)
measurement
rpe 
predicted distance - measured distance
min(predic ted distance, measured distance)
8
Stability CDF of relative errors for 1-month (left) & 6-month (right)
Summary of 80th (left) & 90th (right) percentile relative error
9
Conclusion
• Omniscient always works the best
– RTT time overall is quite stable for the experimental sites and
period, but need further verification
– It can not report timely congestion
– It requires full n * n IP distance matrix, inapplicable to
scalability tricks, e.g. hierarchy
• GNP has better performance and stability than
clustering-based schemes
– Has much more computation & communication cost when update
• Using similarity of network distance for clustering works
much better than using proximity
• Geographical proximity based clustering works better
than network proximity based clustering
– Requires no measurement for clustering & monitor selection
– Provides reasonably good performance & stability
– But may biased with the dataset used
10
Current Work
• Congestion/Failure Correlation of Clustered Hosts
– Can Monitors report timely congestion/path outage? False-alarms?
• Evaluation with Keynote Web Site Perspective Benchmarking
(Collaboration with Dr. Chris Overton@Keynote)
– Measure Web site performance from more than 100 agents on the
Internet
– Heterogeneous core network: various ISPs
– Heterogeneous access network:
» Dial up 56K, DSL and high-bandwidth business connections
– Choose 40 most popular Web servers for benchmarking
– Problem: how to reduce the number of agents and/or servers, but
still represent the majority of end-user performance for reasonably
stable period?
11
Keynote Agent Locations
• America (including Canada, Mexico): 67 agents
– 29 cities: Houston, Toronto, LA, Minneapolis, DC, Boston, Miami, Dallas, NY,
SF, Cleveland, Philadelphia, Milwaukee, Chicago, Cincinnati, Portland,
Vancouver, Seattle, Phoneix, San Diego, Denver, Sunnyvale, McLean, Atlanta,
Tampa, St. Louis, Mexico, Kansas City, Pleasonton
– 14 ISPs: PSI, Verio, UUNET, C&W, Sprint, Qwest, Genuity, AT&T, XO,
Exodus, Level3, Intermedia, Avantel, SBC
• Europe: 25 agents
– 12 cities: London, Paris, Frankfurt, Munich, Oslo, Copenhagen, Amsterdam,
Helsinki, Milan, Stockholm, Madrid, Brussels
– 16 ISPs: PSI, Cerbernet, Oleane, Level3, ECRC, Nextra, UUNET,
TeleDanmark, KPNQwest, Inet, DPN, Xlink, Telia, Retevision, BT,
Telephonica
• Asia: 8 agents
– 6 cities: Seoul, Singapore, Tokyo, Shanghai, Hongkong, Taipei
– 8 ISPs: BORANet, SingTel, IIJ, ChinaTel, HKT, Kornet, NTTCOM, HiNet,
• Australia: 3 agents
– 3 cities: Sydney, Wellington, Melbourne
– 3 ISPs: OzeMail, Telstra-Saturn, Optus
12
Evaluation of Generic Clustering
Algorithms
• Limit-number clustering
and limit-diameter
clustering gives similar
results with Limit-number
a bit better
• Net_ed and Net_vs gives
similar results with
Net_vs a bit better
• Use Limit-number
clustering for the rest
comparison
13
Performance
Evaluation
• Static and stability
analysis in daily, tri-daily,
weekly, bi-weekly,
monthly, six-monthly
intervals
14
15