Evaluation of the Proximity betw Web Clients and their

Download Report

Transcript Evaluation of the Proximity betw Web Clients and their

Evaluation of the Proximity
between Web Clients and
their Local DNS Servers
Z. Morley Mao
Chuck Cranor, Fred Douglis,
Misha Rabinovich, Oliver Spatscheck,
and Jia Wang
Motivation –
originator problem

Originator problem


CDNs assume that clients are close to their
local DNS servers
Content Distribution Networks (CDNs)


Try to deliver content from servers close to
users
Current server selection mechanisms

Uses Domain Name System (DNS)
Verify the assumption that clients are close to
their local DNS servers
Measurement setup

Three components

1x1 pixel embedded transparent GIF image


A specialized authoritative DNS server


<img src=http://xxx.rd.example.com/tr.gif height=1
width=1>
Allows hostnames to be wild-carded
An HTTP redirector


Always responds with “302 Moved Temporarily”
Redirect to a URL with client IP address embedded
Embedded image request
sequence
1. HTTP GET request for the image
Client
[10.0.0.1]
2. HTTP redirect to
IP10-0-0-1.cs.example.com
Redirector for
xxx.rd.example.com
Content server for the image
4. Request to resolve IP10-0-0-1.cs.example.com
Local DNS server
5. Reply: IP address of content server
Name server for
*.cs.example.com
Measurement data/stats
Site
Participant
Hit count
Duration
1
att.com
20,816,927
2 months
2,3
Personal Web pages
1,743
3 months
(commercial domain)
4
Research lab
212,814
3 months
5-7
University site
4,367,076
3 months
8-19
Personal Web pages
26,563
3 months
(university domain)
Data type
Count
Client-LDNS associations
4,253,157
HTTP requests
25,425,123
Unique client IPs
3,234,449
Unique LDNS Ips
157,633
Client-LDNS associations with a common IP
56,086
Proximity metrics:

AS clustering


Network clustering



Observes if client and LDNS belong to the same AS
Network cluster based on BGP routing information using
longest prefix match
Observes if client and LDNS belong to the same network
cluster
Roundtrip time correlation

Correlation between message roundtrip times from a probe
site to the client and its LDNS server
Probe site represents a potential cache server location

A crude metric, highly dependent on the probe site

Proximity metric:
traceroute divergence (TD)
•Use the last
point of divergence
•TD=Max(3,4)=4
Probe machine

a
Sample Probe sites:
NJ(UUNET), NJ(AT&T),
Berkeley(calren), Columbus(calren)

size: 48,908 client-LDNS pairs
Median divergence: 4
Mean divergence: 5.8-6.2
Ratio of common to disjoint path length

b
1
2
3
1
About 66% pairs traced have common
path at least as long as disjoint path

2
3
4
client
Local DNS server
Proximity analysis results:
AS, network clustering





Metrics
Client IPs
HTTP requests
AS cluster
64% (88%)
69% (92%)
Network cluster
16% (66%)
24% (70%)
AS clustering: coarse-grained
Network clustering: fine-grained
Most clients not in same routing entity as their LDNS
Clients with LDNS in same cluster slightly more active
Numbers in red indicate improvement possible.
Impact on commercial CDNs
CDN (using AS clustering)
CDN X
CDN Y
CDN Z
Clients with CDN server in cluster
1,679,515
1,215,372
618,897
Verifiable clients
1,324,022
961,382
516,969
Misdirected clients
(% verifiable clients)
809,683
(60%)
752,822
(77%)
434,905
(82%)
Clients with LDNS not in client’s cluster
(% misdirected clients)
443,394
(55%)
354,928
(47%)
262,713
(60%)
CDN (using network aware clustering)
CDN X
CDN Y
CDN Z
Clients with CDN server in cluster
264,743
156,507
103,448
Verifiable clients
221,440
132,567
90,264
Misdirected clients
(% verifiable clients)
154,198
(68%)
125,449
(94%)
87,486
(96%)
Clients with LDNS not in client’s cluster
(% misdirected clients)
145,276
(94%)
116,073
(93%)
84,737
(97%)
total # clients
= 3,234,449

Verifiable
client:
A client with
LDNS in
cluster,
responding to
our request,
and has at
least one
cache server
in its cluster


Majority of
“misdirected
clients” for
NAC have
LDNS nonlocal
Conclusion



DNS based server selection works well
for coarse-grained load-balancing
Server selection can be inaccurate if
cache server density is high
Future work


Study alternatives to DNS based server
selection
Improved proximity evaluation