Evaluation of the Proximity between Web Clients and their

Download Report

Transcript Evaluation of the Proximity between Web Clients and their

Evaluation of the Proximity
between Web Clients and their
Local DNS Servers
Z. Morley Mao
UC Berkeley ([email protected])
C. Cranor, M. Rabinovich, O. Spatscheck, and J. Wang
AT&T Labs-Research
F. Douglis
IBM Research
Motivation

Content Distribution Networks (CDNs)

Attempt to deliver content from servers close to
users
Origin servers
Cache server
Internet
Cache server
Clients
Cache server
Clients
DNS based server selection

Originator problem

Assumes that clients are close to their local
DNS servers
Authoritative DNS server
ns.service.com
www.service.com?
www.service.com?
ns.service.com
Client.myisp.net
Local DNS Server
ns.myisp.net
A.GTLD-SERVERS.NET
Verify the assumption that clients are close to
their local DNS servers
Measurement setup

Three components

www.att.com
1x1 pixel embedded transparent GIF
image


A specialized authoritative DNS server


Allows hostnames to be wild-carded
An HTTP redirector


1x1 transparent GIF
<img src=http://xxx.rd.example.com/tr.gif
height=1 width=1>
Always responds with “302 Moved
Temporarily”
Redirect to a URL with client IP address
embedded
Embedded image request
sequence
1. HTTP GET request for the image
Client
[10.0.0.1]
2. HTTP redirect to
IP10-0-0-1.cs.example.com
Redirector for
xxx.rd.example.com
Content server for the image
4. Request to resolve IP10-0-0-1.cs.example.com
Local DNS server
5. Reply: IP address of content server
Name server for
*.cs.example.com
Measurement Data
Site
Participant
1
2,3
4
att.com
Personal pages
(commercial domain)
AT&T research
5-7
University sites
8-19
Personal pages
(university domain)
Image hit
count
20,816,927
Duration
1,743
212,814
3 months
3 months
4,367,076
3 months
26,563
3 months
2 months
Measurement statistics
Data type
Unique client-LDNS associations
HTTP requests
Unique client IPs
Unique LDNS IPs
Client-LDNS associations where
Client and LDNS have the same IP address
Count
4,253,157
25,425,123
3,234,449
157,633
56,086
Proximity metrics:




AS clustering
Network clustering
Traceroute divergence
Roundtrip time correlation
AS clustering

Autonomous System (AS)


A single administrative entity with unified
routing policy
Observes if client and LDNS belong to
the same AS
Network clustering




[Krishnamurthy,Wang sigcomm00]
Based on BGP routing information using
the longest prefix match
Each prefix identifies a network cluster
Observes if client and LDNS belong to
the same network cluster
Traceroute divergence
Probe machine
a
•[Shaikh et al. infocom00]
•Use the last point of
divergence
b
1
2
3
1
•Traceroute divergence:
Max(3,4)=4
2
3
4
client
Local DNS server
Roundtrip time correlation



Correlation between message roundtrip
times from a probe site to the client and
its LDNS server
The probe site represents a potential
cache server location
A crude metric, highly dependent on
the probe site
Aggregate statistics of
AS/network clustering
Metrics

AS clustering
# client
clusters
9,215
# LDNS
clusters
8,590
Total #
clusters
9,570
Network clustering
98,001
53,321
104,950
More than 13,000 ASes


Close to 75% total ASes
440,000 unique prefixes

Close to 25% of all possible network clusters
 We have a representative data set
Proximity analysis:
AS, network clustering
Metrics
Client IPs
HTTP requests
AS cluster
64%
69%
Network cluster
16%
24%




AS clustering: coarse-grained
Network clustering: fine-grained
Most clients not in the same routing entity as
their LDNS
Clients with LDNS in the same cluster slightly
more active
Proximity analysis:
Traceroute divergence

Probe sites:





NJ(UUNET), NJ(AT&T), Berkeley(Calren),
Columbus(Calren)
Sampled from top half of busy network clusters
Median divergence: 4
Mean divergence: 5.8-6.2
Ratio of common to disjoint path length

72%-80% pairs traced have common path at least
as long as disjoint path
Improved local DNS
configuration

For client-LDNS associations not in the
same cluster, do we know a LDNS in the
client’s cluster?
Metrics
Client IPs
Original Improved
HTTP requests
Original Improved
AS cluster
64%
88%
69%
92%
Network cluster
16%
66%
24%
70%
Impact on commercial CDNs

Data set



Client-LDNS associations
LDNS-CDN associations
Available CDN servers
Client w/ CDN server
in cluster
Verifiable clients:
w/ responsive
LDNS
Misdirected clients:
directed to a cache
not in client’s cluster
Clients with LDNS
not in same cluster
Impact on commercial CDNs
AS clustering
CDN
CDN X
CDN Y
CDN Z
Clients with CDN server in
cluster
1,679,515
1,215,372
618,897
Verifiable clients
1,324,022
961,382
516,969
Misdirected clients
(% of verifiable clients)
809,683
(60%)
752,822
(77%)
434,905
(82%)
Clients with LDNS not in
client’s cluster
(% of misdirected clients)
443,394
354,928
262,713
(55%)
(47%)
(60%)
Impact on commercial CDNs
Network clustering
Less than 10% of all clients
CDN
CDN X
CDN Y
CDN Z
Clients with cache server
in cluster
264,743
156,507
103,448
Verifiable clients
221,440
132,567
90,264
Misdirected clients
(% of verifiable clients)
154,198
(68%)
125,449
(94%)
87,486
(96%)
Clients with LDNS not in
client’s cluster
(% of misdirected clients)
145,276
116,073
84,737
(94%)
(93%)
(97%)
Conclusion

Novel technique for finding client and local
DNS associations


DNS based server selection works well for
coarse-grained load-balancing



Fast, non-intrusive, and accurate
64% associations in the same AS
16% associations in the same network cluster
Server selection can be inaccurate if server
density is high
Related work

Measurement methodology
1.
IBM (Shaikh et al.)

2.
Univ of Boston (Bestavros et al.)



Assigning multiple IP addresses to a Web server
Differences from our work:

3.
Time correlation of DNS and HTTP requests from DNS
and Web server logs
Our methodology: efficient, accurate, nonintrusive
Web bugs
Proximity metrics

Cisco’s Boomerang protocol: uses latency from
cache servers to the LDNS