Internet structure

Download Report

Transcript Internet structure

Principles in Communication Networks
• Instractor: Prof. Yuval Shavitt,
– Office hours: room 303 s/w eng. bldg., Tue 14:0015:00
• Prerequisites (‫)דרישות קדם‬:
– Introduction to computer communications (TAU,
Technion, BGU)
• Expectations from students:
–
–
–
–
probability
Queueing theory basics
Graph theory
Some programming skills
Course Syllabus (tentative)
• Internet structure
• Introduction to switching, router
types
• Use of Gen. Func.: HOL analysis,
TCP analysis.
• Matching algorithms and their
analysis
• CLOS networks: non-blocking
theorem, routing algorithms and
their analysis
• Event simulators – introduction
• Scheduling algorithms: WFQ,
W2FQ, priorities
• Distributed algorithms
Grade composition
•
•
•
•
Final exam
Paper presentation (20-30 minutes)
Critical review of a paper (best of two)
Home assignments (2-3)
Routing in the Internet
Routing in the Internet
Routing in the Internet is done in three levels:
– In LANs in the MAC layer:
• Spanning tree protocol for Ethernet Transparent bridge.
• Source routing for token rings
• Inside autonomous systems (ASes):
– RIP, OSPF, IS-IS, (E)IGRP
• Between ASes:
– BGP
Autonomous Systems
• Autonomous Routing Domains: A collection of
physical networks glued together using IP, that have
a unified administrative routing policy.
• An AS is an autonomous routing domain that has
been assigned a number.
… the administration of an AS appears to other ASes to
have a single coherent interior routing plan and presents a
consistent picture of what networks are reachable through it.
RFC 1930: Guidelines for creation, selection,
and registration of an Autonomous System
Internet Hierarchical Routing
C.b
a
Host
h1
C
b
A.a
Inter-AS
routing
between
A and B
A.c
a
d
c
b
A
Intra-AS routing
within AS A
B.a
a
c
B
Host
h2
b
Intra-AS routing
within AS B
Why different Intra- and Inter-AS routing ?
Policy:
• Inter-AS: admin wants control over how its traffic
routed, who routes through its net.
• Intra-AS: single admin, so no policy decisions
needed
Scale:
• hierarchical routing saves table size, reduced
update traffic
Performance:
• Intra-AS: can focus on performance
• Inter-AS: policy may dominate over performance
RIP
• A distance-vector protocol – (distributed
Bellman Ford)
• Developed in the 80s based on a Xerox
protocol
• RIP-2 is now often used due to its
simplicity
• Distance metric: minimum hop
OSPF / IS-IS
• Link state protocol – each node see the
entire network map and calculate shortest
paths using Dijksrta algorithm.
• Allows two level of hierarchy
• Authentication
• Complex
• IS-IS gain popularity among large ISPs
The structure of the Internet
How are routers connected?
• Why should we care?
– While communication protocols will work
correctly on ANY topology
– ….they may not be efficient for some
topologies
– Knowledge of the topology can aid in
optimizing protocols
The Internet as a graph
• Remember: the Internet is a collection of
networks called autonomous systems (ASs)
• The Internet graph:
– The AS graph
• Nodes: ASs, links: AS peering
– The router level graph
• Nodes: routers, links: fibers, cables, MW channels, etc.
– There are mid-level aggregation schemes
• How does it looks like?
Random graphs in Mathematics
The Erdös-Rényi model
• Generation:
– create n nodes.
– each possible link is added with probability p.
• Number of links: np
• If we want to keep the
number of links linear,
what happen to p as
n?
Poisson distribution
The Waxman model
• Integrating distance with the E-R model
• Generation
– Spread n nodes on a large enough grid.
– Pick a link uar and add it with prob. that
exponentially decrease with its length
– Stop if enough links
• Heavily used in the 90s
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100
1999
The Faloutsos brothers
• Measured the Internet
AS and router graphs.
• Mine, she looks
different!
Notre Dame
• Looked at complex
system graphs: social
relationship, actors,
neurons, WWW
• Suggested a dynamic
generation model
The Faloutsos Graph
1995 Internet router topology
3888 nodes, 5012 edges, <k>=2.57
SCIENCE CITATION INDEX
Nodes: papers
Links: citations
25
Witten-Sander
PRL 1981
1736 PRL papers (1988)
2212
P(k) ~k-
( = 3)
(S. Redner, 1998)
Sex-web
Nodes: people (Females; Males)
Links: sexual relationships
4781 Swedes; 18-74;
59% response rate.
Liljeros et al. Nature 2001
Web power-laws
SCALE-FREE NETWORKS
(1) The number of nodes (N) is NOT fixed.
Networks continuously expand by
the addition of new nodes
Examples:
WWW : addition of new documents
Citation : publication of new papers
(2) The attachment is NOT uniform.
A node is linked with higher probability to a node
that already has a large number of links.
Examples :
WWW : new documents link to well known sites
(CNN, YAHOO, NewYork Times, etc)
Citation : well cited papers are more likely to be cited again
(1) GROWTH :
Scale-free model
At every timestep we add a new node with m edges
(connected to the nodes already present in the system).
(2) PREFERENTIAL ATTACHMENT :
The probability Π that a new node will be connected to
node i depends on the connectivity ki of that node
ki
 ( ki ) 
 jk j
P(k) ~k-3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
The Faloutsos Graph
node degree for AS20000102.m
4
10
3
10
2
10
1
10
0
10
0
10
1
10
2
10
3
10
4
10
Back to the Internet
• Understanding its structure and dynamics
– help applications (WWW, file sharing)
– help improving routing
– predict Internet growth
• So lets look at the data….
…Data?
• The Internet is an engineered system, so
someone must know how it is built, no?
• NO! It is an uncoordinated interconnection
of Autonomous Systems
(ASes=networks).
• No central database about Internet
structure.
• Several projects attempt to reveal the
structure: Skitter, RouteViews, …
The Internet Structure
routers
The Internet Structure
The AS graph
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Diminishing return!

Deploying more
boxes does not
pay-off
7 new links
30 new links
NO new links
Revealing the Internet Structure
To obtain the ‘horizontal’
links we need strong
presence in the edge
What is DIMES?
DIMES
• Distributed Internet measurement and monitoring
– Based on software agents downloaded by volunteers
• Diminishing return?
– Software agents

– The cost of the first agent is very high
– each additional agent costs almost zero
• Capabilities
– Obtaining Internet maps at all granularity level
• connectivity, delay, loss, bandwidth, jitter, ….
– Tracking the Internet evolution in time
– Monitoring the Internet in real time
DIMES
Correlating the Internet with the World:
Geography, Economics, Social Sciences
The Internet as a complex system:
static and dynamic analysis
Distributed System Design:
Obtaining the Internet Structure
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when
you combine more and more points of
view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when
you combine more and more points of
view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminish … shminimish
How many ASes see an edge?
~9000/6000
are seen
only by one
real world
complex system
Distributed System
Challenges
• It’s a distributed systems:
– Measurement traffic looks
malicious
• Flying under the NOC radar screens
(Agents cannot measure too much)
– Optimize the architecture:
• Minimize the number of measurements
• Expedite the discovery rate
• BUT agents are
– Unreliable
– Some move around
real world
complex system
Distributed System
Agents
• To be able to use agents wisely we need
agents profiles:
– Reliablility
– Location:
• Static
• Bi-homed: where mostly?
• Mobile: identify home base
– Abilities: what type of measurements can it
perform?
Agent shavitt
shavitt
9000
8000
7000
6000
Fairly stable
measurements
from Israel
Reappear in Spain
5000
2 idle weeks
4000
3000
2000
1000
0
31-Aug-04
5-Sep-04
10-Sep-04
15-Sep-04
20-Sep-04
shavitt
25-Sep-04
30-Sep-04
5-Oct-04
4
agent prinCompNet
x 10
1.8
Number of measurements
1.6
1.4
1.2
1
0.8
75
82
89
96
103
110
Days since project launched
117
124
131
138 140
Degree Distribution
DIMES+BGP (Feb 05)
12
10
Zipf plot
Pr(k)
log(degree)
8
<k>
6
4
k
DIMES+BGP (Feb 05)
2
14
0
12
log(Pr(degree))
10
8
6
4
2
0
0
2
4
6
log(degree)
8
10
12
0
2
4
6
8
log(rank)
10
12
14
16
Quantifying the Distribution
Data Set
• Data is obtained from DIMES
– Community-based infrastructure, using almost
1000 active measuring software agents
– Agents follow a script and perform ~2 probes
per minute (ICMP/UDP traceroute, ping)
– Most agents measure from a single AS (vp)
• But some (appear to) measure from more…
• Data need to be filtered to remove artifacts
– Traceroute data collected during March 2008
Filtering the data
• For each agent and each week, classify
how many networks it measured the
Internet from
Typical cases:
– ASi:15300, ASj:8
– ASi:10000, ASj:3178
– ASi:10000, ASj:412 , ASk:201
– 18000, 12, 11, 9, 9, 3, 3, 2, 2, 1, 1, 1, 1, 1,
….
Measurements Per Agent
Week 4,2008
Measurements per Network
500
Agents per Network
Filtering Results
• 96% of the agents have less than 4
different vps
• High degree ASs tend to have more
agents
• High number of measurements for all
vps degrees
Diminishing Returns?
• Barford et. al. – the utility of adding many
vps quickly diminishes
– In terms of ASes and AS-links
• Shavitt and Shir – utility indeed diminishes
but the tail is long and significant
– Tail is biased towards horizontal links
• We wish to quantify how different aspects
of AS-level topology are affected by
adding more vps
Creating topologies per VP
sort by
Topology Size
• The return (especially for AS links) does
not diminishes fast!
VP with small local
topology can contribute
many new links!
Direction of Detected Links
• For each link: Plot max adjacent AS
degree and max adjacent ASes degree
difference
High degree
difference –
indicates radial
links towards the
core
Low degree
difference – indicates
tangential links and
links between smallsize ASes
Convergence of Properties
• Taking several common AS-level graph
properties, and analyze their convergence
as local topologies are added
– Keeping the sort order by number of links
• Slow convergence indicates the need to
have broad and diverse set of vps
Density and Average Degree
Slow convergence of density and average degree – easy
to detect ASes but difficult to find all links
Power-law and Max Degree
Fast convergence of maximal
degree – core links are easily
detects
Fair convergence of
power-law exponent
Betweenness and Clustering
Fast convergence of max bc – Level3
(AS3356), a tier-1 AS is immediately
detected as having max bc
Radial links
decrease cc
Tangential links
increase cc
Revisitng Sampling Bias
• Lakhina et al. – AS degrees inferred from
traceroute sampling are biased
– ASes in vicinity to vps have higher degrees
– Power-law might be an artifact of this!
• Dall’asta et al. – no…it is quite possible to
have unbiased degrees with traceroutes
• Cohen et al. – when exponent is larger
than 2, resulting bias is neglible
Evaluating Sampling Bias
• For each AS find:
– All the vps that have it in their local topology
– The Valley-Free distance in hops
Up-hill to the core (c2p), side-ways in
the core (p2p) and down-hill from
the core (p2c)
Dataset VPs and Distances
Low degree ASes are
seen from less vps than
high-degree Ases…this
makes sense!
In our dataset, most ASes
have a vp that is only 1-2
hops away!
Average Distance per Degree
Low degree ASes are
seen from farther
vps…sampling bias?
No real bias!
•More VPs are located in high-degree ASes
•There are high-degree ASes that are seen from “far” vps
•Broad distribution – all ASes are pretty close-by to a vp!
Revisiting Diversity Bias
• What is the effect of diversity in vps geolocation and network type?
– Some infrastructures rely on academic
networks for vp distribution – does it have an
effect on the resulting topology?
• We compare iPlane and DIMES
– Classify AS into types: t1,t2, edu, comp, ix, nic
using Dimitropoulos et al.
Diversity Bias Evaluation
iPlane uses many PlanetLab nodes (edu), while DIMES resides
mostly at homes (tier-2)
Indeed DIMES have higher t2 and comp degrees and iPlane have
higher edu degrees – results are slightly biased to vps’ types!
In Search of Ground Truth
• One week is not sufficient for active
measurements
• Both iPlane and DIMES have lower
average degrees than RouteViews
– Except iPlane’s edu and ix!
– Diversity bias exists – need diverse vp
types!
Measuring Within a Network
• Comparing vp average degrees to quantify
the effect of measuring within a network
Indeed, the average degree when measuring within a network is mostly
higher (hmm…tier-1 doesn’t count cause most vps are the same!)
Conclusion
• VP distribution is important
– Number, AS type, geo-location
• AS-level graph properties are affected
– Some converge very fast
– Other converge slowly
• Community based projects have
practically unlimited growth potential!
Predicting Growth
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
The Internet Structure
The AS graph
The Internet Structure
The PoP level graph
The AS graph
What the PoP is ?
• PoP – Point of Presence of the ISP
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
– PoP level – strike the right balance
• Network size is reasonable
• Nodes are roughly the same size
• Has a good geographical grip (with some
exceptions)
• Other uses of PoP maps
– Network distance estimation
The Algorithm Input & Output
Pivot Idea: What is a graph
representation of the POP?
DIMES
a historical perspective
DIMES
• Comments in 2004 (expert meeting in UCSD)
– It will never fly
– You’ll be lucky to get 500 downloads in three
years
– You’ll never be able to clean the noise
– How will you deal with problemi (i=1,2,3,4,….)?
• Status in Feb 2009
–
–
–
–
–
Over 18,000 downloads (over 100 nations)
1200-1500 active agents every week
Measuring from over 200 ASes every week
Data is used world wide by EE, CS, Phys, Econ
The DIMES approach appears in GENI & FIRE
ARMENIA
AUSTRALIA
AUSTRIA
BELGIUM
CANADA
CHINA
CROATIA
CZECH REPUBLIC
ESTONIA
FINLAND
FRANCE
GERMANY
GREECE
GUATEMALA
IRAQ
IRELAND
ISRAEL
ITALY
JAPAN
Active Agents
Early 2008
http://www.netDimes.org