Internet structure
Download
Report
Transcript Internet structure
Principles in Communication Networks
• Instractor: Prof. Yuval Shavitt,
– Office hours: room 303 s/w eng. bldg., Tue 14:0015:00
• Prerequisites ()דרישות קדם:
– Introduction to computer communications (TAU,
Technion, BGU)
• Expectations from students:
–
–
–
–
probability
Queueing theory basics
Graph theory
Some programming skills
Course Syllabus (tentative)
• Internet structure
• Introduction to switching, router
types
• Use of Gen. Func.: HOL analysis,
TCP analysis.
• Matching algorithms and their
analysis
• CLOS networks: non-blocking
theorem, routing algorithms and
their analysis
• Event simulators – introduction
• Scheduling algorithms: WFQ,
W2FQ, priorities
• Distributed algorithms
Grade composition
•
•
•
•
Final exam
Paper presentation (20-30 minutes)
Critical review of a paper (best of two)
Home assignments (2-3)
Routing in the Internet
Routing in the Internet
Routing in the Internet is done in three levels:
– In LANs in the MAC layer:
• Spanning tree protocol for Ethernet Transparent bridge.
• Source routing for token rings
• Inside autonomous systems (ASes):
– RIP, OSPF, IS-IS, (E)IGRP
• Between ASes:
– BGP
Autonomous Systems
• Autonomous Routing Domains: A collection of
physical networks glued together using IP, that have
a unified administrative routing policy.
• An AS is an autonomous routing domain that has
been assigned a number.
… the administration of an AS appears to other ASes to
have a single coherent interior routing plan and presents a
consistent picture of what networks are reachable through it.
RFC 1930: Guidelines for creation, selection,
and registration of an Autonomous System
Internet Hierarchical Routing
C.b
a
Host
h1
C
b
A.a
Inter-AS
routing
between
A and B
A.c
a
d
c
b
A
Intra-AS routing
within AS A
B.a
a
c
B
Host
h2
b
Intra-AS routing
within AS B
Why different Intra- and Inter-AS routing ?
Policy:
• Inter-AS: admin wants control over how its traffic
routed, who routes through its net.
• Intra-AS: single admin, so no policy decisions
needed
Scale:
• hierarchical routing saves table size, reduced
update traffic
Performance:
• Intra-AS: can focus on performance
• Inter-AS: policy may dominate over performance
RIP
• A distance-vector protocol – (distributed
Bellman Ford)
• Developed in the 80s based on a Xerox
protocol
• RIP-2 is now often used due to its
simplicity
• Distance metric: minimum hop
OSPF / IS-IS
• Link state protocol – each node see the
entire network map and calculate shortest
paths using Dijksrta algorithm.
• Allows two level of hierarchy
• Authentication
• Complex
• IS-IS gain popularity among large ISPs
The structure of the Internet
How are routers connected?
• Why should we care?
– While communication protocols will work
correctly on ANY topology
– ….they may not be efficient for some
topologies
– Knowledge of the topology can aid in
optimizing protocols
The Internet as a graph
• Remember: the Internet is a collection of
networks called autonomous systems (ASs)
• The Internet graph:
– The AS graph
• Nodes: ASs, links: AS peering
– The router level graph
• Nodes: routers, links: fibers, cables, MW channels, etc.
– There are mid-level aggregation schemes
• How does it looks like?
Random graphs in Mathematics
The Erdös-Rényi model
• Generation:
– create n nodes.
– each possible link is added with probability p.
• Number of links: np
• If we want to keep the
number of links linear,
what happen to p as
n?
Poisson distribution
The Waxman model
• Integrating distance with the E-R model
• Generation
– Spread n nodes on a large enough grid.
– Pick a link uar and add it with prob. that
exponentially decrease with its length
– Stop if enough links
• Heavily used in the 90s
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100
1999
The Faloutsos brothers
• Measured the Internet
AS and router graphs.
• Mine, she looks
different!
Notre Dame
• Looked at complex
system graphs: social
relationship, actors,
neurons, WWW
• Suggested a dynamic
generation model
The Faloutsos Graph
1995 Internet router topology
3888 nodes, 5012 edges, <k>=2.57
SCIENCE CITATION INDEX
Nodes: papers
Links: citations
25
Witten-Sander
PRL 1981
1736 PRL papers (1988)
2212
P(k) ~k-
( = 3)
(S. Redner, 1998)
Sex-web
Nodes: people (Females; Males)
Links: sexual relationships
4781 Swedes; 18-74;
59% response rate.
Liljeros et al. Nature 2001
Web power-laws
SCALE-FREE NETWORKS
(1) The number of nodes (N) is NOT fixed.
Networks continuously expand by
the addition of new nodes
Examples:
WWW : addition of new documents
Citation : publication of new papers
(2) The attachment is NOT uniform.
A node is linked with higher probability to a node
that already has a large number of links.
Examples :
WWW : new documents link to well known sites
(CNN, YAHOO, NewYork Times, etc)
Citation : well cited papers are more likely to be cited again
(1) GROWTH :
Scale-free model
At every timestep we add a new node with m edges
(connected to the nodes already present in the system).
(2) PREFERENTIAL ATTACHMENT :
The probability Π that a new node will be connected to
node i depends on the connectivity ki of that node
ki
( ki )
jk j
P(k) ~k-3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
The Faloutsos Graph
node degree for AS20000102.m
4
10
3
10
2
10
1
10
0
10
0
10
1
10
2
10
3
10
4
10
Back to the Internet
• Understanding its structure and dynamics
– help applications (WWW, file sharing)
– help improving routing
– predict Internet growth
• So lets look at the data….
…Data?
• The Internet is an engineered system, so
someone must know how it is built, no?
• NO! It is an uncoordinated interconnection
of Autonomous Systems
(ASes=networks).
• No central database about Internet
structure.
• Several projects attempt to reveal the
structure: Skitter, RouteViews, …
The Internet Structure
routers
The Internet Structure
The AS graph
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Revealing the Internet Structure
Diminishing return!
Deploying more
boxes does not
pay-off
7 new links
30 new links
NO new links
Revealing the Internet Structure
To obtain the ‘horizontal’
links we need strong
presence in the edge
What is DIMES?
DIMES
• Distributed Internet measurement and monitoring
– Based on software agents downloaded by volunteers
• Diminishing return?
– Software agents
– The cost of the first agent is very high
– each additional agent costs almost zero
• Capabilities
– Obtaining Internet maps at all granularity level
• connectivity, delay, loss, bandwidth, jitter, ….
– Tracking the Internet evolution in time
– Monitoring the Internet in real time
DIMES
Correlating the Internet with the World:
Geography, Economics, Social Sciences
The Internet as a complex system:
static and dynamic analysis
Distributed System Design:
Obtaining the Internet Structure
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when
you combine more and more points of
view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminishing Return?
• [Chen et al 02], [Bradford et al 01]: when
you combine more and more points of
view the return diminishes very fast
• What have they missed?
– The mass of the tail is significant
No. of views
Diminish … shminimish
How many ASes see an edge?
~9000/6000
are seen
only by one
real world
complex system
Distributed System
Challenges
• It’s a distributed systems:
– Measurement traffic looks
malicious
• Flying under the NOC radar screens
(Agents cannot measure too much)
– Optimize the architecture:
• Minimize the number of measurements
• Expedite the discovery rate
• BUT agents are
– Unreliable
– Some move around
real world
complex system
Distributed System
Agents
• To be able to use agents wisely we need
agents profiles:
– Reliablility
– Location:
• Static
• Bi-homed: where mostly?
• Mobile: identify home base
– Abilities: what type of measurements can it
perform?
Agent shavitt
shavitt
9000
8000
7000
6000
Fairly stable
measurements
from Israel
Reappear in Spain
5000
2 idle weeks
4000
3000
2000
1000
0
31-Aug-04
5-Sep-04
10-Sep-04
15-Sep-04
20-Sep-04
shavitt
25-Sep-04
30-Sep-04
5-Oct-04
4
agent prinCompNet
x 10
1.8
Number of measurements
1.6
1.4
1.2
1
0.8
75
82
89
96
103
110
Days since project launched
117
124
131
138 140
Degree Distribution
DIMES+BGP (Feb 05)
12
10
Zipf plot
Pr(k)
log(degree)
8
<k>
6
4
k
DIMES+BGP (Feb 05)
2
14
0
12
log(Pr(degree))
10
8
6
4
2
0
0
2
4
6
log(degree)
8
10
12
0
2
4
6
8
log(rank)
10
12
14
16
Quantifying the Distribution
Data Set
• Data is obtained from DIMES
– Community-based infrastructure, using almost
1000 active measuring software agents
– Agents follow a script and perform ~2 probes
per minute (ICMP/UDP traceroute, ping)
– Most agents measure from a single AS (vp)
• But some (appear to) measure from more…
• Data need to be filtered to remove artifacts
– Traceroute data collected during March 2008
Filtering the data
• For each agent and each week, classify
how many networks it measured the
Internet from
Typical cases:
– ASi:15300, ASj:8
– ASi:10000, ASj:3178
– ASi:10000, ASj:412 , ASk:201
– 18000, 12, 11, 9, 9, 3, 3, 2, 2, 1, 1, 1, 1, 1,
….
Measurements Per Agent
Week 4,2008
Measurements per Network
500
Agents per Network
Filtering Results
• 96% of the agents have less than 4
different vps
• High degree ASs tend to have more
agents
• High number of measurements for all
vps degrees
Diminishing Returns?
• Barford et. al. – the utility of adding many
vps quickly diminishes
– In terms of ASes and AS-links
• Shavitt and Shir – utility indeed diminishes
but the tail is long and significant
– Tail is biased towards horizontal links
• We wish to quantify how different aspects
of AS-level topology are affected by
adding more vps
Creating topologies per VP
sort by
Topology Size
• The return (especially for AS links) does
not diminishes fast!
VP with small local
topology can contribute
many new links!
Direction of Detected Links
• For each link: Plot max adjacent AS
degree and max adjacent ASes degree
difference
High degree
difference –
indicates radial
links towards the
core
Low degree
difference – indicates
tangential links and
links between smallsize ASes
Convergence of Properties
• Taking several common AS-level graph
properties, and analyze their convergence
as local topologies are added
– Keeping the sort order by number of links
• Slow convergence indicates the need to
have broad and diverse set of vps
Density and Average Degree
Slow convergence of density and average degree – easy
to detect ASes but difficult to find all links
Power-law and Max Degree
Fast convergence of maximal
degree – core links are easily
detects
Fair convergence of
power-law exponent
Betweenness and Clustering
Fast convergence of max bc – Level3
(AS3356), a tier-1 AS is immediately
detected as having max bc
Radial links
decrease cc
Tangential links
increase cc
Revisitng Sampling Bias
• Lakhina et al. – AS degrees inferred from
traceroute sampling are biased
– ASes in vicinity to vps have higher degrees
– Power-law might be an artifact of this!
• Dall’asta et al. – no…it is quite possible to
have unbiased degrees with traceroutes
• Cohen et al. – when exponent is larger
than 2, resulting bias is neglible
Evaluating Sampling Bias
• For each AS find:
– All the vps that have it in their local topology
– The Valley-Free distance in hops
Up-hill to the core (c2p), side-ways in
the core (p2p) and down-hill from
the core (p2c)
Dataset VPs and Distances
Low degree ASes are
seen from less vps than
high-degree Ases…this
makes sense!
In our dataset, most ASes
have a vp that is only 1-2
hops away!
Average Distance per Degree
Low degree ASes are
seen from farther
vps…sampling bias?
No real bias!
•More VPs are located in high-degree ASes
•There are high-degree ASes that are seen from “far” vps
•Broad distribution – all ASes are pretty close-by to a vp!
Revisiting Diversity Bias
• What is the effect of diversity in vps geolocation and network type?
– Some infrastructures rely on academic
networks for vp distribution – does it have an
effect on the resulting topology?
• We compare iPlane and DIMES
– Classify AS into types: t1,t2, edu, comp, ix, nic
using Dimitropoulos et al.
Diversity Bias Evaluation
iPlane uses many PlanetLab nodes (edu), while DIMES resides
mostly at homes (tier-2)
Indeed DIMES have higher t2 and comp degrees and iPlane have
higher edu degrees – results are slightly biased to vps’ types!
In Search of Ground Truth
• One week is not sufficient for active
measurements
• Both iPlane and DIMES have lower
average degrees than RouteViews
– Except iPlane’s edu and ix!
– Diversity bias exists – need diverse vp
types!
Measuring Within a Network
• Comparing vp average degrees to quantify
the effect of measuring within a network
Indeed, the average degree when measuring within a network is mostly
higher (hmm…tier-1 doesn’t count cause most vps are the same!)
Conclusion
• VP distribution is important
– Number, AS type, geo-location
• AS-level graph properties are affected
– Some converge very fast
– Other converge slowly
• Community based projects have
practically unlimited growth potential!
Predicting Growth
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
The Internet Structure
The AS graph
The Internet Structure
The PoP level graph
The AS graph
What the PoP is ?
• PoP – Point of Presence of the ISP
OurGoal
• To measure the Internet evolution in time
– AS level - too coarse
– IP level - too fine
– PoP level – strike the right balance
• Network size is reasonable
• Nodes are roughly the same size
• Has a good geographical grip (with some
exceptions)
• Other uses of PoP maps
– Network distance estimation
The Algorithm Input & Output
Pivot Idea: What is a graph
representation of the POP?
DIMES
a historical perspective
DIMES
• Comments in 2004 (expert meeting in UCSD)
– It will never fly
– You’ll be lucky to get 500 downloads in three
years
– You’ll never be able to clean the noise
– How will you deal with problemi (i=1,2,3,4,….)?
• Status in Feb 2009
–
–
–
–
–
Over 18,000 downloads (over 100 nations)
1200-1500 active agents every week
Measuring from over 200 ASes every week
Data is used world wide by EE, CS, Phys, Econ
The DIMES approach appears in GENI & FIRE
ARMENIA
AUSTRALIA
AUSTRIA
BELGIUM
CANADA
CHINA
CROATIA
CZECH REPUBLIC
ESTONIA
FINLAND
FRANCE
GERMANY
GREECE
GUATEMALA
IRAQ
IRELAND
ISRAEL
ITALY
JAPAN
Active Agents
Early 2008
http://www.netDimes.org