slides - Duke Workshop on Sensing and Analysis of High
Download
Report
Transcript slides - Duke Workshop on Sensing and Analysis of High
On a New Internet Traffic
Matrix (Completion) Problem
Walter Willinger
AT&T Labs–Research
Local Traffic Matrices
• At an individual router
– Gives traffic volumes
(number of bytes per time
unit: 5 min, 1 hour, 1 day)
between every input port and
output port on a router
– Typical routers have a small
number of ports, from 16 to
at most 256
• Available measurements
– Netflow-enabled routers
provide direct measurements
– Routing data
– No need for inference!
2
Abilene Router (Washington, D.C.)
3
Local TM (Washington, D.C., 9/1/06)
4
Top 6 Local TM Elements (Wash. PoP)
5
Intra-Domain Traffic Matrices
• For an individual network
– Gives traffic volumes
(number of bytes per time
unit: 5 min, 1 hour, 1 day)
between every ingress
router/PoP and egress
router/PoP in a network
– Some of the larger
networks can have 1000’s
of routers or 100’s of PoPs
• Available measurements
– SNMP data provide
indirect measurements
(per link)
– Routing data
6
Intra-Domain TM Inference Problem
• Network-wide availability of SNMP data
(link loads)
• Relying only on SNMP data, solve
AX=Y
A: routing matrix; Y: link measurements
• In real networks, this is a massively
underconstrained problem
• Active area of research in 2000-2010
– Zhang, Roughan, Duffield, and Greenberg (2003)
– Zhang, Roughan, Lund, and Donoho (2003, 2005)
7
Intra-Domain TM Inference Problem
8
• Applications
–
–
–
–
–
Network engineering (capacity planning)
Traffic engineering (what-if scenarios)
Anomaly detection
Enormously useful for daily network operations
Textbook example of theory impacting practice
• Things changed around 2010 …
– Netflow-enabled routers are now deployed
network-wide and provide direct measurements
– Can measure the intra-domain TM directly!
– Inference approach is no longer needed!
Example: Abilene Network
• High speed
Education
Network
• 28 links
• 10 Gbps Capacity
on each link
• 11 Points of
Presence (POPs)
with NetFlow
measurement
capabilities
Abilene Traffic Matrix (9/1/06)
10
Top 12 Abilene TM Elements (1 week)
11
Intra-Domain TM: Open Problems
• Synthesis of realistic TMs
– Can’t be agnostic about the underlying
network!
– What information about the underlying
network is needed?
• Network-related root causes for observed
properties of measured TMs
– Low-rank, deviations from low-rank
– Sparsity
• Which measurements are more critical
than others for my network?
12
What can Intra-Domain TMs tell us?
• How much of the traffic that enters my
network in NYC is destined for ATL (per
hour, per day)?
• How much of the daily traffic on my
network is coming from (which) CDNs?
• How much of the hourly traffic that
enters my network in NYC and is destined
to ATL is coming from Netflix?
• How much traffic does my network carry
(per hour, per day)?
13
A Different Set of Questions
14
• How much traffic do Sprint and Verizon
exchange with one another (per hour, day)?
• How much traffic does Verizon get from
Netflix (per day, month)?
• What are the networks that exchange the
most traffic with Google?
• How much does Facebook’s traffic increase
on a monthly basis?
• How much traffic does the Internet carry
per day?
New Problem: Inter-Domain TM
• The Internet is a “network of networks”
– Individual networks are also called Autonomous
Systems (ASes)
– Today’s Internet consists of about ~30K-40K
actively routed ASes
– We are getting a clearer picture of the AS-level
topology (i.e., which networks exchange routing
information with one another and hence presumably
also IP traffic)
• Inter-domain (or AS-level) traffic matrix
– Gives traffic volumes between ASes
– Completely unknown …
15
Inter-Domain TM: Highly Structured
• Some numbers …
– In 2010 the Internet carried some 20 EB/month
– In late 2009, AT&T carried some 20PB/day in 2009
– There are some 20 AT&T-like large transit providers
in today’s Internet
• Some caveats …
– Large transit providers use multiple networks to run
their business (e.g., Verizon has some 230 ASes)
– Need to know how to map ASes to companies
16
On Inter-Domain TM Completion
17
• Today’s formulation
– About 1% of the inter-domain TM elements are
responsible for a majority of all the traffic
– Inter-domain TM has low rank (does it?)
– (Non)standard TM completion problem
• Towards tomorrow’s formulation
– How to insist on strong validation criteria?
– What sort of new measurements are feasible
and can be used to check the validity of a
solution to today’s formulation of the interdomain TM completion problem?
Internet eXchange Points (IXPs)
Content
Content
AS2
Provider 1
AS1
Provider 2
layer-2 switch
AS5
AS4
AS3
Inter-Domain TM and IXPs
• Some numbers …
– There are some 300 IXPs worldwide that see
some 10-20% of all Internet traffic
– They involve some 4K ASes
– Most IXPs publish their hourly/daily total
traffic volume
– We are getting more and more accurate
peering matrices for these 300 IXPs
• New Twist …
– How to infer the local TM at each IXP?
– How to measure the local TM at each IXP?
19
Back to Inter-Domain TM Completion
20
• Tomorrow’s formulation
– Start with today’s formulation
• Accounts for large transit providers
– Incorporate IXP-specific information
• Accounts for large content providers
– New (non)standard TM completion problem
• … and repeat
– What other sources of new measurements?
– Promising candidates: CDNs (Akamai & co.)
– What types of measurements are more critical
than others?
Summary
21
• Intra-domain TM research
– Beautiful example of innovative research with
enormous practical benefits for network operators
– The intra-domain TM of an AS is a basic ingredient for
a first-principles approach to understanding the AS’s
router-level topology (forget “Network Science” …)
– Reminder that “change changes things”
• Inter-domain TM research
– Enormous practical value
– Adds new twist to generic matrix completion problem
– The inter-domain TM as critical ingredient for a firstprinciples approach to understanding the Internet’s
AS-level topology (TBD)