Transcript slides

Toward an Optimization-Driven
Framework for Designing and Generating
Realistic Internet Topologies
David Alderson, Stanford
John Doyle, Caltech
Ramesh Govindan, USC
Walter Willinger, AT&T Labs-Research
HotNets Conference, Princeton University
October 28, 2002
Internet Topology
• Many types of topology
– Router-Level graphs
• Nodes represent routers
• Links represent one-hop IP connectivity
– Autonomous System (AS) graphs
• Nodes represent ASs
• Links represent business relationships (customers, peers)
• Need for more than just “connectivity”
– Annotated links (capacities, distances, delays)
– Annotated nodes (capacities, sizes, geographic
location for routers, PoP location for ASs)
Perhaps “geometry” as a better term?
An Important Problem
Understanding the topology of the current (and future)
Internet is important for many reasons
• Design and evaluation of networking protocols
– Topology affects performance, not correctness
• Understanding large-scale network behavior
– Closed-loop feedback: topology design vs. protocols
• Is the current design a result of the dominant routing protocols?
• Or are the presently used routing protocols the result of some
prevailing design principles?
– Ability to study what-if scenarios
• Operating policy shifts
• Economic changes in the ISP market
• Implications for tomorrow’s networks
– Provisioning requirements, Traffic engineering, Operating and
management policies
A Challenging Problem
Since the decommissioning of the NSFNet in 1995, it has
been difficult to obtain comprehensive knowledge about
the topology of the Internet
• The network has grown dramatically (number of hosts,
amount of traffic, number of ISPs, etc.)
• There have been economic incentives for ISPs to maintain
secrecy about their topologies
• Measurements about topology are possible/informative…
– Traceroute data for router-level graph
– BGP data for AS graph
• …but not as clean and complete as one would hope
– Traceroute: Mercator, Skitter, RocketFuel
– BGP: Oregon RouteView, NLANR, RIPE
“Standard” Approach
1. Choose a sequence of well-understood metrics or
observed features of interest, such as
– hierarchy
– node-degree distributions
– clustering coefficients
2. Develop a method that matches these metrics
Pros:
– Always possible to obtain a good “fit” on a chosen metric.
Cons:
– Hard to choose the “right” metric. What is “right” is apt to
vary, depending on the intended use of the topology.
– A method that does a good job of matching the chosen
metric often does not fit other metrics well.
– No predictive power.
We call this approach descriptive (evocative) modeling.
An Alternative Approach
1. Identify the causal forces at work in the design and
evolution of real topologies.
2. Develop methods that generate and evolve
topologies in a manner consistent with these forces.
Pros:
–
–
–
–
Ability to generate (and design!) topologies at different
levels of hierarchy
More realistic topology
Greater predictive power
Possibly reveal some relationship with routing protocols
Cons:
–
–
Difficult to identify the causal forces
Requires careful development and diligent validation
We call this approach explanatory modeling.
Our Approach: Focus on the ISP
• Capture and represent realistic drivers of Internet
deployment and operation at the level of the single ISP
• Many important networking issues are relevant at the
level of the ISP (e.g. configuration, management,
pricing, provisioning)
• Common topologies represented in terms of ISPs
– Router-level graphs  connectivity within the ISP
– AS graphs  connectivity between ISPs
• First-Order Objective: the ability to generate a
“realistic, but fictitious” ISP topology at different levels
of hierarchy
ISP Driving Forces
• Economic Factors:
– Cost of procuring, installing, and maintaining the
necessary facilities and equipment
– Limited budget for capital expenditures
– Need to balance expenditures with revenue streams
– Need to leverage investment in existing infrastructure
– Location of customers
• Technical Factors:
– Hardware constraints (e.g. router speeds, limited #
interfaces or line cards per router)
– Level 2 Technologies (Sonet, ATM, WDM)
– Existing legacy infrastructure
– Location and availability of dark fiber
Mathematical Framework
1. Use combinatorial optimization to represent the
problem and its constraints
–
–
–
Objectives (min cost, max profitability, satisfy demand)
Constraints (equip costs/capacities, legacy infrastructure)
Parameters (pricing, provisioning, facility location)
2. Study and explore how this framework allows for a
range of ISP behavior
–
Effect of objectives and constraints are the most important
3. Part of a more general framework
–
Highly Optimized Tolerance (HOT), Carlson and Doyle, 1999
4. We expect different formulations and solutions to
problems at different levels of hierarchy
–
Illustrative toy example: Fabrikant et al, 2002
Example: Network Access Design
• Occurs at the level of the metropolitan area
• Build a topology that connects customers to backbone
access points, satisfies service requirements, and
optimizes ISP objectives
• This problem has been
studied extensively (see
paper for references)
• Standard formulations
result in tree topologies
• Variations incorporate
realistic constraints
• Buy-at-bulk design
• Path redundancy
Hierarchical Optimization
An appropriate optimization problem can be formulated
at different levels within the network hierarchy
• Local (metropolitan) access
• Regional interconnectivity
• Backbone design
Consider the problem
faced at the regional
level in interconnecting
the access points…
A Regional ISP
A Regional ISP
A Regional ISP
A National ISP
Multiple ISPs
Multiple ISPs
Validation
• We expect that different formulations give rise to different
solutions
– Example: relative emphasis on connection cost (= geographic
distance) and transmission delay (= avg hop distance from graph
core) in local access design results in widely different node degree
distributions (Fabrikant et al)
– Example: different emphasis on accounting for number and location
of PoPs in AS design results in widely different types of hierarchical
structures.
• New formulations typically motivate looking at new/different
measurements
– Can one identify and obtain appropriate measurements?
– Example (router): geographic distance by use of NetGeo database
• New/different measurements improve chances for
successful and rigorous model validation
– Getting at the casual forces at work, not data-fitting
– Opportunities for close interactions with network designers and
operators
Benefits
• Generate realistic topologies
– Annotated graphs suitable for use in simulation
• Insight into drivers and constraints affecting topology
growth
– Predictive power
– What-if scenario analysis
• Insight into large-scale behavior
– Rough understanding of basic interaction between network
design and routing protocols
• Address important economic questions
– Pricing
– Provisioning
– Peering
Why It Might Not Work
1. The optimization framework may not adequately
capture the economic and technical factors driving
topological growth
2. The real decisions made by ISPs may be neither
consistent nor rational, thus outside the realm of
abstract mathematical representation
We expect this approach at a minimum to yield insight
into what ought to be done by ISPs in considering
both economic and technical factors
Part of A Bigger Picture
(http://netlab.caltech.edu)
• This work is part of a larger effort to develop a rigorous, coherent,
and reasonably complete mathematical theory for the Internet
• This new theory addresses the performance and robustness of
– the "horizontal" decentralized and asynchronous nature of
congestion control (TCP/AQM), routing (IP), and
design/provisioning (layer 2)
– the "vertical" separation into the layers of the TCP/IP protocol
stack from the application down to the link layer
• Recent progress with "horizontal" treatment of TCP/AQM
– view and formulate TCP/AQM as a distributed primal-dual
algorithm to maximizing aggregate utility over possible rates
– robustness and stability properties of TCP/AQM dynamics as
by-products
Part of A Bigger Picture (cont.)
• First attempts toward "vertical" integration of app/transport layers
– optimality of separating the application level "source" vs the
transport level "channel"
– optimal coding of "source" (mice/elephants) and "channel"
(TCP/AQM)
– assumes fixed routing (IP layer)
• First attempts toward "vertical" integration of transport/IP layers
– view and formulate TCP-AQM/IP as a distributed primal-dual
algorithm to maximizing aggregate utility over possible rates and
routes
– NP-hard problem: TCP/AQM and shortest-path routing cannot
jointly maximize utility over both rates and routes
– assumes fixed topology (link layer)
• This work is a first attempt at a horizontal treatment of
network design/provisioning (around layer 2) and at pushing the
vertical integration down to the link layer (or maybe layer 2.5?)
Thank You
David Alderson, [email protected]
John Doyle, [email protected]
Ramesh Govindan, [email protected]
Walter Willinger, [email protected]