20051025-network-maltz

Download Report

Transcript 20051025-network-maltz

Rethinking Network
Control and Management
David A. Maltz
[email protected]
1
Context for Network Control and Management



Many different network environments

Access, backbone networks

Data-center networks, enterprise/campus
Many different technologies

Longest-prefix routing, label switching, circuit switching

IP, Ethernet, MPLS, optical circuits
Outsourcing of responsibility into the network


Many different policies

2
Middle-boxes: firewalls, network monitoring, …
Routing, reachability, transit, traffic engineering, robustness
ATT/CMU Study of 31 Production networks

Provider & enterprise networks (10-1200 routers)

Many different routing designs

Packet filters, multiple OSPF instances, multiple ASs
2000
Lines in
config file 1000
0
3
0
Router ID
881
Fundamental Problem: Wrong Abstractions
Shell scripts
Traffic Eng
Planning tools
Databases
Configs SNMP
OSPF
Link
metrics
OSPF
BGP
FIB
OSPF
BGP
FIB
• Figure out what is happening in
network
netflow modems • Decide how to change it
Routing
policies
OSPF
BGP
FIBPacket
filters
4
Management Plane
Control Plane
• Multiple routing processes on
each router
• Each router with different
configuration program
• Huge number of control knobs:
metrics, ACLs, policy
Data Plane
• Distributed routers
• Forwarding, filtering, queueing
• Based on FIB or labels
Inside a Single Network
Shell scripts
Management Plane
• Figure out what is
Planning tools
Databases
happening in network
Configs SNMP
netflow modems • Decide how to change it
OSPF
Control Plane
• Multiple routing processes
Link
Routing
OSPF
metrics
on each router
policies
BGP
• Each router with different
configuration program
OSPF
OSPF
• Huge number of control
BGP
BGP
FIB
knobs: metrics, ACLs, policy
FIB
Traffic Eng
FIBPacket
filters
5

Data Plane

Distributed routers

Forwarding, filtering, queueing

Based on FIB or labels
Inside a Single Network
Shell scripts
Management Plane
• Figure out what is
Planning tools
Databases
happening in network
Configs SNMP
netflow modems • Decide how to change it
OSPF
Control Plane
• Multiple routing processes
Link
Routing
OSPF
metrics
on each router
policies
BGP
• Each router with different
configuration program
OSPF
OSPF
• Huge number of control
BGP
BGP
FIB
knobs: metrics, ACLs, policy
FIB
Traffic Eng
FIBPacket
filters
6

Data Plane

Distributed routers

Forwarding, filtering, queueing

Based on FIB or labels
Control Plane: The Key Leverage Point

Great Potential: control plane determines the
behavior of the network


Reaction to events, reachability, services
Great Opportunities

Each network (administrative domain) has its own
control plane

A radical clean-slate control plane can be deployed
– Agnostic to user data format: IPv4/v6, ethernet, circuit
– No changes to end-system software

Control plane is the nexus of network evolution
– Changing the control plane logic can smooth transitions in
network technologies and architectures
7
An Alternative: The 4D Architecture


8
Key principles

Network-level objectives

Network-wide views

Direct control
Corollaries

Predictable behavior (including overload threshold)

Zero device-specific or manual configuration

Data plane support for network-wide view

Define objectives in terms of organizationally salient
entities
Good Abstractions Reduce Complexity
Management
Plane
Control
Plane
Data Plane
Configs
FIBs, ACLs
Decision
Plane
FIBs, ACLs Dissemination
Data Plane
All decision making logic lifted out of control plane
9

Eliminates duplicate logic in management plane

Dissemination plane provides robust
communication to/from data plane switches
Overview of the 4D Architecture
Network-level
objectives
Decision
Network-wide
views
Dissemination
Discovery
Direct
control
Data
Decision Plane:
10

All management logic implemented on centralized
servers making all decisions

Decision Elements use views to compute data plane
state that meets objectives, then directly writes this
state to routers
Concerns and Challenges


11
Distributed Systems issues

How will communication between routers and DEs
survive failures in the network?

Latency means DE’s view of network is behind
reality. Will the control loop be stable?

What is the overhead to/from the DEs?

What happens in a network partition?
Networking issues

Does the 4D simplify control and management?

Can we create logic to meet multiple objectives?
Evaluation of the 4D Prototype

Evaluated using Emulab (www.emulab.net)

Linux PCs used as routers (650 – 800MHz)

Tested on 9 enterprise network
topologies (10-100 routers each)
Example network with
49 switches and 5 DEs
12
Performance of the 4D Prototype
Trivial prototype has performance comparable to welltuned production networks



Recovers from single link failure in < 300 ms

< 1 s response considered “excellent”

Faster forwarding reconvergence possible
Survives failure of master Decision Element

New DE takes control within 1 s

No disruption unless second fault occurs
Gracefully handles complete network partitions

13
Less than 1.5 s of outage
Thanks!
14
Future Work



Scalability

Evaluate over 1-10K switches, 10-100K routes

Networks with backbone-like propagation delays
Structuring decision logic

Arbitrate among multiple, potentially competing objectives

Unify control when some logic takes longer than others
Protocol improvements


Deployment in today’s networks

15
Better dissemination and discovery planes
Data center, enterprise, campus, backbone (RCP)
Future Work


Expand relationships with security

Securing the infrastructure

Using 4D as mechanism for monitoring/quarantine
Formulate models that establish bounds of 4D


16
Scale, latency, stability, failure models, objectives
Generate evidence to support/refute principles
Themes of Network Control & Management
Holistic Design

Many different technologies – a few common problems

Find the right abstractions: exploit commonality
Clean Slate

How much autonomy do routers/switches need?

New principles for controlling networks

Separate networking issues from distributed system issues
Leverage Network Structure

17
Many different types of networks exist - each with different
objectives and topologies
Recent Publications
18

G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “On
Static Reachability Analysis of IP Networks,” IEEE INFOCOM 2005, Orlando, FL,
March 2005.

J. Rexford, A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, G. Xie, J. Zhan, H.
Zhang, “Network-Wide Decision Making: Toward a Wafer-Thin Control Plane,”
Proceedings of ACM HotNets-III, San Diego, CA, November 2004.

D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A. Greenberg, H. Zhang, “Routing
Design in Operational Networks: A Look from the Inside,” Proceedings of the 2004
Conference on Applications, Technologies, Architectures, and Protocols for Computer
Communications (ACM SIGCOMM 2004), Portland, Oregon, 2004.

D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G. Hjalmtysson, A. Greenberg, J. Rexford,
“Structure Preserving Anonymization of Router Configuration Data,” Proceedings
of ACM/Usenix Internet Measurement Conference (IMC 2004), Sicily, Italy, 2004.
A Clean-slate Design

What are the fundamental causes of network problems?

How to secure the network and protect the infrastructure?

What functionality needs to be distributed – what can be
centralized?


How to reduce/simplify the software in networks?

What would a “RISC” router look like?
How to leverage technology trends?

19
CPU and link-speed growing faster than # of switches
Three Principles for
Network Control & Management
Network-level Objectives:

Express goals explicitly


Security policies, QoS, egress point selection
Do not bury goals in box-specific configuration
Reachability matrix
Traffic engineering rules
Management
Logic
20
Three Principles for
Network Control & Management
Network-wide Views:

Design network to provide timely, accurate info


Topology, traffic, resource limitations
Give logic the inputs it needs
Reachability matrix
Traffic engineering rules
Management
Logic
Read state info
21
Three Principles for
Network Control & Management
Direct Control:

Allow logic to directly set forwarding state


FIB entries, packet filters, queuing parameters
Logic computes desired network state, let it implement it
Reachability matrix
Traffic engineering rules
Write state
Management
Logic
Read state info
22
Overview of the 4D Architecture
Network-level
objectives
Decision
Network-wide
views
Dissemination
Discovery
Direct
control
Data
Dissemination Plane:
23

Provides a robust communication channel to each
router – and robustness is the only goal!

May run over same links as user data, but logically
separate and independently controlled
Overview of the 4D Architecture
Network-level
objectives
Decision
Network-wide
views
Dissemination
Discovery
Direct
control
Data
Discovery Plane:
24

Each router discovers its own resources and its local
environment

E.g., the identity of its immediate neighbors
Overview of the 4D Architecture
Network-level
objectives
Decision
Network-wide
views
Dissemination
Discovery
Direct
control
Data
Data Plane:
25

Spatially distributed routers/switches

Can deploy with today’s technology

Looking at ways to unify forwarding paradigms
across technologies
Fundamental Problem: Conflation of Issues

Ideal case: all routing information flooded to
all routers inside network


26
Robustness achieved via flooding
Reality: routing information filtered and
aggregated extensively

Route filtering used to implement security and
resource policies

Route aggregation used to achieve scalability
4D Separates Distributed Computing Issues from
Networking Issues


27
Distributed computing issues ! protocols and network
architecture

Overhead

Resiliency

Scalability
Networking issues ! management logic

Traffic engineering and service provisioning

Egress point selection

Reachability control (VPNs)

Precomputation of backup paths
4D Can Leverage Network Structure


28
Decision plane logic can be specialized for
structure of each physical network

Distributed protocols must be prepared for
arbitrary topology graphs

4D enables network logic specialized differently
for access and for backbone

E.g., creating aggregation tree in access network
Advantages

Faster route computations

Retain flexibility to evolve network as needed

Support transition to 100x100 architecture
The Feasibility of the 4D Architecture
We designed and built a prototype of the 4D Architecture

4D Architecture permits many designs – prototype is a
single, simple design point

Decision plane

29

Contains logic to simultaneously compute routes and enforce
reachability matrix

Multiple Decision Elements per network, using simple election
protocol to pick master
Dissemination plane

Uses source routes to direct control messages

Extremely simple & robust

Quickly route around failed data links, even multiple failures