Enabling Innovation inside the Network

Download Report

Transcript Enabling Innovation inside the Network

Enabling Innovation
Inside the Network
Jennifer Rexford
Princeton University
http://www.cs.princeton.edu/~jrex
The Internet: A Remarkable Story
• Tremendous success
–From research experiment
to global infrastructure
• Brilliance of under-specifying
–Network: best-effort packet delivery
–Hosts: arbitrary applications
• Enables innovation in applications
–Web, P2P, VoIP, social networks, virtual worlds
• But, change is easy only at the edge… 
2
Inside the ‘Net: A Different Story…
• Closed equipment
–Software bundled with hardware
–Vendor-specific interfaces
• Over specified
–Slow protocol standardization
• Few people can innovate
–Equipment vendors write the code
–Long delays to introduce new features
Impacts performance, security, reliability, cost…
3
How Hard are Networks to Manage?
• Operating a network is expensive
–More than half the cost of a network
–Yet, operator error causes most outages
• Buggy software in the equipment
–Routers with 20+ million lines of code
–Cascading failures, vulnerabilities, etc.
• The network is “in the way”
–Especially a problem in data centers
–… and home networks
4
Creating Foundation for Networking
• A domain, not a discipline
–Alphabet soup of protocols
–Header formats, bit twiddling
–Preoccupation with artifacts
• From practice, to principles
–Intellectual foundation for networking
–Identify the key abstractions
–… and support them efficiently
• To build networks worthy of society’s trust
5
Rethinking the “Division of Labor”
6
Traditional Computer Networks
Data plane:
Packet
streaming
Forward, filter, buffer, mark,
rate-limit, and measure packets
7
Traditional Computer Networks
Control plane:
Distributed algorithms
Track topology changes, compute
routes, install forwarding rules
8
Traditional Computer Networks
Management plane:
Human time scale
Collect measurements and
configure the equipment
9
Shortest-Path Routing
• Management: set the link weights
• Control: compute shortest paths
• Data: forward packets to next hop
1
1
1
1
3
10
Shortest-Path Routing
• Management: set the link weights
• Control: compute shortest paths
• Data: forward packets to next hop
1
1
1
1
3
11
Inverting the Control Plane
• Traffic engineering
–Change link weights
–… to induce the paths
–… that alleviate congestion
5
1
1
1
3
12
Avoiding Transient Anomalies
• Distributed protocol
–Temporary disagreement among the nodes
–… leaves packets stuck in loops
–Even though the change was planned!
15
1
1
1
3
13
Death to the Control Plane!
• Simpler management
–No need to “invert” control-plane operations
• Faster pace of innovation
–Less dependence on vendors and standards
• Easier interoperability
–Compatibility only in “wire” protocols
• Simpler, cheaper equipment
–Minimal software
14
Software Defined Networking (SDN)
Logically-centralized control
Smart,
slow
API to the data plane
(e.g., OpenFlow)
Dumb,
fast
Switches
15
OpenFlow Networks
http://www.openflow.org
16
Data-Plane: Simple Packet Handling
• Simple packet-handling rules
– Pattern: match packet header bits
– Actions: drop, forward, modify, send to controller
– Priority: disambiguate overlapping patterns
– Counters: #bytes and #packets
1. src=1.2.*.*, dest=3.4.5.*  drop
2. src = *.*.*.*, dest=3.4.*.*  forward(2)
3. src=10.1.2.3, dest=*.*.*.*  send to controller
17
Controller: Programmability
App #1
App #2
App #3
Network OS
Events from switches
Topology changes,
Traffic statistics,
Arriving packets
Commands to switches
(Un)install rules,
Query statistics,
Send packets
18
Dynamic Access Control
• Inspect first packet of each connection
• Consult the access control policy
• Install rules to block or route traffic
19
Seamless Mobility/Migration
• See host sending traffic at new location
• Modify rules to reroute the traffic
20
E.g.: Server Load Balancing
• Pre-install load-balancing policy
• Split traffic based on source IP
src=0*
src=1*
Example Applications
• Dynamic access control
• Seamless mobility/migration
• Server load balancing
• Using multiple wireless access points
• Energy-efficient networking
• Adaptive traffic monitoring
• Denial-of-Service attack detection
• Network virtualization
See http://www.openflow.org/videos/
22
OpenFlow in the Wild
• Open Networking Foundation
– Creating Software Defined Networking standards
– Google, Facebook, Microsoft, Yahoo, Verizon, Deutsche
Telekom, and many other companies
• Commercial OpenFlow switches
– HP, NEC, Quanta, Dell, IBM, Juniper, …
• Network operating systems
– NOX, Beacon, Floodlight, Nettle, ONIX, POX, Frenetic
• Network deployments
– Eight campuses, and two research backbone networks
– Commercial deployments (e.g., Google backbone)
23
Algorithmic Challenges in
Software Defined Networking
24
Two-Tiered Computational Model
Smart,
slow
What problems can we solve
efficiently in this two-tiered model?
Dumb,
fast
25
Example: Hierarchical Heavy Hitters
• Measure large traffic aggregates
–Understand network usage
–Detect unusual activity
• Fixed aggregation level
–Top 10 source IP addresses?
–Top 10 groups of 256 IP addresses?
• Better:
–Identify the top IP prefixes, of any size
–Contributing a fraction T of the traffic
26
Heavy Hitters (HH)
• All IP prefixes
****
• Contributing >= T of capacity
40
1***
0***
HH: sends more than
T= 10% of link cap.
100
40
0
00**
01**
19
000*
21
001*
12
0000
11
010*
12
7
0001
1
0010
0011
5
011*
2
0100
9
0101
9
3
0110
5
0111
4
27
Hierarchical Heavy Hitters (HHH)
• All IP prefixes
****
• Contributing >= T of capacity
• … excluding descendents
40
1***
0***
40
HH:
0
00**
01**
19
HHH:
000*
21
001*
12
0000
11
010*
12
7
0001
1
0010
0011
5
011*
2
0100
9
0101
9
3
0110
5
0111
4
28
Computational Model
• Previous work: custom hardware
– Streaming algorithms using sketches
– Fast and accurate, but not commodity hardware
• Our model: simplified OpenFlow
– Slow, smart controller
 Read traffic counters
 Adapt rules installed in the switch
– Fast, dumb switch
?
rules
counters
 Ternary match on packet headers
 Traffic counters
• What algorithm should the controller run?
29
Monitoring HHHes
Priority Prefix Rule Count
1
2
3
0000
010*
0***
****
11
12
17
40
1***
0***
40
0
00**
01**
19
000*
001*
12
0000
11
21
010*
12
7
0001
1
0010
0011
5
011*
2
0100
9
0101
9
3
0110
5
0111
4
Detecting New HHHs
• Monitor children of HHH
****
• Using at most 2/T rules
40
1***
0***
40
0
00**
01**
19
000*
21
001*
12
0000
11
010*
12
7
0001
1
0010
0011
5
011*
2
0100
9
0101
910
9
332
0110
5
0111
4
31
Iteratively Adjust Wildcard Rules
• Expand: if count > T, install rule for children
• Collapse: if count < T, remove rule
0***
Priority Prefix Rule Count
1
0***
80
2
****
0
80
00**
01**
77
000*
3
001*
70
0000
70
010*
3
7
0001
0
0010
0011
2
011*
5
0100
0
0101
3
0
0110
0
0111
0
32
Iteratively Adjust Wildcard Rules
• Expand: if count > T, install rule for children
• Collapse: if count < T, remove rule
0***
Priority Prefix Rule Count
1
00**
77
2
01**
3
00**
3
****
0
77
80
000*
01**
3
001*
70
0000
70
010*
3
7
0001
0
0010
0011
2
011*
5
0100
0
0101
3
0
0110
0
0111
0
33
Iteratively Adjust Wildcard Rules
• Expand: if count > T, install rule for children
• Collapse: if count < T, remove rule
0***
Priority Prefix Rule Count
1
000*
70
2
001*
7
00**
3
****
3
77
000*
010*
80
01**
3
70
0000
70
010*
3
7
0001
0
0010
0011
2
011*
5
0100
0
0101
3
0
0110
0
0111
0
34
Using Leftover Rules
• Monitoring children of all HHHs
– Threshold fraction T of traffic (e.g., T=0.1)
– At most 2/T rules (e.g., 20 rules)
• Often need much less
– Some HHH are very large
– So, may have fewer than 1/T HHHs
• Using the extra rules
– Monitor nodes close to the threshold
– Most likely to become new HHHs soon!
35
Experimental Results
• CAIDA packet traces
– Trace of 400,000 IP packets/second
– Measuring HHHes for T=0.05 and T=0.10
– Time intervals of 1 to 60 seconds
• Accuracy
– Detect and measure ~9 out of 10 HHHes
– Large traffic aggregates are stable
• Speed
– Take a few intervals to find new HHHs
– Meanwhile, report coarse-grain results
000*
12
0000
11
0001
1
36
Stepping Back
• Other monitoring problems
– Multi-dimensional HHH
– Detecting large changes
– Denial-of-Service attack detection
• Changing the computational model
– Efficient, generic support for traffic monitoring
– Good for many problems, rather than perfect for one
• Other kinds of problems
– Flexible server load balancing
– Latency-equalized routing
–…
37
Making SDN Work Better
• Distributed controllers
– Improve scalability, reliability, performance
– Need: efficient distributed algorithms
• Distributed policies
– Spread rules over multiple switches
– Need: policy transformation algorithms
• Verifying network properties
– Test whether the rules satisfy invariants
– Need: efficient verification algorithms
• Many more interesting problems!
38
Conclusion
• SDN is exciting
–Enables innovation
–Simplifies management
–Rethinks networking
• SDN is happening
–Practice: useful APIs and good industry traction
–Principles: start of higher-level abstractions
• Great research opportunity
–Practical impact on future networks
–Placing networking on a strong foundation
39