Towards Wireless Overlay Network Architectures

Download Report

Transcript Towards Wireless Overlay Network Architectures

Quality of Service vs.
Any Service at All
IWQoS 2005 Passau Germany
Randy H. Katz
Computer Science Division
Electrical Engineering and Computer Science Department
University of California, Berkeley
Berkeley, CA 94720-1776
1
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
2
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
3
Some Observations
• Internet reasonably robust to point problems
like link and router failures (“fail stop”)
• Successfully operates under a wide range of
loading conditions and over diverse
technologies
• During 9/11/01, Internet worked reasonable
well, under heavy traffic conditions and with
some major facilities failures in Lower
Manhattan
4
The Problem
• Networks awash in illegitimate traffic: port
scans, propagating worms, p2p file swapping
– Legitimate traffic starved for bandwidth
– Essential network services (e.g., DNS, NFS) compromised
• Needed: better network management of
services/applications to achieve good
performance and resilience even in the face of
network stress
– Self-aware network environment
– Observing and responding to traffic changes
– While sustaining the ability to control the network
5
From the Frontlines
• Berkeley Campus Network
– Unanticipated traffic surges render the network
unmanageable (and may cause routers to fail)
– Denial of service attacks, latest worm, or the newest file
sharing protocol largely indistinguishable
– In-band control channel is starved, making it difficult to
manage and recover the network
• Berkeley EECS Department Network (12/04)
– Suspected denial-of-service attack against DNS
– Poorly implemented/configured spam appliance adds to DNS
overload
– Traffic surges render it impossible to access Web or mount
file systems
• Network problems contribute to brittleness of
distributed systems
6
Why and How Networks Fail
• Complex phenomenology of failure
• Recent Berkeley experience suggests that traffic
surges render enterprise networks unusable
• Indirect effects of DoS traffic on network
infrastructure: role of unexpected traffic patterns
– Cisco Express Forwarding: random IP addresses flood route
cache forcing all traffic to go through router slow path—high
CPU utilization yields inability to manage router table updates
– Route Summarization: powerful misconfigured peer overwhelms
weaker peer with too many router table entries
– SNMP DoS attack: overwhelm SNMP ports on routers
– DNS attack: response-response loops in DNS queries generate
traffic overload
7
Possible Approach
• New technology: packet flow manipulations at L4-L7
made possible by new PNEs and stateful routers
• Enables identification/segregation of traffic
– Good: protect it
– Bad: block it
– Suspicious: slow it
• Check/Observe/Protect Services (COPS)
– Inspection-and-Action Boxes (iBoxes)
– Annotation layer between routing and transport
• Yielding new service building blocks
– Beyond packet marking and annotation …
– To flow extraction and path-oriented statistics collection …
– Based on traffic analysis, model extraction, statistical correlation
& causality testing
8
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
9
Managing Edge Network
Services and Applications
• Not shrink wrap software—but cascaded “appliances”
• Data Center in-a-box blade servers, network storage
• Brittle to traffic surges and shifts, yielding network
disruption
Edge Network
Blades
Server
Server
Server
Server
Traffic
Shaper
Load
Balancer
IDS
Firewall
Wide
Area
Network
Egress
Checker
Edge Network Middleboxes
10
Appliances Proliferate:
Management Nightmare!
Packeteer PacketShaper
Network Appliance NetCache F5 Networks BIG-IP LoadBalancer
Localized content delivery platform
Web server load balancer
Traffic monitor and shaper
Ingrian i225
Cisco SN 5420
SSL offload appliance
IP-SAN storage gateway
NetScreen 500
Extreme Networks SummitPx1
Firewall and VPN
L2-L7 application switch
Nortel Alteon Switched Firewall
CheckPoint firewall and L7 switch
Cisco IDS 4250-XL
Intrusion detection system
11
Network Support for Tiered
Applications
Server
Server
LAN
Server
Database
Tier
Server
App
Tier
LAN
Server
Server
Web
Tier
Datacenter Network(s)
Load
Balancer
LAN
Firewall
Wide
Area
Network
Egress
Checker
Load
Server
Server
Balancer
Unified
Server
Server
+
LAN Servers
Server Servers
Server
Firewall
on
on
+
Demand Demand
Configure servers,
Egress
storage, connectivity
Blades
Checker
net functionality as needed
Wide
Area
Network
12
“The Computer is the Network”
• Emergence of Programmable Network Elements
– Network components where net services/applications execute
– Virtualization (hosts, storage, nets) and flow filtering (blocking,
delaying)
• Computation-in-the-Network is NOT Unlimited
– Packet handling complexity limited by latency/processing overhead
– NOT arbitrary per packet programming (aka active networking)
– Allocate general computation like proxies to network blades
• Beyond Per Packet Processing: Network Flows
– Managing/configuring network for performance and resilience
– Emergence of stateful routers for flow-based management
– Adaptation based on Observe (Monitor), Analyze (Detect, Diagnose),
Act (Redirect, Reallocate, Balance, Throttle)
13
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
14
Check
• Checkable Protocols: Maintain invariants and
techniques for checking and enforcing
protocols
– Listen & Whisper: well-formed BGP behavior
– Traffic Rate Control: Self-Verifiable Core Stateless Fair
Queuing (SV-CSFQ)
• Existing work requires changes to protocol end
points or routers on the path
– Difficult to retrofit checkability to existing protocols
without embedded processing in PNEs
– Develop building blocks for new protocols
» Observable protocol behavior
» Cryptographic techniques
» Statistical methods
15
Observe
• Observation and Action Points
– Network points where control is exercised,
traffic classified, resources allocated
– In the datapath statistical collection +
annotating, prioritizing, shaping, blocking, …
– Inspection-and-Action Boxes (iBoxes)
» Prototyped on commercial PNEs
» Placed at Internet and Server edges of enterprise net
» Cascaded with existing routers to extend their
functionality
» Migration into (some current and) future router
architectures
16
Protect
• Protect Crucial Services
– Minimize and mitigate effects of attacks and traffic surges
– Classify traffic into good, bad, and ugly (suspicious)
» Good: standing patterns and operator-tunable policies
» Bad: evolves faster, harder to characterize
» Ugly: cannot immediately be determined as good or bad
– Filter the bad, slow the suspicious, maintain resources for the
good (e.g., control traffic)
» Sufficient to reduce false positives
» Some suspicious-looking good traffic may be slowed down,
but won’t be blocked
17
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
18
iBoxes: Observe, Analyze, Act
Enterprise
Network
Architecture
I
R
Internet or
WAN Edge
R
E
E
E
I
R
I
R
R
Distribution
Tier
E
Access
Tier
E
R
I
E
E
E
Access
Tier
R
Network Services
End Hosts
I
E
E
E
E
Access
Tier
User
End Hosts
Server
End Hosts
Inspection-and-Action Boxes:
Deep multiprotocol packet inspection
No routing; observation & marking
Policing points: drop, fence, block
19
Generic Network Element
Architecture
Buffers
Buffers
CP
CP
CP
CP
Classification
Processor
“Tag”
Mem
CP
CP
CP
AP
Rules &
Programs
Interconnection
Fabric
Output Ports
Input Ports
Buffers
Action
Processor
20
RouterVM
• High-level specification environment for
describing packet processing
• Virtualized: abstracted view of underlying
hardware resources of target PNEs
– Portable across diverse architectures
– Simulate before deployment
• Services, policies, and standard routing
functions managed thru composed packet filters
– Generalized packet filter: trigger + action bundle, cascaded,
allows new services and policies to be implemented /
configured thru GUI
– New services can be implemented without new code through
library building blocks
Mel Tsai
21
Extended Router Architecture
• Virtualized components representative of a “common”
router implementation
• Structure independent of particular hardware
Virtual line card
instantiated for every
port required by
application
Virtual backplane
shuttles packets
between line cards
CPU handles
routing protocols
& mgmt tasks
Blue “standard” components
Yellow components added &
configured per-application
Filters are key to
flexibility
Compute engines
perform complex,
high-latency
processing on
flows
Mel Tsai
22
GPF “Fill-in” Specification
RouterVM Generalized Packet Filter (type L7)
“Packet filter” as high-level,
programmable building-block
for network appliance apps
Traditional Filter
FILTER 19 SETUP
Classification
Parameters
Action
NAME SIP SMASK DIP DMASK PROTO SRC PORT DST PORT VLAN ACTION -
example
any
255.255.255.255
10.0.0.0
255.255.255.0
tcp,udp
any
80
default
drop
23
GPF Action Language
•
Basic set of assignments,
evaluations, expressions,
control-flow ops, “physical”
actions on packets/flows
– Control-flow: If-then-else, if-not
– Evaluation: ==, <=, >=, !=
– Packet flow control: Allow,
unallow, drop, skip filter, jump
filter
– Packet manipulation: Field
rewriting (ip_src == blah,
tcp_src = blah), truncation,
header additions
– Actions: NAT, loadbalance,
ratelimit, (perhaps others)
– Meta actions: packet generation,
logging, statistics gathering
• Basic Filter
– Simple L2-L4 header classifications
– Any RouterVM actions
• L7 Filter
– REs, TCP termination, ADU recon
• NAT Filter
– Capabilities beyond simple NAT action
available to all GPFs
• Content Caching
– Builds on L7 filter functionality
• WAN Link Compression
– Simple to specify, but requires lots of
computation
• IP-to-FC Gateway
– Requires own table format &
processing
• XML Preprocessing
– Not very well documented, and
difficulty is unknown…
24
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
25
Network-Level
Observe-Analyze-Act
• Observe
– Packet, path, protocol, service invocation statistical collection
and sampling: frequencies, latencies, completion rates
– Construct the collection infrastructure
• Analyze
– Determine correlations among observations
– “Normal” model discovery + anomaly detection
– Exploit SLT
• Act
–
–
–
–
Experiment to test correlations
Prioritize and throttle
Mark and annotate
Control theory? Distributed analyses and actions
26
Network Layer Mechanism:
Annotations
Application
Presentation
Session
Transport
Annotation
Network
Link
Phy
• Enhance network visibility:
disseminate observations,
communicate actions, provide inband network management actions,
iBox-to-iBox communications
• iBoxes label packets at annotation
layer but do not rewrite packet
contents
• Annotations stack, must be
removed from packets before
delivery to A-layer unaware end
nodes
• Expose annotations to application
layer?
27
Annotation Layer:
Simple Marking Example
External Traffic
Label
Packet
Network
Services
Internal
Router
iBox
iBox
Boundary
Router
Action: Mark
packets
Detect load and trigger action:
Slow traffic with “external” labels
Internal Traffic
Enterprise Network
Label
Packet
• Marking vs. rewriting approach
Packet
– E.g., mark packets as internally vs. externally sourced using IP
header options
• Prioritize internal vs. external access to services
solves some but not all traffic surge problems
28
Annotation Layer:
iBox Piggybacked Control Plane
• Problem: Control plane starvation
• Use A-layer for iBox-to-iBox communication
–
–
–
–
–
Passively piggyback on existing flows
“Busy” parts of network have lots of control plane b/w
Actively inject control packets to less active parts
Embedded control info authenticated and sent redundantly
Priority given to packets w/control when net under stress
• Network monitoring and statistics collection
dissemination subsystem
29
Presentation Outline
•
•
•
•
•
•
•
The Problem
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
30
Scenario: Traffic Surge
Inhibiting Network Services
II
Primary &
Secondary
DNS
Servers
R
Distribution
Tier
S
S
Mail
Server
Spam
Appliance
Internet
Edge
S
S
E
IS
R
Server
Edge
R
IA
Access
Edge
E
E
E
• DNS Server swamped by excessive request traffic
– Observe: DNS time outs, Web access traffic slowed, but also
higher than normal mail delivery latency implying busy server edge
(correlation between Mail Server and DNS Server utilization?)
– Root Cause: High DNS request rates generated by Spam Appliance
triggered by mail surge
31
Scenario Continued
II
Primary &
Secondary
DNS
Servers
R
Distribution
Tier
S
S
Mail
Server
Spam
Appliance
Internet
Edge
S
S
E
IS
R
Server
Edge
R
IA
Access
Edge
E
E
E
• How Diagnosed?
– I-S detects high link utilization but abnormally high DNS traffic
– Stats from I-I: high mail traffic, low outgoing web traffic, in
traffic high but link utilization not high
– Stats from I-A: lower web traffic, no unusual mail origination
– Problem localized to Server edge, but visibility limited
32
Scenario Continued
II
Primary &
Secondary
DNS
Servers
R
Distribution
Tier
S
S
Mail
Server
Spam
Appliance
Internet
Edge
S
S
E
IS
R
Server
Edge
R
IA
Access
Edge
E
E
E
• Possible Action Responses
– Experiment: Redirect local DNS requests to Secondary DNS server:
if these complete, can infer the server is the problem, not the
network
– Throttle: Due to MS-DNS correlation, block/slow email traffic at
Server Edge: should expect reduced DNS server utilization
33
Presentation Outline
•
•
•
•
•
•
•
Problem and Approach
System and Network Trends
Checking-Observing-Protecting Services
Inspection-and-Action Boxes
Annotation Layer
Scenario
Call to Action
34
System Perspective Needed!
Operator
User
Programming
Abstractions
For Roll-back and
wide-area distributed
computations
Crash-only services
+ Observation
Infrastructure for
System SLT
Checkable Protocols
Fast Detection &
Route Recovery
Observation
Infrastructure for
network SLT
Commodity
Internet
Prototype Applications
Client
Server
Distributed
Middleware
iBox Edge
Network
Router
SLT Services
ApplicationSpecific
Overlay Network
Internet
IP Network
Distributed
Middleware
Edge iBox
Network
Router
35
Hope for Emerging
Platforms
iBoxes implemented on
commercial PNEs
– Don’t: route or implement (full)
protocol stacks
– Do: protect routers and shield
network services
» Classify packets
» Extract flows
» Redirect traffic
» Log, count, collect stats
» Filter/shape traffic
36
Summary and Conclusions
• Processing-in-the-Network is real
– Networking plus processing in switched and routed
infrastructures
– Configuration and management of packet processing cast
onto PNEs (network appliances, blades, stateful routers)
• Needed: Unifying Framework
– Methods to specify functionality and processing
» RouterVM: Filtering, Redirecting, Transformation
» Map from policy intentions to network actions?
» Local observations/correct global behavior?
• Application-specific network processing based
on session extraction
37
Summary and Conclusions
• PNEs: foundation of a pervasive infrastructure
for observation and action at the network level
– iBoxes Observation and Action points
– Annotation Layer for marking and control
• Check-Observe-Protect paradigm for
protecting critical resources when network is
under stress
• Functionality eventually migrates into future
generations of routers
– E.g., Blades embedded in routers
38
Quality of Service
vs.
Any Service at All
Randy H. Katz
Thank You!
39
40