Towards Resilient Networks using Programmable Networking

Download Report

Transcript Towards Resilient Networks using Programmable Networking

Towards Resilient Networks
using Programmable Networking
Technologies
Linlin Xie, Paul Smith, Mark Banfield, Helmut Leopold,
James Sterbenz and David Hutchison
Computing Department
Lancaster University
Department of Electrical
Engineering and
Computer Science
University of Kansas
Telekom Austria AG
Presentation Outline
• Introduction to resilience networking
–
–
–
–
Motivation
Resilient networks
Aims
Approaches
• Scenario
– Flash Crowd Event to Web Servers
– Ill-Effect Detection
– Remediation
Linlin Xie
IWAN 2005
2
Motivation
• The Internet is a utility
– Consumers, businesses, governments
• Failures & attacks are inevitable
– Hurricane Katrina, 9/11, NE blackout…
– Link/device failures, DDoS…
• Current Internet and applications not resilient
• Need networking effort: network providers
should take the responsibility
– to protect network resources and optimize the
utilization
– to protect cross traffic as well as stricken customers
Linlin Xie
IWAN 2005
3
Resilient Networks
• Ability of network to maintain or recover an
acceptable level of service in the face of challenges
to normal operation in an acceptable period of time
• Example challenges to normal operation:
–
–
–
–
–
–
Unusual traffic load (e.g. flash crowds)
High-mobility of nodes and sub-networks
Weak and episodic connectivity of wireless channels
Long delay paths
Large-scale natural disasters
Attacks against the network hardware, software, protocol
infrastructure
– Natural faults of network components
Linlin Xie
IWAN 2005
4
General Resilience Aims
• Provide acceptable services to applications
– Ensure information is accessible
– Maintain end-to-end communication when possible
– Operation of distributed processing and networked
storage
• Resilient services must remain accessible
– Can degrade gracefully when necessary, but
ensure correctness
– Recover rapidly and automatically when challenges
dissipate
Linlin Xie
IWAN 2005
5
Role of Programmable Networks
• Challenges to normal operation will rapidly
change over time and space
• Prescribed solutions cannot be deployed
• Therefore, resilient networks must:
–
–
–
–
Operate in real-time
Be autonomic
Be context-aware and “intelligent”
Be dynamically extensible
• Programmable networking technologies are key
to enabling these facilities
Linlin Xie
IWAN 2005
6
Programmable Networking Facilities
• Dynamic extensibility and self-organisation
– Programmability allows dynamic response to challenges by
altering its behaviour
– But need to be controlled in order to avoid misuse and potential
harm (e.g., stealthy interfaces)
– Service to determine suitable locations to deploy services is
required
• Traffic and network environment awareness
– Packet inspection at line speed
– Network information collection
• Cross layer awareness and interaction
– Avoid waste of resources and enhance coordination
– How and the possible consequences need further study
Linlin Xie
IWAN 2005
7
Related Work
• Knowledge Plane (“KP”, David Clark et al. MIT)
– Part of the KP purpose is to detects faults &intrusion and
mitigate the ill-effects
– It proposes to add a new plane into the Internet architecture
– The supporting technology is cognitive AI
– The purpose of KP covers a very broad range
– Cognitive AI is still in its initial stage of development
– No concrete mechanisms for resilience maintenance yet
• Autonomic Communications
– Efforts largely focused on self-configuring, self-managing, and
self-healing networked server systems
– Initiatives now on making communications system autonomic
• Learn network context and automatically adapt
Linlin Xie
IWAN 2005
8
Related Work (Cont’d)
• COPS (Checking, Observing and Protecting
Services) (Randy Katz, UCB)
– Propose to protect network using iBoxes on the network edge
– Propose an annotation layer between IP and transport layers to
carry information along the traffic
• Other similar/related efforts
– Disruption Tolerant Network (DTN)
• Mean to provide stable end to end paths for applications when
network connectivity faces challenges
– Survivability
• Enable the system to fulfil its mission even in the presence of
attacks or failures (CMU)
– Resilience covers a broader range including protection against
unusual traffic load (e.g., FC)
Linlin Xie
IWAN 2005
9
Resilience Networking Scenario
• Demonstrate the applicability of programmable
networks
• Flash Crowd Event
– Although flash crowd requests are legitimate, the
damage caused is equally as bad as malicious
attacks
• Two activities investigated:
– Detecting ill-effects of a flash crowd on Web servers
– Remediation of a flash crowd event
Linlin Xie
IWAN 2005
10
Network Model
We take the role of network provider, i.e. ISP, to detect and mitigate the
ill-effects occurred to the web servers network (which subscribes
such service), and protect resources and cross traffic in the network
of its own
ISP
E2
LAN
Network
Ei
E3
R2
ISP Core Network
Dial
Network
R3
Edge Router
(Ingress, Egress,
Border)
Ri
Core Router
E1
R1
R4
ISP Network
LAN
Network
E4
Web Servers
Network
Linlin Xie
IWAN 2005
11
Ill-Effects Detection
• Detection basis:
– An increase of request rate in an association with a
decrease or level-off of response rate
• Detection location:
– The edge router that connects the web server
network to the ISP network
• Algorithm overview:
– compare actual observed response rate with the
expected one
Linlin Xie
IWAN 2005
12
Ill-Effect Detection (Cont’d)
Mechanism based on the formulae:
Where the sizes of response objects are estimated according to the
size distribution calculated from sampling the “content-length”
domain in HTTP header of the response traffic
Linlin Xie
IWAN 2005
13
Simulation Setup
•
•
•
•
Based on ns-2
Topology
α chosen to 0.2
Detection interval t set
to be 30s
20 clients
...
LAN:
10Mb
/s
Ingress Edge
Router
WAN:
50Mb/s
LAN:
15Mb/s
Egress Edge
Router
Web Server
Linlin Xie
IWAN 2005
14
Simulation Setup (Cont’d)
Parameters set up as follows
Linlin Xie
IWAN 2005
15
Simulation Results
Flash crowd traffic simulation
Flash crowd starts at 500s
Linlin Xie
We use access link congestion to
simulate the server-side behavior
IWAN 2005
16
Simulation Results (Cont’d)
Detection results
Ratio of the actual response volume over the expected one
Linlin Xie
IWAN 2005
17
Simulation Results
Statistical distribution of ratio samples of background traffic:
N(1.10817, 0.2274772)
The 95% confidence range of this distribution is [0.662315, 1.554025]
Linlin Xie
IWAN 2005
18
Remediation
• Drop excessive requests at the ingress edges of
the network
– Pushback-similar mechanism
• Opportunistic multiple-routing of large response
traffic that is packet-sequence-tolerant to protect
cross traffic from degrading QoS too much
– Multiple routes database
– Path bandwidth information collection
– Split the response traffic in proportion to the available
bandwidth of each path
• Must consider the possibility of having zero or
just a few of programmable routers in the core
network
Linlin Xie
IWAN 2005
19
Scenario Conclusions
• Contributions
– Cross-layer coordination in detection
– Cross traffic protection in the network
• Future work
– Mitigation mechanism and experiments
– Design and improve a resilient network
infrastructure and architecture
Linlin Xie
IWAN 2005
20
Conclusions
• Resilient networks are crucial for the future
information society
• Programmable networking technology is
appropriate for building resilient networks
• Example flash crowd scenario demonstrates
the need for programmability, namely:
– cross-layer interaction
– dynamic extensibility
Linlin Xie
IWAN 2005
21
Thanks!
Questions?
Linlin Xie
IWAN 2005
22