The Edge of Smartness - University of Calgary

Download Report

Transcript The Edge of Smartness - University of Calgary

The Edge of Smartness
Carey Williamson
Department of Computer Science
University of Calgary
Email: [email protected]
1
Copyright © 2005 Department of Computer Science
Main Message
• Now, more than ever, we need “smart edge”
devices to enhance the performance,
functionality, and efficiency of the Internet
Application
Application
Transport
Transport
Network
Network
Data Link
Data Link
Physical
Core
Network
Physical
2
Copyright © 2005 Department of Computer Science
The End-to-End Principle
• Central design tenet of the Internet (simple core)
• Represented in design of TCP/IP protocol stack
• Wikipedia: Whenever possible, communication
protocol operations should be defined to occur
at the end-points of a communications system
• Some good reading:
– J. Saltzer, D. Reed, and D. Clark, “End-to-End
Arguments in System Design”, ACM ToCS, 1984
– M. Blumenthal and D. Clark, “Rethinking the Design
of the Internet: The end to end arguments vs. the
brave new world”, ACM ToIT, 2001
3
Copyright © 2005 Department of Computer Science
The End-to-End Principle: Revisited
• Claim: The ongoing evolution of the Internet is
blurring our notion of what an end system is
• This is true for both client side and server side
– Client: mobile phones, proxies, middleboxes, WLAN
– Server: P2P, cloud, data centers, CDNs, Hadoop
• When something breaks in the Internet protocol
stack, we have to find a suitable retrofit to make
it work properly
• We have done this repeatedly for decades,
and will likely keep doing it again and again!
4
Copyright © 2005 Department of Computer Science
(Selected) Existing Examples
•
•
•
•
•
•
•
•
•
•
•
Mobility: Mobile IP, MoM, Home/Foreign Agents
Small devices: mobile portals, content transcoding
Web traffic volume: proxy caching, CDNs
Wireless: I-TCP, Proxy TCP, Snoop TCP, cross-layer
IP address space: Network Address Translation (NAT)
Multi-homing: smart devices, cognitive networks, SIP
Big data: P2P file sharing, BT, download managers
P2P file sharing: traffic classification, traffic shapers
Security concerns: firewalls, intrusion/anomaly detection
Intermittent connectivity: delay-tolerant networks (DTN)
Deep space: inter-planetary IP
5
Copyright © 2005 Department of Computer Science
The Smart Edge
• Similar “tweaks” will be needed at server side
• Putting new functionality in a “smart edge”
device seems like a logical choice, for reasons
of performance, functionality, efficiency, security
• What is meant by “smart”?
– Interconnected: one or more networks; define basic
information units; awareness of location/context
– Instrumented: suitably represent user activities;
location, time, identity, and activity; perf metrics
– Intelligent: provisioning, management, adaptation;
appropriate decision-making in real-time
6
Copyright © 2005 Department of Computer Science
Example 1:
Redundant Traffic Elimination
7
Copyright © 2005 Department of Computer Science
Basic Principles of RTE
• If you can “remember” what you have
sent before, then you don’t have to
send another copy
• Redundant Traffic Elimination (RTE)
• Done using a dictionary of chunks and
their associated fingerprints
• Examples:
– Joke telling by certain CS professors
– Data deduplication in storage systems (90% savings)
– “WAN Optimization” in networks (20% savings)
8
Copyright © 2005 Department of Computer Science
Redundant Traffic Elimination (RTE)
• Purpose: Use bottleneck link more efficiently
• Basic idea: Use a cache of data chunks to avoid
transmitting identical chunks more than once
Distance
Overlap
• RTE process:
Chunk B
FP A
Chunk A
...
FP C
... ...
FP B
...
Chunk C
...
...
• Works within and across files
• Combines caching and chunking
Chunk C
... ...
Chunk A
Chunk B
– Divide IP packet into chunks
– Select a subset of chunks
FP A = fingerprint (Chunk A)
– Store a cache of chunks at two ends
of a network link or path
– Transfer only chunks that are not cached
Chunk cache
9
Copyright © 2005 Department of Computer Science
RTE Process Pipeline
Current
 Improve traditional RTE
 Exploit traffic non-
NIC
NIC
uniformities:
Packet
 Packet size (bypass
technique)
 Chunk popularity (new
cache management scheme)
 Content type (content-aware
RTE)
 Up to 50% more
detected redundancy
Proposed
Packet
Large
enough?
No
Chunking
(no overlap)
Yes
Next
chunk
Fingerprinting
Overlap
OK?
Yes
No
Content
promising?
No
Chunk
expansion
Yes
FIFO cache
management
Fingerprinting
Forwarding
non-FIFO cache
management
10
Copyright © 2005 Department of Computer Science
Forwarding
Main Sources of Redundancy
Type
Value
Description
Example
Nulls
57.1% Consecutive null bytes
0x00000000
Text
16.7% Plain text (English)
Gnutella
HTTP
7.3%
HTTP directives
Content-Type:
Mixed
6.2%
Plain text and other chars
14pt font
Binary
5.8%
Random characters
0x27c46128
HTML
3.7%
HTML code fragments
<HTML> <p>
Char+1
3.2%
Repeated text chars
AAAAAAAz
11
Copyright © 2005 Department of Computer Science
RTE Summary
• Improves traditional RTE savings by up to 50%
• Techniques can be used individually or together
• RTE very beneficial for wireless traffic
– 30% of users have 10-50% redundant traffic
• Proposed a novel content-aware RTE
– Improve RTE savings by up to 38%
• Challenges of content-aware RTE
– Needs refinement to be able to work on real traces, or
exploit an appropriate traffic classification scheme
– Needs improvement in execution time
12
Copyright © 2005 Department of Computer Science
Example 2:
The TCP Incast Problem
13
Copyright © 2005 Department of Computer Science
Motivation
• Emerging IT paradigms
–
–
–
–
Data centers, grid computing, HPC, multi-core
Cluster-based storage systems, SAN, NAS
Large-scale data management “in the cloud”
Data manipulation via “services-oriented computing”
• Cost and efficiency advantages from IT trends,
economy of scale, specialization marketplace
• Performance advantages from parallelism
– Partition/aggregation, Hadoop, multi-core, etc.
– Think RAID at Internet scale! (1000x)
14
Copyright © 2005 Department of Computer Science
Problem Formulation
TCP retransmission timeouts
How to provide
high goodput
for data center
applications?
•
•
•
•
TCP throughput
degradation
High-speed, low-latency network (RTT ≤ 0.1 ms)
Highly-multiplexed link (e.g., 1000 flows)
Highly-synchronized flows on bottleneck link
Limited switch buffer size (e.g., 100 packets)
15
Copyright © 2005 Department of Computer Science
Summary
Summary: TCP Incast Problem
• Data centers have specific network characteristics
• TCP-incast throughput collapse problem emerges
• Solutions:
– Tweak TCP parameters for this environment
– Redesign TCP for this environment
– Rewrite applications for this environment (Facebook)
– Smart edge coordination for uploads/downloads
16
Copyright © 2005 Department of Computer Science
Concluding Remarks
• We need “smart edge” devices to enhance the
performance, functionality, security, and
efficiency of the Internet (now more than ever!)
Application
Application
Transport
Transport
Network
Network
Data Link
Data Link
Physical
Core
Network
Physical
17
Copyright © 2005 Department of Computer Science
Future Outlook and Opportunities
•
•
•
•
•
•
•
•
Traffic classification
QoS management
Load balancing
Security and privacy
Cloud computing
Virtualization everywhere
Multipath TCP congestion control
…
18
Copyright © 2005 Department of Computer Science