Chapter 4 Lecture Presentation

Download Report

Transcript Chapter 4 Lecture Presentation

Cheetah-UTK collaboration
Tao Li & Malathi Veeraraghavan
University of Virginia
May 3, 2007

Outline





Potential topics for collaboration
Cheetah goals with implications on network
design
CDN applications
Collaboration with HOPI
Three-domain network: UTK, Cheetah, HOPI


Data plane
Control plane
1
Potential topics for UTK-Cheetah
collaboration

Applications



Control-plane


CDN
DVTS
Internetworking with cheetah control-plane
solution
Virtualizer

For Force10 switches: need this to enable
HOPI to serve as a testbed for simultaneous
networking experiments
Proposed Three-Domain Network (UTK,
Cheetah, HOPI)
PC3
Chicago
HOPI
Force10
PC3
CHEETAH
PC
CHEETAH
PC
10GbE
Seattle
HOPI
Force10
NYC HOPI
Force10
10GbE
Gloriad
Force10
PC3
NxGbE
UTK SERF
Force10
UTK
Humanities
Force10
10GbE
PC3
10GbE
PC3
UTK
UTK
server
UTK
ORNL
Force10
CHEETAH
PC
LA HOPI
Force10
Washington
HOPI
Force10
NxGbE
UTK
server
10GbE
Zelda4/5
SN16k
ORNL
OC192
OC192
NxGbE
SLR
Force10
SN16k
Zelda1/2/3
GMPLS UNI
SN16k
ATL
GbE
Wukong/
Wuneng
Cheetah
Cheetah goals

Original goals:

Support eScience applications primarily the
Terascale Supernova Initiative (TSI)



Connect scientist's cluster at NCSU to ORNL Cray
TSI project is now complete
2007 refocus

Design and demonstrate use of an internetworking
architecture consisting of a core circuit-switched
network with control-plane support for dynamic
call-by-call bandwidth sharing with connectionless
packet-switched regional/enterprise networks

quantifiable benefits to applications
4
Motivation

Bottom-up:



Circuit switches, being position-based, are
cheaper for higher-rate interfaces and larger
switching capacities than packet switches
Core network switches need higher link rates
and switching capacities
Therefore, use circuit switches in the core
5
Motivation contd.

Today’s Internet & Internet2 do use circuit switches
in the core



e.g., OC192 links between Abilene routers traverse a SONET
circuit-switched network
However, circuits are provisioned; leased lines held for long
durations (years); PVCs in ATM lingo
For a network of circuit switches to qualify as a
"network" as opposed to a set of "wires," bandwidth
sharing must be implemented:


Control-plane call-by-call sharing
SVCs in ATM lingo.
6
Top-down motivation


Are there any applications for a "core" circuitswitched network with SVC capability?
Without these, it is a technology-driven solution
without a problem


Technology: control-plane protocols
Data-plane aspect of circuit switches already in use in
the form of PVCs
7
Applications for circuit-switched
networks with SVC capability

eScience applications:



focus is on providing high-bandwidth per call
circuit/virtual circuit (VC) capability required enterprise-toenterprise, not just in the core network
General file-transfer applications:

need to focus on mode of bandwidth sharing due to scale



request and obtain dedicated bandwidth and use temporarily
in contrast with TCP bandwidth sharing on connectionless
packet-switched networks, where the amount of bandwidth a
flow receives can vary within a flow's duration
has value even if the circuit/VC is just a core network
Internetworking architecture for connectionless
packet-switched enterprise/regional networks and
core circuit-switched network with SVC capability
e.g. airport
Gateway
Circuit
switch
Call ahead before
sending flow that needs
its own circuit
Circuit
switch
Gateway
e.g. airport
Core network
e.g. airlines network
Host
Host
Packet
switch
Packet
switch
Packet
switch
Packet
switch
Host
Host
Enterprise/regional
e.g. roadways network
Enterprise/regional
e.g. roadways network
Key point: gateways need to be connection-oriented packet switches
Analogy: Airline passenger calls ahead, makes reservation for flight before
driving on roadways network to reach the airport (gateway)
Explanation

If gateways are IP routers, and they are
operated in connectionless mode (which means no
"connection setup" phase prior to data transfer),
there is no simple solution to trigger SVC set up
and teardown


Hence PVC usage of core circuit-switched network
IP-over-ATM efforts suggested automatic flow
classification: if X packets for a flow are detected in Y
sec, assume it is a long-lived flow and initiate SVC setup
 Didn't work. Why?


Technology problem: guessing game to identify long-lived flows
More importantly: no value
 End-to-end bandwidth sharing is still TCP based
What is a "connection-oriented"
gateway?




Example: squid web proxy (caching) server
When http request arrives, think of it as as a
data-plane packet with an implicit signaling call
setup request
If web proxy's secondary NIC (into the circuitswitched) network is tied up in circuits to other
web proxy servers distinct from the one
identified as best parent for this request, call is
effectively rejected
Falls back to primary NIC path and uses
connectionless IP path between squid servers
Is there any value?

Clear value to the service provider



Cheaper circuit switches at higher rates
implies cheaper core-network connectivity
services
Pass some of the savings to users
Value to the user:



Different mode of bandwidth sharing
Allows for user to pay for and obtain
differentiated services on a per-flow basis
Pricing model can capture temporal fairness
Implications of goals on
network design

Data-plane:


Circuit granularity should be moderate/small, e.g., OC1
Control-plane


Fast call setup delay: if 1-sec call setup delay, holding
time should be at least n times this number, e.g., n=10.
Call handling throughput should be high
 e.g., 160Gbps switching capacity; per-call rate:
300Mbps; number of concurrent calls = 533
 if call holding time = 10 sec, switch controller should
handle 53 calls/sec
 Distributed call controller
Control-plane implications of
goals

General-purpose file transfer applications




Moderate bandwidth per call
Short call holding times
Control-plane solution only needs to support immediaterequest mode of bandwidth sharing
eScience applications



High bandwidth per call
Long (hours) holding time
Control-plane solution should support book-ahead mode
of bandwidth sharing
Book-ahead (BA) vs. immediate request
(IR) modes of bandwidth sharing
m is the link capacity
expressed in channels
e.g., if 1Gbps circuits
are assigned on a 10Gbps link,
m = 10
Bandwidth sharing modes
Large m
Moderate-rate circuits
Small m
High-rate circuits
immediate-request
Short calls
Bank teller
immediate-request


Doctor's office
book-ahead
Call blocking probability is low when m is high; IR mode works
Mean waiting time is proportional to mean call holding time



Long calls
Can afford to use IR mode when m is small if calls are short
Implement some form of call queueuing
But if circuit rates are high and holding times are large, need
to support book-ahead mode
15
Examples representative of the two
classes of applications
Applications Video telephony
Metric
(IR application)
Video
conferencing
(BA application)
Call arrival rate
High (many
endpoints)
Low (one or two
endpoints per
enterprise)
Call holding time
Short (~3 mins)
Long (~1 hr)
Bandwidth
required per call
Low
Higher
Mode of request
Immediate usage Can plan ahead
and reserve
bandwidth
16
Outline check

Outline





Potential topics for collaboration
Cheetah goals with implications on network design
CDN applications
Collaboration with HOPI
Three-domain network


Data plane
Control plane
17
Open-source CDN software




OpenCDN
Globule CDN
CoralCDN
CoDeeN
18
OpenCDN



OpenCDN is an application-level content delivery network (CDN),
suitable for live and recorded multimedia content distribution
Creates an application-level multicast-relay tree because IP
multicast routing is not widely enabled in routers
Architecture:


Data-plane: A tree of streaming servers forming a chain from origin
server to client; New clients join the tree at closest server
Control-plane: Request Routing and Distribution Manager (RRDM)



origin and relay nodes register their streaming protocol capabilities
relay nodes registers the IP address space they are willing to serve
Currently supported streaming servers:


Darwin Streaming Server by Apple
Helix Universal Server by Real
19
Architecture
• Portal: Web site that shows meta data for the content stored
on origin servers available for streaming.
• Nodes  Relay servers; Runs streaming clients on one side
and servers on the other side
Fit of OpenCDN with Cheetah

Is Cheetah appropriate to carry concatenated streams of
data with relay nodes located at the Cheetah PoPs?

Circuit granularity: OC1 too high for one stream



Could set up one OC1 between two relay nodes and carry
multiple streams across this OC1
Holding time: Could be large, for popular TV stations, if held
"all-the-time" leased circuits are better
Alternative solution:



Use file transfers to copy video files between origin nodes
and CDN servers at Cheetah PoPs
Add streaming servers to the CDN servers for serving local
clients
Keep the RRDM registration and software for identifying
ideal relay node for client. Relay  CDN server
21
Globule CDN

Globule is a collaborative content delivery network,
where content providers (origins) share each
other’s server resources (an enterprise uses
another enterprise's servers as replicas)




in contrast to using a commercial CDN service
Each origin server maintains some backup servers
and some replica servers
Clients are redirected to their closest replica
servers using a redirector (http or DNS)
Need to upgrade Apache Web Server with a
Globule module
22
Fit of Globule CDN with Cheetah


Deploy replica servers at Cheetah PoPs
Use Cheetah for copying files




between replica servers if origin server is on
enterprise/regional network after initial copy from origin
server to closest replica server
between origin server and replicas if the origin server is
itself located at a PoP
Are the updates to replicas automatic?
If pre-configured, what criteria are used to determine
which replica servers to use?


Paper mentions that only partial copies are maintained in
replicas
See IEEE Comm. Mag. Aug. 2006 paper
23
Akamai model
Steps in looking up a URL when a CDN is
used.
Courtesy: Tanenbaum's Fourth Edition slides from Prentice Hall
Content Delivery Networks
(a) Original Web page.
(b) Same page after transformation.
Courtesy: Tanenbaum's Fourth Edition slides from Prentice Hall
DNS or http redirection



HTTP supports a Location header, which can be included in
the response with the URL of the CDN server to which
the http request is redirected
DNS redirection: appears that the DNS server serving the
origin server needs to be modified to provide the IP
address of an appropriate CDN server based on the client
location
Which is appropriate for our deployment?



See our goals for deploying applications
Understand sclabaility of call-by-call dynamic bandwidth
sharing
Need to sign on web servers for this application unlike in
squid where we need to sign on web clients
Outline check

Outline




Potential topics for collaboration
Cheetah goals with implications on network design
CDN applications
Collaboration with HOPI




Applications
Control-plane
Virtualizer
Three-domain network


Data plane
Control plane
27
Application testing

Applications



Selected to show-case advantages of high-speed dedicated
virtual circuits (VCs) between PCs located at HOPI PoPs
Mostly file transfer applications
Examples:





Web proxy (caching) servers: allows users not directly
connected to HOPI to nevertheless use it (use HOPI VCs for
inter-proxy file transfers)
CDN and web mirroring: locate these servers at HOPI PoPs,
and use VCs for file movement between CDN servers/mirrors
IPTV: move video files between IPTV servers located at PoPs
that serve local audiences
Email servers: SMTP-to-SMTP server file transfers
Storage and disaster recovery
28
CDN: Content Delivery network; SMTP: Simple Mail Transfer Protocol
Process for application testing

End hosts on which to run applications:




Use existing "support" PCs at HOPI PoPs, or
Collocate UVa-provided PCs at HOPI PoPs
Obtain virtual circuits from HOPI TSC as
required for the experiments, and run tests
Goal of deploying applications:



Actively solicit and sign-on users
Need to generate sufficient traffic to understand
bandwidth sharing aspects of circuit-switched network
Quite different from running bbcp on two servers in an
experiment to obtain throughput of one flow
29
Control-plane testing

Cheetah Control-Plane Module (CCPM)



Implements distributed bandwidth management
One CCPM per HOPI Force10 switch to manage
the bandwidth for all the interfaces on that
particular switch
Dynamic virtual-circuit service for calls with




high call arrival rates
short durations
moderate bandwidth
immediate-request type
30
Virtualizing HOPI PCs and Force10
switches

Virtualizing PCs:



Invite contributions of PCs from researchers,
to locate at HOPI PoPs
Slice these PCs and offer usage to the whole
community
Virtual Force10 switch
 To support multiple control-plane and
management-plane projects
Outline check

Outline




Potential topics for collaboration
Cheetah goals with implications on network design
CDN applications
Collaboration with HOPI




Applications
Control-plane
Virtualizer
Three-domain network


Data plane
Control plane
32
Proposed HOPI-CHEETAH-UTK ThreeDomain Network: Data plane
PC3
Chicago
HOPI
Force10
PC3
CHEETAH
PC
CHEETAH
PC
10GbE
Seattle
HOPI
Force10
NYC HOPI
Force10
10GbE
Gloriad
Force10
PC3
NxGbE
UTK SERF
Force10
UTK
Humanities
Force10
10GbE
PC3
10GbE
PC3
UTK
UTK
server
UTK
ORNL
Force10
CHEETAH
PC
LA HOPI
Force10
Washington
HOPI
Force10
NxGbE
UTK
server
10GbE
Zelda4/5
SN16k
ORNL
OC192
OC192
NxGbE
SLR
Force10
SN16k
Zelda1/2/3
GMPLS UNI
SN16k
ATL
GbE
Wukong/
Wuneng
Cheetah
Cheetah control-plane solution



Developed a software program called circuitrequestor for CHEETAH end hosts
Use built-in GMPLS control-plane software in
Sycamore switch controllers
Current solution:


supports only port-mapped GbE-STS3-7v-GbE circuits
Planned upgrades:

support VLAN mapped to sub-Gbps SONET circuits
CHEETAH architecture
End Host
CHEETAH
software
Internet
DNS client
RSVP-TE module
Application
DNS client
RSVP-TE module
SONET circuitswitched network
TCP/IP
NIC 1
NIC 2
Circuit
Gateway
Circuit
Gateway
NIC 1
NIC 2
Based on the RSVP-TE code from KOM/DRAGON


Application
TCP/IP
C-TCP/IP

End Host
CHEETAH
software
About 40K lines of C++ code
What we changed:





Modified the code to inter-operate with the Sycamore SN16000
Added admission control, session management, user interface, etc.
Integrated code for DNS lookup from our partner CUNY
Designed and implemented APIs for general applications
About 4K lines of new code
C-TCP/IP
CHEETAH end-host software
– includes circuit-requestor + daemons
DNS
server
End host
DNS lookup
Circuit-requestor
Application
DNS client
socket
CD API
CHEETAH
Daemon
(CD)
RSVPD API
socket
C-TCP API
User space
Kernel space
C-TCP
Steps:
•DNS lookup (to support our scalability goal)
•Circuit setup signaling procedure (RSVP-TE)
RSVP-TE Daemon
(RSVPD)
RSVP-TE
messages
Cheetah control-plane solution

Usage:

User logins to a CHEETAH end host

Option 1:




Uses circuit-requestor program to request the setup of
a dedicated 1Gb/s Ethernet-SONET-Ethernet circuit to
another CHEETAH end host
Runs application, such as file transfer
Uses circuit-requestor program to release circuit
Option 2:

User starts file-transfer application and unbeknownst to
the user, the software decides whether it is appropriate
to use a circuit and if so, sets it up, transfers the file
and releases the circuit
Circuit-requestor usage

To setup a new circuit:
circuit-requestor setup domain-name-of-called-host
bandwidth [holding-time]




Default holding-time: 10 mins
Max holding time: 1 hour
Limit call holding time for fair bandwidth sharing
To renew an existing circuit:
circuit-requestor renew session-id [new-holding-time]

Release unused circuits if there is no renewal

To release an existing circuit:

To check the status of the CHEETAH trunk:
circuit-requestor release session-id
circuit-requestor status
Option 2


Modified squid software to have the squid
server automatically initiate circuit setup
when it receives an http request that
requires it to obtain the file from another
squid server (parent) located at
cheetah/HOPI PoP
Release circuit if less than 10 packets seen
on secondary NIC with 60sec (ICP packets
do appear between squid servers)
Measurements
Signaling delays incurred in setting up a circuit between zelda1 and
wuneng across the CHEETAH network.
Circuit type
End-tend circuit
setup delay (s)
Processing delay for
Path message(s) at
sn16k-nc
Processing delay for
Resv message(s) at
the sn16k-nc
OC-1
0.166103
0.091119
0.008689
OC-3
0.165450
0.090852
0.008650
1Gb/s EoS
1.645673
1.566932
0.008697
Round-trip signaling message propagation plus emission delay between sn16k-atl and sn16knc: 0.025s
Why the initial DNS lookup?


Verify called end host is on CHEETAH network
and to obtain MAC address of the second
(CHEETAH) NIC
Why is MAC address necessary?

Need to program ARP table to avoid wide-area ARP
lookups
IP and MAC addressing issues

Our completed Control-Plane Network Design document
describes
 Why we chose static public IP addresses for the second
(CHEETAH) NICs at end hosts

Choose addresses based on CHEETAH host’s location – i.e.,
allocate address from enterprise’s public IP address space
allocation (e.g., UVA hosts: 128.143.x.x)
Reason: Scalability
Impact:
 After dedicated circuit is setup:




far end NIC has IP address from a different subnet
Default setting of IP routing table entries will indicate that such
an address is only reachable through the default gateway
Our solution:
automatically update IP routing and ARP tables



Update IP routing and ARP tables at both end hosts as last
step of circuit setup
 Analogous to switch fabric configuration
Routing table update
 Add an entry indicating the remote host is directly reachable
through the second (CHEETAH) NIC
ARP table update
 Add an entry for the MAC address of the remote CHEETAH
NIC
Addressing in the CHEETAH network

Whether private and/or dynamic IP addresses can be
assigned to the data-plane and control-plane interfaces in
GMPLS networks?
 Data-plane Addresses
 Static


Need to be “called” by other clients
Public


Globally unique
Scalable to multiple autonomous-systems


Private IP addresses sufficient if goal for CHEETAH is to create a
small eScience network
Control-plane Addresses
 Static


Configured in Traffic-Engineered (TE) link

Same scalability reason
Public
Control-plane security





IPsec based
Use Juniper NS-5 devices on SN16000 control
ports
Use Openswam IPsec software on Linux end hosts
IPsec tunnels created between primary NICs of
hosts and NS-5 devices of switches
IPsec tunnels created between switch NS-5
devices for switch-to-switch control-plane
exchanges
CCPM for HOPI Force10



Apply same approach as in cheetah
network
Since Force10 does not have a built-in
GMPLS control-plane, we implemented
CCPM
Run one CCPM per host