TC90: Presentation Title

Transcript TC90: Presentation Title

Energy in Networks & Data Center
Networks
Yanjun Yao
Department of EECS
University of Tennessee, Knoxville
1
Network Architecture
Internet
End
Host
Router
Router
Switch
Switch
Switch
End
Host
End
Host
Switch
End
Host
End
Host
Switch
End
Host
End
Host
2
A Feasibility Study for Power
Management in LAN Switches
Maruti Gupta, Satyajit Grover and Suresh Singh
Computer Science Department
Portland State University
3
Motivation and Goals
 Motivation
 Few dynamic power management schemes for internet
devices
 Goal
 Power management scheme for LAN switches
 Why switches?
 Switches comprise bulk of network devices in LAN
 Consumes largest percentage of energy in internet devices
Device
Approximate Number Deployed
Total AEC TW-h
Hubs
93.5 million
1.6 TW-h
LAN Switches
95,000
3.2 TW-h
WAN Switches
50,000
0.15 TW-h
router
3,257
1.1 TW-h
4
Related Works
 Estimate power consumption in switch fabrics:
 Developing statistical traffic models [Wassal et al. 2001]
 Various analytical models [G. Essakimuthu et al. 2002, D.
Langen et al. 2000, C. Patel et al. 1997, Hang et al. 2002,
Ye et al. 2002]
 Power management schemes for interconnection
network fabrics:
 Using DVS with links [Li et al. 2003]
 Using on/off links [L. Peh et al. 2003]
 Router power throttling [Li et al. 2003]
5
Feasibility
 What to do?
 Put LAN switch components, interfaces or entire switches in
sleep.
 Are there enough idle periods to justify sleeping?
Individual Switch Interface
Activity at Switch
Low activity time)
60% of time has interactivity time
Greater than 20 seconds)
Interactive time (seconds)
Percentage of 2 hours
Percentage of 2 hours
High activity time
Low activity time)
High activity time
Interactive time (seconds)
6
Models for Sleeping
 Basic sleep components:
 No sleep model for switches
 Each port has a line card
 Each line card with a processor and
buffers
Ingress Buffer
Network
Processor
Egress Buffer
 Sleep model for a line card is obtained
from the sleep model of its constituent
parts
 Develop sleep model based on the
functionality of the line card
7
Models for Sleeping
 Interface state is preserved
Wake
 HABS (Hardware Assisted Buffered Sleep):
HABS
 Incoming packet wakes up the interface and is buffered
 Power on input buffer, input circuits for receiving
 HAS (Hardware Assisted Sleep):
HAS
 Incoming packet wakes up switch interface and is lost
 Power on receiver circuits
 Simple Sleep:
Simple
Wake
 Set a sleep timer
 Only wakes up when timer expires
 Assumption:
 Transmitting from a deeper sleep to lighter sleep takes time
and results in a spike in energy consumption
8
Implication of Sleeping
 Simple Sleep:
 All packets are lost
 Poor throughput, energy saving will be offset by
retransmission
 To use this state, we need:
 Interface connected to end host: ACPI (Advanced Configuration
and Power Interface) to inform the switch that it is going to sleep
 Interface connecting switches: guarantee no packets will be
sent to a sleeping interface
 HAS:
 The packets wake up the interface get lost
 To use it, we need:
 Send a dummy packet ahead of the packets to be sent to the
sleeping interface
9
Implication of Sleeping
 HABS:
 Lower energy saving
 Further simplify the model:
 Simple sleep:
 Switch interface connected to end
hosts with extended ACPI
 HABS:
 Switch to switch
 Switch to router
 Switch interface connected to
hosts without extended ACPI
10
Algorithms for Sleeping
 Questions:
 When can interface go to sleep?
 Length of sleep interval ts?
 Length of wake interval between consecutive sleeps t I ?
 Wake and Simple Sleep:
 Switch interface sleep when the end host goes to sleep
 Wakes up periodically to check if host has woken up:
 End hosts wakes up and send packets to switch interface with
period   t I
 Remains awake if end host awake until end hosts sleep
again
11
Algorithms for Sleeping
 Wake and HABS:
 Make decision after processing the last packet in the buffer:
 If ( x   )es   ew  xeI , then sleep time ts  x  
 Otherwise, stays awake
 Two simple practical algorithm:
 Estimated algorithm:
 Use an estimator for x , sleep if ( x   )es   ew  xeI ,
where x   xt 1  (1  ) xt
 Sleeps until woken up by an incoming packet
 Estimated and Periodic Algorithm:
 For periodic traffics
 Get time to next periodic packet y, determine x
 Interface sleeps if (min( x , y)   )es   ew  min( x , y)eI
12
Estimated Energy Savings
 Determine energy saving:

E
Es
Energy with no Sleeping/Energy when Sleeping
Individual Switch Interface
es = 0.1
Low activity period
High activity period
es = 0.5
Low activity period
High activity period
Time to wake up (seconds)
13
Performance of Three Algorithms
Host M to Switch Interface
Light
Optimal, Estimated and
Estimated & Period
Heavy
Energy with no Sleeping/Energy when Sleeping
Energy with no Sleeping/Energy when Sleeping
Host Y to Switch Interface
Light & Heavy
All Algorithms
Time to wake up (seconds)
Switch to Switch Interface
Switch to Switch Interface
Optimal, Estimated and
Estimated & Period
Light
Light
Heavy
Heavy
Time to wake up (seconds)
Energy with no Sleeping/Energy when Sleeping
Energy with no Sleeping/Energy when Sleeping
Three algorithms have
very similar performance
Time to wake up (seconds)
Optimal, Estimated and
Estimated & Period
Light
Light
Heavy
Heavy
14
Time to wake up (seconds)
Simulation Results
 Topology:
 Six switches
 Each host runs STP protocol
in addition to different data
streams
 Data for simulations is
generated using Markov
Modulated Poisson Process
 Simulation on Opnet
 Evaluate Interfaces:
 Sw0 to sw4
 Sw2 to mmpp22
15
Simulation Result
 Switch to switch saves more
energy
Energy with no Sleeping/Energy when Sleeping
Switch Interfaces, HABS Simulation
Time to wake up (seconds)
Switch Interfaces, Simple Sleep Simulation
Percentage of Packets Lost
Energy with no Sleeping/Energy when Sleeping
Switch Interfaces, Simple Sleep Simulation
Time to wake up (seconds)
Time to wake up (seconds)
16
Impact of Sleeping On protocols and
Topology Design
 Simple Sleep’s impact on protocol design:
 For periodic messages, the sleep time must be fine tuned.
 Wake up all interfaces for broadcasting.
 Impact of network topology and VLANs on sleeping:
 For redundant paths:
 Aggregate traffic loads to some of the paths and put the rest to
sleep.
 However, the STP generated a spanning tree
17
Conclusion
 Sleeping in order to save energy is a feasible option
in the LAN.
 Three sleeping models are proposed.
 Two types of algorithms for transmitting from wake
state and sleeping state are shown.
 Simulations are done to evaluate the performance of
HABS and Simple Sleep.
18
Critique
 Three sleeping models are proposed but only two of
them are evaluated. HAS is eliminated without a
good reason.
 Modifications on hardware are needed to support the
three sleep models.
 For the first simulation, it is said that the HABS are
used for both experiments, but different transision
energies are used.
 Did not evaluate packet delay
19
VL2: A Scalable and Flexible Data
Center Network
Albert Greenberg. James R. Hamilton. Navendu
Jain. Srikanth Kandula. Changhoon Kim, et al
Microsoft Reseach
20
Architecture of Data Center Networks (DCN)
21
Conventional DCN Problems
CR
CR
AR
AR
AR
AR
S
S
S
S
1:240
S
IS wantS1:80
more
S
1:5
…
…
 Static network assignment
 Fragmentation of resource
...
I have spare ones,
S
S
S
S
but…
…
…
 Poor server to server connectivity
 Traffics affects each other
 Poor reliability and utilization
22
Objectives:
 Uniform high capacity:
 Maximum rate of server to server traffic flow should be limited only
by capacity on network cards
 Assigning servers to service should be independent of network
topology
 Performance isolation:
 Traffic of one service should not be affected by traffic of other
services
 Layer-2 semantics:
 Easily assign any server to any service
 Configure server with whatever IP address the service expects
 VM keeps the same IP address even after migration
23
Measurements and Implications of DCN
 Data-Center traffic analysis:
 Traffic volume between servers to entering/leaving data
center is 4:1
 Demand for bandwidth between servers growing faster
 Network is the bottleneck of computation
 Flow distribution analysis:
 Majority of flows are small, biggest flow size is 100MB
 The distribution of internal flows is simpler and more uniform
 50% times of 10 concurrent flows, 5% greater than 80
concurrent flows
24
Measurements and Implications of DCN
 Traffic matrix analysis:
 Poor summarizing of traffic patterns
 Instability of traffic patterns
 Failure characteristics:
 Pattern of networking equipment failures: 95% < 1min, 98%
< 1hr, 99.6% < 1 day, 0.09% > 10 days
 No obvious way to eliminate all failures from the top of the
hierarchy
25
Virtual Layer Two Networking (VL2)
 Design principle:
 Randomizing to cope with volatility:
 Using Valiant Load Balancing (VLB) to do destination
independent traffic spreading across multiple intermediate
nodes
 Building on proven networking technology:
 Using IP routing and forwarding technologies available in
commodity switches
 Separating names from locators:
 Using directory system to maintain the mapping between names
and locations
 Embracing end systems:
 A VL2 agent at each server
26
VL2 Addressing and Routing
Switches run link-state routing
and
maintain only switch-level
topology
LAs
ToR1 . . . ToR2
ToR3 y payload
ToR34 z payload
AAs
x
...
ToR3
y,yz
. . . ToR4
z
Directory
Service
…
x  ToR2
y  ToR3
z  ToR34
…
Lookup &
Response
Servers use flat
names
27
Random Traffic Spreading over Multiple Paths
IANY
IANY
IANY
Links used
for up paths
Links used
for down paths
T1
IANY T53
T2
T3
x
y
T4
T5
T6
yz payload
z
28
VL2 Directory System
RSM
RSM
Servers
3. Replicate
RSM
RSM
4. Ack
(6. Disseminate)
2. Set
...
DS
DS
2. Reply
...
DS
2. Reply
1. Lookup
...
Directory
Servers
5. Ack
1. Update
Agent
Agent
“Lookup”
“Update”
29
Evaluation
 Uniform high capacity:
 All-to-all data shuffle stress test:
 75 servers, deliver 500MB
 Maximal achievable goodput is 62.3
 VL2 network efficiency as 58.8/62.3 = 94%
30
Evaluation
 Fairness:
Fairness Index
 75 nodes
 Real data center workload
 Plot Jain’s fairness index for traffics to intermediate switches
1.00
0.98
0.96
Aggr1
0.94
0
100
200
300
Time (s)
Aggr2
400
Aggr3
500
31
Evaluation
 Performance isolation:
 Two types of services:
 Service one: 18 servers do single TCP transfer all the time
 Service two: 19 servers starts a 8GB transfer over TCP every 2
seconds
 Service two: 19 servers burst short TCP connections
32
Evaluation
 Convergence after link failures
 75 servers
 All-to-all data shuffle
 Disconnect links between intermediate and aggregation
switches
33
Conclusion
 Studied the traffic pattern in a production data center
and find the traffic patterns
 Design, build and deploy every component of VL2 in
an 80 server testbed
 Apply VLB to randomly spreading traffics over
multiple flows
 Using flat address to split IP addresses and server
names
34
Critique
 The extra servers are needed to support the VL2
directory system,:
 Brings more cost on devices
 Hard to be implemented for data centers with tens of
thousands of servers.
 All links and switches are working all the times, not
power efficient
 No evaluation of real time performance.
35
Comparison
LAN Switch
VL2
Target
Save power on LAN
switches
Achieve agility on DCN
Networks
LAN
DCN
Traffic Pattern Light for most time
Highly unpredictable
Object
Switches
Whole network
Experiment
Simulation on Opnet
Real testbed
36
 Q&A
37