Transcript PPT - Users

Green Computing
Omer Rana
[email protected]
The Need
Bill St. Arnaud (CANARIE, Inc)
Impact of ICT Industry
Bill St. Arnaud (CANARIE, Inc)
Virtualization Techniques
Bill St. Arnaud (CANARIE, Inc)
Virtual Machine Monitors (IBM
1960s)
App
App
App
App
CMS
MVS
CMS
CMS
IBM VM/370
IBM Mainframe
A thin software layer that sits between
hardware and the operating system— virtualizing and
managing all hardware resources
Ed Bugnion, VMWare
Old idea from the 1960s
• IBM VM/370 – A VMM for IBM mainframe
– Multiple OS environments on expensive hardware
– Desirable when few machine around
• Popular research idea in 1960s and 1970s
– Entire conferences on virtual machine monitor
– Hardware/VMM/OS designed together
• Interest died out in the 1980s and 1990s.
– Hardware got cheap
– Operating systems got more more powerful (e.g multiuser)
Ed Bugnion, VMWare
A return to Virtual Machines
• Disco: Stanford research project (1996-):
– Run commodity OSes on scalable multiprocessors
– Focus on high-end: NUMA, MIPS, IRIX
•
Hardware has changed:
– Cheap, diverse, graphical user interface
– Designed without virtualization in mind
• System Software has changed:
– Extremely complex
– Advanced networking protocols
– But even today :
• Not always multi-user
• With limitations, incompatibilities, …
Ed Bugnion, VMWare
The Problem Today
Operating System
Intel Architecture
Ed Bugnion, VMWare
The VMware Solution
Operating System
Operating System
Intel Architecture
Intel Architecture
Ed Bugnion, VMWare
™
VMware MultipleWorlds
Technology
App
App
App
App
Win
2000
Win
NT
Linux
Win
2000
™
VMware MultipleWorlds
Intel Architecture
A thin software layer that sits between Intel hardware and the
operating system— virtualizing and managing all hardware
resources
Ed Bugnion, VMWare
MultipleWorlds Technology
World
App
App
App
App
Win
2000
Win
NT
Linux
Win
2000
VMware MultipleWorlds
Intel Architecture
A world is an application execution environment
with its own operating system
Ed Bugnion, VMWare
Virtual Hardware
Parallel Ports
Ethernet
Serial/Com Ports
Floppy Disks
Monitor
(VMM)
Keyboard
Sound Card
IDE Controller
SCSI Controller
Ed Bugnion, VMWare
Mouse
Attributes of MultipleWorlds
Technology
• Software compatibility
– Runs pretty much all software
• Low overheads/High performance
– Near “raw” machine performance
• Complete isolation
– Total data isolation between virtual machines
• Encapsulation
– Virtual machines are not tied to physical machines
• Resource management
Ed Bugnion, VMWare
Hosted VMware Architecture
Host Mode
VMM Mode
VMware, acting as an
application, uses the host
to access other devices
such as the hard disk,
floppy, or network card
The VMware Virtual
machine monitor allows
each guest OS to directly
access the processor
(direct execution)
Guest OS Applications
Guest Operating System
Host OS Apps
Host OS
NIC
VMware App
VMware Driver
Disks
PC Hardware
Virtual Machine
Virtual Machine Monitor
Memory
VMware achieves
both near-native
execution speed
and broad device
support by
transparently
switching*
between Host
Mode and VMM
Mode.
CPU
*VMware typically switches modes 1000 times per second
Ed Bugnion, VMWare
Hosted VMM Architecture
•
Advantages:
– Installs and runs like an application
– Portable – host OS does I/O access
– Coexists with applications running on the host
•
Limits:
– Subject to Host OS:
• Resource management decisions
• OS failures
•
Usenix 2001 paper:
J. Sugerman, G. Venkitachalam and B.-H. Lim, “Virtualizing I/O on VMware
Workstation’s Hosted Architecture”.
Ed Bugnion, VMWare
Virtualizing a Network Interface
Physical Ethernet
VMApp
Guest OS
Virtual Network Hub
Host OS
NIC Driver
Virtual Bridge
NIC Driver
Physical NIC
VMDriver
VMM
PC Hardware
Ed Bugnion, VMWare
The rise of data centers
• Single place for hosting servers and data
• ISP’s now take machines hosted at data
centers
• Run by large companies – like BT
• Manage
– Power
– Computation + Data
– Cooling systems
– Systems Admin + Network Admin
Data Centre in Tokyo
From:
Satoshi Matsuoka
http://www.attokyo.co.jp/eng/facility.html
Martin J. Levy (Tier1 Research) and Josh Snowhorn (Terremark)
Martin J. Levy (Tier1 Research) and Josh Snowhorn (Terremark)
Martin J. Levy (Tier1 Research) and Josh Snowhorn (Terremark)
Requirements
• Power an important design constraint:
– Electricity costs
– Heat dissipation
• Two key options in clusters – enable
scaling of:
– Operating frequency (square relation)
– Supply voltage (cubic relation)
• Balance QoS requirements – e.g.fraction
of workload to process locally – with power
management
From: Salim Hariri, Mazin Yousif
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
Martin J. Levy (Tier1 Research) and Josh Snowhorn (Terremark)
The case for power management in HPC
• Power/energy consumption a critical issue
– Energy = Heat; Heat dissipation is costly
– Limited power supply
– Non-trivial amount of money
• Consequence
– Performance limited by available power
– Fewer nodes can operate concurrently
• Opportunity: bottlenecks
– Bottleneck component limits performance of other components
– Reduce power of some components, not overall performance
• Today, CPU is:
– Major power consumer (~100W),
Power/performance
– Rarely bottleneck and
“gears”
– Scalable in power/performance (frequency & voltage)
•
Two reasons:
1. Frequency and voltage scaling
(1)
power
Is CPU scaling a win?
Performance reduction less than
Power reduction
Throughput reduction less than
Performance reduction
•
Assumptions
– CPU large power consumer
– CPU driver
– Diminishing throughput gains
(2)
application throughput
2. Application throughput
CPU power
P = ½ CVf2
performance (freq)
performance (freq)
AMD Athlon-64
•
•
•
•
x86 ISA
64-bit technology
Hypertransport technology – fast memory bus
Performance
– Slower clock frequency
– Shorter pipeline (12 vs. 20)
– SPEC2K results
• 2GHz AMD-64 is comparable to 2.8GHz P4
• P4 better on average by 10% & 30% (INT & FP)
• Frequency and voltage scaling
– 2000 – 800 MHz
– 1.5 – 1.1 Volts
From: Vincent W. Freeh (NCSU)
LMBench results
• LMBench
– Benchmarking suite
– Low-level, micro data
• Test each “gear”
Frequency
Gear
(Mhz)
0
2000
1
1800
2
1600
3
1400
4
1200
6
800
Voltage
1.5
1.4
1.3
1.2
1.1
0.9
From: Vincent W. Freeh (NCSU)
Operating system functions
From: Vincent W. Freeh (NCSU)
Communication
From: Vincent W. Freeh (NCSU)
The problem
• Peak power limit, P
– Rack power
– Room/utility
– Heat dissipation
• Static solution, number of servers is
– N = P/Pmax
– Where Pmax maximum power of individual node
• Problem
–
–
–
–
Peak power > average power (Pmax > Paverage)
Does not use all power – N * (Pmax - Paverage) unused
Under performs – performance proportional to N
Power consumption is not predictable
From: Vincent W. Freeh (NCSU)
From: Vincent W. Freeh (NCSU)
Safe over provisioning in a cluster
• Allocate and manage power among M > N nodes
– Pick M > N
• Eg, M = P/Paverage
– MPmax > P
– Plimit = P/M
• Goal
Pmax
Paverage
P(t)
time
power
power
– Use more power, safely under limit
– Reduce power (& peak CPU performance) of individual nodes
– Increase overall application performance
Pmax
Paverage Plimit
P(t)
time
From: Vincent W. Freeh (NCSU)
Safe over provisioning in a cluster
• Benefits
– Less “unused” power/energy
– More efficient power use
• More performance under same power limitation
– Let P be performance
– Then more performance means: MP * > NP
Pmax
Paverage
P(t)
time
unused
energy
power
power
– Or P */ P > N/M or P */ P > Plimit/Pmax
Pmax
Paverage Plimit
P(t)
time
When P */ P > N/M
or
P */ P
> Plimit/Pmax
(1)
In words:
power reduction more than
performance reduction
•
Two reasons:
1. Frequency and voltage scaling
2. Application throughput
(2)
application throughput
•
power
When is this a win?
performance (freq)
performance (freq)
From: Vincent W. Freeh (NCSU)
Feedback-directed,
adaptive power control
• Uses feedback to control power/energy consumption
– Given power goal
– Monitor energy consumption
– Adjust power/performance of CPU
• Several policies
– Average power
– Maximum power
– Energy efficiency: select slowest gear (g) such that
From: Vincent W. Freeh (NCSU)
A more holistic approach: Managing a Data Center
CRAC: Computer Room Air Conditioning
units
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
Location of Cooling Units
six CRAC units are serving 1000 servers, consuming 270
KW of power out of a total capacity of 600 KW
http://blogs.zdnet.com/BTL/?p=4022
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From:
Satoshi
Matsuoka
From:
Satoshi
Matsuoka
From:
Satoshi Matsuoka
From:
Satoshi
Matsuoka
From:
Satoshi Matsuoka
From:
Satoshi Matsuoka