CPU Intensive

Download Report

Transcript CPU Intensive

Virtualization for the Win!
Scaling Electronic Sports League’s
servers way up
Sreeram Sammeta
Paul Lindberg
Intel
Agenda

MMO hosting is well-understood, but
hosting lots of LAN game servers can be
hard.

What is virtualization?

Can we virtualize LAN game server
code? Electronic Sports League (ESL)
tests showed it can be done.

New hardware and software technologies
let us virtualize more.

Do more with less!
Electronic Sports League*:
Largest online gaming
community in Europe

Electronic Sports
League (ESL) has >1
million active
members1

Mission-critical game
servers

Sensitive to
transaction latency

Often single-thread
CPU intensive
1
Source: ESL web site August 19, 2009
Game Servers
…
Firewall
Internet
Industry knows how to
host a “typical” MMO


Like any multitier IT shop
Database
Compute
Dedicated
network
between tiers
Network
Internet

SW/HW
designed for
throughput and
availability,
as needed
Game clients
(1000s)
…
Hosting lots of LAN game
servers can be harder
1 is easy
 How would
you host 100
LAN games?
1000?
10,000?

Server
Internet
Game clients (16 typical)
Scaling up LAN game servers
could be hard + expensive


LAN game server code sometimes not built
to make hosting easy
Usually CPU-intensive


Assumes it owns the machine?




Network - single IP address?
File system
Impractical/unmanageable to run many server
procs on one machine
“Simple” way: A few game procs per server!


Need lots of CPU “headroom”
…and lots of servers
But lots of servers $$, space, W
Is there a better way?
ESL had a challenge:
Host lots of game servers
without compromising

LAN game servers, like Counter-Strike* 1.6,
are usually:




CPU Intensive: Single Thread vs. Multi-Thread
Memory Intensive: Size/Throughput/Latency
Network I/O Intensive: Throughput/Latency
How could ESL host lots of game servers
(especially LAN games), to meet demand?
Must maintain
Quality of Service (QoS)!
Virtualization can help!

Virtualization is a best practice used by
many IT shops to consolidate servers
for cost and efficiency

Can we really get enough performance
in a virtual machine (VM) to satisfy
gamers?

Perception: Can’t!
We say: Can!
Here’s how…
Virtualization shares
hardware
CS
1
CS
2
CS
1
CS
3
Windows*
Server
2003
(1)
…
Virtual HW
CS1
CS2
CS
2
CS
3
Windows*
Server
2003
(n)
Virtual HW
…
CS3
Windows* Server
2003
VMware* Hypervisor
(OS + Virtual
Machine Manager)
Physical hardware
Physical hardware
CPU
Memory
CPU
Memory
Storage
Network
Storage
Network
ESL Proof of Concept (PoC)
showed it can be done


Hypothesis:
Virtualization of gaming
servers may be possible
Use the latest
technologies







Intel® Xeon® 7400
processor based servers
Intel VMDq NICs
VMware* ESX* 3.5U1 &
NetQueue*
Test if virtualization adds
network latency, in the
Intel lab
Private testing @ ESL lab
Public testing on the
Internet with real ESL
members
Success!
VM1
VMn
10 GbE VMDq
10GbE Switch
ESL Network
Firewall
Internet
Internet
PoC Hardware makes it easier
4 sockets with 6-cores each
7400 Series
4x
1066 MHz
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
FBD
32 slots
32 GB tested
(256 GB max)
FBD
FBD
ESB2
I/O Bridge
Configurable
PCI Express




Intel® Xeon® processor 7400 Series
Performance boost from 6-core with 16 MB L3 cache
Energy efficient boost from 45nm high-k process
technology
Enhanced hardware assist features for virtualization
Network I/O can be slow if
not tuned for virtualization
Virtual NIC
VM2
Virtual NIC
…
Virtualization
Hypervisor
NIC
VMn
Virtual NIC
10
Throughput (Gb/s)
VM1
6
4
2
0
LAN
VMM overhead


Switching load
Interrupt bottleneck
>=60% of
the NIC
capacity
unused
8
No VMDq
Throughput measures receive
side (Rx) I/O performance of
10GbE LAN.
Source: Intel.
Queues for each VM give
near-native throughput
Virtual NIC
VM2
Virtual NIC
…
VMn
10
Virtual NIC
VMware*
with
NetQueue*
NIC
with
VMDq
LAN
VMDq & NetQueue*
 Optimize switching
 Load balance interrupts
9.2
Throughput (Gb/s)
VM1
9.5
8
6
4
>2x throughput!
Near native
10GbE
4.0
2
0
No VMDq
VMDq VMDq
Jumbo
Frames
Tests measure Wire Speed Receive
(Rx) Side Performance With VMDq
on Intel® 82598 10 Gigabit
Ethernet Controller.
Source: Intel.
PoC Software fits it together
Counter
Strike 1.6
Counter
Strike 1.6
Windows
Server 2003
32 bit
Windows
Server 2003
32 bit
VM1
VMn
VMware® ESX 3.5 U1
Intel® Xeon® Processor 7400 Series based
Server

VMware* ESX* 3.5 U1
Virtual Center* 2.5
 NetQueue* enabled (16 queues)
 1 virtual CPU per VM
 2GB memory per VM

Windows* Server* 2003 32-bit
 Counter-Strike* 1.6

It’s all about latency:
Don’t make it worse

Player sends ~40-200
byte update to server
In-Game Latency (ms)

Server sends ~2000
bytes in return
25
20

In-game transaction
latency = round-trip
network latency +
game server
processing time
15
Best LAN
Best Internet
Typ. Internet
10
5
0
Source: ESL observations.
VMDq keeps latency near
native levels!
Avg. latency (ms)
0.3
0.2
0.1
0
64
Native



256
Packet Size (bytes)
VMDq
1024
No VMDq
Virtualization with no VMDq increases latency
VMDq latency is near-native
Negligible effect on latency!

<< In-Game Transaction Latency (5 ms best case)
Source: Intel Lab. Performance measured using the netperf 2.4.4 (UDP latency test with 8
parallel streams) benchmark running on Intel® Xeon® processors 7300 (2.93 GHz).
Live tests find ideal load
ESL virtual game servers on Xeon 7400
700
650
600
24
32
36
Power (Watts)
750
CPU (%)
100
80
60
40
20
0
40
# of VMs

Private ESL & Public
Internet testing revealed
no impact on In-Game
Transaction Latency!
Source: ESL Lab.
Performance measured
using esxtop & power
meter with reference
s/w stack running on
Intel® Xeon®
processors 7400 (2.67
GHz).
Replace 18 servers with 1!
Before
After
1P Core™2 Duo
(2 cores)
4P Xeon 7400
(24 cores)
6 game servers
(3 per core)
108 game servers
(3 per VM, 36 VMs)
72 gamers
1296 gamers
18x game servers per machine
(18:1 consolidation ratio)
 18x gamers per machine
Same CPU headroom, same user
experience!

Source: ESL Lab. Performance measured using esxtop & power
meter with reference s/w stack running on Intel® Xeon® processors
7400 (2.67 GHz). Power savings calculated based on ESL actual
power rate & Yahoo $/€ exchange rate as of 2008-08-12.
18:1 consolidation yields
big efficiency
CS (1)
CS (6)
CS (6)
CS (1)
Windows Svr 2003
Windows Svr 2003
Intel® Core™2 Duo
(1)
Intel® Core™2 Duo
(18)
18 servers
into 1!
CS
(1)
CS
(2)
CS
(3)
CS
(1)
CS
(2)
CS
(3)
Win 2003
Win 2003
VM (1)
VM (36)
VMware* ESX 3.5 U1
Intel® Xeon® Processor 7400 series based
server
18:1 consolidation gives
big power savings
Before
After
1P Core 2 Duo
(2 cores)
4P Xeon 7400
(24 cores)
350 W / machine
710 W / machine
(6300 W / 18 machines)
Power $731K / 1000
machines
Power $83K /
equivalent machines
Power savings: $648K per year for
each 1000 machines converted!
 Power and other factors give a good
Total Cost of Ownership (TCO)

Source:
ESL La. Performance
measured
using esxtop
esxtop &&power
meter
with
Source: ESL
Lab. Performance
measured
using
power
meter
reference s/w stack running on Intel® Xeon® processors 7400 (2.67 GHz).
with reference s/w stack running on Intel® Xeon® processors 7400
Power savings calculated based on ESL actual power rate & Yahoo $/€ exchange
(2.67 GHz). Power
calculated
based
on 24x7x365
usage,
ESL
rate savings
as of 2008-08-12.
Actual
performance
and savings
may vary.
actual power rate & Yahoo $/€ exchange rate as of 2008-08-12.
ESL and gamers loved it!
"Playing on virtualized
gameservers running on
Intel® and VMware*
technologies gives
professional gamers no
disadvantages compared
with playing on a non
virtualized server.
Everything ran smoothly
and I did not notice
anything unusual. A
perfect setup for
professional gaming."
—Navid Javadi aka
mousesports|Kapio
"The new Quad-core Intel®
Xeon® processor 7400
series were completely
overwhelming in all terms.
The Intel Xeon MP processor
… based servers with Intel
VMDq technology enable us
to efficiently run our servers
with reduced costs and
without any negative
impacts."
—Bjoern Metzdorf
Director Information
Technology
Electronic Sports League
New virtualization tech
might help you, too!

Do you have “non-virtualize-able”
apps? Really? Try them!
Can have very low network latency
 Can consolidate many servers into 1


Consolidating servers can lead to
big efficiency and power savings
Read more:
http://software.intel.com/sites/billboard/
archive/performance-sensitiveapplication.php
Legal Disclaimer
• INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH
INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR
OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY
THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND
CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS
INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A
PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY
PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL
PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE
SUSTAINING APPLICATIONS.
• Intel may make changes to specifications and product descriptions at any time,
without notice.
• All products, dates, and figures specified are preliminary based on current
expectations, and are subject to change without notice.
• Intel, processors, chipsets, and desktop boards may contain design defects or errors
known as errata, which may cause the product to deviate from published
specifications. Current characterized errata are available on request.
• Code names featured are used internally within Intel to identify products that are in
development and not yet publicly announced for release. Customers, licensees and
other third parties are not authorized by Intel to use code names in advertising,
promotion or marketing of any product or services and any such use of Intel's
internal code names is at the sole risk of the user
• Performance tests and ratings are measured using specific computer systems and/or
components and reflect the approximate performance of Intel products as measured
by those tests. Any difference in system hardware or software design or
configuration may affect actual performance.
• Intel, Intel Inside, Xeon and the Intel logo are trademarks of Intel Corporation in the
United States and other countries.
• *Other names and brands may be claimed as the property of others.
• Copyright © 2009 Intel Corporation.