MulticoreProgrammingx

Download Report

Transcript MulticoreProgrammingx

Summary of Topics:
Console Architecture
Meaning of Paper’s Title/Why the Video Game
Developer HATED the new
Techniques/Problems
The Future
Video Game Architecture
For the most part, same as computer:
Very operating system-linked.
With PCs, almost always have been games.
Mac Gaming is sparse, recently increased.
Linux users have to compile/make their own.
Console Games = primarily single-core
processors…until 2005.
XBOX 360
• 3.2 GHz “Xenon” triple-core
PowerPC, 2 hardware threads
per processor
• 256 MB main RAM
• 500 MHz ATI “Xenos” GPU
-CPU accesses memory
through the GPU!
• GPU has 10 MB RAM
embedded frame buffer
XBOX 360 vs. Playstation 3
Triple-Core PPC
Multicore Cell Engine
• Xbox 360 - 512 MB,
700 MHz, GDDR3,
shared by CPU and GPU
PS3 - 512 MB total
• CPU accesses memory
through the GPU!
256 MB 700 MHz GDDR3
video RAM for the GPU
• GPU has 10 MB RAM
embedded frame buffer
256 MB 3.2 GHz XDR
main RAM for the CPU
Cell Architecture
Multiple synergistic core units that attach to local stores, which then
feed into DMAs going into the on-chip bus. One set-off PPE (Power
Processing Element), with an L1 and L2 cache. Developers are
having some serious problems with this model.
Why So Unhappy?
Delays, setbacks, ecetera = unhappy fans.
Yu Suzuki; Saturn Virtua Fighter: “One very fast central
processor would be preferable...I think that only one in
100 programmers are good enough to get this kind of
speed out of the Saturn.”
Not implementing parallelism, use of multicore
architecture, etc = unhappy fans.
If game developers utilize parallelism, the game will
be delayed – 6 months, 1 year?
Beginning Techniques
• Patches, so computers at least realize there’s multiple
cores available.
• Intel releases several multicore assists; especially in
the beginning (coaxing people into it)
• Building Blocks
• Codeplay’s sieve compilers
• Broke a program into “sieve blocks” where automatic
parallelization could be utilized
What do we do today?
Multithreading from the ground up
Decent (and fast!) parallelization
One of two main ways:
Every process on a different thread
Dependencies galore~!
“Best” Multithreading Approach
Main gaming thread, with branches coming off for specific parts of the
game and splintering into other threads.
Particularly beastly programs get their own multithreading
implementations.
Networking and I/O get their own threads.
CASE EXAMPLE: Kameo
CASE EXAMPLE: Kameo, which achieved 2.2~2.5 cores in 6mos.
Rendering, decompression were on a separate thread
Latter saved space on the DVD and improved load times for the game.
Additionally, file I/O was separated onto two threads – one for reading,
and one for decompressing.
Best Processes for MT
File decompression – improve load times.
Rendering – separate update and render; can be
problematic
Physics Engine? – Physics/Update/Render, but
latency issues.
Graphical Fluff – always and forever.
Artificial Intelligence - position independency of
data, cache coherency
Cascade Project
Fix dataflow by sending data from the parent to the child
before the parent had completed!
Respect dependencies, divided AI
Resulted in reducing “the average time per frame from
15.5ms using a single thread to 7.8ms using eight threads.”
51% Speedup!
Work in progress – CDML
List constraints in language instead of working out later.
Multithreading is Tricky
Threads can fight over the cache
Dependencies
Data corruption, deadlocks
Bugs might not be apparent right away
Debugging sets developers back
The Future
ARM’s GPU/CPU Chip
Intel’s Larrabee Chip
Mobile Gaming Platforms laugh for now…
Unreal 4 Engine – “We’re waiting for massively
multicore processors.”
It’s just not that easy anymore.
Thanks for watching!