Transcript Slide 1
Multi-core systems
System Architecture COMP25212
Daniel Goodman
Advanced Processor Technologies Group
Processors Designs
Power
Single Threaded Performance
Throughput
Reliability
......
Classifying Processors
SISD
• Single Instruction Single Data
• Uniprocessor
SIMD
• Single Instruction Multiple Data
• Vector processor & Vector operations (MMX & SSE)
MIMD
• Multiple Instructions Multiple Data
• Multi-cores (multiprocessors)
MISD (not known)
SPMD: Single Program Multiple Data
• (MIMD plus only one program)
• clusters
Classifying Processors
RISC
• Reduced instruction set
• Small number of very fast simple instructions
• Complex instructions are constructed from many smaller instructions
CISC
• Complex instruction set
• Lots of instructions
• Can be slow, but do a lot of work per instruction
Graphics Processing Units (GPU)
HD video and games are computationally very
demanding (Beyond even the best CPU’s)
Extremely parallel, each pixel is independent
Quite different emphasis and evolution for GPUs
• Fine to perform non-graphics tasks poorly or not at all
• Large number of cores and each highly multithreaded
(768 -1024 concurrent threads per Nvidia core)
• Additional threads are queued till the earlier threads
complete
• Shared register file
• Each core is SIMD
• No coherency between cores
• No communication between groups of threads
• Very fast memory access
Graphics Processing Units (GPU)
Coalesced Memory Access
Un-Coalesced Memory Access
SpiNNaker Massively Parallel System
Fabricated SpiNNaker CMP
•Fabricated in UMC 130nm L130E
•CMP Die Area - 101.64 sq.mm
•Over 100 million transistors
•Power consumption of 1W at 1.2V
when all the processor cores are
operating
•Peak Performance – 4 GIPS
Constructing Clusters, Data Centres and
Super Computers
Composing Multi-cores
Input/Output Hub
(DRAM)
Memory
Multi-core Chip
(DRAM)
Memory
Multi-core Chip
QPI or HT
Input/Output Hub
Motherboard
Multi-core
Multi-core Chip
Memory
(DRAM)
Memory
(DRAM)
Composing Multiple Computers
...
Interconnection Network
Clusters/Super Computers/Data Centres
All terms overloaded and misused
Have lots of CPU’s on lots of Mother boards
Clusters/Super Computers are used to run one large
task very quickly eg. A simulation
Cluster/Farms/Data centres do thousands of
independent tasks in parallel eg. Google Mail
The distinction becomes blurred with services such as
Google
Main difference is the network between CPU’s
Building a Cluster/SC/DC
•Large numbers of self contained
computers in a small form factor
•These are optimised for cooling
and power efficiency
•Racks house 10’s – 100’s of CPU’s
•They normally also contain separate units
for networking and power distribution
•They are self contained
Building a Cluster/SC/DC
•Sometimes a rack is not big enough
•How many new computers a day go into a data centre?
•What does this mean for reliability?
Building a Cluster/SC/DC
Join Lots of racks
Add power distribution,
network and cooling
For Super Computers add
racks dedicated to storage
K Super Computer
Water cooled
6D network for fault tolerance
RISC processors (Sparc64 VIII fx)
90,000 processors
Questions
?