Transcript Slide 1

Multi-core systems
System Architecture COMP25212
Daniel Goodman
Advanced Processor Technologies Group
Processors Designs





Power
Single Threaded Performance
Throughput
Reliability
......
Classifying Processors
 SISD
• Single Instruction Single Data
• Uniprocessor
 SIMD
• Single Instruction Multiple Data
• Vector processor & Vector operations (MMX & SSE)
 MIMD
• Multiple Instructions Multiple Data
• Multi-cores (multiprocessors)
 MISD (not known)
 SPMD: Single Program Multiple Data
• (MIMD plus only one program)
• clusters
Classifying Processors
 RISC
• Reduced instruction set
• Small number of very fast simple instructions
• Complex instructions are constructed from many smaller instructions
 CISC
• Complex instruction set
• Lots of instructions
• Can be slow, but do a lot of work per instruction
Graphics Processing Units (GPU)
 HD video and games are computationally very
demanding (Beyond even the best CPU’s)
 Extremely parallel, each pixel is independent
 Quite different emphasis and evolution for GPUs
• Fine to perform non-graphics tasks poorly or not at all
• Large number of cores and each highly multithreaded
(768 -1024 concurrent threads per Nvidia core)
• Additional threads are queued till the earlier threads
complete
• Shared register file
• Each core is SIMD
• No coherency between cores
• No communication between groups of threads
• Very fast memory access
Graphics Processing Units (GPU)
Coalesced Memory Access
Un-Coalesced Memory Access
SpiNNaker Massively Parallel System
Fabricated SpiNNaker CMP
•Fabricated in UMC 130nm L130E
•CMP Die Area - 101.64 sq.mm
•Over 100 million transistors
•Power consumption of 1W at 1.2V
when all the processor cores are
operating
•Peak Performance – 4 GIPS
Constructing Clusters, Data Centres and
Super Computers
Composing Multi-cores
Input/Output Hub
(DRAM)
Memory
Multi-core Chip
(DRAM)
Memory
Multi-core Chip
QPI or HT
Input/Output Hub
Motherboard
Multi-core
Multi-core Chip
Memory
(DRAM)
Memory
(DRAM)
Composing Multiple Computers
...
Interconnection Network
Clusters/Super Computers/Data Centres
 All terms overloaded and misused
 Have lots of CPU’s on lots of Mother boards
 Clusters/Super Computers are used to run one large
task very quickly eg. A simulation
 Cluster/Farms/Data centres do thousands of
independent tasks in parallel eg. Google Mail
 The distinction becomes blurred with services such as
Google
 Main difference is the network between CPU’s
Building a Cluster/SC/DC
•Large numbers of self contained
computers in a small form factor
•These are optimised for cooling
and power efficiency
•Racks house 10’s – 100’s of CPU’s
•They normally also contain separate units
for networking and power distribution
•They are self contained
Building a Cluster/SC/DC
•Sometimes a rack is not big enough
•How many new computers a day go into a data centre?
•What does this mean for reliability?
Building a Cluster/SC/DC
 Join Lots of racks
 Add power distribution,
network and cooling
 For Super Computers add
racks dedicated to storage
K Super Computer




Water cooled
6D network for fault tolerance
RISC processors (Sparc64 VIII fx)
90,000 processors
Questions
?