Transcript SMP OS

Operating System Issues in
Multi-Processor Systems
John Sung
Hardware Engineer
Compaq Computer Corporation
www.compaq.com
Outline
Multi-Processor Hardware Issues
 Snoopy Bus System Architecture
 AMD Athlon’s Snoopy Protocol
 ccNUMA System Architecture
 AMD Athlon’s LDT System Bus
 SGI Origion’s ccNUMA System Architecture
 Alpha 21364 System Architecture
 ccNUMA and CPU Scheduling
 Conclusion

Multi-Processor Hardware Issues

Bandwidth/Latency




Scalability


Processor to Processor
Processor to Memory
Processor to I/O
Increase performance as you increase CPU/Memory
Coherency/Synchronization


Give software coherent view of memory
Provide synchronization primitives
Snoopy Bus System
Architecture
Snoopy Bus System Architecture
A bus Connects Processors,Memory,and I/O
 Scales upto ~16 processors
 Limited by bus bandwidth
 Cache Coherency Protocol




Snoops the bus for memory traffic
Each set has to “listen” for addresses in it’s cache
Does the “right thing” to give software coherent
view of memory
Snoopy Bus System Architecture
CPU
Core
CPU
Core
CPU
Core
Cache
Cache
Cache
Bus
Memory
I/O
Memory
I/O
Memory
I/O
ccNUMA System
Architecture
ccNUMA System Architecture
Cache-Coherent Non-Uniform Memory Access
 Memory is distributed and attached to processors
 Some network connects each processor/memory sets
 Each processor owns part of the memory space
 Cache coherency protocol




Gives software coherent view of memory
Protocol primitives for synchronization
Directory to keep track of who has a copy of memory
ccNUMA System Architecture
CPU
Core
CPU
Core
Cache
Cache
Memory Network
Directory Router
I/O
Memory Network
Directory Router
I/O
Network Fabric
SGI Origin System
Architecture
SGI CrayLinkTM
Node = 2 CPU and their cache
 Module = Memory + Directory + HUB
 2 Modules per Router
TM
 System = Modules + Routers + CrayLink
Network

SGI CrayLinkTM
Processor System Network
Bisectional Bandwidth
ccNUMA and CPU
Scheduling Issues
OS’s Questions

Single CPU System


What to schedule next?
ccNUMA System




What to schedule next?
Which cpu to schedule it to?
Where should the process information be located at?
1 or many instances of OS?
OS’s Choices for a Process

Single CPU System



Process has1 choice
Process information has 1 choice
ccNUMA System with N CPU’s and M Memory



Process has N choices
Process information M choices per virtual page
“Distance” between process and it’s information
Context Switch Penalty

Single CPU System



Saving/Restoring process state (PCB)
Scheduling routine
ccNUMA System



Saving/Restoring process state (PCB)
Scheduling routine
Moving process’s information
Some Common Sense

Replicate parts of the OS across processors


Minimize process movement




System calls will happen often
Cost of moving a process to another CPU is high
Less than swaping to disk, most of the time
Higher than simple context switching
But if you have to move a process


Minimize the amount of information to move
Opportunity for a cache????
Conclusion

Hardware



Bandwidth and Latency for performance
Cache Coherency for correctness
Operating System


ccNUMA adds complexity in CPU scheduling
HW performance = Lower Context Switch Penalty
=> flexibility in scheduling choices for a process
References

Alpha




AMD


http://www.amd.com/products/cpg/mpf/speech/slides99.ppt
SGI


http://www.digital.com/alphaoem/present/ev7forum98.ppt
http://www.compaq.com/InnovateForum99/presentation/session31/
http://www.digital.com/alphaoem/
http://www-europe.sgi.com/origin/numa_tech.html
BenchMarks


http://www.spec.org/
http://www.tpc.org/
Abbreviation Index














AMD - Advanced Micro Devices
SGI - Silicon Graphics Inc.
ECC - Error Correction Code
SECDED - Single Error Correct Double Error Detect
API - Alpha Processor Inc
AGP - Accelerated Graphics Port
DDR DRAM - Double Data Rate Dynamic RAM
LTD - Lightning Data Transport
PCI - Peripheral Component Interconnect
CMOS - Complementary Metal Oxide Semiconductor
CAS - Column Address Strobe
TPC-C -Transaction Processing Performance Council Benchmark
ccNUMA - Cache-Coherent Non-Uniform Memory Access
SMP - Symmetric Multi-Processing