PPT - RAMP - University of California, Berkeley

Download Report

Transcript PPT - RAMP - University of California, Berkeley

RAMP-White
Hari Angepat
Derek Chiou
University of Texas at Austin
Motivation
 Coherent shared memory multiprocessor
simulator
 Support for existing programming models
• Operating system support
• Single image
• Programming libraries
• Legacy applications
2
RAMP-White
Outline
 Leon3 Integration
•
•
•
•
Default architecture
RAMP-White architecture
Baseline Design features
Status
 PPC405 MP OS Support
• Plan9 Operating System
• Porting Status
3
RAMP-White
Support for Leon3
 Previous design utilized PPC405 hard-cores
• Dual-node, multi-image Linux kernels
• Segmented global address space
 Added support to use Leon3/Grlib
components
4
RAMP-White
Leon3 Default Architecture
 Original Leon3MP design is shared bus model
• Processor cores share AHB bus
• Periperals connected to AHB
• Interrupt and DSU have direct processor links
Leon 3
Leon 3
Mst Slv Int Dbg
Mst Slv Int Dbg
AHB Bus
MP
IntCntrl
5
DSU
Eth
RAMP-White
DDR
RAMP-White System Architecture
 Adapt bus interfaces to a point-to-point
connection scheme
Leon 3
Leon 3
Mst Slv Int Dbg
Mst Slv Int Dbg
Leon3 shim
Leon3 shim
Intersection
Unit
NIU
Router
Router
AHB shim
6
DSU
Eth
Intersection
Unit
AHB shim
AHB bus
AHB bus
MP
IntCntrl
NIU
DDR2
DDR2
RAMP-White
Processor Support
 IBM PowerPC 405
•
•
•
•
Hard-core, 300Mhz
Non-coherent I/D caches
PLB Bus interface
Uni-Processor Linux
 Gaisler Leon3
•
•
•
•
7
Soft-core, ~65Mhz
Write-through snoopy-cache-coherent
AMBA AHB bus interface
SMP Linux
RAMP-White
Default Leon3 Core Interfaces
 4 bi-directional channels
• Master bus interface
• Services icache/dcache fills
• Slave bus interface
• Invalidates dcache entries via snooping address bus
• Interrupt channel
• Driven from multiprocessor interrupt controller
• Debug channel
• Grmon DSU interface
8
RAMP-White
Integrating Off-the-Shelf Processor
 Generally there is a tight coupling between
pipeline/cache/bus-interface
• Thus, prefer to keep existing port interface
• Also provides forward compatibility with soft-cores
 Therefore, add bus shim to convert from
processor-specific bus to White connections
Leon 3
Mst
Slv
AHB shim
9
Int
Dbg
Int/Dbg shim
RAMP-White
Processor Adaptation Issues
 Increased FPGA resources
• Support for RAMP-White infrastructure
 Performance impact
• Request/reply interaction adds latency
 Platform Detection
• Static mapping of bus connected devices
• Linux/Grmon use configuration registers to detect
platform configuration at run-time
• If devices on are connected indirectly, must populate
mapping correctly
10
RAMP-White
Baseline Design
 Dual Leon3 cores with bus shims
 Intersection Unit acts as message handler
• Will be used to support pluggable coherency
engines
 Network interface
 Ring topology
11
RAMP-White
System Software
 Bootstrap:
• JTAG for initial configuration
• Ethernet for system init, kernel loading, debug
 Linux 2.6.21 SMP Kernel
 SnapGear Linux root file-system
 Pthreads programming libraries
12
RAMP-White
Current Prototype
 Single FPGA, dual core, RAMP-White
infrastructure
 ICache enabled, DCache disabled
 Boots Linux in SMP mode via debug memory
initialization
 Ethernet and NFS mount support
 Compact initramfs root file system complied
from SnapGear sources
13
RAMP-White
Current Prototype
14
RAMP-White
Future Work
 Near-term
• Cleanup/bug-fixing/stabilize dual-node platform
• Integrate simple microcoded coherency engine for
Spring 2008 Parallel Comp Arch class
• Lab to be given out by last week of Feb
• Expand cache hierarchy with soft-core cache
models
• Expand design to support multi-FPGA support
15
RAMP-White
Multiprocessor PPC405 OS Support
 Previous PPC RAMP-White design used
multiple independent operating system
images
• Pseudo-SMP support in Linux kernel was nontrivial to implement
 Alternative strategy opened up by recent work
over the summer by IBM in porting Plan9 to
BlueGene PPC
16
RAMP-White
Plan9 Background
 Research OS from Bell Labs open-sourced in
2000
 Unix-style operating system
 Resources exposed as file trees
 Per-process namespaces
 Standard protocol for sharing resources
17
RAMP-White
Plan9 for HPC Applications
 IBM port of Plan9 on a BlueGene grid
• Part of DoE FastOS initiative
• Allows distributed resource management and
sharing across a large grid
• Lighter weight kernel that has less intrusive effect
on HPC apps
• Ported to PPC440 with support for JTAG-based
debug and bootstrapping
18
RAMP-White
Plan9 for RAMP-White
 Smaller, lower complexity operating system
 Allows flexible sharing of physical resources
• memory, ethernet, disk
 Can expose multiple cores as CPU servers
• Allows easy task execution/debugging on remote
cores
 Leverage resurgent interest in using Plan9 for
HPC
• HPC applications ported to Plan9
19
RAMP-White
Porting Plan9
• Worked in collaboration with Eric Van
Hensbergen
• Resurrected the CerfCube 405 platform
– Removed assumptions regarding PPC405 SoC
– Adding support for Xilinx peripherals
• Serial port, interrupt controller, network
• Currently boots Plan9 with limited console
– Bootstrap, virtual memory, console initialization
– Still completing work on interrupt controller,
network
20
RAMP-White
Questions?
21
RAMP-White