360N: Computer Architecture Spring 2005

Download Report

Transcript 360N: Computer Architecture Spring 2005

RAMP-White
Derek Chiou
The University of Texas at Austin
© Derek Chiou
1
Test of size
High Level Characteristics


Coherent distributed shared memory machine
Scalable at the same level as other RAMP machines


Intended to be ISA/Architecture independent



Use different cores
All RAMP efforts are intended to be ISA independent
Intended to integrate components from other RAMP
participants

7/17/2015
1K eventual target
A testbed for sharing of IP
RAMP-White
2
Test of size
Our Additions

New code in Bluespec rather than Verilog/VHDL



Embedded PowerPC as one core




7/17/2015
More configurable
That’s what my group is using
Leon decided on after we started, just recently boots Debian
Wanted to determine issues of different cores
My research needs fast cores
Eventually an SMP OS, initially multi-OS shared space
initially
RAMP-White
3
Test of size
Issues




Architecture
Implementation
Operating System
Sharing IP



7/17/2015
Language
Maturity
Infrastructure (CVS, etc.)
RAMP-White
4
Test of size
Three Stages (for Implementation Ease)

Incoherent shared memory

No hardware global cache, just global shared memory support


However, software can maintain coherence if necessary




Requires a coherent cache
Running essentially a snoopy protocol


True coherence engine not required
But, very restricted communication

Good for testing, modeling many targets
General network-based coherence

7/17/2015
Network virtual memory
Run a simulator on top of the processor
Ring-based coherence (scalable bus)


Optimal cache for local memory
Requires general coherence engine
RAMP-White
5
Test of size
Generalized Architecture


ISA dependent
Intersection Unit
Network Interface Unit
Proc
$
Mem MC
PLB
IU
NIU ISA independent
OPB
bridge
7/17/2015
RAMP-White
6
Test of size

Intersection Unit

Sits between the Processor
(cache), PLB bus and NIU
Processor interface


Proc

Network interface


$

Mem MC
IU

NIU


Incoherent version is a special case
Programmable regions


7/17/2015
Master
Eventually snoop
Hooks for coherency engine

OPB
bridge
Master
Slave
Memory interface

PLB
Slave
Eventually snoop
RAMP-White
Global (local and remote)
Local
7
Test of size
Network Interface Unit

Proc
Split into two components

Msg composition/Queuing

Net transmit/receive

Insert/extract for ring

Intended to permit other
transmit/receive
$
Mem MC
PLB
IU
NIU

One input/one output

OPB
bridge

7/17/2015
RAMP-White
Creates a simple
unidirectional ring
Can interface to more
advanced fabrics
8
Test of size
Operating System


Started by looking at PowerPC
Wanted an SMP OS



Knew we didn’t have coherent cache
But, also missing TLB Invalidation & OpenPIC (interprocessor
interrupts, bring-up)
But, do have load-reservation/store-conditional instructions

Leon is SMP-capable, so should avoid these issues

Starting with separate OS’s


Region of memory is global
(no Block Address Translation (BAT) so need to manage global
pages)

7/17/2015
mmap
RAMP-White
9
Test of size
Status: Hari Angepat



Bluespec learned
NIU code complete and unit tested
IU code complete being tested on XUP


2 PowerPC processors
Supports interfaces



Processor Slave
PLB Master
NIU

Hardware intended to target different ISAs

Some preliminary OS work



Targets Phase 1 (incoherent shared memory)

7/17/2015
SMP-linux investigation
Multi-image mmap interface currently targeted
2 IUs, 1 MC with an arbiter
RAMP-White
10
Test of size
Our Long Term Plans

Phase 1, XUP complete end of 1Q07


Phase 2, 1 BEE2 board hopefully will be 2Q07


Larger scalability, BEE2, Berkeley MC, Leon?, RDL?
Phase 3, hopefully 4Q07


With multi-OS support (with help from Stanford?)
Arbitrary network, cache coherency engine, SMP OS?, Leon?, RDL?
x86 CMP/SMP on top of RAMP-White


Full cycle accurate (separate timing model)
RAMP-White executes functional model in parallel



Start with Phase 1 (separate team)
For Phase 3, tie target coherence system to RAMP-White

7/17/2015
Heterogeneous hosts!
Cache maintained by target coherence, not by host coherence
RAMP-White
11
Test of size
Sharing IP: Some Preliminary Experience

We looked at RAMP-Red XUP


Used some code (PLB master)
Red-BEE is not ready to distribute


Looking for switch code
Berkeley’s code on CVS repository

But, we can’t use memory controller because we don’t have BEE2 board yet

Bluespec
We are spinning almost all of our own code right now

Would like to steal software




Naming



7/17/2015
OS (kernel proxy)
SMP OS port
MPI reference design in BEE2 repository
Is that RAMP-Blue?
A central CVS repository for RAMP code?
RAMP-White
12
Test of size
Sharing Over the Long Term

Processor is shared



Proc


MC is shared

$


IU
NIU



RAMP-White
Borrow half from Berkeley?
Network can be shared

7/17/2015
Trying to make ours general
NIU can be shared


CMU/Stanford
IU functionality can be shared

Peripherals
Transactional/traditional
Borrow Stanford’s?
Coherency engine can be shared

CCE
Xilinx or Berkeley
Coherent cache can be shared

Mem MC
Leon
PowerPC
MicroBlaze
Everything else
Borrow Berkeley’s?
13
Test of size
Conclusions

RAMP White is started




7/17/2015
Hari has been working full time for 1 semester
Have a clear first direction
Architecture looks fairly flexible
Would like to discuss how to share IP better so we
don’t reinvent the wheel
RAMP-White
14