RISC ARCHITECTURE BY TEDDY LEE

Download Report

Transcript RISC ARCHITECTURE BY TEDDY LEE

RISC ARCHITECTURE
BY
TEDDY LEE
TOPICS
• REVIEW OF RISC
• RISC ARCHITECTURE
• RISC VS. CISC
• PA-RISC HP ARCHITECTURE
RISC REVIEW
RISC – Reduced Instruction Set Computer
Main Features:
• One cycle execution
• Pipelining
• Large Number of Registers
RISC ARCHITECTURE
Features:
• Word Width
• Split or common cache
• On-chip or off-chip cache
• Write buffer
• Prefetch buffer
• Harvard or Princeton Architecture
• Common register file or private registers
FEATURES
• Word Width:
Most RISC processors use a 32-bit internal and external
word width
• Split or Common Cache:
Cache is needed between RISC processor and main
memory
FEATURES cont.
• On-Chip or Off-Chip Cache:
Different chip designs either increase access time or simplify
the design of the integer unit
Example – SPARC chip
•Write Buffer:
Accessing data faster
FEATURES cont.
• Prefetch Buffer:
Accessing instruction cache faster
•Harvard or Princeton Architecture:
The design to access data and instruction cache
Examples:
Motorola 88000
MIPS R3000
FEATURES cont.
• Common Register File or Private Registers
Common Register – can be accessed by all execution units
Private Registers – works with the execution units
Examples:
Motorola 88000
IBM RS/6000
RISC VS. CISC
Multiplication Example
Let’s find the product of
two numbers. One in
address location 2:3, and
the other in 5:2, and then
store it back into 2:3
RISC VS. CISC cont.
CISC Approach:
MULT 2:3, 5:2
Higher Level Language:
int a, b;
a = a * b;
RISC VS. CISC cont.
RISC APPROACH:
LOAD A, 2:3
LOAD B, 5:2
PROD A, B
STORE 2:3, A
RISC VS. CISC cont.
ADVANTAGES
CISC
RISC
Emphasis on hardware
Emphasis on software
Includes multi-clock
Single-clock,
complex instructions
Reduced instruction only
Memory-to-memory:
Register to register:
“LOAD” and “STORE”
“LOAD” and “STORE”
incorporated in instructions
are independent instructions
Small code sizes,
Low cycles per second,
high cycles per second
large code sizes
Transistors used for storing
Spends more transistors
complex instructions
on memory registers
RISC VS. CISC cont.
The Performance Equation
CISC APPROACH ANALYSIS: attempts to minimize the
number of instructions per program, sacrificing the number of
cycles per instruction
RISC APPROACH ANALYSIS: RISC does the opposite,
reducing the cycles per instruction at the cost of the number of
instructions per program.
PA-RISC Architecture from HP
What is PA-RISC?
•Definition: PA-RISC stands for Precision Architecture,
Reduced Instruction Set Computer
•Current versions run on the MPE/IX and HP-UX
operating systems.
•Hewlett-Packard was the first computer company to replace
their entire CISC machine families with RISC machines.
PA-RISC HP cont.
Some Features:
• Machine Instruction Formats
• Registers
• Delayed Branching
• Multiply Instruction
Machine Instruction Format
PA-RISC machines use an instruction set based on 32-bit
general-purpose registers
Assembler Code
Explanation
LDW
d(s,b),t
d
= displacement
LDIL$2000,8
R8:=$2000
STH
r,d(s,b)
t
= target register
LDO32(8),8
R8:=R8 + 32
LD
d(b),t
i
= immediate value
LDW-12(0,5),21
R21:=memory(R5-12)
LDIL
s
= space id
STH8,0, (0,21)
memory(R21):=R8
r
= source register
LDH-8(0,5),22
R22:=memory(R5-8)
p
= bit position
EXTRS22,31,16,22
R22:=sign-extended(R22)
b
= base register
LDO-1(22),22
R22:=R22 - 1
i,t
r,p,len,
EXTRS
t
(s,b) = memory address
len
= number of
Machine Instruction Format cont.
Arithmetic:
ADD@and SUB@.
Branches:
B@ as in BL Branch and Link, BV Branch Vectored.
Compare and Branch: C@ as in COMIBF, COMpare Immediate and Branch If False.
Extract:
EXTRS for signed and EXTRU for unsigned.
Load:
L@ as in LDH load halfword, LDO load offset.
Shift:
SH@ as in SH2ADD Shift 2 and Add.
Store:
ST@ as in STB Store Byte, STW Store Word.
Registers
The PA-RISC uses 32 general purpose registers
R0
= bit bucket and source of zero value
R1
= target of ADDIL (Add Immediate Literal)
R2
= RP Return Pointer where BL places address and where BV gets it
R23
= fourth parameter of a procedure call
R24
= third parameter of a procedure call
R25
= second parameter of a procedure call
R26
= first parameter of a procedure call
R27
= DP Data Pointer to base of global data
R28-29
= function result in R28 if 32-bits, both if 64-bits
R30
= SP Stack Pointer to parameters and exit data
R31
= receives target branch address in BLE instruction
Registers cont.
The PA-RISC systems also have in addition eight 32-bit Space Registers
SR 0 = return address of inter-space procedure calls
SR 1
SR 2
SR 3
SR 4
=
=
=
=
Temporary use for constructing long pointers
Temporary use for constructing long pointers
Temporary use for constructing long pointers
Code space
SR 5 = process private data: stack and heap
SR 6 = Shared data
SR 7 = System public code, literals, and data
Delayed Branching
Problems with Branching
• The ideal goal for the PA-RISC architecture is to complete the
execution of a useful instruction in each machine cycle. The
branch instruction is hard to implement in one cycle.
• Pipelining is used to execute instructions simultaneously, but doing a
branch will not work with pipelining.
Delayed Branching cont.
Solution
• Delay the execution of the branch for
one cycle
• Make instructions following after branch
( located in a delay slot) be executed
before control passes to the branch
destination.
• Let the compiler look for an instruction
to put in the delay slot, one that can be
executed during the branch operation
Delayed Branching cont.
Example:
BL opencarton ; branch
LDW 26 ... ; load word into register during delay
BL closecarton ; branch
NOP ; code 8000240, actually OR 0,0,0
Delayed Branching cont.
Delayed Branching is the same as if we could pack our bags while flying to our
destination:
1. book our flight
2. reserve hotel room
3. reserve rental car
4. fly to destination
5.
(pack suitcase during the delay slot)
6.
collect baggage
7.
get rental car
8.
check into hotel
Multiplying Instruction
• Integer multiply and divide not supported by hardware on a PA-RISC
• Find a way to optimize the frequent use of the constants used during
multiply tasks during compile time
• Multiply can be converted to a series of additions and smaller multiples
Example:
120 = 10 x 12
120 = (5 x 12) + (5 x 12)
120 = ((4 x 12) + 12) + ((4 x 12) + 12)
Multiplying Instruction cont.
Solution
• Use Shift and Add machine instructions to multiply a register by 2, 4,
or 8 and add to any register in one cycle.
SH2ADD x,x,x; shift x 2 bits (multiply by 4), add to x, store in x
ADD x,x,x;
add register x to itself and store in x
• The compiler can convert a multiplication by a constant into a series of
Shift and Add instructions
RISC Systems and Processors
• PA-RISC Systems - http://www.testdrive.hp.com/systems/pa-risc.shtml
• RISC Processor - http://www.atmel.com/products/avr/
• MIPS Technology - http://www.mips.com/
References
1. http://www.robelle.com/library/smugbook/pa-risc.html
2. http://cse.stanford.edu/class/sophomore-college/projects00/risc/whatis/index.html
3. http://www.inf.fu-berlin.de/lehre/WS94/RA/RISC-9.html
4. Anthony J. Dos Reis, Assembly Language and Computer
Architecture Using C++ and Java, (United States: Course
Technology, Copyright 2004).