Jan 26 - Zhang Penghui - Jan 25, 2015 622 PM
Download
Report
Transcript Jan 26 - Zhang Penghui - Jan 25, 2015 622 PM
Lecture 2-Berkeley RISC
Penghui Zhang
Guanming Wang
Hang Zhang
1. What Is RISC?
1.1 RISC idea
developed from the realization that the vast majority of
programs did not use the vast majority of a processor’s
instructions.
including only those instructions that were really used
using the space that had been used for the removed
circuitry for other circuits that would speed the system
up instead.
1. What Is RISC?
1.2 How RISC achieves its goal
adding many more registers
small bits of memory hold temporary values that can be
accessed at negligible cost
the speed of the processor would be more closely
defined by its clock speed
1. What Is RISC?
1.3 Comparison Between RISC and CISC
2. RISC I
2.1 RSIC I Design Goals
High-level language programming
Cost-effective system in both hardware and software
Simple, one-word(32-bits) long
“cost” of each statement type
2. RISC I
2.2 RISC I Architechture
31 instructions in a few similar formats, all 32 long
Execution time
Instructions between registers and memory
2. RISC I
2.3 Micro-architechture of RISC I
Instruction executions pattern
1. Read two register
2. 2. Perform an operation on them
3. Store the result
EX. Data-Path of RISC I Chip
2. RISC I
2.4 Design environment of RISC I
UNIX environment on a VAX 11/780
Regular parts of the chip
Control section
3. RISC II
3.1 Background
RISC II microprocessor
• Meets the requirements by the code analysis.
• The majority of the chip is occupied by the data unit.
• Unlike normal microprocessors were dominated by
control.
• Majority of the data unit consists a huge file of
registers — 138 of them.
3. RISC II
• 3.1 Background
• RISC work at Berkeley had turned to the new Blue design
from Gold Design.
• The savings due to the new design were tremendous.
• Gold contained
78 registers in 6 windows.
• Blue contained
138 registers 8 windows of 16 registers each another 10
globals.
• The final Blue design, fabbed as RISC II, implemented all of
the RISC instruction set with only 39,000 transistors.
3. RISC II
The RISC II register file
3. RISC II
3.2 Difference
The key difference was simpler cache circuitry that
eliminated one line per bit
The other major change was to include an "instructionformat expander“
RISC II proved to be much more successful in silicon and
in testing outperformed almost all minicomputers on
almost all tasks.
3. RISC II
3.3 Architecture Of RISC II
It is the evolution of the RISC I design.
Reading is accomplished by selectively discharging one
of the two precharged bit Line busses
RISC was design based two-bus and two port register
cell.
The RISC II architecture used a two-stage pipeline.
3. RISC II
Data Path of RISC II
3. RISC II
3.4 Implementation
Three Machine Cycles:
Instruction fetch and decode.
Register read, operate, and temporary latching of result.
Write result back into the register file.
These three cycles are overlaped
New instruction begins every machine cycle.
Except for Load and Store instructions.
4. Architectural inheritance
Features used
A load-store architecture
Fixed-length 32-bit instructions
3-address instruction formats
4. Architectural inheritance
4.2 Features rejected
4.2.1Register windows
The register banks on the Berkeley RISC processors
incorporated a large number of registers, 32 of which were
visible at any time
Procedure entry and exit instructions moved the visible
‘window’ to give each procedure access to new registers
The principal problem with register windows is the large
chip area occupied by the large number of registers
This feature was therefore rejected on cost grounds
4. Architectural inheritance
4.2 Features rejected
4.2.2 Delayed branches
Branches cause pipelines problems since they interrupt the
smooth flow of instructions
Most RISC processors ameliorate the problem by using
delayed branches where the branch takes effect after the
following instruction has executed
On the original ARM delayed branches were not used
because they made exception handling more complex
In the long run this has turned out to be a good decision
since it simplifies re-implementing the architecture with a
different pipeline
4. Architectural inheritance
4.2 Features rejected
4.2.3 Single-cycle execution of all instruction
Although the ARM executes most data processing
instructions in a single clock cycle, many other instructions
take multiple clock cycles
Single cycle operation of all instructions is only possible with
separate data and instruction memories, which were
considered too expensive for the intended ARM application
areas
The ARM was designed to use the minimum number of
cycles required for memory access