Transcript CE550
Computer Organization EECC 550
Week 1
Week 2
Week 3
•
Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer
Notation (RTN). [Chapters 1, 2]
•
Instruction Set Architecture (ISA) Characteristics and Classifications: CISC Vs. RISC. [Chapter 2]
•
MIPS: An Example RISC ISA. Syntax, Instruction Formats, Addressing Modes, Encoding &
Examples. [Chapter 2]
•
Central Processor Unit (CPU) & Computer System Performance Measures. [Chapter 1]
•
CPU Organization: Datapath & Control Unit Design. [Chapter 4]
Week 4
Week 5
•
MIPS Single Cycle Datapath & Control Unit Design.
–
MIPS Multicycle Datapath and Finite State Machine Control Unit Design.
–
Week 6
Week 7
Week 8
3rd Edition Ch. 5
–
Microprogrammed Control Unit Design.
3rd Edition Ch. 4
3rd Edition Ch. 5 (not in 4th)
3rd Edition Ch. 5 (not in 4th Edition)
Microprogramming Project
•
Midterm Review and Midterm Exam
•
CPU Pipelining. [Chapter 4]
•
The Memory Hierarchy: Cache Design & Performance. [Chapter 5]
3rd Edition Ch. 6
3rd Edition Ch. 7
•
The Memory Hierarchy: Main & Virtual Memory. [Chapter 5]
Week 9
•
Input/Output Organization & System Performance Evaluation. [Chapter 7]
Week 10
•
Computer Arithmetic & ALU Design. [Chapter 3] If time permits.
Week 11
•
Final Exam.
3rd Edition Ch. 8
EECC550 - Shaaban
#1 Lec # 1 Winter 2011 11-29-2011
Computing System History/Trends +
Instruction Set Architecture (ISA) Fundamentals
•
Computing Element Choices:
–
–
–
•
•
•
•
•
•
•
•
Computing Element Programmability
Spatial vs. Temporal Computing
Main Processor Types/Applications
General Purpose Processor Generations
The Von Neumann Computer Model
CPU Organization (Design)
Recent Trends in Computer Design/performance
Hierarchy of Computer Architecture
Hardware Description: Register Transfer Notation (RTN)
Computer Architecture Vs. Computer Organization
Instruction Set Architecture (ISA):
–
–
–
–
–
–
–
–
–
–
Definition and purpose
ISA Specification Requirements
Main General Types of Instructions
ISA Types and characteristics
Typical ISA Addressing Modes
Instruction Set Encoding
Instruction Set Architecture Tradeoffs
Complex Instruction Set Computer (CISC)
Reduced Instruction Set Computer (RISC)
Evolution of Instruction Set Architectures
Chapters 1, 2 (both editions)
ISA
CPU Design
EECC550 - Shaaban
#2 Lec # 1 Winter 2011 11-29-2011
Computing Element Choices
•
•
General Purpose Processors (GPPs): Intended for general purpose computing
(desktops, servers, clusters..)
Application-Specific Processors (ASPs): Processors with ISAs and
architectural features tailored towards specific application domains
–
•
•
Co-Processors: A hardware (hardwired) implementation of specific
algorithms with limited programming interface (augment GPPs or ASPs)
Configurable Hardware:
–
–
•
•
E.g Digital Signal Processors (DSPs), Network Processors (NPs), Media Processors,
Graphics Processing Units (GPUs), Vector Processors??? ...
Field Programmable Gate Arrays (FPGAs)
Configurable array of simple processing elements
Application Specific Integrated Circuits (ASICs): A custom VLSI hardware
solution for a specific computational task
The choice of one or more depends on a number of factors including:
- Type and complexity of computational algorithm
(general purpose vs. Specialized)
- Desired level of flexibility/
programmability
- Development cost/time
- Power requirements
The main goal of this course is the study of fundamental design
techniques for General Purpose Processors
- Performance requirements
- System cost
- Real-time constrains
EECC550 - Shaaban
#3 Lec # 1 Winter 2011 11-29-2011
Programmability / Flexibility
Computing Element Choices
General Purpose
Processors
(GPPs):
The main goal of this course is the study
of fundamental design techniques
for General Purpose Processors
Application-Specific
Processors (ASPs)
Processor : Programmable computing element that
runs programs written using a pre-defined set of
instructions
ISA Requirements Processor Design
Configurable Hardware
Selection Factors:
- Type and complexity of computational algorithms
(general purpose vs. Specialized)
- Desired level of flexibility
- Performance
- Development cost
- System cost
- Power requirements
- Real-time constrains
Co-Processors
Specialization , Development cost/time
Performance/Chip Area/Watt
(Computational Efficiency)
Software
Hardware
Application Specific
Integrated Circuits
(ASICs)
Performance
EECC550 - Shaaban
#4 Lec # 1 Winter 2011 11-29-2011
Computing Element Choices:
Computing Element Programmability
(Hardware)
(Processor)
Software
Fixed Function:
Programmable:
• Computes one function (e.g.
FP-multiply, divider, DCT)
• Function defined at
fabrication time
• e.g hardware (ASICs)
• Computes “any”
computable function (e.g.
Processors)
• Function defined after
fabrication
Parameterizable Hardware:
Performs limited “set” of functions
e.g. Co-Processors
Processor = Programmable computing element
that runs programs written using pre-defined instructions
EECC550 - Shaaban
#5 Lec # 1 Winter 2011 11-29-2011
Computing Element Choices:
Spatial vs. Temporal Computing
Spatial
Temporal
(using software/program
running on a processor)
(using hardware)
Defined by fixed functionality
and connectivity of hardware elements
Hardware Block Diagram
Time
Processor
Instructions
(Program)
Processor = Programmable computing element
that runs programs written using a pre-defined set of instructions
EECC550 - Shaaban
#6 Lec # 1 Winter 2011 11-29-2011
The main goal of this course is the study of fundamental design
techniques for General Purpose Processors
•
•
Examples of Application-Specific Processors (ASPs)
Increasing
volume
•
General Purpose Computing & General Purpose Processors (GPPs)
– High performance: In general, faster is always better.
– RISC or CISC: Intel P4, IBM Power4, SPARC, PowerPC, MIPS ...
64 bit
– Used for general purpose software
– End-user programmable
– Real-time performance may not be fully predictable (due to dynamic arch. features)
– Heavy weight, multi-tasking OS - Windows, UNIX
– Normally, low cost and power not a requirement (changing)
– Servers, Workstations, Desktops (PC’s), Notebooks, Clusters …
Embedded Processing: Embedded processors and processor cores
– Cost, power code-size and real-time requirements and constraints
– Once real-time constraints are met, a faster processor may not be better
16-32 bit
– e.g: Intel XScale, ARM, 486SX, Hitachi SH7000, NEC V800...
– Often require Digital signal processing (DSP) support or other
application-specific support (e.g network, media processing)
– Single or few specialized programs – known at system design time
– Not end-user programmable
– Real-time performance must be fully predictable (avoid dynamic arch. features)
– Lightweight, often realtime OS or no OS
– Examples: Cellular phones, consumer electronics .. …
Microcontrollers
8 bit
– Extremely code size/cost/power sensitive
– Single program
– Small word size - 8 bit common
Processor = Programmable computing element
– Usually no OS
that runs programs written using pre-defined instructions
– Highest volume processors by far
– Examples: Control systems, Automobiles, industrial control, thermostats, ...
Increasing
Cost/Complexity
Main Processor Types/Applications
EECC550 - Shaaban
#7 Lec # 1 Winter 2011 11-29-2011
Performance
The Processor Design Space
Application specific
architectures
for performance
Embedded
Real-time constraints
processors
Specialized applications
Low power/cost constraints
Microcontrollers
Microprocessors
GPPs
Performance is
everything
& Software rules
The main goal of this course is the
study of fundamental design techniques
for General Purpose Processors
Cost is everything
Chip Area, Power Processor Cost
complexity
Processor = Programmable computing element
that runs programs written using a pre-defined set of instructions
EECC550 - Shaaban
#8 Lec # 1 Winter 2011 11-29-2011
General Purpose Processor/Computer System Generations
Classified according to implementation technology:
•
The First Generation, 1946-59: Vacuum Tubes, Relays, Mercury Delay Lines:
– ENIAC (Electronic Numerical Integrator and Computer): First electronic
computer, 18000 vacuum tubes, 1500 relays, 5000 additions/sec (1944).
– First stored program computer: EDSAC (Electronic Delay Storage Automatic
Calculator), 1949.
•
The Second Generation, 1959-64: Discrete Transistors.
– e.g. IBM Main frames
•
The Third Generation, 1964-75: Small and Medium-Scale Integrated (MSI)
Circuits.
– e.g Main frames (IBM 360) , mini computers (DEC PDP-8, PDP-11).
•
The Fourth Generation, 1975-Present: The Microcomputer. VLSI-based
Microprocessors (single-chip processor)
– First microprocessor: Intel’s 4-bit 4004 (2300 transistors), 1970.
– Personal Computer (PCs), laptops, PDAs, servers, clusters …
– Reduced Instruction Set Computer (RISC) 1984
Common factor among all generations:
All target the The Von Neumann Computer Model or paradigm
EECC550 - Shaaban
#9 Lec # 1 Winter 2011 11-29-2011
The Von Neumann Computer Model
• Partitioning of the programmable computing engine into components:
–
–
–
–
Central Processing Unit (CPU): Control Unit (instruction decode , sequencing of operations),
Datapath (registers, arithmetic and logic unit, connections, buses …).
AKA Program Counter
Memory: Instruction (program) and operand (data) storage.
(PC) Based Architecture
Input/Output (I/O) sub-system: I/O bus, interfaces, devices.
The stored program concept: Instructions from an instruction set are fetched from a common
memory and executed one at a time
The Program Counter (PC) points to next instruction to be processed
Control
Input
Memory
(instructions,
data)
Computer System
Datapath
registers
ALU, buses
Output
CPU
I/O Devices
Major CPU Performance Limitation: The Von Neumann computing model implies sequential execution one instruction at a time
Another Performance Limitation: Separation of CPU and memory
(The Von Neumann memory bottleneck)
EECC550 - Shaaban
#10 Lec # 1 Winter 2011 11-29-2011
Generic CPU Machine Instruction Processing Steps
(Implied by The Von Neumann Computer Model)
Instruction
Fetch
Obtain instruction from program storage
(memory)
The Program Counter (PC) points to next instruction to be processed
Instruction
Decode
Operand
Fetch
Execute
Result
Store
Next
Instruction
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Determine successor or next instruction
(i.e Update PC to fetch next instruction to be processed)
Major CPU Performance Limitation: The Von Neumann computing model
implies sequential execution one instruction at a time
EECC550 - Shaaban
#11 Lec # 1 Winter 2011 11-29-2011
Hardware Components of Computer Systems
Five classic components of all computers:
1. Control Unit; 2. Datapath; 3. Memory; 4. Input; 5. Output
}
}
Processor
I/O
Central Processing Unit (CPU)
Computer
Processor
(active)
Control
Unit
Datapath
Memory
(passive)
(where
programs,
data
live when
running)
Devices
Keyboard,
Mouse, etc.
Input
I/O
Disk
Output
Display,
Printer, etc.
EECC550 - Shaaban
#12 Lec # 1 Winter 2011 11-29-2011
CPU Organization (Design)
• Datapath Design:
AKA CPU
Microarchitecture
Components & their connections needed by ISA instructions
– Capabilities & performance characteristics of principal
Functional Units (FUs) needed by ISA instructions
– (e.g., Registers, ALU, Shifters, Logic Units, ...) Components
– Ways in which these components are interconnected (buses
connections, multiplexors, etc.). Connections
– How information flows between components.
• Control Unit Design:
Control/sequencing of operations of datapath components
to realize ISA instructions
– Logic and means by which such information flow is controlled.
– Control and coordination of FUs operation to realize the targeted
Instruction Set Architecture to be implemented (can either be
implemented using a finite state machine or a microprogram).
• Hardware description with a suitable language, possibly
using Register Transfer Notation (RTN).
ISA = Instruction Set Architecture
The ISA forms an abstraction layer that sets the requirements for both
complier and CPU designers
EECC550 - Shaaban
#13 Lec # 1 Winter 2011 11-29-2011
Control
Unit
A Typical
Microprocessor
Layout:
The Intel
Pentium Classic
1993 - 1997
60MHz - 233 MHz
Datapath
First Level of Memory (Cache)
EECC550 - Shaaban
#14 Lec # 1 Winter 2011 11-29-2011
Control
Unit
A Typical
Microprocessor
Layout:
The Intel
Pentium Classic
1993 - 1997
60MHz - 233 MHz
Datapath
First Level of Memory (Cache)
EECC550 - Shaaban
#15 Lec # 1 Winter 2011 11-29-2011
Computer System Components
CPU Core
Recently 1 or 2 or 4 processor cores per chip
1 GHz - 3.8 GHz
4-way Superscaler
All Non-blocking caches
RISC or RISC-core (x86):
L1 16-128K
1-2 way set associative (on chip), separate or unified
Deep Instruction Pipelines
L1 L2 256K- 2M 4-32 way set associative (on chip) unified
Dynamic scheduling
L3 2-16M
8-32 way set associative (off or on chip) unified
CPU
Multiple FP, integer FUs
Dynamic branch prediction
L2
Hardware speculation
Examples: Alpha, AMD K7: EV6, 200-400 MHz
Intel PII, PIII: GTL+ 133 MHz
L3
SDRAM
Caches
Intel P4
800 MHz
PC100/PC133
100-133MHZ
64-128 bits wide
2-way inteleaved
~ 900 MBYTES/SEC )64bit)
Current Standard
Double Date
Rate (DDR) SDRAM
PC3200
200 MHZ DDR
64-128 bits wide
4-way interleaved
~3.2 GBYTES/SEC
(one 64bit channel)
~6.4 GBYTES/SEC
(two 64bit channels)
Front Side Bus (FSB)
Off or On-chip
adapters
Memory
Controller
Memory Bus
RAMbus DRAM (RDRAM)
400MHZ DDR
16 bits wide (32 banks)
~ 1.6 GBYTES/SEC
I/O Buses
NICs
Controllers
Example: PCI, 33-66MHz
32-64 bits wide
133-528 MBYTES/SEC
PCI-X 133MHz 64 bit
1024 MBYTES/SEC
Memory
Disks
Displays
Keyboards
Networks
I/O Devices:
North
Bridge
South
Bridge
Chipset
I/O Subsystem
EECC550 - Shaaban
#16 Lec # 1 Winter 2011 11-29-2011
Microprocessor Performance Increase 1984-2000
SPEC CPU2000 Performance
> 100x performance increase in one decade
EECC550 - Shaaban
#17 Lec # 1 Winter 2011 11-29-2011
Microprocessor Transistor Count Growth Rate
Currently ~ 3 Billion
Moore’s Law:
2X transistors/Chip
Every 1.5-2 years
(circa 1970)
Intel 4004
(2300 transistors)
~ 1,300,000x transistor density increase in the last 40 years
Still holds today
EECC550 - Shaaban
#18 Lec # 1 Winter 2011 11-29-2011
A current Multi-core Microprocessor Example
The increase in transistor chip density allows integrating more
than one processor core per chip
A benefit of Moore’s Law
AMD Barcelona (Opteron X4) : 4 processor cores on one chip
EECC550 - Shaaban
#19 Lec # 1 Winter 2011 11-29-2011
Increase of Capacity of VLSI Dynamic RAM
(DRAM) Memory Chips
1.55X/yr, or doubling every 1.6 years
(Also follows Moore’s Law)
~ 17,000x DRAM chip capacity increase in 20 years
EECC550 - Shaaban
#20 Lec # 1 Winter 2011 11-29-2011
Computer Technology Trends:
Evolutionary but Rapid Change
• Processor:
– 1.5-1.6 performance improvement every year; Over 100X performance in last
decade.
• Memory:
– DRAM capacity: > 2x every 1.5 years; 1000X size in last decade.
– Cost per bit: Improves about 25% or more per year.
– Only 15-25% performance improvement per year.
• Disk:
–
–
–
–
Performance gap compared
Capacity: > 2X in size every 1.5 years.
to CPU performance causes
Cost per bit: Improves about 60% per year.
system performance bottlenecks
200X size in last decade.
Only 10% performance improvement per year, due to mechanical limitations.
• Expected State-of-the-art PC Fourth Quarter 2011 :
– Processor clock speed: ~ 3500 MegaHertz (3.5 Giga Hertz) With 2-8 processor cores
– Memory capacity:
~ 8000 MegaByte (8 Giga Bytes) on a single chip
– Disk capacity:
~ 3000 GigaBytes (3 Tera Bytes)
EECC550 - Shaaban
#21 Lec # 1 Winter 2011 11-29-2011
A Simplified View of The
Software/Hardware Hierarchical Layers
EECC550 - Shaaban
#22 Lec # 1 Winter 2011 11-29-2011
Hierarchy of Computer Architecture
High-Level Language Programs
Software
Assembly Language
Programs
Application
Operating
System
Machine Language
Program
Compiler
Software/Hardware
Boundary
Firmware
Instr. Set Proc. I/O system
Instruction Set
Architecture
(ISA)
The ISA forms an abstraction layer
that sets the requirements for both
complier and CPU designers
Datapath & Control
Hardware
e.g.
BIOS (Basic Input/Output System)
Digital Design
Circuit Design
Microprogram
Layout
Logic Diagrams
VLSI placement & routing
Register Transfer
Notation (RTN)
Circuit Diagrams
EECC550 - Shaaban
#23 Lec # 1 Winter 2011 11-29-2011
Levels of Program Representation
temp = v[k];
High Level Language
Program
v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw $15,
lw $16,
sw$16,
sw$15,
Software
Assembly Language
Program
Assembler
Machine Language
Program
Hardware
ISA
0000
1010
1100
0101
Machine Interpretation
Control Signal
Specification
°
°
1001
1111
0110
1000
1100
0101
1010
0000
0110
1000
1111
1001
0($2)
4($2)
0($2)
4($2)
1010
0000
0101
1100
1111
1001
1000
0110
MIPS
Assembly
Code
0101
1100
0000
1010
1000
0110
1001
1111
ISA Requirements Processor Design
ALUOP[0:3] <= InstReg[9:11] & MASK
Register Transfer Notation (RTN)
Microprogram
ISA = Instruction Set Architecture. The ISA forms an abstraction layer that sets the
requirements for both complier and CPU designers
EECC550 - Shaaban
#24 Lec # 1 Winter 2011 11-29-2011
A Hierarchy of Computer Design
Level Name
1
Modules
Electronics
2
Logic
3
Organization
Gates, FF’s
Registers, ALU’s ...
Processors, Memories
Primitives
Descriptive Media
Transistors, Resistors, etc.
Gates, FF’s ….
Circuit Diagrams
Logic Diagrams
Registers, ALU’s …
Register Transfer
Notation (RTN)
Low Level - Hardware
4 Microprogramming
Assembly Language
Microinstructions
Microprogram
Firmware
5 Assembly language
programming
6 Procedural
Programming
7
Application
OS Routines
Applications
Drivers ..
Systems
Assembly language
Instructions
Assembly Language
Programs
OS Routines
High-level Languages
High-level Language
Programs
Procedural Constructs
Problem-Oriented
Programs
High Level - Software
EECC550 - Shaaban
#25 Lec # 1 Winter 2011 11-29-2011
Hardware Description
• Hardware visualization:
– Block diagrams (spatial visualization):
Two-dimensional representations of functional units and their
interconnections.
– Timing charts (temporal visualization):
Waveforms where events are displayed vs. time.
• Register Transfer Notation (RTN):
AKA Register Transfer Language (RTL)
– A way to describe microoperations capable of being performed
by the data flow (data registers, data buses, functional units) at
the register transfer level of design (RT).
– Also describes conditional information in the system which
cause operations to come about.
– A “shorthand” notation for microoperations.
• Hardware Description Languages:
– Examples: VHDL: VHSIC (Very High Speed Integrated
Circuits) Hardware Description Language, Verilog.
EECC550 - Shaaban
#26 Lec # 1 Winter 2011 11-29-2011
Register Transfer Notation (RTN)
• Independent RTN:
– No predefined data flow is assumed (i.e No datapath design yet)
– Describe actions on registers and memory locations without
regard to nonexistence of direct paths or intermediate registers.
– Useful to describe functionality of instructions of a given ISA.
• Dependent RTN:
– When RTN is used after the data flow (datapath design) is
assumed to be frozen.
– No data transfer can take place over a path that does not exist.
– No RTN statement implies a function the data flow hardware is
incapable of performing.
• The general format of an RTN statement:
Conditional information : Action1; Action2; …
• The conditional statement is often an AND of literals (status and
control signals) in the system (a p-term). The p-term is said to imply
the action.
• Possible actions include transfer of data to/from registers/memory
data shifting, functional unit operations etc.
EECC550 - Shaaban
#27 Lec # 1 Winter 2011 11-29-2011
RTN Statement Examples
AB
or
R[A] R[B]
where R[X] mean the content of register X
– A copy of the data in entity B (typically a register) is
placed in Register A
– If the destination register has fewer bits than the source,
the destination accepts only the lowest-order bits.
– If the destination has more bits than the source, the value
of the source is sign extended to the left.
CTL T0: A = B
– The contents of B are presented to the input of
combinational circuit A
– This action to the right of “:” takes place when control
signal CTL is active and signal T0 is active.
EECC550 - Shaaban
#28 Lec # 1 Winter 2011 11-29-2011
RTN Statement Examples
MD M[MA]
or MD Mem[MA]
– Means the memory data (MD) register receives the contents
of the main memory (M or Mem) as addressed from the
Memory Address (MA) register.
AC(0), AC(1), AC(2), AC(3)
–
–
–
–
–
Register fields are indicated by parenthesis.
The concatenation operation is indicated by a comma.
Bit AC(0) is bit 0 of the accumulator AC
The above expression means AC bits 0, 1, 2, 3
More commonly represented by AC(0-3)
E T3: CLRWRITE
– The control signal CLRWRITE is activated when the
condition E T3 is active.
EECC550 - Shaaban
#29 Lec # 1 Winter 2011 11-29-2011
Computer Architecture Vs. Computer Organization
• The term Computer architecture is sometimes erroneously restricted
to computer instruction set design, with other aspects of computer
design called implementation.
The ISA forms an abstraction layer that sets the
requirements for both complier and CPU designers
• More accurate definitions:
– Instruction Set Architecture (ISA): The actual programmervisible instruction set and serves as the boundary or interface
between the software and hardware.
– Implementation of a machine has two components:
• Organization: includes the high-level aspects of a computer’s
CPU Microarchitecture
design such as: The memory system, the bus structure, the
(CPU design)
internal CPU unit which includes implementations of arithmetic,
logic, branching, and data transfer operations.
• Hardware: Refers to the specifics of the machine such as detailed
logic design and packaging technology. Hardware design and implementation
• In general, Computer Architecture refers to the above three aspects:
1- Instruction set architecture 2- Organization. 3- Hardware.
EECC550 - Shaaban
#30 Lec # 1 Winter 2011 11-29-2011
Assembly
Programmer
Or
Compiler
Instruction Set Architecture (ISA)
“... the attributes of a [computing] system as seen by the
programmer, i.e. the conceptual structure and functional
behavior, as distinct from the organization of the data flows
and controls the logic design, and the physical
implementation.”
ISA forms an abstraction layer that sets the
– Amdahl, Blaaw, and Brooks, 1964. The
requirements for both complier and CPU designers
The instruction set architecture is concerned with:
• Organization of programmable storage (memory & registers):
Includes the amount of addressable memory and number of
available registers.
• Data Types & Data Structures: Encodings & representations.
• Instruction Set: What operations are specified.
• Instruction formats and encoding.
• Modes of addressing and accessing data items and instructions
• Exceptional conditions.
EECC550 - Shaaban
#31 Lec # 1 Winter 2011 11-29-2011
Evolution of Instruction Set Architectures
Single Accumulator (EDSAC 1949)
No ISA
Accumulator + Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model
from Implementation
High-level Language Based
(B5000 1963)
i.e. CPU
Design
Concept of an ISA Family
(IBM 360 1964)
General Purpose Register (GPR) Machines
Complex Instruction Sets (CISC)
(Vax, Motorola 68000, Intel x86 1977-80)
Load/Store Architecture
(CDC 6600, Cray 1 1963-76)
Reduced Instruction Set Computer (RISC)
(MIPS, SPARC, HP-PA, PowerPC, . . . 1984..)
EECC550 - Shaaban
#32 Lec # 1 Winter 2011 11-29-2011
Computer Instruction Sets
• Regardless of computer type, CPU structure, or
hardware organization, every machine instruction must
specify the following:
– Opcode: Which operation to perform. Example: add,
load, and branch. Opcode = Operation Code
– Where to find the operand or operands, if any: Operands
may be contained in CPU registers, main memory, or I/O
ports. Operands location can be explicitly specified in the instruction or implied
– Where to put the result, if there is a result: May be
explicitly mentioned or implicit in the opcode. Destination
– Where to find the next instruction: Without any explicit
branches, the instruction to execute is the next instruction
in the sequence or a specified address in case of jump or
branch instructions.
EECC550 - Shaaban
#33 Lec # 1 Winter 2011 11-29-2011
Instruction Set Architecture (ISA)
Instruction
Specification Requirements
Fetch
Instruction
Decode
Operand
Fetch
Execute
Result
Store
Next
Instruction
• Instruction Format or Encoding:
– How is it decoded?
• Location of operands and result (addressing
modes):
– Where other than memory?
– How many explicit operands?
– How are memory operands located?
– Which can or cannot be in memory?
• Data type and Size.
• Operations
– What are supported
• Successor instruction:
– Jumps, conditions, branches.
• Fetch-decode-execute is implicit.
EECC550 - Shaaban
#34 Lec # 1 Winter 2011 11-29-2011
Main General Types of Instructions
1. Data Movement Instructions, possible variations:
–
–
–
–
–
–
Memory-to-memory.
Memory-to-CPU register.
CPU-to-memory.
Constant-to-CPU register.
CPU-to-output.
etc.
2. Arithmetic Logic Unit (ALU) Instructions:
– Logic instructions
– Integer Arithmetic Instructions
– Floating Point Arithmetic Instructions
3. Branch (Control) Instructions:
– Unconditional jumps.
– Conditional branches.
EECC550 - Shaaban
#35 Lec # 1 Winter 2011 11-29-2011
Examples of Data Movement Instructions
Instruction
Meaning
Machine
MOV A,B
Move 16-bit data from memory loc. A to loc. B
VAX11
lwz R3,A
Move 32-bit data from memory loc. A to register R3
PPC601
li $3,455
Load the 32-bit integer 455 into register $3
MIPS R3000
MOV AX,BX
Move 16-bit data from register BX into register AX
Intel X86
LEA.L (A0),A2
Load the address pointed to by A0 into A2
MC68000
EECC550 - Shaaban
#36 Lec # 1 Winter 2011 11-29-2011
Examples of ALU Instructions
Instruction
Meaning
Machine
MULF A,B,C
Multiply the 32-bit floating point values at mem.
locations A and B, and store result in loc. C
VAX11
nabs r3,r1
Store the negative absolute value of register r1 in r2
PPC601
ori $2,$1,255
Store the logical OR of register $1 with 255 into $2
MIPS R3000
SHL AX,4
Shift the 16-bit value in register AX left by 4 bits
Intel X86
ADD.L D0,D1
Add the 32-bit values in registers D0, D1 and store
the result in register D0
MC68000
EECC550 - Shaaban
#37 Lec # 1 Winter 2011 11-29-2011
Examples of Branch Instructions
Instruction
Meaning
Machine
BLBS A, Tgt
Branch to address Tgt if the least significant bit
at location A is set.
VAX11
bun r2
Branch to location in r2 if the previous comparison
signaled that one or more values was not a number.
PPC601
Beq $2,$1,32
Branch to location PC+4+32 if contents of $1 and $2
are equal.
MIPS R3000
JCXZ Addr
Jump to Addr if contents of register CX = 0.
Intel X86
BVS next
Branch to next if overflow flag in CC is set.
MC68000
EECC550 - Shaaban
#38 Lec # 1 Winter 2011 11-29-2011
Operation Types in The Instruction Set
Operator Type
Arithmetic and logical
Data transfer
1
Control
3
2
Examples
Integer arithmetic and logical operations: add, or
Loads-stores (move on machines with memory
addressing)
Branch, jump, procedure call, and return, traps.
System
Operating system call/return, virtual memory
management instructions ...
Floating point
Floating point operations: add, multiply ....
Decimal
Decimal add, decimal multiply, decimal to
character conversion
String
String move, string compare, string search
Media
The same operation performed on multiple data
(e.g Intel MMX, SSE)
EECC550 - Shaaban
#39 Lec # 1 Winter 2011 11-29-2011
Instruction Usage Example:
Top 10 Intel X86 Instructions
Rank
instruction
Integer Average Percent total executed
1
load
22%
2
conditional branch
20%
3
compare
16%
4
store
12%
5
add
8%
6
and
6%
7
sub
5%
8
move register-register
4%
9
call
1%
10
return
1%
Total
96%
Observation: Simple instructions dominate instruction usage frequency.
CISC to RISC observation
EECC550 - Shaaban
#40 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
According To Operand Memory Addressing Fields
Memory-To-Memory Machines:
– Operands obtained from memory and results stored back in memory by any
instruction that requires operands.
– No local CPU registers are used in the CPU datapath.
– Include:
• The 4 Address Machine.
Machine = ISA or CPU targeting a specific ISA type
• The 3-address Machine.
• The 2-address Machine.
The 1-address (Accumulator) Machine:
– A single local CPU special-purpose register (accumulator) is used as the source of
one operand and as the result destination.
The 0-address or Stack Machine:
– A push-down stack is used in the CPU.
General Purpose Register (GPR) Machines:
– The CPU datapath contains several local general-purpose registers which can
be used as operand sources and as result destinations.
– A large number of possible addressing modes.
– Load-Store or Register-To-Register Machines: GPR machines where only
data movement instructions (loads, stores) can obtain operands from memory
and store results to memory.
CISC to RISC observation (load-store simplifies CPU design)
EECC550 - Shaaban
#41 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
Memory-To-Memory Machines:
The 4-Address Machine/ISA
•
•
No program counter (PC) or other CPU registers are used.
Instruction encoding has four address fields to specify:
– Location of first operand. - Location of second operand.
– Place to store the result.
- Location of next instruction.
Instruction:
Memory
CPU
add Res, Op1, Op2, Nexti
Op1Addr: Op1
Op2Addr: Op2
Meaning:
Res Op1 + Op2
+
or more precise RTN:
M[ResAddr] M[Op1Addr] + M[Op2Addr]
ResAddr: Res
:
:
Instruction Format (encoding)
Bits:
NextiAddr: Nexti
Can address
224
Instruction
Size:
13 bytes
bytes = 16 MBytes
8
24
add
ResAddr
Opcode
Which
operation
Where to
put result
24
24
Op1Addr
Op2Addr
Where to find operands
24
NextiAddr
Where to find
next instruction
EECC550 - Shaaban
#42 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
Memory-To-Memory Machines:
The 3-Address Machine/ISA
•
•
A program counter (PC) is included within the CPU which points to the next
instruction.
No CPU storage (general-purpose registers).
Memory
CPU
add Res, Op1, Op2
Op1Addr: Op1
Op2Addr: Op2
Instruction:
+
ResAddr: Res
:
:
Meaning:
Res Op1 + Op2
or more precise RTN:
M[ResAddr] M[Op1Addr] + M[Op2Addr]
PC PC + 10 Increment PC
Where to find
next instruction
NextiAddr: Nexti
Program
24
Counter (PC)
Can address 224 bytes = 16 MBytes
Instruction
Size:
10 bytes
Instruction Format (encoding)
Bits:
8
24
add
ResAddr
Opcode
Which
operation
24
Where to
put result
Op1Addr
24
Op2Addr
Where to find operands
EECC550 - Shaaban
#43 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
Memory-To-Memory Machines:
The 2-Address Machine/ISA
•
The 2-address Machine: Result is stored in the memory address of one of
the operands.
Instruction:
Memory
Op1Addr:
CPU
Meaning:
Op1
+
Op2Addr: Op2,Res
:
:
Op2 Op1 + Op2
or more precise RTN:
M[Op2Addr] M[Op1Addr] + M[Op2Addr]
PC PC + 7 Increment PC
Instruction Format (encoding)
Where to find
next instruction
NextiAddr: Nexti
add Op2, Op1
Program
24
Counter (PC)
Can address 224 bytes = 16 MBytes
Bits:
8
24
add
Op2Addr
Opcode
Which
operation
24
Op1Addr
Where to find operands
Where to
put result
Instruction
Size:
7 bytes
EECC550 - Shaaban
#44 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
The 1-address (Accumulator) Machine/ISA
•
A single register (accumulator) in the CPU is used as the source of one
operand and result destination.
Instruction:
Memory
Op1Addr:
CPU
add Op1
Meaning:
Op1
+
:
:
Accumulator
Where to find
next instruction
NextiAddr: Nexti
Where to find
operand2, and
where to put result
Program
24
Counter (PC)
Acc Acc + Op1
or more precise RTN:
Acc Acc + M[Op1Addr]
PC PC + 4 Increment PC
Instruction Format (encoding)
Bits:
8
24
add
Op1Addr
Opcode
Where to find
Which
operand1
operation
Can address 224 bytes = 16 MBytes
Instruction
Size:
4 bytes
EECC550 - Shaaban
#45 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
The 0-address (Stack) Machine/ISA
•
A push-down stack is used in the CPU.
4 Bytes
Memory
push
Op1Addr: Op1
Op2Addr: Op2
ResAddr: Res
Instruction Format
24
Bits: 8
CPU
Stack
pop
TOS
Op2, Res
SOS
Op1
add
+
etc.
:
:
Instruction:
push Op1Addr
push Op1
Opcode Where to find
operand
Meaning:
TOS M[Op1Addr]
Instruction:
Instruction Format
1 Byte
add
Bits: 8
Meaning:
add
TOS TOS + SOS
Opcode
8
4 Bytes
NextiAddr: Nexti
Program
24
Counter (PC)
TOS = Top Entry in Stack
SOS = Second Entry in Stack
Can address
224
bytes = 16 MBytes
Instruction Format
24
Bits: 8
pop ResAddr
Instruction:
pop Res
Opcode
Memory
Destination
Meaning:
M[ResAddr] TOS
EECC550 - Shaaban
#46 Lec # 1 Winter 2011 11-29-2011
Types of Instruction Set Architectures
General Purpose Register (GPR) Machines
• CPU contains several general-purpose registers which can
be used as operand sources and result destination.
Eight general purpose Registers (GPRs) assumed here: R1-R8
CPU
Memory
Registers
Op1Addr: Op1
load
add
+
:
:
NextiAddr: Nexti
store
R8
R7
R6
R5
R4
R3
R2
R1
Program
24
Counter (PC)
Instruction Format
Instruction:
3
24
Bits: 8
load R8, Op1
load R8 Op1Addr
Meaning:
R8 M[Op1Addr] Opcode
Where to find
operand1
PC PC + 5
Size = 4.375 bytes rounded up to 5 bytes
Instruction:
add R2, R4, R6
Meaning:
R2 R4 + R6
PC PC + 3
Instruction Format
3
3
3
Bits: 8
add
R2 R4 R6
Opcode Des Operands
Size = 2.125 bytes rounded up to 3 bytes
Instruction Format
Instruction:
3
24
Bits: 8
store R2, Op2
Meaning:
store R2 ResAddr
M[Op2Addr] R2
Opcode
Destination
PC PC + 5
Here add instruction has three register specifier fields
While load, store instructions have one register specifier field
and one memory address specifier field
Size = 4.375 bytes rounded up to 5 bytes
EECC550 - Shaaban
#47 Lec # 1 Winter 2011 11-29-2011
Expression Evaluation Example with 3-, 2-,
1-, 0-Address, And GPR Machines
For the expression A = (B + C) * D - E
3-Address
2-Address
add A, B, C load A, B
mul A, A, D add A, C
sub A, A, E mul A, D
sub A, E
3 instructions
Code size:
30 bytes
9 memory
accesses for
data
1-Address
Accumulator
load B
add C
mul D
sub E
store A
4 instructions 5 instructions
Code size:
Code size:
28 bytes
11 memory
accesses for
data
20 bytes
5 memory
accesses for
data
where A-E are in memory
GPR
0-Address
Load-Store
Register-Memory
Stack
push B
push C
add
push D
mul
push E
sub
pop A
8 instructions
Code size:
23 bytes
5 memory
accesses for
data
load R1, B
add R1, C
mul R1, D
sub R1, E
store A, R1
5 instructions
Code size:
25 bytes
5 memory
accesses for
data
load R1, B
load R2, C
add R3, R1, R2
load R1, D
mul R3, R3, R1
load R1, E
sub R3, R3, R1
store A, R3
8 instructions
Code size:
34 bytes
5 memory
accesses for
data
EECC550 - Shaaban
#48 Lec # 1 Winter 2011 11-29-2011
Instruction Set Architecture Tradeoffs
• 3-address machine: shortest code sequence; a large number of bits
per instruction; large number of memory accesses.
• 0-address (stack) machine: Longest code sequence; shortest
individual instructions; more complex to program.
Machine = CPU or ISA
• General purpose register machine (GPR):
– Addressing modified by specifying among a small set of
registers with using a short register address (all new ISAs since
1975).
– Advantages of GPR:
Why GPR?
1 • Low number of memory accesses. Faster, since register access
is currently still much faster than memory access.
2 • Registers are easier for compilers to use.
AKA Register-register GPR
3 • Shorter, simpler instructions.
• Load-Store Machines: GPR machines where memory addresses
are only included in data movement instructions (loads/stores)
between memory and registers (all new ISAs designed after 1980).
CISC to RISC observation (load-store simplifies CPU design)
EECC550 - Shaaban
#49 Lec # 1 Winter 2011 11-29-2011
Typical GPR ISA Memory Addressing Modes
Addressing
Mode
Sample
Instruction
Meaning
Register
Add R4, R3
R4 R4 + R3
Immediate
Add R4, #3
R4 R4 + 3
Displacement
Add R4, 10 (R1)
R4 R4 + Mem[10+ R1]
Indirect
Add R4, (R1)
R4 R4 + Mem[R1]
Indexed
Add R3, (R1 + R2)
R3 R3 +Mem[R1 + R2]
Absolute
Add R1, (1001)
R1 R1 + Mem[1001]
Memory indirect
Add R1, @ (R3)
R1 R1 + Mem[Mem[R3]]
Autoincrement
Add R1, (R2) +
R1 R1 + Mem[R2]
R2 R2 + d
Autodecrement
Add R1, - (R2)
R2 R2 - d
R1 R1 + Mem[R2]
Scaled
Add R1, 100 (R2) [R3]
CISC to RISC observation
(fewer addressing modes simplify CPU design)
R1 R1+ Mem[100+ R2 + R3*d]
EECC550 - Shaaban
#50 Lec # 1 Winter 2011 11-29-2011
Addressing Modes Usage Example
For 3 programs running on VAX ignoring direct register mode:
Displacement
42% avg, 32% to 55%
75%
Immediate:
33% avg, 17% to 43%
Register deferred (indirect):
13% avg, 3% to 24%
Scaled:
7% avg, 0% to 16%
Memory indirect:
3% avg, 1% to 6%
Misc:
2% avg, 0% to 3%
88%
75% displacement & immediate
88% displacement, immediate & register indirect.
Observation: In addition Register direct, Displacement,
Immediate, Register Indirect addressing modes are important.
CISC to RISC observation
(fewer addressing modes simplify CPU design)
EECC550 - Shaaban
#51 Lec # 1 Winter 2011 11-29-2011
Displacement Address Size Example
Avg. of 5 SPECint92 programs v. avg. 5 SPECfp92 programs
Int. Avg.
FP Avg.
30%
20%
10%
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0%
Displacement Address Bits Needed
For displacement addressing mode
1% of addresses > 16-bits
12 - 16 bits of displacement needed
CISC to RISC observation
EECC550 - Shaaban
#52 Lec # 1 Winter 2011 11-29-2011
Instruction Set Encoding
Considerations affecting instruction set encoding:
– The number of registers and addressing modes
supported by ISA.
– The impact of of the size of the register and addressing
mode fields on the average instruction size and on the
average program.
– To encode instructions into lengths that will be easy to
handle in the implementation. On a minimum to be a
multiple of bytes.
• Instruction Encoding Classification:
1. Fixed length encoding: Faster and easiest to implement in
hardware. e.g. Simplifies design of pipelined CPUs
2. Variable length encoding: Produces smaller instructions.
3. Hybrid encoding.
CISC to RISC observation
to reduce code size
EECC550 - Shaaban
#53 Lec # 1 Winter 2011 11-29-2011
Three Examples of Instruction Set Encoding
Operations &
no of operands
Address
specifier 1
Address
field 1
Address
specifier n
Address
field n
Variable Length Encoding: VAX (1-53 bytes)
Operation
Address
field 1
Address
field 2
Fixed Length Encoding:
Operation
Operation
Operation
Address
Specifier
Address
Specifier 1
Address
Specifier
Address
field3
MIPS, PowerPC, SPARC (all instructions are 4 bytes each)
Address
field
Address
Specifier 2
Address
field 1
Address field
Address
field 2
Hybrid Encoding: IBM 360/370, Intel 80x86
EECC550 - Shaaban
#54 Lec # 1 Winter 2011 11-29-2011
ISA Examples
Machine
Number of General
Purpose Registers
EDSAC
IBM 701
CDC 6600
IBM 360
DEC PDP-8
DEC PDP-11
Intel 8008
Motorola 6800
DEC VAX
1
1
8
16
1
8
1
1
16
Intel 8086
Motorola 68000
Intel 80386
MIPS
HP PA-RISC
SPARC
PowerPC
DEC Alpha
HP/Intel IA-64
AMD64 (EMT64)
1
16
8
32
32
32
32
32
128
16
Architecture
year
accumulator
accumulator
load-store
register-memory
accumulator
register-memory
accumulator
accumulator
register-memory
memory-memory
extended accumulator
register-memory
register-memory
load-store
load-store
load-store
load-store
load-store
load-store
register-memory
1949
1953
1963
1964
1965
1970
1972
1974
1977
1978
1980
1985
1985
1986
1987
1992
1992
2001
2003
EECC550 - Shaaban
#55 Lec # 1 Winter 2011 11-29-2011
Examples of GPR Machines
For Arithmetic/Logic (ALU) Instructions
Max. number of
memory addresses
(ISAs)
Max. number
of operands allowed
+ destination
0
3
SPARC, MIPS
PowerPC, ALPHA
1
2
Intel 80386
Motorola 68000
2 or 3
2 or 3
VAX
EECC550 - Shaaban
#56 Lec # 1 Winter 2011 11-29-2011
Complex Instruction Set Computer (CISC)
• Emphasizes doing more with each instruction:
ISAs
– Thus fewer instructions per program (more compact code).
• Motivated by the high cost of memory and hard disk
Why?
capacity when original CISC architectures were proposed
– When M6800 was introduced: 16K RAM = $500, 40M hard disk = $ 55, 000
– When MC68000 was introduced: 64K RAM = $200, 10M HD = $5,000 Circa 1980
• Original CISC architectures evolved with faster more
complex CPU designs but backward instruction set
compatibility had to be maintained (e.g X86).
• Wide variety of addressing modes:
• 14 in MC68000, 25 in MC68020
• A number instruction modes for the location and number of
operands:
• The VAX has 0- through 3-address instructions.
• Variable-length instruction encoding.
To reduce code size
EECC550 - Shaaban
#57 Lec # 1 Winter 2011 11-29-2011
Example CISC ISAs
Motorola 680X0
18 addressing modes:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Data register direct.
Address register direct.
Immediate.
Absolute short.
Absolute long.
Address register indirect.
Address register indirect with postincrement.
Address register indirect with predecrement.
Address register indirect with displacement.
Address register indirect with index (8-bit).
Address register indirect with index (base).
Memory inderect postindexed.
Memory indirect preindexed.
Program counter indirect with index (8-bit).
Program counter indirect with index (base).
Program counter indirect with displacement.
Program counter memory indirect postindexed.
Program counter memory indirect preindexed.
GPR ISA (Register-Memory)
Operand size:
•
Range from 1 to 32 bits, 1, 2, 4, 8,
10, or 16 bytes.
Instruction Encoding:
•
Instructions are stored in 16-bit
words.
•
the smallest instruction is 2- bytes
(one word).
•
The longest instruction is 5 words
(10 bytes) in length.
2 Bytes
10 Bytes
EECC550 - Shaaban
#58 Lec # 1 Winter 2011 11-29-2011
Example CISC ISA:
Intel 80386
X86 or IA-32
GPR ISA (Register-Memory)
12 addressing modes:
•
•
•
•
•
•
•
•
•
•
•
•
Register.
Immediate.
Direct.
Base.
Base + Displacement.
Index + Displacement.
Scaled Index + Displacement.
Based Index.
Based Scaled Index.
Based Index + Displacement.
Based Scaled Index + Displacement.
Relative.
Operand sizes:
•
Can be 8, 16, 32, 48, 64, or 80 bits long.
•
Also supports string operations.
Instruction Encoding:
•
The smallest instruction is one byte.
•
The longest instruction is 12 bytes long.
•
The first bytes generally contain the opcode,
mode specifiers, and register fields.
•
The remainder bytes are for address
displacement and immediate data.
One Byte
12 Bytes
EECC550 - Shaaban
#59 Lec # 1 Winter 2011 11-29-2011
Reduced Instruction Set Computer (RISC)
~1984
ISAs
• Focuses on reducing the number and complexity of instructions of the
ISA.
RISC: Simplify ISA Simplify CPU Design
Better CPU Performance
– Motivated by simplifying the ISA and its requirements to:
RISC Goals
• Reduce CPU design complexity
• Improve CPU performance.
– CPU Performance Goal: Reduced number of cycles needed per
instruction. At least one instruction completed per clock cycle.
• Simplified addressing modes supported.
– Usually limited to immediate, register indirect, register displacement,
indexed.
• Load-Store GPR: Only load and store instructions access memory.
– (Thus more instructions are usually executed than CISC)
• Fixed-length instruction encoding.
– (Designed with CPU instruction pipelining in mind).
• Support of delayed branches.
• Examples: MIPS, HP PA-RISC, SPARC, Alpha, POWER, PowerPC.
EECC550 - Shaaban
#60 Lec # 1 Winter 2011 11-29-2011
Example RISC ISA:
PowerPC
8 addressing modes:
•
•
•
•
•
•
•
•
Register direct.
Immediate.
Register indirect.
Register indirect with immediate
index (loads and stores).
Register indirect with register index
(loads and stores).
Absolute (jumps).
Link register indirect (calls).
Count register indirect (branches).
Load-Store GPR
Operand sizes:
•
Four operand sizes: 1, 2, 4 or 8 bytes.
Instruction Encoding:
•
Instruction set has 15 different formats
with many minor variations.
•
•
All are 32 bits (4 bytes) in length.
EECC550 - Shaaban
#61 Lec # 1 Winter 2011 11-29-2011
Example RISC ISA:
HP Precision Architecture
HP PA-RISC Load-Store GPR
7 addressing modes:
•
•
•
•
•
•
•
Register
Immediate
Base with displacement
Base with scaled index and
displacement
Predecrement
Postincrement
PC-relative
Operand sizes:
•
Five operand sizes ranging in powers of
two from 1 to 16 bytes.
Instruction Encoding:
•
Instruction set has 12 different formats.
•
•
All are 32 bits (4 bytes) in length.
EECC550 - Shaaban
#62 Lec # 1 Winter 2011 11-29-2011
Example RISC ISA:
SPARC
5 addressing modes:
•
•
•
•
•
Register indirect with immediate
displacement.
Register inderect indexed by another
register.
Register direct.
Immediate.
PC relative.
Load-Store GPR
Operand sizes:
•
Four operand sizes: 1, 2, 4 or 8 bytes.
Instruction Encoding:
•
Instruction set has 3 basic instruction
formats with 3 minor variations.
•
All are 32 bits (4 bytes) in length.
EECC550 - Shaaban
#63 Lec # 1 Winter 2011 11-29-2011
Example RISC ISA:
DEC Alpha AXP
Load-Store GPR
4 addressing modes:
•
•
•
•
Register direct.
Immediate.
Register indirect with displacement.
PC-relative.
Operand sizes:
•
Four operand sizes: 1, 2, 4 or 8 bytes.
Instruction Encoding:
•
Instruction set has 7 different formats.
•
•
All are 32 bits (4 bytes) in length.
EECC550 - Shaaban
#64 Lec # 1 Winter 2011 11-29-2011
RISC ISA Example:
AKA MIPS-I
MIPS R3000 (32-bit)
Instruction Categories:
•
•
•
•
•
•
5 Addressing Modes:
Load/Store.
Computational.
Jump and Branch.
Floating Point
(using coprocessor).
Memory Management.
Special.
•
•
•
Load-Store GPR
•
•
•
Register direct (arithmetic).
Immedate (arithmetic).
Base register + immediate offset
(loads and stores).
PC relative (branches).
Pseudodirect (jumps)
Registers
R0 - R31
PC
HI
Operand Sizes:
Memory accesses in any
multiple between 1 and 4 bytes.
LO
Instruction Encoding: 3 Instruction Formats, all 32 bits (4 bytes) wide.
R
I
OP
rs
rt
OP
rs
rt
J
OP
rd
sa
funct
immediate
jump target
MIPS is the target ISA for CPU design in this course
EECC550 - Shaaban
#65 Lec # 1 Winter 2011 11-29-2011