Lecture 12 - Instruction Set(1)

Download Report

Transcript Lecture 12 - Instruction Set(1)

Lecture 7
Instruction Set
CS311-Computer Organization
Instruction Set
Lecture 7 -1
Lecture 7:
Instruction Set
In this lecture, we will study
•
•
•
•
•
•
Languages of computers
Meaning of an Instruction Set
Component of an instruction
Four types of operation
Basic instruction set
Instructions on three different computer architectures
– Stack machine architecture
– AC machine architecture
– GPR machine architecture
•
Instruction set design
– Trade-off
– Minimal instruction set
•
Powerful instructions
CS311-Computer Organization
Instruction Set
Lecture 7 -2
Lecture 7:
Instruction Set
In this lecture, we will study(Cont)
• Classifications of instruction types by the location of the
operands
–
–
–
–
–
Stack Instruction
AC Instruction
R-R Instruction
R-M Instruction
M-M Instruction
• Classification of instruction types by the number of
operands whose addresses are represented in the
instruction
–
–
–
–
3-address Instruction
2-address Instruction
1-address Instruction
0-address Instruction
CS311-Computer Organization
Instruction Set
Lecture 7 -3
Languages of Computers
•
Machine Language
–
–
–
–
–
•
•
Programs consist of machine instructions
Directly executable without preprocessing
Direct manipulation of machine registers
Efficient in view of machine resource utilization
Difficult to program
We are dealing with
Machine Languages
in this class
Assembly language
–
Improved version of machine language with emphasis on user-friendliness
» Symbolic machine language(symbols for operations and addresses)
–
Assembler is needed to translate into a machine language program
High-Level Language
–
–
–
Programs consist of statements, each of which can be translated into several
machine language instructions
Need a compiler to translate into a machine language program
Relatively easy to program compare to ML or AL
–
Hardware resource utilization may be inefficient
CS311-Computer Organization
Instruction Set
Lecture 7 -4
Semantic Gap Between
ML and HLL
• As Hardware cost goes down, Software cost goes up
– Shortage of programmers
– Unreliable Software => Unreliable Computers
System
cost
SW
• Response: Keep the programming cost down
– Develop powerful, complex user-friendly HLL
– HLL programmers are easy to train
HW
• Greater Semantic Gap between HLL and Machine Language
year
– Execution inefficiency
– Software complexity
– Compiler complexity
• To offset the semantic gap
– Large instruction set
– Variety of addressing modes
– Hardware/Firmware implementation of HLL primitives
CS311-Computer Organization
Instruction Set
Lecture 7 -5
Instruction Set
Boundary between Designers(architects) and programmers
– For designers:
Specification of the function of CPU
– For Programmers: A pool of functions from which they choose to
use in the program
One would expect that human language should directly reflect the
characteristics of human intellectual capabilities that language should be
a direct mirror of mind in ways which other systems of knowledge and
belief cannot. - Noam Chomsky
Instruction Set
– Language of a machine
– Characterizes the machine’s capability and behavior
Performance Issues
– Memory Bandwidth is used 1/2 for Instructions and 1/2 for Data
– For efficient utilization of MB, instruction representation must as
compact as possible whilst still being compatible with data
– von Neumann Bottleneck exists in MB
CS311-Computer Organization
Instruction Set
Lecture 7 -6
Memory BW Usage
• Half of Memory Bandwidth allocated to CPU is used for
Instruction and another half for Data
• Consider AC-machine
– ADD X, and LDA X
– Half of BW for Instruction Fetch from Memory and
another half of BW for Operand Fetch from Memory
• Consider GPR-machine
– ADD R1, X and LD R1, X
– Half of BW for Instruction Fetch from Memory and
another half of BW for operand Fetch from Memory
CS311-Computer Organization
Instruction Set
Lecture 7 -7
Memory Bandwidth Issue
•
Total Memory Bandwidth is used by CPU and I/O
Memory Bandwidth
instruction
execution
•
•
I/O
instruction
execution
I/O
Memory Bandwidth given to CPU is used for Instruction
Fetches and Operand Fetches or Operand Stores
Consider an AC-machine; ADD X, or LDA X
Memory bandwidth given to CPU
IF
OF
D/IP
CS311-Computer Organization
IF
E
OF
D/IP
Instruction Set
IF
E
OF
D/IP
E
Lecture 7 -8
Machine Language
Machine Language
– Vocabulary
» Operations
» Addressing Modes for operands’ addresses and the next instruction
address
– Syntax
» Methods of representing operation(OP-code), operands, addresses
in an instruction
> Instruction format
> Encoding of Instruction fields
– Grammar
» Rules of using instructions to make a program
CS311-Computer Organization
Instruction Set
Lecture 7 -9
Components of an Instruction
Operation Code(OP-code)
– Format specifier
» Long / Short
» Field definition
– Operation
– Types of operands
Operand Address(es)
– Operand itself
– Address themselves(including abbreviated)
– Address modification specification
» Automatic indexing
» Relative address
– Sequencing
CS311-Computer Organization
Instruction Set
Lecture 7 -10
Four Types of Operations
Functional Operations(Process information)
– Arithmetic Operations
» Add, Shift, complement
– Logical Operations
» AND, OR, I
Transfer Operations(Move information)
– Moving information between CPU and memory
» Load, Store
Control Operations(Decide the order of instruction execution)
– Controlling the order of instruction execution
– Conditional and unconditional branches
Input/Output Operations(Move information)
The most important type of operations is the Functional Operation
CS311-Computer Organization
Instruction Set
Lecture 7 -11
Basic Instructions
Functional
ADD, AND, CPA, CPC, ROL, CLA, CLC, INC
Transfer
LDA, STA(LD, ST)
Control
JMP, JNA, JZA, JZC(SMA, SZA, SZC)
Input/Output
INP, OUT
CS311-Computer Organization
Instruction Set
Lecture 7 -12
Time Out
• 아주 젊은 여자와 결혼한 93세 노인이 의사를 찾아와서 그들 부
부에게 아이가 생겼다고 자랑했다.
• 그러자 의사가 말했다. “제가 얘기를 하나 해드리죠. 건망증이 심
한 친구가 사냥을 갔대요. 그 친구는 총 대신에 우산을 가지고 갔
답니다. 갑자기 사자가 나타나서 그에게 달려오자 그는 우산으로
사자를 겨누고 쏘았답니다.그러자 사자는 그 자리에 쓰러져 죽었
답니다.”
• “말도 안 되는 소리! 누군가가 옆에서 대신 총을 쏘았겠지.” 노인
이 말했다.
• “바로 맞히셨습니다.” 의사가 맞장구를 쳤다.
CS311-Computer Organization
Instruction Set
Lecture 7 -13
Instruction Set
and Computer Architecture
Computer Architectures are classified into three classes
according to the Register Structures for operands storage
– Stack Computer Architecture
– AC Computer Architecture
– General Purpose Register Computer Architecture
Input Bus
Input Bus
General
Purpose
Registers
Stack
Stack Architecture
CS311-Computer Organization
ALU
Other
Registers
ALU
Output Bus
AC Architecture
Instruction Set
AC
Registers
ALU
Output Bus
GPR Architecture
Lecture 7 -14
Stack Computer Architecture
Instruction
PUSH X
n-1
SP
Full(F)
ALU
Empty(E)
POP
X
0
S
CS311-Computer Organization
Unary Instr.
(Shift Right)
Binary Instr.
(ADD)
Instruction Set
Operation
if F=1, then S overflow;
else SP SP+1, S[SP]
if SP=(n-1), then F
M[X],
1, E 0
if E=1, then empty S;
else M[X] S[SP], SP SP-1, F 0,
if SP=(n-1), E 1;
if E=1, then empty S;
else ALU S[SP], S[SP]
ALU;
if E=1, then empty S;
else ALU S[SP], SP SP-1,
if SP=(n-1),
then E 1, empty S;
else ALU S[SP], S[SP] ALU
Lecture 7 -15
Characteristics
of the Stack Architecture
• Instruction length is short
– No need to represent the address(es) of operand(s) in
functional instructions
• Instruction execution time is fast
– Operand(s) access is fast because they are in the
stack(register)
• Operand(s) must be stored in the stack before operating on
them
– Inconvenient to prepare data in the stack
– Frequent use of PUSH and POP instructions to prepare
data in the stack - memory access
CS311-Computer Organization
Instruction Set
Lecture 7 -16
AC Computer Architecture
Input Bus
Other
Registers
ALU
Output Bus
Characteristics:
AC
Instruction
Operation
Unary Instruction AC
f(AC)
(CPA)
Binary Instruction AC
f(AC, M[X])
(ADD X)
Transfer Instruction
(LDA X)
AC
M[X]
(STA X)M[X]
AC
- Instruction execution time of binary instructions are slow
» One of the operands must be read from memory
- Instruction length is longer than in the stack architecture
» One of the operand’s memory address must be specified in the instruction
although AC(a data register) can be implied
- Frequency of LDA/STA instructions is high
» There is only one data register
CS311-Computer Organization
Instruction Set
Lecture 7 -17
GPR Computer Architecture
Instruction
Input Bus
Registers
ALU
Output Bus
Characteristics:
Unary Instruction
(COMP R1, R2)
Binary Instruction
(ADD R1, R2) or
(ADD R1, R2, R3)
Transfer Instruction
(LD
(ST
R1, X)
R1, X)
Operation
R1
f(R2)
R1
R3
f(R1, R2) or
f(R1, R2)
R1
M[X]
M[X]
R1
- Instruction length is short because register addresses are used
for operands
- Instruction execution time is fast because all the operands are in
the registers
- Frequency of using LD/ST instructions depends on the number
of registers
- Opportunities of storing the results of operations in GPR is high
because there are many registers
CS311-Computer Organization
Instruction Set
Lecture 7 -18
CS311-Computer Organization
Instruction Set
Lecture 7 -19
A Minimal Instruction Set:
BN Instruction
BN Instruction(Branch on Negative)
The instruction set consists of 1 instruction - a minimal
BN
We will see another minimal
instruction later
a1, a2, a3
M[a1] M[a1] - M[a2]
AC
M[a1] - M[a2]
If AC < 0, then PC a3
CS311-Computer Organization
Instruction Set
Lecture 7 -20
Programming with Bn Instruction
LDA
a1
BN
a1, 0, a3
assuming M[0] = 0 and M[1] = 1
STA a1
Content of AC is somewhere
in memory,
it isainlot
M[a1],
Inefficient,
wasteif of
of then no operation,if
it is in M[a4], then
Memory Space and
BN
a2, a2, a3
Memory
BN
a2, Bandwidth
a4, a3
a3:
ADD
JMP
0
- M[a4]
0
M[a4]
a1, a1, a3
a1, a2, a3
BN
BN
BN
a1, a1, a3
a1, Y, a3
X, a1, a3
/M[a1]
/M[a1]
/M[X]
0
- M[Y]
M[X] + M[Y]
BN
X, Y, a3
/M[X]
M[X] - M[Y]
BN
BN
a1, a1, a3
a1, 1, X
/M[a1]
/AC
0
-1, PC
X, Y
a3:
SUB
BN
BN
/M[a2]
/M[a2]
/M[a1]
/M[a1]
X, Y
X
CS311-Computer Organization
Instruction Set
X
Lecture 7 -21
Instruction Set Design:
Operations to Include in the Instruction Set
Trade-off 3 Es(Elegance, Efficiency, Environment)
Elegance
– Completeness(Even Bn instruction is complete)
– Symmetry:
AC <= f(AC, M[X]) and M[X] <= f(AC, M[X])
– Flexibility, Generality
Efficiency
– Space
» Bit budget
> Efficient specification of address
> Fewer instructions require fewer bits to encode OP-code
– Frequency of use arguments
– Bandwidth arguments(NOP simply waste memory bandwidth)
– Ratio of overheads: non-functional to functional
Environment
– Multiprogramming(Relocation, Protection, Sharing)
– Code generation by compilers(Compiler favors only a little portion of
instruction set)
CS311-Computer Organization
Instruction Set
Lecture 7 -22
Powerful Instructions:
Multiple Operations
Instruction Cycle
I-F
I-P
O-F
E
Instruction Fetch, Decode,
Opd addr decision, and fetch
Execution
(Operation)
Overhead for Execution(O)
(E)
Rich, Powerful Instruction:
Instruction with longer Execution Time(E) to balance the overhead penalty(O)
Instruction which has a large E/O
IF
IP OF
O
CS311-Computer Organization
E
E
Instruction Set
Lecture 7 -23
Powerful Instructions
•
•
•
Extended Arithmetic Function
– Multiply, divide, Trigonometric Functions, etc
Automatic Indexing
– BCT R1, addr
(R1 <- R1 - 1, if R1 = 0 then PC <- addr)
– BXLE R1, R3, addr
(R1 <- R1 + R3, if R3=odd, R1 < R3,
PC <- addr
if R3=even, R1 < R3+1, PC <- addr)
Subroutine Linkage
– JMS X
(M[X] <- PC, PC <- X+1)
CS311-Computer Organization
Instruction Set
Lecture 7 -24
Powerful Instructions:
Meta Instruction
Meta Instruction
– EXECUTE X or EXECUTE R, X
» Next instruction address is generated from X
» R is used for address modification of the instruction at X
» After execution at X, normal in-line sequencing resumes
REPEAT instruction(UNIVAC 1103A)
REPEAT j, n, w
next instruction u, v
– The next instruction is executed n times using address specified by u
and v and the modification information specified by
j=0:
u and v do not change
j=1:
v
v+1
j=2:
u
u+1
j=3:
v
v+1, u
u+1
– After the execution of the next instruction n times, program continues at
the fixed location F using w. F contains an unconditional branch
instruction
CS311-Computer Organization
Instruction Set
Lecture 7 -25
Powerful Instructions:
Multiple Register Move
Process State Exchange(Context Switch):
Instructions required in the multiprogramming environments
Otherwise
LM
SM
R1, R5, addr
R1, R5, addr
R1
R2
…
R5
M[addr]
M[addr+1]
M[addr+4]
M[addr] R1
M[addr+1]
…
M[addr+4]
LD
LD
…
LD
R1, addr
R2, addr+1
R5, addr+4
R2
R5
XJ(Exchange Jump of CDC 6000 series)
CS311-Computer Organization
Instruction Set
Lecture 7 -26
Classifications of Instruction
• Location of operands for the Functional Instructions
– Registers including AC
– Memory
– Stack
• Number of addresses that needs to be explicitly specified in
an instruction
4 addresses: 2 Source Operands, 1 Result(Destination), and
the Next instruction addresses
–
–
–
–
0-address instruction
1-address instruction
2-address instruction
3-address instruction
CS311-Computer Organization
Instruction Set
Lecture 7 -27
Classification of Instruction Type - Operand Location:
Stack Instruction
•
Stack instruction is used only by the Stack Computer Architectures
– Functional instructions
» Unary: S[SP] <- fu(S[SP])
SP does not change
> SHR, CMP, INC,...
» Binary: S[SP-1] <- fb(S[SP],S[SP-1]), SP <- SP-1 SP changes to SP-1
> ADD, AND, ...
– Transfer instructions also different from the transfer instructions
of other architectures
» Load: PUSH X
> SP <- SP+1, S[SP] <- M[X] (remember to check for stack overflow)
» Store: POP X
> M[X] <- S[SP], SP <- SP-1 (remember to check for empty stack)
CS311-Computer Organization
Instruction Set
Lecture 7 -28
Characteristics
of the Stack Instruction
• Short instruction
• 1-cycle instruction(functional)
– 1 memory cycle for instruction fetch
– No need for memory access for operand
• Need to prepare data in the order of calculations
• Frequent use of PUSH and POP instructions, which are 2cycle instructions
CS311-Computer Organization
Instruction Set
Lecture 7 -29
Classification of Instruction Type by the Location of Operands:
AC Instruction
•
AC instruction is used only by the AC Computer Architectures
– Functional instructions
» Unary: AC <- fu(AC)
> SHR, CMP, INC,...
» Binary: AC <- fb(AC, M[X])
> ADD X, AND X, ...
– Transfer instructions are also different from the transfer
instructions used by other architectures
» Load: LDA X
> AC <- M[X]
» Store: STA X
> M[X] <- AC
CS311-Computer Organization
Instruction Set
Lecture 7 -30
Characteristics of AC Instruction
• Instruction length is medium
• 2-cycle instruction(functional)
– 1 memory cycle for instruction fetch
– 1 memory access for operand fetch
• Frequent use of LD/ST instructions because there is only
one data register
CS311-Computer Organization
Instruction Set
Lecture 7 -31
CS311-Computer Organization
Instruction Set
Lecture 7 -32
Classification of Instruction Type by the Location of Operands:
R-R Instruction
•
R-R instruction is used only by the GPR Computer Architectures
– Functional instructions
» Unary: R1 <- fu(R2) or R1 <- fu(R1)
> SHR R1, R2, CMP R1, R2, INC R1, R2, ... or
> SHR R1,
CMP R1,
INC R1
» Binary: R3 <- fb(R1, R2) or R1 <- fb(R1, R2)
> ADD R1, R2, R3, AND R1, R2, R3, ... or
> ADD R1, R2,
AND R1, R2, ...
– Transfer instructions are common to all GPR machines
» Load: LD R1, X
> R1 <- M[X]
» Store: ST R1, X
> M[X] <- R1
CS311-Computer Organization
Instruction Set
Lecture 7 -33
Characteristics of R-R Instruction
• Instruction length is short - register address
• 1-cycle instruction(functional)
– 1 memory cycle for instruction fetch
– No need for memory access for operands
• Frequency of using LD/ST instructions is low because there
are plenty of registers
CS311-Computer Organization
Instruction Set
Lecture 7 -34
Classification of Instruction Type by the Location of Operands:
R-M Instruction
•
R-M instruction is used only by the GPR Computer Architectures
– Functional instructions
» Unary: R1 <- fu(R1)
> SHR R1,
CMP R1,
INC R1, ...
» Binary: R1 <- fb(R1, M[X])
> ADD R1, X, AND R1, X, ...
– Transfer instructions are common to all GPR machines
» Load: LD R1, X
> R1 <- M[X]
» Store: ST R1, X
> M[X] <- R1
CS311-Computer Organization
Instruction Set
Lecture 7 -35
Characteristics of R-M
Instruction
• Instruction length is intermediate
• 2-cycle instruction(functional)
– 1 memory cycle for instruction fetch
– 1 memory cycle for 1 operand fetch
• Frequency of using LD/ST instructions is low
– There are plenty of registers
– Binary functional instructions load data without using LD
instructions
• Program becomes shorter
LD R1, A
ADD R1, R2
CS311-Computer Organization
ADD R1, A
Instruction Set
Lecture 7 -36
Classification of Instruction Type by the Location of Operands:
M-M Instruction
•
M-M instruction is used only by high performance GPR
architectures
– Functional instructions
» Unary: M[X1] <- fu(M[X1])
> SHR X1,
CMP X1,
INC X1, ...
» Binary: M[X1] <- fb(M[X1], M[X2])
> ADD X1, X2, AND X1, X2, ...
> This architecture also has ADD R, X type of R-M instructions
– Transfer instructions are common to all GPR machines
» Load: LD R, X
> R <- M[X]
» Store: ST R, X
> M[X] <- R
CS311-Computer Organization
Instruction Set
Lecture 7 -37
Characteristics of M-M
Instruction
• Instruction length is extremely long
• 4-cycle instruction(functional)
– 1 memory cycle for instruction fetch
– 2 memory cycles for operand fetch
– 1 memory cycle for result operand store
• Frequency of using LD/ST instructions is very low
– Binary functional instructions load data without using LD
instructions
• Program becomes short
– Replace 2 LDs, 1 ST, and an ADD instructions with an ADD
instruction
CS311-Computer Organization
Instruction Set
Lecture 7 -38
Example 1
3 numbers a, b, and c are stored in memory locations A, B, and C.
Assume that there are plenty of data registers, with an exception of AC
architecture.
Compare the five different architectures when calculating
R
M[A]+M[B]+M[C]
Program Comparison
S instruction
PUSH
PUSH
PUSH
ADD
ADD
A
B
C
AC instruction
R-R instruction
R-M instruction
M-M instruction
LDA
ADD
ADD
LD
LD
LD
ADD
ADD
LD
ADD
ADD
ADD
ADD
LD
CS311-Computer Organization
A
B
C
R1,A
R2,B
R3,C
R1,R2
R1,R3
Instruction Set
R1,A
R1,B
R1,C
A,B
A,C
R1,A
Lecture 7 -39
Example 1 - Analysis
Assumption:
OP-code:
R addr:
M addr:
Operand:
S Instruction:
AC Instruction:
R-R Instruction:
R-M Instruction:
M-M Instruction:
6 bits
4 bits
10 bits
24 bits
Type of
instr.
S
AC
R-R
R-M
M-M
No. of
Instr.
5
3
5
3
3
No. of
Cycles
8
6
8
6
10
Number of instructions and
Number of cycles are low
Highly efficient
Fast, But…Next Example
CS311-Computer Organization
6 bits
16 bits
14 bits
20 bits
36 bits
Instr Set Efficiency
No. of Func. Instr.
=
No. of Other Instr.
M-CPU
transferred bits
instr: 60, data: 72
instr: 48, data: 72
instr: 88, data: 72
instr: 60, data: 72
instr: 72, data: 168
Spend a lot of MB
Instruction Set
Efficiency
0.67
2.0
0.67
2.0
2.0
In scientific computation environments,
a lot of intermediate results are produced
and they remain in the registers or stack.
In this situation, efficiency could become
higher than AC, R-M, M-M instructions.
Lecture 7 -40
Example 2
Solve the same problem with the three numbers are intermediate results
so that they are stored in Registers or in a Stack.
Thus, it is natural to assume as follows;
Stack architecture: a, b, and c are in the stack
AC architecture: Only a is in AC, and b and c are in M[B] and M[C]
GPR architecture: a, b, and c are stored in Ra, Rb, and Rc, respectively,
however, for this example
For R-M instruction: only a is stored in Ra and b, c are in Memory
For M-M instruction: All three data are in memory
Program Comparison
S instruction
ADD
ADD
AC instruction
ADD
ADD
CS311-Computer Organization
B
C
R-R instruction
ADD
ADD
Ra,Rb
Ra,Rc
Instruction Set
R-M instruction
M-M instruction
ADD
ADD
ADD
ADD
LD
Ra,B
Ra,C
A,B
A,C
Ra,A
Lecture 7 -41
Example 2 - Analysis
Type of
instr.
S
AC
R-R
R-M
M-M
No. of
Instr.
2
2
2
2
3
No. of
Cycles
2
4
2
4
10
M-CPU
transferred bits
instr: 12, data: 0
instr: 32, data: 48
instr: 28, data: 0
instr: 40, data: 48
instr: 72, data: 168
Efficiency
infinnite
Infinity
infinite
Infinity
2.0
Both Number of instructions
and Number of cycles are low Less use of MB
Spend a lot of MB
Highly efficient
Infinity is a little exaggeration, but it is certain that, in actual
Fast
program runs, it is high.
But in reality, it is not possible to avoid using transfer
instructions and there should be control and input/output
instructions.
CS311-Computer Organization
Instruction Set
Lecture 7 -42
Classification of Instruction Type by the Location of Operands:
Conclusion
•
•
•
•
•
Functional Instructions of S and R-R Instructions do not require
memory access for operand fetch which result in 1_cycle
Instruction
Instruction length of S and R-R Instructions are short, and the
program can be coded with a small number of S and R-R
Instructions
Functional Instructions of AC and R-M Instructions require one
memory access for operand which result in 2_cycle Instruction
Functional Instructions of M-M Instruction requires two memory
accesses for operands and one memory access for storage of
result which results in 4_cycle Instruction
Memory Bandwidth utilization of S and R-R Instructions are
efficient, but M-M Instruction is very inefficient
CS311-Computer Organization
Instruction Set
Lecture 7 -43
Classification by the
Number of Addresses
•
0-address Instruction
–
–
•
1-address Instruction
–
–
–
•
•
Specify only OP-code, address(es) of operand(s) is(are) implied (Stack
architecture)
Require PUSH and POP instructions for transferring data between Memory and
Stack which are 1-address instructions
Specify address of 1 operand in addition to OP-code (AC architecture)
Address for another operand and the address for the result are implied to AC
Uniform instruction type for all instructions(LDA and ADD both are 1-address)
2-address Instruction(2g-instruction, (1+g)-instruction)
–
Specify addresses of 2 operands in addition to OP-code (R-R, R-M, M-M
architectures)
–
Address of the result is implied to one of the address of the operands
3-address Instruction(3g-instruction, (1+2g)-instruction,…)
–
–
Specify addresses of 2 operands and the result (R-R architecture)
This type of instruction is hardly used in M-M architecture because of the
instruction length
CS311-Computer Organization
Instruction Set
Lecture 7 -44