Lectures for 2nd Edition

Download Report

Transcript Lectures for 2nd Edition

Chapter 3
Machine Language Instructions
CSE 45432 SUNY New Paltz
1
Generic Examples of Instruction Format Widths
Variable:
…
…
Fixed:
Hybrid:
If code size is most important, use variable length instructions
If performance is most important, use fixed length instructions
CSE 45432 SUNY New Paltz
2
General Purpose Registers Dominate
1975-1995 all machines use general purpose registers
Expect new instruction set architecture to use general purpose register
Advantages of registers
• registers are faster than memory
• registers are easier for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order
vs. stack
• registers can hold variables
- memory traffic is reduced, so program is sped up
(since registers are faster than memory)
- code density improves (since register named with fewer bits
than memory location)
CSE 45432 SUNY New Paltz
3
Addressing Objects: Endianess and Alignment
• Big Endian: address of most significant IBM 360/370, Motorola 68k,
MIPS, Sparc, HP PA
• Little Endian: address of least significant Intel 80x86, DEC Vax, DEC
Alpha (Windows NT)
3
2
1
little endian byte 0
0
msb
0
big endian byte 0
lsb
1
2
3
0
1
2
3
Aligned
Alignment: require that objects fall
on address that is multiple of their
size.
CSE 45432 SUNY New Paltz
Not
Aligned
4
Top 10 80x86 Instructions
Rank
1
2
3
4
5
6
7
8
9
10
Instruction
Integer Average Percent total executed
load
22%
conditional branch
20%
compare
16%
store
12%
add
8%
and
6%
sub
5%
move register-register
4%
call
1%
return
1%
Total
96%
Simple instructions dominate instruction frequency
CSE 45432 SUNY New Paltz
5
Machine Language Instructions:
• More primitive than higher level languages
• Very restrictive
• We’ll be working with the MIPS instruction set architecture
– similar to other architectures developed since the 1980's
– used by NEC, Nintendo, Silicon Graphics, Sony
Design goals: maximize performance and minimize cost, reduce design time
CSE 45432 SUNY New Paltz
6
MIPS ISA
MIPS assumes 32 CPU registers ($0, …. , $31)
•
•
•
•
All arithmetic instructions have 3 operands
Operand order is fixed (destination first in assembly instruction)
Operand of arithmetic instructions are in registers
Simple memory addressing mechanism
C code:
MIPS code:
A = B + C + D;
E = F - A;
Compiler
add $t0, $s1, $s2
add $s0, $t0, $s3
sub $s4, $s5, $s0
t0, s1, s2, … are symbolic names for registers (translated to the
corresponding numbers by the assembler).
CSE 45432 SUNY New Paltz
7
Register Usage Conventions
Name
Register number
$zero
$v0-$v1
$a0-$a3
$t0-$t7
$s0-$s7
$t8-$t9
$gp
$sp
$fp
$ra
0
2-3
4-7
8-15
16-23
24-25
28
29
30
31
Usage
the constant value 0
values for results and expression evaluation
arguments
temporaries
saved
more temporaries
global pointer
stack pointer
frame pointer
return address
Registers hold 32 bits of data
Register zero always has the value zero (even if you try to write it)
CSE 45432 SUNY New Paltz
8
Memory Organization
•
•
•
•
Viewed as a large, single-dimension array, with an address.
A memory address is an index into the array
A word in MIPS is 32 bits long (4 bytes)
"Byte addressing" means that the index points to a byte of memory.
0
1
2
3
4
5
6
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
0
32 bits of data
4
32 bits of data
8
32 bits of data
12
32 bits of data
...
8 bits of data
8 bits of data
...
• 232 bytes with byte addresses from 0 to 232-1
• 230 words with byte addresses 0, 4, 8, ... 232-4
• Words are aligned! (the least 2 significant bits of a word address?)
CSE 45432 SUNY New Paltz
9
Load and Store Instructions
• A memory address = content of a register + an immediate constant
C code:
A[8] = h + A[8];
MIPS code:
lw $t0, 32($s3)
add $t0, $s2, $t0
sw $t0, 32($s3)
// Load word
// Store word
• The compiler stores the address of the first element of array A in register
$s3.
• It is assumed that the value of h is stored in register $s2.
• Store word has destination last
• Remember arithmetic operands are registers, not memory!
CSE 45432 SUNY New Paltz
10
Summary so far:
• MIPS
— loading words but addressing bytes
— arithmetic on registers only
• Instruction
add $s1, $s2, $s3
sub $s1, $s2, $s3
lw $s1, 100($s2)
sw $s1, 100($s2)
CSE 45432 SUNY New Paltz
Meaning
$s1 <= $s2 + $s3
$s1 <= $s2 – $s3
$s1 <= Memory[$s2+100]
Memory[$s2+100] => $s1
11
Example
swap(int v[], int k);
{ int temp;
temp = v[k]
v[k] = v[k+1];
v[k+1] = temp;
}
CSE 45432 SUNY New Paltz
swap:
muli $2, $5, 4
add $2, $4, $2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
12
Control Instructions
• Decision making instructions
– alter the control flow,
– i.e., change the "next" instruction to be executed
• MIPS conditional branch instructions:
bne $t0, $t1, Label // branch if $t0 != $t1
beq $t0, $t1, Label // branch if $t0 = $t1
• Example:
if (i==j) h = i + j;
bne $s0, $s1, Label
add $s3, $s0, $s1
Label: ....
CSE 45432 SUNY New Paltz
13
Control Instructions (Continue)
• MIPS unconditional branch instructions:
j label
• Example:
if (i!=j)
h=i+j;
else
h=i-j;
CSE 45432 SUNY New Paltz
beq $s4, $s5, Lab1
add $s3, $s4, $s5
j Lab2
Lab1: sub $s3, $s4, $s5
Lab2: ...
14
Control Instructions (Continue)
• We have: beq, bne, what about Branch-if-less-than?
• New instruction:
if $s1 < $s2 then
$t0 = 1
slt $t0, $s1, $s2
else
$t0 = 0
• Can use this instruction to build "blt $s1, $s2, Label"
— can now build general control structures
• Note that the assembler needs a register to do this,
— there are policy of use conventions for registers
CSE 45432 SUNY New Paltz
15
Machine Language
• Instructions, like registers and words of data, are also 32 bits long
• R-type instruction format:
– Example: add $t0, $s1, $s2
– registers have numbers, $t0=8, $s1=17, $s2=18
000000
op
10001
rs
10010
rt
Source register
•
01000
00000
100000
rd
shamt
funct
Destination register
Op-code extension
I-type instruction format:
– Example: lw $t0, 32($s2)
35
18
9
op
rs
rt
CSE 45432 SUNY New Paltz
32
16 bit number
16
Overview of MIPS
• Simple instructions all 32 bits wide
• Very structured, no unnecessary baggage
• Only three instruction formats
R
op
rs
rt
rd
shamt
funct
I
op
rs
rt
16 bit address
J
op
26 bit address
• In branch instructions, address is relative to PC (next instruction)
bne $t4,$t5,Label ==> PC = (PC+4) + Label if $t4 = = $t5
• In jump instructions, address is relative to the 4 high order bits of PC
– Address boundaries of 256 MB.
• Pseudo Instructions are assembly instructions that are translated by the
assembler into one or more MIPS instructions
– Example: MOV $t0, $t1 ==> add $t0, $t1, $0
CSE 45432 SUNY New Paltz
17
MIPS 5 Addressing Modes
1. Immediate addressing
op
rs
rt
Immediate
2. Register addressing
op
rs
rt
rd
. ..
funct
Registers
Register
3. Base addressing
op
rs
rt
Memor y
Address
+
Register
Byte
Halfword
Word
4. PC-relative addressing
op
rs
rt
Memor y
Address
PC
+
Word
5. Pseudodirect addressing
op
Address
PC
CSE 45432 SUNY New Paltz
Memor y
Word
18
Constants
• Immediate instructions (2nd operand is a constant):
addi $29, $29, 4 // Add Immediate
slti $8, $18, 10
// Set Less Than Immediate
andi $29, $29, 6 // AND Immediate
ori $29, $29, 4
// OR Immediate
• To load a 32 bit constant into a register, load each 16 bit separatel
lui $t0, 1010101010101010 //First: "load upper immediate"
1010101010101010
0000000000000000
filled with zeros
Then must get the lower order bits right, i.e.,
ori $t0, $t0, 1010101010101010 // OR immediate
1010101010101010
0000000000000000
0000000000000000
1010101010101010
1010101010101010
1010101010101010
OR
CSE 45432 SUNY New Paltz
19
To summarize:
MIPS operands
Name
32 registers
Example
$s0-$s7, $t0-$t9, $zero,
$a0-$a3, $v0-$v1, $gp,
$fp, $sp, $ra, $at
Memory[0],
2 30 memory Memory[4], ...,
words
Memory[4294967292]
Comments
Fast locations for data. In MIPS, data must be in registers to perform
arithmetic. MIPS register $zero always equals 0. Register $at is
reserved for the assembler to handle large constants.
Accessed only by data transfer instructions. MIPS uses byte addresses, so
sequential words differ by 4. Memory holds data structures, such as arrays,
and spilled registers, such as those saved on procedure calls.
MIPS assembly language
Category
Arithmetic
Instruction
add
subtract
add immediate
load word
store word
Data transfer load byte
store byte
load upper
immediate
branch on equal
Conditional
branch
Unconditional jump
CSE 45432 SUNY New Paltz
Example
add $s1, $s2, $s3
Comments
Three operands; data in
registers
sub $s1, $s2, $s3 $s1 <= $s2 - $s3
Three operands; data in
registers
addi $s1, $s2, 100 $s1 <= $s2 + 100
Used to add constants
lw $s1, 100($s2) $s1 <= Memory[$s2 + 100] Word from memory to register
sw $s1, 100($s2) Memory[$s2 + 100] <D1= $s1
Word from register to memory
lb $s1, 100($s2) $s1 <= Memory[$s2 + 100] Byte from memory to register
sb $s1, 100($s2) Memory[$s2 + 100] <= $s1 Byte from register to memory
16
lui $s1, 100
Loads constant in upper 16 bits
$s1 <= 100 * 2
beq $s1, $s2, 25
Meaning
$s1 <= $s2 + $s3
if ($s1 == $s2) go to
PC + 4 + 100
branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to
PC + 4 + 100
set on less than
slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1;
else $s1 = 0
set less than
slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1;
immediate
else $s1 = 0
jump
j 2500
go to 10000
jump register
jr $ra
go to $ra
jump and link
jal 2500
$ra = PC + 4; go to 10000
Equal test; PC-relative branch
Not equal test; PC-relative
Compare less than; for beq,
bne
Compare less than constant
Jump to target address
For switch, procedure return
For procedure call
20
Alternative Architectures
• We've focused on architectural issues
– basics of MIPS assembly language and machine code
– we’ll build a processor to execute these instructions.
• Design alternative:
– provide more powerful operations
– goal is to reduce number of instructions executed
– danger is a slower cycle time and/or a higher CPI
• Sometimes referred to as “RISC vs. CISC”
– virtually all new instruction sets since 1982 have been RISC
– VAX: minimize code size, make assembly language easy
instructions from 1 to 54 bytes long!
• We’ll look at PowerPC and 80x86
CSE 45432 SUNY New Paltz
21
PowerPC
• Indexed addressing
– example:
lw $t1,$a0+$s3
– What do we have to do in MIPS?
// $t1=Memory[$a0+$s3]
• Update addressing
– update a register as part of load (for marching through arrays)
– example: lwu $t0,4($s3)
// $t0=Memory[$s3+4];$s3=$s3+4
– What do we have to do in MIPS?
• Others:
– load multiple/store multiple
– a special counter register “bc Loop”
decrement counter, if not 0 goto loop
CSE 45432 SUNY New Paltz
22
80x86
•
•
•
•
•
1978: The Intel 8086 is announced (16 bit architecture)
1980: The 8087 floating point coprocessor is added
1982: The 80286 increases address space to 24 bits, +instructions
1985: The 80386 extends to 32 bits, new addressing modes
1989-1995: The 80486, Pentium, Pentium Pro add a few instructions
(mostly designed for higher performance)
• 1997: MMX is added
“This history illustrates the impact of the “golden handcuffs” of
compatibility
“adding new features as someone might add clothing to a packed bag”
“an architecture that is difficult to explain and impossible to love”
CSE 45432 SUNY New Paltz
23
A dominant architecture: 80x86
• Complexity:
– Instructions from 1 to 17 bytes long
– one operand must act as both a source and destination
– one operand can come from memory
– complex addressing modes
e.g., “base or scaled index with 8 or 32 bit displacement”
• Saving grace:
– the most frequently used instructions are not too difficult to build
– compilers avoid the portions of the architecture that are slow
“what the 80x86 lacks in style is made up in quantity, making it beautiful
from the right perspective”
CSE 45432 SUNY New Paltz
24
Summary
• Instruction complexity is only one variable
– lower instruction count vs. higher CPI / lower clock rate
• Design Principles:
1 Simplicity favors regularity
2 Smaller is faster
3 Good design demands compromise
4 Make the common case fast
• Instruction set architecture
– a very important abstraction indeed!
Fallacy: Most powerful instructions mean higher performance
Repeat Prefix (REP) in 80X86
Fallacy: Write in assembly language to obtain the highest performance
CSE 45432 SUNY New Paltz
25
MIPS: Software conventions for Registers
0
zero constant 0
16 s0 callee saves
1
at
. . . (caller can clobber)
2
v0 expression evaluation &
23 s7
3
v1 function results
24 t8
4
a0 arguments
25 t9
5
a1
26 k0 reserved for OS kernel
6
a2
27 k1
7
a3
28 gp Pointer to global area
8
t0
...
reserved for assembler
temporary (cont’d)
temporary: caller saves
29 sp Stack pointer
(callee can clobber)
30 fp
frame pointer
31 ra
Return Address (HW)
15 t7
Plus a 3-deep stack of mode bits.
CSE 45432 SUNY New Paltz
26