Lectures for 2nd Edition
Download
Report
Transcript Lectures for 2nd Edition
MIPS Assembly language
In computer programs we have :
Data
Instructions (Arithmetic & control)
We need a memory & a CPU:
1
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
What should the CPU do?
• We need to transfer data from registers to the memory and
vice versa
• We need to perform arithmetic operations on the data
residing in registers
• We need to have the ability of controlling the flow of the
program (Ifs and Jumps)
2
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The machine understands only bits!
3
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Humans prefer a language “closer” to English
4
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Labels will improve the situation
5
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
High-level language is even better
6
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The process of Compilation
7
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Preface
We’ll learn:
•
How a computer is built
•
Some performance analysis
•
Advanced subject (Pipeline, caches) that are used in
all modern computers
The course book:
Computer Organization & Design The hardware/software
interface, David A. Patterson and John L. Hennessey.
Second Edition 1998
8
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Machine language, Assembly and C
• The CPU understands machine Language only
High-level
language
program
(in C)
• Assembly language is easier to understand:
- Abstraction
- A unique translation
Looks like regular English. Depends
on the specific CPU. Every CPU has a
different set of Assembly instructions
since it has different sets of registers
and machine operations.
• C language (a high level language):
- The translation is
not unique !
C compiler
Assembly
language
program
(for MIPS)
C is closer to regular English.( The
high level language can be adjusted to
the desired processing). High
productivity: In a single sentence we
define many machine operations. And
the most important: It does not depend
on the CPU !!!
- It depends on the Compiler
and the optimization requested
- It is portable (fits every CPU)
When do we use Assembly?
Nowadays we use Assembly only when we have to:
* When processing time is critical and we need “manually” adjustments of the code
* When all resources of the CPU (e.g., Overflow, Index registers etc.) are needed, but not
supported by the hig level language
* When memory is critical an an optimization of its management is required (e.g., in DSPs)
swap(int v[], int k)
{int temp;
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
}
swap:
muli $2, $5,4
add $2, $4,$2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
Assembler
Binary machine
language
program
(for MIPS)
00000000101000010000000000011000
00000000100011100001100000100001
10001100011000100000000000000000
10001100111100100000000000000100
10101100111100100000000000000000
10101100011000100000000000000100
00000011111000000000000000001000
9
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Instruction Set Architecture
• There is similarity between Assembly languages of different CPUs
• We’ll learn the MIPS Assembly language. It was developed in the
80’s and was used in workstation such as Silicon Graphics, NEC,
Sony.
• RISC v. CISC
– Reduced Instruction Set Computer - MIPS
– 8086 - Complex Instruction Set Computer
Our motto is: “More is less”
By that we mean that a smaller set of simpler instructions results
with a better performance. This is so, since the h/w required is
simpler, it is therefore faster and takes less Silicon space, which
means less expensive.
10
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
1st design rule: Simplicity favors Regularity
Arithmetic operations:
• MIPS
addi a,b,100
add a,b,c
# a=b+100
# a=b+c
• 8086
ADD EAX,B # EAX= EAX+B
We prefer a simple mechanism with minimalistic
instructions such as: R3 = R1 op R2 on a CPU which
might be easier to program since it has instructions like:
R5 = ( R1 op1 R2) op2 (R3 op3 R4 )
but is much harder to design and implement
11
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
2nd design rule: Smaller is faster
• Arithmetic operations are allowed only on registers
• The operands could be registers or a single constant
• There are only 32 registers
( spilling is applied when needed) Page 115
• A register contains a word = 32 bits = 4 bytes
The rgisters are
• We use conventions
$1,$2 …
$s0,$s1 ... - C variables
$t1,$t2 … - temporary vars.
An example:
f=(g+h)-(k+j) #
add $t0,$s1,$s2
add $t1,$s3,$s4
sub $s0, $t0, $t1
denoted $0 - $31 or
by names related to
their job. There is a
convention of their
jobs.
Page 110
$s0=f, $s1=g, $s2=h, $s3=k, $s4=j
The example describes how the sentence: f=(g+h)(k+j) is translated into Assembly language instructions.
The registers are the heart of the CPU. Accessing
registers is faster than accessing the memory. We
access 3 registers simultaneously: read from two and
write to the 3rd.
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
12
Policy of Use Conventions
Name Register number
$zero
0
$v0-$v1
2-3
$a0-$a3
4-7
$t0-$t7
8-15
$s0-$s7
16-23
$t8-$t9
24-25
$gp
28
$sp
29
$fp
30
$ra
31
Usage
the constant value 0
values for results and expression evaluation
arguments
temporaries
saved
more temporaries
global pointer
stack pointer
frame pointer
return address
Other registers are: $at = $1 which is reserved for the Assembler
and $k0, $k1 = $26, $27 which are reserved for the Operating System
13
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The memory
• The memory is a long array
• The address is the index to the array
• Byte addressing - The index is in bytes
•
The maximal size of memory available is
0
1
2
3
4
5
.6..
8 bits of data
230 words = 232 bytes
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
14
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing the memory
Page 112
• Load and Store instructions only
• in LW we load a word, but the address we access is
given in bytes
lw $s1,100($s2)
# $s1=Memory[$s2+100]
base register = pointer to the array
sw $s1,100($s2)
offset = the location in the array
# Memory[$s2+100]=$s1
15
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing bytes
• There are also lb (load byte) and
sb (store byte) instructions
• These are useful for handling “char”s since
ASCII characters are stored in bytes
ASCII: American Standard Code For information
Interchange
• Another code, Unicode, store characters in
two bytes each
16
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
ASCII
17
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Accessing memory
• An example:
A is an array of words
The address of A is in $3
h is in $2
Page 114
A
0
32 bits of data
4
8
12
MIPS = Big endian
A[0]
A[1]
A[2]
0
1
2
3
INTEL = Little endian
16
3
C code:
A[2] = h + A[2];
MIPS code:
lw $t0, 8($s3)
add $t0, $s2,$t0
sw $t0, 8($s3)
2
1
0
# $t0=$s3[8]
# $t0=$s2+$t0
# $s3[8]=$t0
18
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Machine language
• All instructions have the same size, 32 bits
• In Intel’s 8086 the size changes from 1 to 17 bytes
• An R-type instruction:
add $s1,$s2,$s3
# $s1=$17,$s2=$18, $s3=$19
Format of R-type instruction:
0
18
19
17
0
32
31
0
000000 10010
0H
10011
10001
12H
13H
11H
op
rs
rt
rd
6
5
5
5
op - opecode
rt- register source no 2
funct - function
00000
0H
shamt
5
100000
20H
funct
6
rs - register source
rd - register destination
shamt - shift amount
19
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
3rd design rule: A good design requires some compromises
The compromise here is using different dormats to different types of instructions. We could have increased the
number of bits per instruction so we have enough bits for all fields of all types of instructions. However, many of
them would have been unused most of the time. So we use 3 types . (A different size conflicts with simplicuty).
• Minimize the types. Use similar instructions.
•
6
5
5
5
5
R
op
rs
rt
rd
I
op
rs
rt
16 bit address
J
op
shamt
6
funct
26 bit address
Example: lw $s1, 32($s2)
6
5
5
35
18
17
op
rs
rt
# $s1 =$17, $s2=18
5
5
6
32
16 bit number
20
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The program in memory
• The program (code) is stored in memory just like data
Processor
Memory
memory for data, programs,
compilers, editors, etc.
Performing the program
•
The Program Counter (PC) - a special register keeping the
address of the next instruction
•
We read the code word from the memory (using the PC as the
pointer)
•
We then increment the PC
21
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Branch vs Jump
• Jump - is an absolute jump without any conditions
j
label
• Branch - a relative conditioned jump
bne $1,$2,label
•
# $1!=$2 go to label
An example:
if (i!=j)
h=i+j;
else
h=i-j;
($s3=h
Lab1:
Lab2:
$s4 =i
$s5=j )
beq $s4, $s5, Lab1
add $s3, $s4, $s5
j Lab2
sub $s3, $s4, $s5
...
22
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Addresses in Branches and Jumps
•
•
Instructions:
bne $t4,$t5,Label
beq $t4,$t5,Label
j Label
Next instruction is at Label if $t4!= $t5
Next instruction is at Label if $t4 = $t5
Next instruction is at Label
Formats:
I
op
J
op
rs
rt
16 bit address
26 bit address
we see that
branch - is a relative jump in a range of 2^16 words
We assume that most of the jumps are local = branches
•
Beq $s1,$s2,25
# if ($s1 ==$s2) go to PC +4 +25*4
23
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Actual coding example
Loop: lw
$8, save($19) # $8=save[i]
bne $8, $21,Exit #Goto Exit if save[i]<> k
add $19,$19,$20
j
Loop
# i:=i+j
# Goto Loop
Exit:
SAVE - 1000
80,000
35
19
8
1000
80,004
5
8
21
2
80,008
0
19
20
80,012
2
80,016
19
0
32
20,000
24
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
An important rule:
In MIPS Assembly instructions we write
code addresses in words
data addresses in bytes
when a MIPS CPU accesses memory it always gives the
address in bytes
About long jumps:
We cannot do:
bne $8,$21,far_adrs
since there are only 16 bits of
offset in branch instructions.
But we can do:
beq $8,$21,nxt
j far_adrs
nxt:
This is a service given by the Assembler
25
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
How do we code a Branch-if-less-than?
slt $t0, $s1, $s2
means
if
$s1 < $s2 then
$t0 = 1
else
$t0 = 0
• We can use slt to “build” a blt instruction
blt $s0,$s1,Less
slt
bne
$at,$s0,$s1
$at,$zero,Less
• blt is a psedoinstruction
# $t0 gets 1 if $s0<$s1
# go to Less if $t0 != 0
Also:
move $t1,$s4
stands for: add $t1,$s4,$zero
• Assembler uses $at (= $1) for pseudoinstructions
26
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Design rule 4: Build the common case faster
• Most arithmetic operation use small constants
addi $29, $29, 4
So we have many instructions with 16 bit constants: addi, slti, andi, ori, xori
• Large constants
These will be take longer to load, but only one more instruction is added
1010101010101010
0000000000000000
lui $t0,1010101010101010
0000000000000000
1010101010101010
ori $t0,$t0,1010101010101010
1010101010101010
1010101010101010
27
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
MIPS Assembly language summary
MIPS operands
Name
32 registers
Example
Comments
$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform
$a0-$a3, $v0-$v1, $gp,
arithmetic. MIPS register $zero always equals 0. Register $at is
$fp, $sp, $ra, $at
reserved for the assembler to handle large constants.
Memory[0],
2
30
memory Memory[4], ...,
words
Memory[4294967292]
Accessed only by data transfer instructions. MIPS uses byte addresses, so
sequential words differ by 4. Memory holds data structures, such as arrays,
and spilled registers, such as those saved on procedure calls.
28
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
RISC v. CISC
I - number of instructions in program
T - time of the clock cycle
CPI - number of clock cycles per instruction
RISC:
I
T CPI
I
T CPI
CISC:
29
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
An exercise
A C code and two MIPS Assembly translations are given below:
while (save[i]!=k) do
i=i+j ;
save:array [ 0..100] of word
k is kept in $21. $19 has i and $20 has j.
•
First translation:
Loop: muli
lw
bne
add
j
Exit:
$9,$19,4
$8,save($9)
$8,$21,Exit
$19,$19,$20
Loop
# Temporary reg $9:=i*4
# Temporary reg $8:=save[i]
# Goto Exit if save[i]<>k
# i:=i+j
# Goto Loop
30
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
exercise (cont.)
•
2nd translation
muli
lw
bne
Loop: add
muli
lw
beq
Exit:
•
$9,$19,4
$8,save($9)
$8,$21,Exit
$19,$19,$20
$9,$19,4
$8,save($9)
$8,$21,Loop
# Temporary reg $9:=i*4
# Temporary reg $8:=save[i]
# Goto Exit if save[i]<>k
# i:=i+j
# Temporary reg $9:=i*4
# Temporary reg $8:=save[i]
# Goto Loop if save[i]=k
Assuming the loop is performed 10 times, what is the number of instructions
performed in the two translations?
31
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Compiler (Assembler) and Linker
A.asm
compiler
Page 161
A.obj
P.exe
loader
Memory
linker
B.asm compiler
B.obj
C.lib
(c.obj)
32
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Linker operation
m: .word 2
sw 7, m($3)
Pages 156-160
2
compiler
A.asm
2
sw 7,0($3)
3
A.obj
linker
4
s: .word 3,4
j
3
4
k
lw $1,s ($2)
k: add $1,$2,$3
B.asm
compiler
j 2
lw $1,0($2)
add $1,$2,$3
B.obj
sw 7,0($3)
j
3
lw $1,4($2)
add $1,$2,$3
P.exe
33
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The structure of an object file
•
•
•
•
•
•
Page 161
Object file header
text segment
data segment
relocation information
symbol table
debugging information
34
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.