Transcript Y86 ISA

Adaptation par J.Bétréma
CS:APP Chapter 4
Computer Architecture
Instruction Set
Architecture
Randal E. Bryant
Carnegie Mellon University
http://csapp.cs.cmu.edu
CS:APP
Instruction Set Architecture
Application
Program
Compiler
OS
Jeu d’instructions du processeur,
ou plutôt du modèle de processeur.
ISA
CPU
Design
Circuit
Design
Chip
Layout
–2–
CS:APP
X86 Evolution: Programmer’s View
Name
8086




1978
29K
1985
275K
Extended to 32 bits. Added “flat addressing”
Capable of running Unix
Linux/gcc uses no instructions introduced in later models
Pentium
–3–
Transistors
16-bit processor. Basis for IBM PC & DOS
Limited to 1MB address space. DOS only gives you 640K
386

Date
1993
3.1M
CS:APP
X86 Evolution: Programmer’s View
Name
Pentium III


–4–
Transistors
1999
8.2M
Added “streaming SIMD” instructions for operating on 128-bit
vectors of 1, 2, or 4 byte integer or floating point data
Our fish machines
Pentium 4

Date
2001
42M
Added 8-byte formats and 144 new instructions for streaming
SIMD mode
CS:APP
X86 Evolution: Clones
Advanced Micro Devices (AMD)

Historically
 AMD has followed just behind Intel
 A little bit slower, a lot cheaper

Recently
 Recruited top circuit designers from Digital Equipment Corp.
 Exploited fact that Intel distracted by IA64
 Now are close competitors to Intel

–5–
Developing own extension to 64 bits
CS:APP
New Species: IA64
Name
Date
Transistors
Itanium
2001
10M



Extends to IA64, a 64-bit architecture
Radically new instruction set designed for high performance
Will be able to run existing IA32 programs
 On-board “x86 engine”

Joint project with Hewlett-Packard
Itanium 2

–6–
2002
221M
Big performance boost
CS:APP
Y86 Processor State
Program
registers
%eax
%esi
%ecx
%edi
%edx
%esp
%ebx
%ebp

Condition
codes
Memory
OF ZF SF
PC
Program Registers
 Same 8 as with IA32. Each 32 bits

Program Counter
 Indicates address of instruction
–7–
CS:APP
Condition codes
Program
registers
%eax
%esi
%ecx
%edi
%edx
%esp
%ebx
%ebp
Condition
codes
Memory
OF ZF SF
PC
 Single-bit flags set by arithmetic or logical instructions
» OF: Overflow
ZF: Zero
SF:Negative
 Il manque CF : Carry (retenue). On travaille donc sur des entiers
relatifs (“avec signe”).
 Correspondance avec la notation NZVC :
» N = SF Z = ZF V = OF
–8–
CS:APP
Memory
Program
registers
%eax
%esi
%ecx
%edi
%edx
%esp
%ebx
%ebp
Condition
codes
Memory
OF ZF SF
PC
Byte-addressable
storage array
Words stored in little-endian byte order
–9–
CS:APP
Machine Words
Machine Has “Word Size”

Nominal size of integer-valued data
 Including addresses

Most current machines are 32 bits (4 bytes)
 Limits addresses to 4GB
 Becoming too small for memory-intensive applications

High-end systems are 64 bits (8 bytes)
 Potentially address  1.8 X 1019 bytes

Machines support multiple data formats
 Fractions or multiples of word size
 Always integral number of bytes
– 10 –
CS:APP
Word-Oriented Memory
Organization
32-bit 64-bit
Words Words
Addresses Specify Byte
Locations


Address of first byte in
word
Addresses of successive
words differ by 4 (32-bit) or
8 (64-bit)
Addr
=
0000
??
Addr
=
0000
??
Addr
=
0004
??
Addr
=
0008
??
Addr
=
000c
??
– 11 –
Addr
=
0008
??
Bytes Addr.
0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
000a
000b
000c
000d
000e
000f
CS:APP
Byte Ordering Example
Big Endian

Least significant byte has highest address
Little Endian

Least significant byte has lowest address
Example


Variable n has 4-byte representation 0x01234567
Address given by &n is 0x100
Big Endian
0x100 0x101 0x102 0x103
01
Little Endian
45
67
0x100 0x101 0x102 0x103
67
– 12 –
23
45
23
01
CS:APP
Y86 Instructions
Format

1--6 bytes of information read from memory
 Can determine instruction length from first byte
 Not as many instruction types, and simpler encoding than with
IA32


– 13 –
Each accesses and modifies some part(s) of the program
state
PC incrémenté selon la longueur de l’instruction, pour
pointer sur l’instruction suivante
CS:APP
Encoding Registers
Each register has 4-bit ID
%eax
%ecx
%edx
%ebx


0
1
2
3
%esi
%edi
%esp
%ebp
6
7
4
5
Same encoding as in IA32
IA32 utilise seulement 3 bits pour coder un registre
Register ID 8 indicates “no register”

– 14 –
Will use this in our hardware design in multiple places
CS:APP
Instruction Example
Addition Instruction
Generic Form
Encoded Representation
addl rA, rB

6 0 rA rB
Add value in register rA to that in register rB
 Store result in register rB
 Note that Y86 only allows addition to be applied to register data

Set condition codes based on result
e.g., addl %eax,%esi Encoding: 60 06

Two-byte encoding

 First indicates instruction type
 Second gives source and destination registers
– 15 –
CS:APP
Arithmetic and Logical Operations
Instruction Code
Add
addl rA, rB
Function Code
6 0 rA rB
Subtract (rA from rB)
subl rA, rB

Refer to generically as
“OPl”

Encodings differ only by
“function code”
Set condition codes as
side effect
Manquent (entre autres) :
inc (incrémentation), dec,
sh (shift = décalage) …

6 1 rA rB

And
andl rA, rB
6 2 rA rB
Exclusive-Or
xorl rA, rB
– 16 –
6 3 rA rB
CS:APP
Arithmetic and Logical Operations (2)
Dans les architectures RISC (Reduced Instruction Set
Computer) les opérations portent sur 3 registres :
• deux registres sources pour les opérandes
• un registre destination pour le résultat
Attention : ici (x86 ou Y86) le second registre sert aussi
de destination, et le second opérande est donc écrasé.
– 17 –
CS:APP
Move Operations
rrmovl rA, rB
Register --> Register
2 0 rA rB
3 0 8 rB
V
rmmovl rA, D(rB) 4 0 rA rB
D
5 0 rA rB
D
irmovl V, rB
mrmovl D(rB), rA
Register --> Memory
Memory --> Register

Like the IA32 movl instruction

Simpler format for memory addresses
Give different names to keep them distinct

– 18 –
Immediate --> Register
CS:APP
Move Instruction Examples
IA32
Y86
Encoding
movl $0xabcd, %edx
irmovl $0xabcd, %edx
30 82 cd ab 00 00
movl %esp, %ebx
rrmovl %esp, %ebx
20 43
movl -12(%ebp),%ecx
mrmovl -12(%ebp),%ecx
50 15 f4 ff ff ff
movl %esi,0x41c(%esp)
rmmovl %esi,0x41c(%esp)
40 64 1c 04 00 00
movl $0xabcd, (%eax)
—
movl %eax, 12(%eax,%edx)
—
movl (%ebp,%eax,4),%ecx
—
– 19 –
CS:APP
Load
Ce sont les instructions coûteuses !
mrmovl 12(%ebp),%ecx
cas général (déplacement)
mrmovl (%ebp),%ecx
le registre ebp sert de pointeur
mrmovl
adresse absolue de lecture
0x1b8,%ecx
Store
rmmovl %esi,0x41c(%esp)
cas général (déplacement)
rmmovl %esi,(%esp)
le registre esp sert de pointeur
rmmovl %esi,0x41c
adresse absolue d’écriture
– 20 –
CS:APP
Adressage par déplacement
mrmovl 12(%ebp),%ecx
cas général (déplacement)
On charge le mot d’adresse ebp + 12 dans le registre ecx.
Load = charger = lire .
rmmovl %esi,0x41c(%esp)
cas général (déplacement)
On sauvegarde le registre esi à l’adresse esp + 0x41c .
Store = sauve(garde)r = écrire .
– 21 –
CS:APP
Jump Instructions
Jump Unconditionally
jmp Dest
7 0
Dest

Refer to generically as
“jXX”
Dest

Encodings differ only by
“function code”
Based on values of
condition codes
Same as IA32 counterparts
Encode full destination
address
Jump When Less or Equal
jle Dest
7 1
Jump When Less
jl Dest
7 2
Dest
Jump When Equal
je Dest
7 3

Dest
Jump When Not Equal
jne Dest
7 4
Dest
7 5

 Unlike PC-relative
addressing seen in IA32
Jump When Greater or Equal
jge Dest

Dest
Jump When Greater
jg Dest
– 22 –
7 6
Dest
CS:APP
Sauts conditionnels
 Jump When Less : saut effectué si le résultat de la dernière
opération (addl, subl, andl, xorl) est négatif; ne pas oublier
l’overflow !
• ( SF = 1 et OF = 0 )
ou
( SF = 0 et OF = 1 )
• en résumé SF  OF
 Jump When Equal : saut effectué si le résultat de la dernière
opération est nul, soit ZF = 1
 Jump When Less or Equal : SF  OF or ZF = 1
Négations :
 Jump When Greater or Equal : SF = OF
 Jump When Not Equal : ZF = 0
 Jump When Greater : SF = OF and ZF = 0
– 23 –
CS:APP
Sauts (suite et fin)
 Un saut est effectué en modifiant PC .
 Les instructions rrmovl, irmovl, rmmovl, mrmovl ne modifient
jamais les codes de condition.
 Opérations logiques andl, xorl : ZF et SF ajustés selon résultat
de l’opération, OF = 0 (clear).
 andl %eax, %eax # ne modifie pas le registre
je ...
# saut effectué si eax = 0
jl ...
# saut effectué si eax < 0
 xorl %eax, %ebx
je ...
# saut effectué si eax = ebx
(mais ce test écrase ebx)
– 24 –
CS:APP
Miscellaneous Instructions
0 0
nop

Don’t do anything
halt



– 25 –
1 0
Stop executing instructions
IA32 has comparable instruction, but can’t execute it in
user mode
We will use it to stop the simulator
CS:APP
Object Code
Code for sum
Dans la vraie vie :
Assembler

Translates .s into .o

Some libraries are dynamically linked
0x401040 <sum>:
 Binary encoding of each instruction
0x55
• Total of 13
0x89
 Nearly-complete image of executable
bytes
0xe5
code
• Each
0x8b
instruction 1,
 Missing linkages between code in
0x45
2, or 3 bytes
different files
0x0c
• Starts at
0x03
address
Linker
0x45
0x401040
0x08
 Resolves references between files
0x89
 Combines with static run-time
0xec
libraries
0x5d
 E.g., code for malloc, printf
0xc3
 Linking occurs when program begins
execution
– 26 –
CS:APP
Turning C into Object Code


Code in files p1.c p2.c
Compile with command: gcc -O p1.c p2.c -o p
 Use optimizations (-O)
 Put resulting binary in file p
text
C program (p1.c p2.c)
Compiler (gcc -S)
text
Asm program (p1.s p2.s)
Assembler (gcc or as)
binary
Object program (p1.o p2.o)
Static libraries
(.a)
Linker (gcc or ld)
binary
– 27 –
Executable program (p)
CS:APP
Disassembling Object Code
Disassembled
00401040 <_sum>:
0:
55
1:
89 e5
3:
8b 45 0c
6:
03 45 08
9:
89 ec
b:
5d
c:
c3
d:
8d 76 00
push
mov
mov
add
mov
pop
ret
lea
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp,%esp
%ebp
0x0(%esi),%esi
Disassembler
objdump -d p




– 28 –
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either a.out (complete executable) or .o file
CS:APP
Pseudo code objet
Loop:
Fichier source .ys
mrmovl (%ecx),%esi
addl %esi,%eax
irmovl $4,%ebx
addl %ebx,%ecx
irmovl $-1,%ebx
addl %ebx,%edx
jne
Loop
Simulateur Y86
# get *Start
# add to sum
# Start++
# Count-# Stop when 0
Assembleur yas
0x057:
0x05d:
Fichier .yo 0x05f:
0x065:
0x067:
0x06d:
0x06f:
– 29 –
506100000000
6060
308304000000
6031
3083ffffffff
6032
7457000000
|Loop:
|
|
|
|
|
|
mrmovl (%ecx),%esi
addl %esi,%eax
irmovl $4,%ebx
addl %ebx,%ecx
irmovl $-1,%ebx
addl %ebx,%edx
jne
Loop
CS:APP
Fichier .yo
0x057:
0x05d:
0x05f:
0x065:
0x067:
0x06d:
0x06f:
506100000000
6060
308304000000
6031
3083ffffffff
6032
7457000000
|Loop:
|
|
|
|
|
|
mrmovl (%ecx),%esi
addl %esi,%eax
irmovl $4,%ebx
addl %ebx,%ecx
irmovl $-1,%ebx
addl %ebx,%edx
jne
Loop
1. Etiquette = adresse calculée par l’assembleur
(vrai pour tout assembleur)
2. Fichier « exécuté » par un simulateur
– 30 –
CS:APP
CISC Instruction Sets


Complex Instruction Set Computer
Dominant style through mid-80’s
Stack-oriented instruction set


Use stack to pass arguments, save program counter
Explicit push and pop instructions
Arithmetic instructions can access memory

addl %eax, 12(%ebx,%ecx,4)
 requires memory read and write
 Complex address calculation
Condition codes

Set as side effect of arithmetic and logical instructions
Philosophy

– 31 –
Add instructions to perform “typical” programming tasks
CS:APP
RISC Instruction Sets


Reduced Instruction Set Computer
Internal project at IBM, later popularized by Hennessy
(Stanford) and Patterson (Berkeley)
Fewer, simpler instructions


Might take more to get given task done
Can execute them with small and fast hardware
Register-oriented instruction set


Many more (typically 32) registers
Use for arguments, return pointer, temporaries
Only load and store instructions can access memory

Similar to Y86 mrmovl and rmmovl
No Condition codes

– 32 –
Test instructions return 0/1 in register
CS:APP
MIPS Registers
– 33 –
$0
$0
$1
$at
$2
$v0
$3
$v1
$4
$a0
$5
$a1
$6
$a2
$7
Constant 0
Reserved Temp.
$16
$s0
$17
$s1
$18
$s2
$19
$s3
$20
$s4
$21
$s5
$22
$s6
$a3
$23
$s7
$8
$t0
$24
$t8
$9
$t1
$25
$t9
$10
$t2
$26
$k0
$11
$t3
$27
$k1
$12
$t4
$28
$gp
$13
$t5
$29
$sp
$14
$t6
$30
$s8
$15
$t7
$31
$ra
Return Values
Procedure arguments
Caller Save
Temporaries:
May be overwritten by
called procedures
Callee Save
Temporaries:
May not be
overwritten by
called procedures
Caller Save Temp
Reserved for
Operating Sys
Global Pointer
Stack Pointer
Callee Save Temp
Return Address
CS:APP
MIPS Instruction Examples
R-R
Op
Ra
addu $3,$2,$1
R-I
Op
Ra
addu $3,$2, 3145
sll $3,$2,2
Branch
Op
Ra
beq $3,$2,dest
Load/Store
Op
– 34 –
Ra
Rb
Rd
00000
Fn
# Register add: $3 = $2+$1
Rb
Immediate
# Immediate add: $3 = $2+3145
# Shift left: $3 = $2 << 2
Rb
Offset
# Branch when $3 = $2
Rb
Offset
lw $3,16($2)
# Load Word: $3 = M[$2+16]
sw $3,16($2)
# Store Word: M[$2+16] = $3
CS:APP
CISC vs. RISC
Original Debate



Strong opinions!
CISC proponents---easy for compiler, fewer code bytes
RISC proponents---better for optimizing compilers, can make
run fast with simple chip design
Current Status

For desktop processors, choice of ISA not a technical issue
 With enough hardware, can make anything run fast
 Code compatibility more important

For embedded processors, RISC makes sense
 Smaller, cheaper, less power
– 35 –
CS:APP
Summary
Y86 Instruction Set Architecture



Similar state and instructions as IA32
Simpler encodings
Somewhere between CISC and RISC
How Important is ISA Design?

Less now than before
 With enough hardware, can make almost anything go fast

Intel is moving away from IA32
 Does not allow enough parallel execution
 Introduced IA64
» 64-bit word sizes (overcome address space limitations)
» Radically different style of instruction set with explicit parallelism
» Requires sophisticated compilers
– 36 –
CS:APP