x86 Assembly Language - CSE @ IITD
Download
Report
Transcript x86 Assembly Language - CSE @ IITD
PowerPoint Slides
Computer Organisation and Architecture
Smruti Ranjan Sarangi,
IIT Delhi
Chapter 5 x86 Assembly Language
PROPRIETARY MATERIAL. © 2014 The McGraw-Hill Companies, Inc. All rights reserved. No part of this PowerPoint slide may be displayed, reproduced or distributed in any form or by any
means, without the prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGraw-Hill for their individual course preparation.
PowerPoint Slides are being provided only to authorized professors and instructors for use in preparing for classes using the affiliated textbook. No other use or distribution of this PowerPoint
1 slide
is permitted. The PowerPoint slide may not be sold and may not be distributed or be used by any student or any other third party. No part of the slide may be reproduced, displayed or distributed in
any form or by any means, electronic or otherwise, without the prior written permission of McGraw Hill Education (India) Private Limited.
1
These slides are meant to be used along with the book: Computer
Organisation and Architecture, Smruti Ranjan Sarangi, McGrawHill 2015
2
Visit: http://www.cse.iitd.ernet.in/~srsarangi/archbooksoft.html
Overview of the x86 ISA
It is not one ISA
It is a family of ISAs
The great-grandfather in the family
Is the 8-bit 8080 microprocessor used in the mid-seventies
The grandfather is the 16-bit 8086 microprocessor
released in 1978
The parents are the 32 bit processors : 80386, 80486,
Pentium, and Pentium IV
The current generation of processors are 64 bit
processors : Intel Core i3, i5, i7
3
Main Features of the x86 ISA
It is a CISC ISA
Has more than 300+ instructions
Instructions can have a source/
destination memory operand
Uses the stack for passing arguments, and
return addresses
Uses segmented memory
4
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
5
View of Registers
Modern Intel machines are still ISA compatible
with the arcane 16 bit 8086 processor
In fact, due to market requirements, a 64 bit
processor needs to be ISA compatible with all
32 bit, and 16 bit ISAs
What do we do with registers?
Do we define a new set of registers for each
type of x86 ISA? ANSWER : NO
6
View of Registers – II
Consider the 16 bit x86 ISA – It has 8 registers: ax, bx,
cx, dx, sp, bp, si, di
Should we keep the old registers, and create a new
set of registers in a 32 bit processor?
NO – Widen the 16 bit registers to 32 bits.
If the processor is running a 16 bit program, then it
uses the lower 16 bits of every 32 bit register.
7
View of Registers – III
64 bits
32 bits
16 bits
rax
eax
ax
rbx
ebx
bx
rcx
ecx
cx
rdx
edx
dx
rsp
esp
sp
rbp
ebp
bp
rsi
esi
si
rdi
edi
di
8 registers
64, 32, 16
bit variants
r8
r9
The 64 bit ISA has
8 extra registers
r8 - r15
r15
8
x86 can even Support 8 bit Registers
ax
ah
al
bx
bh
bl
cx
ch
cl
dx
dh
dl
For the first four 16 bit registers
The lower 8 bits are represented by : al, bl, cl, dl
The upper 8 bits are represented by : ah, bh, ch, dh
9
x86 Flags Registers and PC
Fields in the flags register
64 bits
32 bits
16 bits
rflags
eflags
rip
eip
flags
ip
Field
OF
CF
ZF
Condition
Overflow
Carry flag
Zero flag
SF
Sign flag
Semantics
Set on an overflow
Set on a carry or borrow
Set when the result is a 0,or the
comparison leads to an equality
Sign bit of the result
Similar to the SimpleRisc flags register
It has 16 bit, 32 bit, and 64 bit variants
The PC is known as IP (instruction pointer)
10
Floating-point Registers
FP register
stack
st0
st1
st0
st2
st0
st3
st4
st5
st6
st7
x86 has 8 (80 bit) floating-point registers
st0 – st7
They are also arranged as a stack
st0 is the top of the stack
We can perform both register operations, as well as
stack operations
11
View of Memory
x86 follows a segmented memory model
Each address in x86 is actually an offset from the start of the
segment.
For example, an instruction address is an offset in the code
segment
The starting address of the code segment is maintained in a
code segment (CS) register
CS Register
Address
memory
Conceptual View
12
Segmentation in x86
16 bit segment registers
cs
es
ss
fs
ds
gs
x86 has 6 different segment registers
Each register is 16 bits wide
Code segment (cs), data segment (ds), stack segment
(ss), extra segment (es), extra segment 1 (fs), extra
segment 2 (gs)
13
Segmented vs Linear Memory Model
In a linear memory model (e.g. SimpleRisc, ARM) the address
specified in the instruction is sent to the memory system
There are no segment registers
What are the advantages of a segmented memory model?
The contents of the segment registers can be changed by the
operating system at runtime.
Can map the text section(code) to another part of memory, or in
principle to other devices also (discussed in Chapter 10)
Stores cannot modify the instructions in the text section.
REASON : Stores use the data segment, and instructions use the
code segment
14
How does Segmentation Work
The segment registers nowadays contain an offset into a
segment descriptor table
Because, 16 bits are not sufficient to store a memory
address
Modern x86 processors have two kinds of segment
descriptor tables
LDT (Local Descriptor Table), 1 per process, typically not used
nowadays
GDT (Global Descriptor Table), contains 8191 entries
Each entry in these tables contains the starting address of the
segment
15
Segment Descriptor Cache
Every memory access needs to access the GDT or LDT :
VERY SLOW
Use a segment descriptor cache (SDC) at each processor
that stores a copy of the relevant entries in the GDT
Lookup the SDC first
If an entry is not there, send a request to the GDT
Quick, fast, and efficient
16
Memory Addressing Mode
𝑐𝑠:
𝑑𝑠:
𝑠𝑠:
𝑎𝑑𝑑𝑟𝑒𝑠𝑠 = 𝑒𝑠:
𝑓𝑠:
𝑔𝑠:
𝑒𝑎𝑥
𝑒𝑏𝑥
𝑒𝑐𝑥
𝑒𝑑𝑥
𝑒𝑠𝑝 +
𝑒𝑏𝑝
𝑒𝑠𝑖
𝑒𝑑𝑖
𝑏𝑎𝑠𝑒
𝑒𝑎𝑥
𝑒𝑏𝑥
𝑒𝑐𝑥
𝑒𝑑𝑥
𝑒𝑏𝑝
𝑒𝑠𝑖
𝑒𝑑𝑖
1
2
∗
4
8
+ 𝑑𝑖𝑠𝑝𝑙𝑎𝑐𝑒𝑚𝑒𝑛𝑡
𝑜𝑓𝑓𝑠𝑒𝑡
𝑠𝑐𝑎𝑙𝑒
𝑖𝑛𝑑𝑒𝑥
x86 supports a base, a scaled index and an offset (known as the
displacement)
Each of the fields is optional
17
Examples of Addressing Modes
Memory operand
[eax]
[eax + ecx*2]
[eax + ecx*2 - 32]
[edx - 12]
[edx*2]
[0xFFE13342]
Value of the address
(in register transfer notation)
eax
eax + 2 * ecx
eax + 2* ecx - 32
edx - 12
edx * 2
0xFFE13342
Addressing mode
register-indirect
base-scaled-index
base-scaled-index-offset
base-offset
scaled-index
memory-direct
x86 supports memory direct addressing
The address can just be the index
It can be a combination of the base, scaled index,
and displacement
18
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
19
Basic x86 Assembly
We shall use the NASM assembler in this book
Available at : http://www.nasm.us
Generic structure of an assembly statement
<label> : <assembly instruction> ; <comment>
Comments are preceded by a ;
x86 assembly instructions
Typically in the 1 and 2 address format
2 address format : <instruction> <operand 1> <operand 2>
<operand 1> is typically both the source and destination
20
Basic x86 Assembly – II
Rules for operands (for most instructions)
Both the operands can be a register
At most one of them can be an immediate
At most one of them can be a memory location
A memory operand is encapsulated in []
Rules for immediates
The size of an immediate is equal to the size of the
memory address
For example, for a 32 bit machine, the maximum size of
an immediate is 32 bits
21
Basic x86 Assembly – III
We shall use the 32 bit flavour of x86 in this book
Readers can seamlessly write 16 bit x86 programs
Simply use the registers : ax, bx, cx, dx, sp, bp, si, di
Readers can also write 64 bit programs by using the registers :
rax, rbx, rcx, rdx, rsp, rbp, rsi, rdi, and r8 – r15
22
The mov instruction
Semantics
mov (reg/mem), (reg/mem/imm)
Example
mov eax, ebx
Explanation
eax ← ebx
Extremely versatile instruction
Can be used to load an immediate
Load and store values to memory
Move values between registers
Example
mov ebx, [esp – eax*4 - 12]
23
movsx and movzx instructions
Semantics
movsx reg, (reg/mem)
Example
movsx eax,bx
movzx reg, (reg/mem)
movsx eax,bx
Explanation
eax ← sign extend(bx), the second
operand is either 8 or 16 bits
eax ← zero extend(bx), the second
operand is either 8 or 16 bits
The regular mov instruction assumes that the
source and destination have the same size
The movsx and movzx instructions replace the MSB
bits by the sign bit, or zeros respectively
24
Exchange Instruction
Semantics
xchg (reg/mem), (reg/mem)
Example
xchg eax, [eax + edi]
Explanation
swap the contents of eax
and [eax + edi]
Exchanges the contents of <operand 1> and
<operand 2>
25
Stack push and pop Instructions
Semantics
push (reg/mem/imm)
pop (reg/mem)
Example
push ecx
pop ecx
Explanation
temp ← ecx; esp ← esp - 4; [esp] ← temp
temp ← [esp]; esp ← esp + 4; ecx ← temp
An x86 processor is aware of the stack
It is aware that the stack pointer is stored in the
register, esp
The push instruction decrements the stack pointer
The pop instruction increments the stack pointer and
returns the contents at the top of the stack
26
Specifying Memory Operand Sizes
The processor knows the size of a register operand
from its name
eax is a 32 bit operand
ax is a 16 bit operand
What about memory operands ?
push [eax] → How many bytes need to be pushed ?
Solution : Use a modifier
push dword [eax] ; pushes 32 bits
Similarly, we need to use modifiers for other instructions such as
pop (when the number of bytes to be transferred are not known)
27
Modifiers
Modifier
byte
word
dword
qword
Size
8 bits
16 bits
32 bits
64 bits
What is the value of ebx in this code snippet?
mov eax, 10
mov [esp], eax
push dword [esp]
mov ebx, [esp]
Answer: 10
28
ALU Instructions
Semantics
add (reg/mem), (reg/mem/imm)
sub (reg/mem), (reg/mem/imm)
adc (reg/mem), (reg/mem/imm)
sbb (reg/mem), (reg/mem/imm)
Example
add eax, ebx
sub eax, ebx
adc eax, ebx
sbb eax, ebx
Explanation
eax ← eax + ebx
eax ← eax - ebx
eax ← eax + ebx + (carry bit)
eax ← eax - ebx - (carry bit)
All of these are 2 operand instructions
The first operand is both the source and destination
Example : Add registers eax, and ebx. Save the result in ecx
add eax, ebx
mov ecx, eax
29
Single Operand ALU Instructions
Semantics
Example
Explanation
inc (reg/mem)
inc edx
edx ← edx + 1
dec (reg/mem)
dec edx
edx ← edx - 1
neg (reg/mem)
neg edx
edx ← -1 * edx
Write an x86 assembly code snippet to compute: eax = -1 * (eax + 1).
Answer:
inc eax
neg eax
30
Compare Instruction
Semantics
cmp (reg/mem), (reg/mem/imm)
Example
cmp eax, [ebx+4]
cmp (reg/mem), (reg/mem/imm)
cmp ecx, 10
Explanation
compare the values in eax, and
[ebx+4], and set the flags
compare the content of ecx with
10, and set the flags
Similar to SimpleRisc, the cmp instruction
sets the flags
31
Multiplication and Division Instructions
Semantics
Example
Explanation
imul (reg/mem)
imul reg, (reg/mem)
imul reg, (reg/mem), imm
idiv (reg/mem)
imul ecx
imul ecx, [eax + 4]
imul ecx, [eax + 4], 5
idiv ebx
edx:eax ← eax * ecx
ecx ← ecx * [eax + 4]
ecx ← [eax + 4] * 5
Divide (edx:eax) by the contents
of ebx; eax contains the quotient,
and edx contains the remainder.
The imul instruction has three variants
1 operand form → Saves the 64 bit result in edx:eax
eax contains the lower 32 bits, and edx contains the
upper 32 bits
32
imul Instruction - II
Semantics
Example
Explanation
imul (reg/mem)
imul reg, (reg/mem)
imul reg, (reg/mem), imm
idiv (reg/mem)
imul ecx
imul ecx, [eax + 4]
imul ecx, [eax + 4], 5
idiv ebx
edx:eax ← eax * ecx
ecx ← ecx * [eax + 4]
ecx ← [eax + 4] * 5
Divide (edx:eax) by the contents
of ebx; eax contains the quotient,
and edx contains the remainder.
2 operand form
The first operand (source and destination) has to be a
register
The second operand can either be a register or
memory location
33
imul Instruction - III
Semantics
Example
Explanation
imul (reg/mem)
imul reg, (reg/mem)
imul reg, (reg/mem), imm
idiv (reg/mem)
imul ecx
imul ecx, [eax + 4]
imul ecx, [eax + 4], 5
idiv ebx
edx:eax ← eax * ecx
ecx ← ecx * [eax + 4]
ecx ← [eax + 4] * 5
Divide (edx:eax) by the contents
of ebx; eax contains the quotient,
and edx contains the remainder.
3 operand form
First operand (destination) → register
First source operand (register or memory)
Second source operand (immediate)
34
idiv Instruction
Semantics
idiv (reg/mem)
Example
idiv ebx
Explanation
Divide (edx:eax) by the contents
of ebx; eax contains the quotient,
and edx contains the remainder.
Takes a single operand (register or memory)
Dividend is contained in edx:eax
edx contains the upper 32 bits
eax contains the lower 32 bits
The input operand contains the divisor
eax contains the quotient
edx contains the remainder
While dividing by a negative number (set edx to -1 for sign
extension)
35
Example
Write an assembly code snippet to divide -50 by 3. Save the quotient in
eax, and remainder in edx.
Answer:
mov edx, -1
mov eax, -50
mov ebx, 3
idiv ebx
At the end eax contains -16, and edx contains -2.
36
Logical Instructions
Semantics
and (reg/mem), (reg/mem/imm)
or (reg/mem), (reg/mem/imm)
xor (reg/mem), (reg/mem/imm)
not (reg/mem)
Example
and eax, ebx
or eax, ebx
xor eax, ebx
not eax
Explanation
eax ← eax AND ebx
eax ← eax OR ebx
eax ← eax XOR ebx
eax ← ∼ eax
and, or, and xor are standard 2 operand ALU
instructions where the first operand is also
the destination
The not instruction is a 1 operand instruction
37
Shift Instructions
Semantics
sar (reg/mem), imm
shr (reg/mem), imm
sal/shl (reg/mem), imm
Example
sar eax, 3
shr eax, 3
sal eax, 2
Explanation
eax ← eax >> 3
eax ← eax >>> 3
eax ← eax << 2
sar (shift arithmetic right)
shr (shift logical right)
sal/shl (shift left)
The second operand(shift amount) needs to be
an immediate
38
Example
What is the output of this code snippet?
mov eax, 0xdeadfeed
sar eax, 4
Answer: 0xfdeadfee
What is the output of this code snippet?
mov eax, 0xdeadfeed
shr eax, 4
Answer: 0xdeadfee
39
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
40
Simple Branch Instructions
Semantics
Example
jmp < label >
j< condcode >
jmp .foo
j< condcode > .foo
Explanation
jump to .foo
jump to .foo if the < condcode > condition
is satisfied
jmp is a simple unconditional branch
instruction
The conditional branches are of the form :
j<condcode> such as jeq, jne
41
Condition Codes in x86
Condition code
o
no
b
nb
e/z
ne/nz
be
s
ns
l
le
g
ge
Meaning
Overflow
No overflow
Below (unsigned less than)
Not below (unsigned greater than or equal to)
Equal or zero
Not equal or not zero
Below or equal (unsigned less than or equal)
Sign bit is 1 (negative)
Sign bit is 0 (0 or positive)
Less than (signed less than)
Less than or equal (signed)
Greater than (signed)
Greater than or equal (signed)
42
Example : Test if a number in eax is
prime. Put the result in eax
x86 assembly code
mov ebx, 2
; starting index
mov ecx, eax
; ecx contains the original number
.loop:
mov edx, 0
idiv ebx
cmp edx, 0
je .notprime
inc ebx
mov eax, ecx
cmp ebx, eax
jl .loop
; required for correct division
; compare the remainder
; number is composite
; set the value of eax again
; compare the index and the number
; end of the loop
mov eax, 1
; number is prime
jmp .exit
; exit
.notprime:
mov eax, 0
.exit:
43
Function Call and Return
Instructions
Semantics
Example
Explanation
call < label >
call .foo
ret
ret
Push the return address on the stack.
Jump to the label .foo.
Return to the address saved on the top
of the stack, and pop the entry
The call instruction jumps to the <label>, and
pushes the return address on the stack
Pops the stack top (assume it contains the return
address)
44
What does a typical function do ?
Extracts the arguments from the stack
Creates space on the stack to store the
activation block
Spills some registers (if required)
Calls other functions
Does some processing
Restores the stack pointer
Returns
45
Example of a Recursive Function
Write a recursive function to compute the factorial of a number (≥ 1) stored in
eax. Save the result in ebx.
Answer:
x86 assembly code
factorial:
mov ebx, 1
cmp eax, 1
jz .return
; default return value
; compare num (input) with 1
; return if input is equal to 1
; recursive step
push eax
dec eax
call factorial
pop eax
imul ebx, eax
;
;
;
;
;
ret
; return
save input on the stack
num-recursive call
retrieve input
prod = prod * num
.return:
46
Implementing a Function
Using push and pop instructions is fine for small
functions
For large functions that have a lot of internal variables,
it might be necessary to push and pop a lot of values
from the stack
For languages like C++ that dynamically declare local
variables, it might be difficult to keep track of the size
of the activation block.
x86 processors thus save the starting value of esp in
the ebp register. At the end they set esp to ebp.
47
Recursive function for factorial :
without push/pop instructions
x86 assembly code
factorial:
mov eax, [esp+4]; get the value of eax from the stack
push ebp
mov ebp, esp
; *** save ebp
; *** save the stack pointer
mov ebx, 1
cmp eax, 1
; default return value
; compare num (input) with 1
jz .return
; return if input is equal
; recursive step
sub esp, 8
mov [esp+4], eax
dec eax
mov [esp], eax
call factorial
mov eax, [esp+4]
imul ebx, eax
;
;
;
;
;
;
;
create space on the stack
save input on the stack
num-push the argument
recursive call
retrieve input
prod = prod * num
.return:
mov esp, ebp
pop ebp
ret
; *** restore the stack pointer
; *** restore ebp
; return
48
Enter and Leave Instructions
Semantics
enter imm, 0
Example
enter 32, 0
leave
leave
Explanation
push ebp (push the value of ebp on the
stack); mov ebp, esp (save the stack
pointer in ebp); esp ← esp - 32
mov esp, ebp (restore the value of esp);
pop ebp (restore the value of ebp)
push ebp ; mov ebp, esp ; sub esp, <stack size> is a
standard sequence of operations
The enter instruction does all the three operations
mov esp, ebp ; pop ebp
Standard sequence at the end of a function
Both the operations are done by the leave instruction
49
Example with enter and leave
x86 assembly code
factorial:
mov eax, [esp+4]
; read the argument
enter 8, 0
; *** save ebp and esp
mov ebx, 1
cmp eax, 1
jz .return
; default return value
; compare num (input) with 1
; return if input is equal to 1
; recursive step
mov [esp+4], eax
dec eax
mov [esp], eax
call factorial
mov eax, [esp+4]
imul ebx, eax
; save input on the stack
; num-; push the argument
; recursive call
; retrieve input
; prod = prod * num
.return:
leave
ret
; *** load esp and ebp
; return
50
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
51
Advanced Memory Instructions
These instructions are useful in moving a large
sequence of bytes from one location to another
Also known as string instructions
They make special use of the edi and esi
registers
edi contains the default destination
esi contains the default source
52
The lea instruction
The lea (load effective address) inst. is
used to load an address in to the edi and
esi registers
In general, lea can be used to load an address in to
any register
lea ebx, [ecx + edx*2 + 16]
ebx ← ecx + 2 * edx + 16
53
stosd instruction
The stosd instruction does not have any
operands
It saves the value in eax to [edi] (memory location in
edi)
If the value of the DF flag in the flags register is 1
edi ← edi – 4
If the value in the DF flag in the flags register is 0
edi ← edi + 4
It is a post-indexed addressing mode
54
lodsd instruction
The lodsd instruction does not have any
operands
It saves the value in [esi] to eax (memory location in
esi)
If the value of the DF flag in the flags register is 1
esi ← esi – 4
If the value in the DF flag in the flags register is 0
esi ← esi + 4
It is a post-indexed addressing mode
55
Summary of Memory Instructions
Semantics
lea reg, mem
stos(b/w/d/q)
lods(b/w/d/q)
movs(b/w/d/q)
Example
lea ebx, [esi + edi*2 + 10]
stosd
lodsd
movsd
Explanation
ebx ← esi + edi*2 + 10
[edi] ← eax; edi ← edi + 4 * (−1)DF
eax ← [esi]; esi ← esi + 4 * (−1)DF
[edi] ← [esi] ; esi ← esi + 4 * (−1)DF
edi ← edi + 4 * (−1)DF
DF ← 1
DF ← 0
DF → Direction Flag
std
cld
std
cld
movsd : [edi] ← [esi]
Auto increments esi, and edi based on the DF flag
std : Sets the DF flag to 1
cld : Sets the DF flag to 0
56
What is the value of eax after executing this code snippet?
mov dword [esp+4], 192
lea esi, [esp+4]
lea edi, [esp+8]
movsd
mov eax, [esp+8]
Answer: The movsd instruction transfer 4 bytes from the memory address
specified in esi to the memory address specified in edi. Since we write
192 to the memory address specified in esi, we shall read back the same
value in the last line.
57
Power of String Instructions
cld
mov ebx, 0
; DF = 0
; initialisation of the loop index
.loop:
movsd
inc ebx
cmp ebx, 10
jne .loop
; [edi] <-- [esi]
; increment the index
; loop condition
Copy a 10 element array
Starting address of source array in esi
Starting address of destination array in edi
58
The rep prefix
Semantics
rep inst
Example
rep movsd
Explanation
val ← ecx; Execute the movsd instruction
val times; ecx ← 0
Repeats a given instruction n times
n is the value stored in ecx
cld
mov ecx, 10
rep movsd
; DF = 0
; Set the count to 10
; Execute movsd 10 times
59
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
60
FP Machine Model
Constants
Integer
registers
FP
registers
Memory
There is no direct connection between integer and
FP registers
They can only communicate through memory
No way to load floating-point immediates directly
61
FP Load Instructions
Semantics
fld mem
Example
fld dword [eax]
fld reg
fld st1
fild mem
fild dword [eax]
Explanation
Pushes an FP number stored in [eax] to
the FP stack
Pushes the contents of st1 to the top of
the stack
Pushes an integer stored in [eax] to the
FP stack after converting it to a 32 bit
floating point number
The fld instruction pushes the value of the first
operand (register/mem) to the FP stack
The fild instruction pushes an integer stored in
memory to the FP stack
62
Assembler Directives
There are two ways to load an FP immediate
Store its hex representation to memory, and use the fld
instruction to bring the value to a FP register.
Use an assembler directive to store the immediate as a
constant before the program starts. Then use the fld
instruction to transfer the value to the FP stack.
In NASM :
section .data
num: dd 2.392
Declares a 32 bit floating-point constant : 2.392 in the data section
63
Assembler Directives – II
Furthermore, the assembler associates the label
num with the memory address that saves 2.392
In the assembly program, we need to write:
fld dword [num]
With this method, we do not have to save the
hex (binary) representation of a FP number. The
assembler will automatically do it for us.
64
FP Exchange
Semantics
fxch reg
fxch
Example
fxch st3
fxch
Explanation
Exchange the contents of st0 and st3
Exchange the contents of st0 and st1
Exchanges the contents of two floating point
registers
st0 is always one of the FP registers
65
FP Store Instruction
Semantics
fst mem
fst reg
fstp mem
fist mem
fistp mem
Example
fst dword [eax]
fst st4
fstp dword [eax]
fist dword [eax]
fistp dword [eax]
Explanation
[eax] ← st0
st4 ← st0
[eax] ← st0; pop the FP stack
[eax] ← int(st0)
[eax] ← int(st0); pop the FP stack
The fst instruction saves the value of st0 to memory
The fist instruction converts the FP value to an integer,
and than saves it in memory.
With the 'p' suffix, the inst. also pops the FP stack
66
Example
A 32 bit floating point number is loaded in st0. Convert it to an integer and
save its value in eax.
Answer:
fist dword[esp+4]
mov eax, [esp+4]
; save st0 to [esp+4]
67
Variants of the FP add instruction
Semantics
fadd mem
fadd reg, reg
Example
fadd dword [eax]
fadd st0, st1
faddp reg, reg
fiadd mem
faddp st1, st0
fiadd dword [eax]
NASM fadd reg
specific
fadd st1
Explanation
st0 ← st0 + [eax]
st0 ← st0 + st1 (one of the registers has
to be st0)
st1 ← st0 + st1; pop the FP stack
st0 ← st0 + float([eax])
st0 ← st0 + st1
fadd adds two FP numbers
faddp additionally pops the stack
fiadd adds an integer in the first memory
operand to st0
68
Subtraction, Multiplication,
Division
Semantics
fsub mem
fmul mem
fdiv mem
Example
fsub dword [eax]
fmul dword [eax]
fdiv dword [eax]
Explanation
st0 ← st0 - [eax]
st0 ← st0 * [eax]
st0 ← st0 / [eax]
fsub, fmul, fdiv have exactly the same
form (variants) as the fadd instruction
69
Example: Arithmetic Mean
Compute the arithmetic mean of two integers stored in eax and ebx. Save the
result (in 64 bits) in esp+4. Assume that the memory address, two, contains the
constant 2.
Answer:
; load the inputs to the FP stack
mov [esp], eax
mov [esp+4], ebx
fild dword [esp]
fild dword [esp+4]
fadd st0, st1
fdiv dword [two]
; compute the sum
; arithmetic mean
fstp qword [esp+4]
; save the result to [esp+4]
; used the qword identifier
; for specifying 64 bits
70
Instructions for Special Functions
Semantics
fabs
fsqrt
fcos
fsin
Example
fabs
fsqrt
fcos
fsin
Explanation
st0 ← |st0|
st0 ← √st0
st0 ← cos(st0)
st0 ← sin(st0)
71
Example: Geometric Mean
Compute the geometric mean of two integers stored in eax and ebx. Save the
result (in 64 bits) in esp+4.
Answer:
; load the inputs to the FP stack
mov [esp], eax
mov [esp+4], ebx
fild dword [esp]
fild dword [esp+4]
fmul st0, st1
fsqrt
; compute the product
; geometric mean
fstp qword [esp+4]
; save the result to [esp+4]
; used the qword identifier
; for specifying 64 bits
72
Compare Instructions
Semantics
Example
Explanation
fcomi reg, reg
fcomi st0, st1
fcomip reg, reg
fcomi st0, st1
compare st0 and st1, and set the eflags
register (first register has to be st0)
compare st0 and st1, and set the eflags
register; pop the FP stack
The fcomi instruction compares the values
of two FP registers and sets the flags
NOTE : It sets the flags for unsigned
comparison
73
Example
Compare sin(2θ) and 2sin(θ)cos(θ). Verify that they have the same value for any
given value of θ. Assume that θ is stored in the data section at the label theta, and
the threshold for floating point comparison is stored at label threshold. Save the
result in eax (1 if equal, and 0 if unequal).
Answer:
; compute sin(2*theta), and save in [esp]
fld dword [theta]
fadd st0, st0
; st0 = theta + theta
fsin
fstp dword [esp]
; store the value
; compute (2*sin(theta)*cos(theta))
fld dword [theta]
fst st1
fsin
fxch
fcos
fmul st0, st1
fadd st0, st0
; st1 = st0 = theta
; st0 = sin(theta)
; swap st0 and st1 (st1=sin(theta))
; st0 = cos(theta)
; st0 = sin(theta) * cos (theta)
; st0 = st0 + st0
74
Example – II
; compute the modulus of the difference
fld dword [esp]
; load (sin(2*theta))
fsub st0, st1
; st0 = sin(2*theta)- 2*sin(theta)cos(theta)
fabs
; compare
fld dword [threshold]
fcomi st0, st1
ja .equal
mov eax, 0
jmp .exit
; compare
; threshold > difference (a for above)
; else not equal
.equal:
mov eax, 1
; values are equal
.exit:
75
Stack Cleanup Instructions
Semantics
ffree reg
finit
Example
ffree st4
finit
Explanation
Free st4
Reset the status of the FP unit including
the FP stack and registers
76
Outline
x86 Machine Model
Simple Integer Instructions
Branch Instructions
Advanced Memory Instructions
Floating Point Instructions
Encoding the x86 ISA
77
Overview of Instruction Encoding
Prefix
1-4 bytes
(optional)
Opcode
ModR/M
1-3 bytes
1 byte
(optional)
SIB
Displacement
1 byte 1/2/4 bytes
(optional) (optional)
Immediate
1/2/4 bytes
(optional)
The 1-4 byte prefix specifies an optional
prefix
Examples :
Can be used to specify the rep prefix
The lock prefix is used to specify that an instruction
executes atomically in a multiprocessor system
78
The ModR/M Byte
2
3
3
Mod
Reg
R/M
Determines the addressing modes of the operands
Mod bits : (addressing mode of one of the operands)
00 → Register indirect addressing mode
01 → Indirect addressing mode with 1 byte displacement
10 → Indirect addressing mode with 4 byte displacement
11 → Register direct addressing mode
79
The ModR/M Byte – II
The Reg field specifies the register operand
(if necessary)
The Mod and R/M bits determine the format
of the memory operand (if it exists)
If R/M = 100 , we get the scale index and
base from the subsequent SIB byte
80
Scale Index Base
2
3
3
Scale
Index
Base
There are four values of the scale : 00 (1), 01 (2), 10 (4),
11 (8)
Both the index and base are 3 bits each, and follow the
register encoding scheme
Some rules :
esp cannot be an index
The offset in the memory address can only be
specified in the displacement field
81
Register Encoding
Code
Register
000
eax
001
ecx
010
edx
011
ebx
100
esp
101
ebp
110
esi
111
edi
** If the R/M bits are 100 , then we use the SIB byte
** If Mod = 00, and R/M = 101 (ebp), we use memory direct addressing
The 32 bit displacement is used as the memory address
82
Example
Encode the instruction: add ebx, [edx + ecx*2 + 32]. Assume that the opcode
for the add instruction is 0x03.
Answer:
Let us calculate the value of the ModR/M byte. In this case, our displacement
fits within 8 bits. Hence, we can set the Mod bits equal to 01 (corresponding to
an 8 bit displacement). We need to use the SIB byte because we have a scale
and an index. Thus, we set the R/M bits to 100. The destination register is ebx.
Its code is 011. Thus, the ModR/M byte is : 01 011 100 (0x5C)
Now, let us calculate the value of the SIB byte. The scale is equal to 2 (01). The
index is ecx(001), and the base is edx (010). Hence, the SIB byte is: 01 001 010
= 0x4A.
The last byte is the displacement, which is equal to 0x20.
Thus, the encoding of the instruction is : 03 5C 4A 20 (in hex)
83
THE END
84