Transcript Assembly

‫‪Assembly‬‬
‫תכנות באסמבלי‬
Assembly vs. Higher level languages

There are NO variables’ type definitions.
 All
kinds of data are stored in the same registers.
 We need to know what we are working it in order to use
the right instructions.
 Memory = a large, byte-addressable array.

Only a limited set of registers is used to store data
while running the program.
 If
we need more room we must save the data into
memory and later reread it.

No special structures (instructions) for “if” / “switch” /
“loops” (for, while, do-while), or even functions!
‫תזכורת‪ :‬תהליך קומפילציה‬
‫‪3‬‬
‫תמיכת שפת ‪ C‬בכתיבת מרובת קבצים‬
‫‪ ‬כתיבת מודולים הוטמעה‬
‫בשפת ‪ C‬בכך שבניית התוכנה‬
‫נעשית בשני שלבים‪:‬‬
‫‪ .1‬הידור )‪ (compilation‬של כל‬
‫קובץ מקור )‪ ( .c‬בנפרד‬
‫‪ .2‬קישור )‪ (linking‬של כל‬
‫הקבצים ביחד‬
‫‪filen.c‬‬
‫‪Compiler‬‬
‫‪filen.o‬‬
‫‪file2.c‬‬
‫‪file1.c‬‬
‫‪Compiler Compiler‬‬
‫‪file2.o‬‬
‫‪file1.o‬‬
‫‪Linker‬‬
‫‪runfile‬‬
‫‪‬‬
‫‪44‬‬
Linux Compilation Process
Five stages : Preprocessing, Parsing,
Translation, Assembling, and Linking
 All 5 stages are implemented by one
program in UNIX, gcc
 gcc is the C compiler of choice for most
UNIX. Actually just a front end that executes various other

programs corresponding to each stage in the compilation process.
To get it to print out the commands it executes at each step, use gcc
-v.
5
How to - Disassembly of code

Compilation of code:
 gcc
-c code.c
 We get the file: code.o

Disassembly:
 objdump
-d code.o
 We get an assembly-like code that represents the c
code appeared in file code.c

Or:
 gcc
-s code.c
 We get a code.s file that contains an assembly code
created by the compiler.
Standard data types
Assembly
In Assembly: size = type of variable.
Words, double words ….





Due to its origins as a 16-bit architecture that
expanded into a 32-bit one, Intel uses the term
“word” to refer to a 16-bit data type and not a 32bit data type as we mentioned before. Backward
Compatibility.
32-bit quantities as “double words”.
64-bit quantities as “quad words”.
Most instructions we will encounter operate on
bytes or double words.
Each instruction has 3 variants, depending on its
suffix (‘b’ – byte / ‘w’ – word / ‘l’ – double word).
The Registers



An IA32 CPU contains a set of eight registers storing
32-bit values. These registers are used to store
integer data as well as pointers.
The registers names all begin with %e (extend), but
otherwise they have peculiar names.
In the original 8086 CPU each register had a specific
target (and hence it got its name). Today most of
these targets are less significant.
 Some
instructions use fixed registers as sources and/or
destinations.
 Within procedures there are different conventions for
saving and restoring the first three registers (%eax, %ecx,
and %edx), than for the next three (%ebx, %edi, and %esi).
 %ebp and %esp contain pointers to important places in the
program stack.
The File Register
Partial access to a register



The low-order two bytes of the first four registers
can be independently read or written by the byte
operation instructions. This feature was provided to
allow backward compatibility.
When a byte instruction updates one of these
single-byte “register elements,” the remaining three
bytes of the register do not change.
Same goes for the low-order 16 bits of each
register, using word operation instructions.
Operand Forms
Move to / from memory Instructions
Important Suffixes
‘l’ - double word.
 ‘w’ - word.
 ‘b’ - byte
 ‘s’ - single (for floating point)

movl Operand Combinations
Source
Destination
Reg
movl $0x4,%eax
temp = 0x4;
Mem
movl $-147,(%eax)
*p = -147;
Imm
movl
Reg
movl %eax,%edx
temp2 = temp1;
Mem
movl %eax,(%edx)
*p = temp;
Reg
movl (%eax),%edx
temp = *p;
Reg
Mem
C Analog
Cannot do memory-memory transfers with
single instruction !!!
Important prefixes
i – immediate (constant) value.
 r – register.
 m – memory.

 rrmovl
= move double word from one register
to the other.
 irmovw = move the word given as immediate
value into the register.
movb & movw



The movb instruction is similar, but it moves just
a single byte. When one of the operands is a
register, it must be one of the eight single-byte
register elements.
Similarly, the movw instruction moves two bytes.
When one of its operands is a register, it must
be one of the eight two-byte register elements.
Both the movsbl and the movzbl instructions
serve to copy a byte and to set the remaining
bits in the destination:
 movsbl
 movzbl
- signed extension.
- zero extension.
MOVSBL and MOVZBL








MOVSBL sign-extends a single byte, and copies it into a
double-word destination
MOVZBL expands a single byte to 32 bits with 24
leading zeros, and copies it into a double-word
destination
Example:
%eax = 0x12345678
%edx = 0xAAAABBBB
MOVB %dh, %al
%eax = 0x123456BB
MOVSBL %dh, %eax %eax = 0xFFFFFFBB
MOVZBL %dh, %eax %eax = 0x000000BB
18
Another example
(Assume initially that %dh = 8D, %eax = 98765432)



movb %dh,%al
movsbl %dh,%eax
movzbl %dh,%eax
%eax = 9876548D
%eax = FFFFFF8D
%eax = 0000008D
C vs. Assembly example
Arithmetic & Logical Operations
Arithmetic & Logical Operations (2)


With the exception of leal, each of these
instructions has a counterpart that operates on
words (16 bits) and on bytes (by replacing the
suffix).
Again, cannot do memory-memory transfers with
single instruction
“Load Effective Address” (leal)



The “Load Effective Address” (leal) instruction is
actually a variant of the movl instruction.
Its first operand appears to be a memory reference,
but instead of reading from the designated location,
the instruction copies the effective address to the
destination. Doesn’t access memory !!!
This instruction can be used to generate pointers
for later memory references without accessing
memory.
leal (2)



The leal Instruction can be used to compactly describe
common arithmetic operations.
If register %edx contains value x, then the instruction:
leal 7(%edx,%edx,4), %eax
will set register %eax to 5x + 7.
It is commonly used to perform simple arithmetic:

(%eax = x; %ecx = y)
= x+6
 leal 6(%eax), %edx
 leal (%eax,%ecx), %edx
= x+y
 leal (%eax,%ecx,4), %edx
= x+4y
 leal 7(%eax,%eax,8), %edx
= 9x+7
 leal 0xA(,%ecx,4), %edx
= 4y+10
 leal 9(%eax,%ecx,2), %edx
=x+2y+9
Logical Shift
Logical Shift - every bit in the operand is
simply moved a given number of bit
positions, and the vacant bit-positions are
filled in, usually with zeros.
 Useful as multiplication or division of
unsigned integers by powers of two.

25
Arithmetic Right Shift



A bitwise operation that shifts all
of the bits of its operand
Every bit in the operand is simply
moved a given number of bit
positions, and the vacant bitpositions are filled in with the
leftmost bit.
When shifting to the right leftmost bit- sign bit - sort of sign
extension. Dividing by 2n
26
Shift

Either logical or arithmetic

k is a number between 0 and 31, or the single-byte
register %cl

Suppose that x and n are stored at memory locations
with offsets 8 and 12, respectively, relative to the
address in register %ebp
 get
n
 get x
 x <<= 2
 x >>= n
 movl 12(%ebp), %ecx
 movl 8(%ebp), %eax
 sall $2,%eax
 sarl %cl,%eax
C vs. Assembly example
mul & div Instructions
Imull Example
mov +4823424, eax
mov -423, ebx
imul ebx ;
What happens is :
EDX:EAX = FFFFFFFFh:86635D80h
‫ נועם חזון‬,‫תמר שרוט‬
30

Cltd - converts the signed long in EAX to a signed double long in
EDX:EAX by extending the most-significant bit (sign bit) of EAX into
all bits of EDX

iDiv - The idiv instruction divides the contents of the 64 bit integer
EDX:EAX (EDX as the most significant four bytes and EAX as the
least significant four bytes) by the specified operand value. The
quotient result of the division is stored into EAX, while the remainder
is placed in EDX.
‫ נועם חזון‬,‫תמר שרוט‬
31
Code example




(x at %ebp+8, y at %ebp+12)
movl 8(%ebp),%eax Put x in %eax
imull 12(%ebp)
Multiply by y
pushl %edx
Push high-order 32 bits
pushl %eax
Push low-order 32 bits
Yet, another example





(x at %ebp+8, y at %ebp+12)
movl 8(%ebp),%eax
Put x in %eax
cltd
Sign extend into %edx
idivl 12(%ebp)
Divide by y
pushl %eax
Push x / y
pushl %edx
Push x % y