dest - Homepage of Professor Ivan A. Escobar Broitman

Download Report

Transcript dest - Homepage of Professor Ivan A. Escobar Broitman

Assembly
Language
http://iescobar.com
Msc. Ivan A. Escobar Broitman
Enero Mayo 2012
CHAPTER 1
Introduction
Microprocessor
 Silicon chip that contains a
central processing unit (CPU).
 The “Brain” of all personal computers,
most workstations, and a great number of
digital devices.
 In charge of program execution.
 It can be RISC or CISC.
2
Bus
Connections
Address Bus
CPU
Memory
I/O
Control Bus
Data Bus
3
Bus Connections (continued)
 A processor communicates with the system’s
memory and I/O circuits by means of signals
that travel through a set of cables or
connections known as buses.
 Address Bus: Holds the memory address that will be
accessed.
 Data Bus: Holds the piece of data to read or write.
 Control Bus: Indicates the operation to be done
(read or write).
4
CPU Instructions
 Each instruction has:
 an opcode (operation code), that
indicates which operation to perform.
 zero o more operands, which may be
registers, constants or memory
locations.
5
Fetch-Execute Cycle
 Fetch:
1. Fetch an instruction from memory.
2. Decode the instruction to determine the
operation.
3. Fetch data from memory if necessary.
 Execute:
4. Perform the operation on the data.
5. Store the result in memory if needed.
6
RISC: Reduced
Instruction Set
Computer
 Microprocessor that uses a relatively
small number of fast but simple
instructions.
 Cheaper to design and produce because
they require less transistors.
 Mainly used in workstations.
7
CISC: Complex
Instruction Set
Computer
 Microprocessor that uses a significantly
large amount of complex (specialized)
instructions.
 Mainly used for Intel’s x86 architecture.
8
Programming Languages
High Level Language
Assembly Language
Machine Code
Hardware
9
Machine Code
 Lowest level programming



language.
Each CPU instruction is
represented as an opcode, which
is an unsigned integer number.
Only language that the computer
really understands.
Difficult to understand by human
beings.
10
Machine Code Example
 The opcode for adding one to the
accumulator in the Intel x86 is:
01000000b
or
0x40
11
Assembly
Language
 Same instruction set as
machine code.
 Each opcode is replaced by
a symbolic name.
 Less cryptic for human
beings.
12
Assembly Language Example
 The Intel x86 assembly language
instruction that adds one to the
accumulator is:
inc eax
13
Assembler
 In order to execute a
program written in
assembly language, it
first has to be translated
to machine code using a
special program called
an assembler.
inc eax
Assembler
0x40
14
High Level Language
 Has less primitive instructions



than assembly language and
machine code.
Program text is much more like
natural language.
Easier to understand by human
beings.
Examples: FORTRAN, LISP,
COBOL, BASIC and C.
15
Compiler
if(x == 0)
x = x + 5;
 A program
written in a high
level language
may be
translated to
machine code
using a
compiler.
Compiler
cmp
jne
add
.L1
0x81FE00000000
0x7506
0x81C605000000
esi,0
.L1
esi,5
Assembler
16
Interpreter
 An interpreter
translates a high level
language program to
an intermediate form
that is subsequently
executed by a virtual
machine.
IF X = 0 THEN
X = X + 5
Interpreter
Translator
Intermediate Form
Virtual Machine
17
Assembly Language
Advantages
 Program execution speed.
 Executable code size.
 “Bare bones” programming:
 special instructions (FPU, MMX)
 I/O ports
 special CPU modes of operation
18
Assembly Language
Disadvantages
 Error prone.
 Long and tedious to write.
 Difficult to understand and modify.
 Strongly tied to a specific computer
architecture.
19
Commonly Used
Assembly Language
Applications
 Operating Systems
 Device Drivers
 Communication Software
 Real Time Systems
 Embedded Systems
 Graphics
20
Reasons for
Studying Assembly
Language
 To understand some of the low level details of


how a real computer operates.
To get to know some technologies that can only
be adequately understood using assembly
language.
To obtain a better appreciation of the innerworkings of a compiler.
21
What’s next?
Programming
Languages
Course
Computer
Science
(ISC)
Microprocessors
Course
Computer
Engineering
(ISE)
Assembly Language
Course
22
CHAPTER 2
The Intel x86
Architecture
Moore’s
Law
 In 1965, Intel’s
co-founder
Gordon Moore,
made the
following
observation:
Approximately
every 18 months
microchips
duplicate their
power, while
their cost
stays roughly
the same.
24
Intel Processors
year
1970 1975 1980 1985 1990 1995 2000
P7
transistors
10M
P6
1M
P5
80486
80386
80286
100K
10K
8086
0
8080
4004
25
Moore’s Law
26
4004 (1971)
 First microprocessor.
 Built by Intel for Busicom
calculators.
 4-bit registers.
 108 kHz.
 2,300 transistors.
 640 bytes of memory.
27
4004 (1971)
28
8080 (1974)
 Used in the MITS Altair 8800, the first
commercial personal computer.
 8-bit registers.
 16-bit address bus.
 2 MHz.
 6,000 transistors.
 64Kbytes of memory
29
8080 (1974)
30
8086/8088 (1978)
 Used in the original IBM PC.
 First 16-bit microprocessor.
 20-bit address bus.
 16-bit (8086) and 8-bit (8088) data bus.
 4.77+ MHz.
 29,000 transistors.
 Addressable memory 1Mb.
31
8086/8088 (1978)
32
80286 (1982)
 Used in the original IBM PC/AT.
 24-bit address bus.
 16-bit data bus.
 6+ MHz.
 134,000 transistors.
 Multitasking, protected mode and virtual

memory.
Addressable memory 16Mb.
33
80286 (1982)
34
80386 (1985)
 32-bit registers.
 32-bit address bus.
 32-bit data bus.
 Pipelining.
 16+ MHz.
 275,000 transistors.
 Addressable memory 4Gb.
35
80386 (1985)
36
P4: 80486 (1989)
 Better execution speed.
 Integrated floating point unit (FPU).
 8 KB L1 cache.
 25+ MHz.
 1’200,000 transistors.
 Addressable memory 4Gb.
37
P4: 80486 (1989)
38
P5: Pentium (1993)
 64-bit data bus.
 8 KB L1 cache for data and 8 KB for code.
 Dual pipeline for integer operations.
 60+ MHz.
 3’100,000 transistors.
 Addressable Memory 4Gb.
39
P5: Pentium (1993)
40
P6: Pentium Pro (1995)
 36-bit address bus.
 256 KB L2 cache.
 Superpipelining.
 Speculative and out of



order execution.
150+ MHz.
5’500,000 transistors.
Addressable Memory
64Gb.
41
P6: Pentium Pro (1995)
42
P55C: Pentium MMX (1997)
 Classic Pentium with MMX




technology: 64-bit SIMD
multimedia and communication
extensions.
16 KB L1 cache for data and 16
KB for code.
166+ MHz.
4’500,000 transistors.
Addressable memory 4Gb.
43
Klamath: Pentium II (1997)
 Pentium Pro with MMX





technology.
16 KB L1 cache for data
and 16 KB for code.
512 KB L2 cache.
233+ MHz.
7’500,000 transistors.
Addressable Memory
64Gb.
44
Klamath: Pentium II (1997)
45
New P6 processors
 Pentium II Xeon (“Pentium II on steroids”)
 L2 cache runs at full processor speed.
 Designed for the computer server market.
 Celeron (“the Castrated One”)
 Pentium II with no L2 cache.
 Designed for the sub-$1,000 PC market.
46
New PII XEON
47
CELERON
48
Katmai: Pentium III (1999)
 Pentium II with 128-bit SIMD floating point
oriented extension to the MMX
technology.
 Processor serial number in order to
“enhance security”.
 450+ MHz.
 Addressable Memory 64Gb.
49
Katmai: Pentium III (1999)
50
Pentium IV (2000)
 0.18-micron
 42 million transistors on a single chip.
 1.4  3.0 Ghz.
 Bus Speed 400 Mhz.
51
Pentium IV (2000)
52
Merced: Itanium (2000)
 Intel Architecture-64 (IA-64).
 Developed jointly by Intel and HewlettPackard.
 Hardware x86 emulation.
 Not RISC or CISC, but EPIC (Explicitly
Parallel Instruction Computing).
 600 MHz and 1,000 MHz.
 Tens of millions of transistors.
53
x86 Basic Structure
Code Cache
Branch
Predictor
Decode & Prefetch Unit
Bus
To RAM
Interface
Integer ALU
Registers
Execution Unit
Floating
Point
Unit
Data Cache
54
x86 Basic Structure (continued)
 Execution unit: two parallel integer pipelines

enable the CPU to read, interpret, execute and
dispatch two instructions simultaneously.
Branch Predictor: The branch prediction unit
tries to guess which sequence will be executed
each time the program contains a conditional
jump, so that the Prefetch and Decode Unit can
get the instructions ready in advance.
55
x86 Basic Structure (continued)
 Floating Point Unit: Third execution unit,


where non-integer calculations are performed.
Primary Cache: Two on-chip caches, one for
code and one for data, are far quicker than the
external memory.
Bus Interface: This brings a mixture of code
and data into the CPU, separates the two ready
for use, and then recombines them and sends
them back out.
56
x86 Modes of
Operation
 The operating mode determines

which instructions and
architectural features are
accessible.
The Intel Architecture supports
three operating modes:
 Real Mode
 Protected Mode
 Virtual-8086 Mode
57
Real Mode
 Mode in which all x86 processors boot.
 The CPU works like a very fast 8086.
 Can only access up to 1 MB of memory.
 Only one task is executed at a time.
58
Real Mode
 In Real address mode, the IA-32 processor can

access 1MB of memory using 20 bit address in
the range 0 to FFFFF hex. The basic problem
that Intel engineers had to solve was that the
original 8086 processor had only 16 bit
registers, so it was impossible to directly
represent a 20 bit address.
They came up with a scheme known as
segmented memory. All memory is divided into
64kb units called segments, as shown in the
figure:
59
Real Mode
60
Real Mode
 An analogy might be a large building
 Segments= floors.
 Offset = a room in that floor.
 EX; 8000:0250 represents an offset of 250 in

the segment 8000, the last zero can be dropped
of the segments.
To calculate linear address:
 Segment x 10 + offset
 8000x10 +250 == 80250
61
Real Mode
 A typical program has three segments:
 Code (CS)
 Data (DS)
 Stack (SS)
62
Protected Mode
 Allows multitasking.
 Each program has its own memory
protected from other programs.
 Extended memory: more than 1 MB of
memory available.
 Supports virtual memory.
63
Protected Mode
 When a processor is running in protected
mode, each program can address up to
4GB of memory.
 It uses the flat memory model.
 It only requires a 32 bit integer to hold the
address of any instruction or variable.
64
Protected Mode
 A typical program has three segments:
 Code (CS)
 Data (DS)
 Stack (SS)
65
Virtual-8086 Mode
 Allows simultaneous execution of two or
more programs designed to work in real
mode, each program having up to 1 MB of
independent memory.
66
Registers
 A register is a special high-speed storage
area within the CPU.
 The x86 processors have several registers
available for the application programmer,
grouped as follows:
 General-purpose data registers.
 Segment registers.
 Status and control registers (EIP and
EFLAGS registers).
67
General-Purpose Data Registers
 These eight 32-bit registers are available
for holding the following data items:
 Integer operands for logical and arithmetic
operations.
 Pointers (memory addresses).
68
General-Purpose Data Registers (continued)
31
16
8
ah
ax
0
al
Accumulator
eax
bh
bx
bl
Base
ebx
ch
cx
cl
Count
ecx
dh
dx
dl
Data
edx
69
General-Purpose Data Registers (continued)
31
16
0
sp
Stack Pointer
bp
Base Pointer
si
Source Index
di
Destination Index
esp
ebp
esi
edi
70
Segment Registers
 The six segment registers hold 16-bit segment


selectors.
A segment selector points to a special
structure in memory called a segment
descriptor. Several segment descriptors are
grouped together into a descriptor table.
A segment descriptor contains addressing and
control information which is used to control how
a 32-bit linear address is generated.
71
Segment Registers (continued)
16
0
cs
Code Segment
ds
Data Segment
es
Extra Segment
fs
Extra Segment
gs
Extra Segment
ss
Stack Segment
72
Segment Registers (continued)
Memory
Segment Information:
Segment
Descriptor
Segment
Descriptor
Segment
Register
Segment Selector
Segment
Descriptor
Descriptor
Table
Segment
Descriptor
• Base address
• Size
• Privilege Level:
- private OS function
- OS service
- device driver
- application program
• Type:
- read-only
- read/write
- execute-only
- execute/read
...
73
Instruction Pointer Register
 The instruction pointer (EIP) is a 32-bit
register that contains the offset in the
current code segment for the next
instruction to be executed.
31
16
eip
0
Instruction Pointer
74
Instruction Pointer Register (continued)
 It is advanced from one instruction
boundary to the next in straight-line code
or it is moved ahead or backwards by a
number of instructions when executing
flow control instructions such as jumps or
subroutine calls.
 It cannot be accessed directly by software.
75
Flags Register
 This 32-bit register is a
collection of individual status
and control bits called flags.
 Each flag is usually
manipulated independently
and not as a set.
76
Flags Register (continued)
eflags
...
31
of df
11
10
 CF carry flag
 PF parity flag
 AF auxiliary flag
 ZF Zero Flag
sf zf
7
6
af
pf
cf
4
2
0
 SF sign flag
 DF direction flag
 OF overflow flag
77
Flags Register (continued)
 Carry Flag Is set if the result of an arithmetic



operation involving unsigned numbers
overflows.
Overflow Flag Is set if the result of an
arithmetic operation involving signed numbers
overflows.
Sign Flag Is set if the result of an arithmetic or
logical operation is negative.
Zero Flag Is set if the result of an arithmetic or
logical operation is zero.
78
Flags Register (continued)
 Parity Flag Is set if the result of an arithmetic or


logical operation has an even number of 1 bits
in its 8 least significant bits.
Auxiliary Flag Is set if the result of an
arithmetic operation has a carry out from the
low-order nibble. Used in binary-coded decimal
(BCD) operations.
Direction Flag Is explicitly set or cleared by the
programmer in order to modify the behavior of
some special string operations.
79
Memory
Organization
 The memory that the processor
addresses on its bus is called
physical memory.
 Physical memory is organized as a
sequence of 8-bit bytes. Each byte is
assigned a unique address, called a
physical address.
80
Memory Organization (continued)
 The physical address space ranges from
zero to a maximum of 232 – 1 (4 GB).
 When employing the processor’s memory
management facilities, programs DO NOT
directly address physical memory.
Instead, they access memory using a
memory model.
81
Flat Memory Model
 Memory appears to a program as a
single, continuous address space,
called a linear address space. All
code and data are contained in this
address space.
82
Flat Memory Model (continued)
 The linear
Linear
Address
Space
...
address space
is byte
addressable,
with addresses
running
contiguously
from 0 to 232 - 1.
0xFFFFFFFF
0x00000000
83
Paging




The x86 supports translation of linear
(virtual) addresses into physical
addresses through paging.
Special tables map portions of the
virtual addresses into physical memory
locations.
Physical memory is divided into page
frames, each 4 KB in size.
The operating system copies a certain
number of pages from your storage
device to main memory.
84
Paging (continued)
Physical Memory
Disk Drive
Address
Space


Virtual
Memory
When a program needs a page that is not in main
memory, the operating system copies the required page
into memory and copies another page back to the disk.
Each time a page is needed that is not currently in
memory, a page fault occurs.
85
Generating a Physical Address
Logical Address
16-bit selector
Segment
Descriptor
32-bit offset
+
32-bit linear address
Paging
32-bit physical address
86
32-bit Offset
+
+

32-bit base register
eax, ebx, ecx, edx,
esi, edi, ebp, esp
32-bit index register
eax, ebx, ecx, edx,
esi, edi, ebp
scale factor
displacement
1, 2, 4, 8
8-bit, 32-bit
32-bit offset
87
32-bit Offset Example
displacement
MOV
EAX, [ESI + ECX * 4 + 12]
base
register
index
register
scale
factor
88
Byte Order
 When a value is stored in
memory in multiple bytes, two
distinct byte orders may be used:
 Big-Endian
 Little-Endian
Big End
Little end
89
Byte Order (continued)
 In big-endian architectures, the leftmost bytes

(those with a lower address) are most
significant. In little-endian architectures, the
rightmost bytes are most significant.
The terms big-endian and little-endian are
derived from the Lilliputians of Jonathan Swift's
Gulliver's Travels, whose major political issue
was whether soft-boiled eggs should be opened
on the big side or the little side.
90
Byte Order (continued)
 Intel x86 and DEC VAX systems store
multibyte values in little-endian order.
 HP, IBM and Motorola 68K systems store
multibyte values in big-endian order.
 The Power PC is a bi-endian processor: it
supports both big and little-endian byte
ordering.
91
Byte Order Example
 The byte ordering for the number 1025
stored in 4 bytes is:
1025 = 00000000 00000000 00000100 00000001b
Address
03 00000000b
02 00000000b
01 00000100b
00 00000001b
little-endian
03 00000001b
02 00000100b
01 00000000b
00 00000000b
big-endian
92
CHAPTER 3
The Linux Operating
System
Operating
System
 Software that makes
hardware usable.
 Manages such things
as: memory, screen
display, keyboard
input, disk files and
printer output.
User
Application
Programs
Operating
System
Hardware
94
UNIX
 Operating system developed at Bell
Labs in the early 1970s by Ken Thompson
and Dennis Ritchie.
 First operating system to be written in a
high-level programming language, namely
C.
95
UNIX (continued)
 The name UNIX was intended as a pun on
a previous OS called MULTICS (and was
written UNICS at first: UNiplexed
Information and Computing System).
 Leading operating system for workstations
96
Linux
 Free UNIX-type operating
system originally created by
Linus Torvalds at the
University of Helsinki in
Finland.
 Developed under the GNU
General Public License, the
source code for Linux is
freely available to everyone.
97
Linux (continued)
 Linux is an independent POSIX (Portable
Operating System Interface for UNIX)
implementation and includes: multitasking,
multi-user, multiprocessing, virtual
memory, shared libraries and TCP/IP
networking.
 Currently implemented in a wide range of
platforms, including: x86, Alpha, SPARC,
68K and PowerPC.
98
GNU Project




Short for GNU's Not UNIX.
A UNIX-compatible software
system developed by the Free
Software Foundation (FSF).
The philosophy behind GNU is to produce software that
is non-proprietary. Anyone can download, modify and
redistribute GNU software. The only restriction is that
they cannot limit further redistribution.
The GNU project was started in 1983 by Richard
Stallman at the MIT.
99
POSIX
 Acronym for Portable
Operating System
Interface for UNIX.
 Set of IEEE and ISO
standards that define
an interface between programs
and operating systems.
 Supported by most UNIX systems
and Windows NT.
100
Multitasking
 The ability to execute more
than one task (program) at
the same time.
 The CPU switches from one program to
another so quickly that it gives the
appearance of executing all of the
programs at the same time.
101
Multitasking (continued)
 There are two basic types of multitasking:
 Preemptive multitasking: the operating
system assigns CPU time slices to each
program.
 Cooperative multitasking: each program
can control the CPU for as long as it needs it.
If a program is not using the CPU, however, it
can allow another program to use it
temporarily.
 Linux supports preemptive multitasking.
102
Multi-user
 Computer systems that
support two or more
simultaneous users.
 All mainframes and
minicomputers and most
workstations are multi-user
systems.
103
Multiprocessing
 Since version 2.0, Linux
has the ability to run in
multiprocessor
architectures.
 The OS can distribute
several applications in
true parallel fashion
across several CPUs.
104
Virtual Memory
If it’s there and you can see it  it’s real
If it’s not there and you can see it  it’s virtual
If it’s there and you can’t see it  it’s transparent
If it’s not there and you can’t see it  you erased it!
IBM poster explaining virtual memory,
circa 1978.
105
Virtual Memory (continued)
 Technique that allows to increases the
amount of apparent memory available on
a system.
 A swap space is an area on disk in which
the OS stores images of running programs
when memory is tight.
 The Linux virtual memory system uses a
swap space to implement paging.
106
Shared Libraries
 A library is a collection of


precompiled routines that
a program can use.
In a static library, all library functions that a
program requires are made part of an
executable, which can make it rather large.
In a shared library, function code is not directly
included in an executable file. Instead, the OS
dynamically links a running program to the
required routines contained in the shared library.
107
Shared Libraries (continued)
 Shared libraries have two important
advantages:
 Small executable files.
 Several programs running at the same time
can share a single copy of the library code.
108
TCP/IP Networking
 Acronym for
Transmission Control
Protocol/Internet Protocol.
 Consists of a suite of communications
protocols used to connect hosts on the
Internet.
 Allows services such as: e-mail, telnet, ftp
and http.
109
CHAPTER 4
The Netwide
Assembly Language
nasm: The Netwide Assembler
 Free and portable x86 assembler
originally developed by Simon Tatham
and Julian Hall.
 It supports a range of object file formats,
including Linux ELF, NetBSD/FreeBSD,
COFF, Microsoft 16-bit OBJ and Win32.
111
Development Cycle
editor
nasm
ld (linker)
assembly
language
file
*.asm
object
file
*.o
ELF
executable
file
112
ld: The Linker
 An object file isn’t directly

executable; it first needs to be
fed into a linker (also known
as link-loader or link-editor).
The linker does the following tasks:
 identifies the initial program entry point (_start label)
 binds symbolic references to memory addresses
 unites all the object and library files
 produces an executable ELF file
113
ELF File
 The Executable and Linkable
Format was designed by the
UNIX System Laboratories.
 Used by contemporary Linux
implementations as its standard
executable file format.
 Supports shared libraries
(dynamic linking).
114
a.out File
 a.out is the default file name given to


executable files by UNIX linkers.
It means “assembly output”, in spite of being
linker output!
On the PDP-7 computer, there was no linker.
Executable programs were created directly by
the assembler. The name stuck, even when the
linkers started to appear in newer machines.
115
Building a Program
edition
assembly
linkage
execution
$ vi test.asm
$ ls
test.asm
$ nasm -f elf test.asm
$ ls
test.asm
test.o
$ ld -s -o test test.o
$ ls
test
test.asm
test.o
$ test
116
Linux-NASM
Program Skeleton
bits 32
; -- 32 bit program
section .data
; -- Start data segment
; put initialized data here
section .bss
; -- Start bss segment
; put non-initialized data here
section .text
; -- Start code segment
global _start ; -- Export “_start” label
_start
; -- Define “_start” label
; put program code here
mov eax, 1
; -- Exit system call
mov ebx, 0
;
exit code #0
int 0x80
117
Segments
 A segment on UNIX is a
section of related stuff
in a binary.
 ELF files have three segments:
 TEXT for storing code
 DATA for storing initialized data
 BSS for non-initialized data
118
NASM Source Code
 Every NASM program
source line has the
following four fields:
label: instruction operands ; comment
 Every field is optional.
 The number of operands depend of the
instruction.
119
PseudoInstructions
Instructions
 Mnemonics that

represent x86
opcodes.
Generate code that
produce actions at
run time.


Not real x86 instructions
(they don’t produce any
actions at run time).
Are used in the
instruction field because
that’s the most
convenient place to put
them.
120
Directives
 Statements that
allow us to control
how a program is
assembled.
 They only work at
assembly time
(they don’t directly
produce any
machine code).
121
bits
Directive
 Specifies if NASM must produce code that
will run in 16 or 32-bit mode.
 ELF files only support 32-bit mode:
bits 32
 May be omitted for ELF files.
122
section .data Directive
 States the beginning of the initialized data
segment.
 An image of this segment’s data is
physically stored in the executable file.
 This segment contains read/write data.
123
Pseudo-Instructions for the
Data Segment
PseudoInstruction
db
dw
dd
dq
dt
Meaning
Define byte
Define word
Define double word
Define quadword
Define ten bytes
Size (bits)
8
16
32
64
80
124
section .bss Directive
 States the beginning of the non-initialized data



segment.
Only the size of the data is stored in the
executable file. Once the program is loaded into
memory, all the data in this section is set to
zero.
This segment contains read/write data.
BSS means “Block Started by Symbol”, a
pseudo-instruction from the old IBM 704
assembler, carried over into UNIX.
125
Pseudo-Instructions for the
BSS Segment
PseudoInstruction
resb
resw
resd
resq
rest
Meaning
Reserve byte
Reserve word
Reserve double word
Reserve quadword
Reserve ten bytes
Size (bits)
8
16
32
64
80
126
section .text Directive
 States the beginning of the segment that
contains the program’s executable
instructions.
 This segment is read-only.
127
System Calls
 Processes access kernel facilities via the
system call interface.
 System calls are the only way a program
con communicate to the outside world.
 In assembly language, interrupt 0x80 is
used to make system calls.
128
System calls (continued)
Process
system calls: INT 0x80
Linux Kernel
I/O Devices
(display, keyboard, mouse,
disks, printer, etc.)
129
sys_exit
 Terminate current process, return exit
code to caller.
EAX  1
EBX  exit code
130
sys_read
 Read a number of bytes from a given
input device.
EAX  3
EBX  file descriptor (0 = stdin)
ECX  buffer address
EDX  number of bytes to read
INT 0x80
131
sys_write
 Write a number of bytes to a given output
device.
EAX  4
EBX  file descriptor (1 = stdout)
ECX  buffer address
EDX  number of bytes to write
INT 0x80
132
CHAPTER 5
x86 Integer
Instructions
Condition Codes
Sufix
O
NO
C
B
NAE
NC
NB
AE
Meaning
Overflow
No Overflow
Carry
Below
Not Above nor Equal
No Carry
Not Below
Above or Equal
Flags
OF=1
OF=0
CF=1
CF=0
134
Condition Codes (continued)
Sufix
Z
E
NZ
NE
BE
NA
A
NBE
Meaning
Zero
Equal
Not Zero
Not Equal
Below or Equal
Not Above
Above
Not Below nor Equal
Flags
ZF=1
ZF=0
CF=1 OR ZF=1
CF=0 AND ZF=0
135
Condition Codes (continued)
Sufix
S
NS
P
PE
NP
PO
Meaning
Sign
Not Sign
Parity
Parity Even
Not Parity
Parity Odd
Flags
SF=1
SF=0
PF=1
PF=0
136
Condition Codes (continued)
Sufix
L
NGE
GE
NL
LE
NG
G
NLE
Meaning
Less
Not Greater nor Equal
Greater or Equal
Not Less
Less or Equal
Not Greater
Greater
Not Less nor Equal
Flags
SF<>OF
SF=OF
ZF=1 OR SF<>OF
ZF=0 AND SF=OF
137
Condition Codes (continued)
 Above and Below are used for unsigned
integer comparisons.
 Greater and Less are used for signed
integer comparisons.
138
Flow Control Instructions
 JMP
 Jcc
 CALL
 RET
139
JMP: jump
 Syntax:
JMP dest
 Operation (absolute jump):
EIP  dest
 Operation (relative jump):
EIP  EIP + dest
- - - - - - of df sf zf af pf cf
140
Unconditional Jumps
 Jmp statement label
 We have two types of jumps,
 Intersegment
 Intrasegment
 Address can be in a register, variable or
label.
141
Unconditional Jumps
 Example:
Start: Mov Ax, 0
Inc Ax,
Jmp Start
142
Jcc: short jump conditional
 Syntax:
Jcc dest
 Operation:
if(cc)
EIP  EIP + dest
endif
 Notes: cc is any of the condition codes. dest
must be within a signed 8-bit range (-128 to
127).
- - - -
- - of df sf zf af pf cf
143
Jcc: near jump conditional
 Syntax:
Jcc NEAR dest
 Operation:
if(cc)
EIP  EIP + dest
endif
 Notes: cc is any of the condition codes. dest
must be within a signed 32-bit range.
- - - - - - of df sf zf af pf cf
144
Conditional Jumps
 Dependent on condition codes.
 Example:
JZ  jump if zero flag is set.
145
Conditional Codes
 Examples:
 Code the following C routine using aseembly

language instructions.
Add a value to x;
If x < 0
Then
… (body for negative condition)
Else if x = 0
… (body for zero condition)
Else
… (body for positive condition)
End if
146
Conditional Codes
 Solution
Add x, eax
;add a value to x
Jns elseIf Zero ;jump if x is not negatve
…
; code for negative condition
Jmp endCheck
elseifZero:
jnz elsePos
; jump if x is not zero
…
; code for zero condition
jmp endCheck
elsePos:
…
; code for positive balance
endCheck:
147
Comparing Instructions
 CMP op1, op2
 This instructions executes by calculating a
like a sub instruction op1 –op2 but it does
not modify the operands it only modifies
the flag register.
 We use the flag register values.
 We have to analyse if we care or not of
the sign of the operation.
148
Compare Examples
 OP1= 3B
 OP2= 3B
 CF=OF=SF=0
 ZF=1
 OP1==OP2 signed and unsigned
149
Compare Examples
 OP1= 3B
 OP2= 15
 OP1-OP2= 26
 CF=OF=SF=ZF=0
 OP1>OP2 signed and unsigned
150
Compare Examples
 OP1=15
 OP2= F6
 OP1-OP2=1F
 CF=1 – borrow
 SF=OF=ZF=0
 Signed operation = op1>op2
 Unsigned operation =op1 < op2
151
Compare Examples
Legal Examples
Cmp eax, 356
cmp value, 03dh
Cmp bh, ‘$’
Illegal examples
Cmp 1000, total
152
Compare Programming Ex.
 Code the following routine in assembly
language.
If val < 10
Then
add 1 to xcount;
Else
add 1 to ycount;
End if;
153
Compare Programming Ex
 Solution:
Cmp ebx, 10
Jnl Elsey
Inc xcount
Jmp endVal
Elsey:
Inc ycount
endVal:
;value < 10
;add 1 to xcount
;add 1 to ycount
154
Programming Ex #2
 Code the following routine in assembly
language:
If (total mayor o igual 100) or (count=10)
Then
add value to total;
End if
155
Programming Ex2 Solution
Cmp total, 100
Jge addValue
Cmp cx, 10
Jne endAddCheck
addValue:
Mov ebx, value
Add total, ebx
endAddCheck:
156
While Loops
While continuation condition loop
…{ body}
end while;
The continuation condition is a boolean
expression.
157
While loop excercise
 Design an assembly language module to
implement the following high level
language instructions.
While (sum < 1000) loop
…{body increment sum}
End while;
158
While Loops Exercise 2
 Design an assembly language module to
implement the following high level
language instructions.
X:=1
twoTox:=1;
While twoTox</number
multiply twoTox by 2;
End while;
Substract 1 from x;
159
Homework
160
CALL: call subroutine
 Syntax:
CALL dest
 Operation (absolute call):
ESP
 ESP - 4
[ESP]  EIP
EIP
 dest
 Operation (relative call):
ESP
 ESP - 4
[ESP]  EIP
EIP
 EIP + dest
- - - - - - of df sf zf af pf cf
161
RET: return from subroutine
 Syntax:
RET
 Operation:
EIP  [ESP]
ESP  ESP + 4
- - - - - - of df sf zf af pf cf
162
Data Transfer Instructions
 MOV
 CMOVcc
 SETcc
 XCHG
 XLATB
 PUSH
 POP
 PUSHF
 POPF
 PUSHA
 POPA
163
MOV: move data
 Syntax:
MOV dest, orig
 Operation:
dest  orig
- - - - - - of df sf zf af pf cf
164
CMOVcc: conditional move
 Syntax:
CMOVcc dest, orig
 Operation:
if(cc)
dest  orig
endif
 Notes: cc is any of the condition codes.
- - - - - - of df sf zf af pf cf
165
SETcc: set conditional
 Syntax:
SETcc dest
 Operation:
if(cc)
dest  1
else
dest  0
endif
 Notes: cc is any of the condition
codes.
- - - - - - of df sf zf af pf cf
166
XCHG: exchange data
 Syntax:
XCHG op1, op2
 Operation:
temp  op1
op1
 op2
op2
 temp
- - - - - - of df sf zf af pf cf
167
XLATB: translate byte
 Syntax:
XLATB
 Operation:
AL  [EBX + AL]
 Notes: AL is treated as an unsigned byte.
- - - - - - of df sf zf af pf cf
168
PUSH: push data on stack
 Syntax:
PUSH op
 Operation:
ESP
 ESP - 4
[ESP]  op
- - - - - - of df sf zf af pf cf
169
POP: pop data from stack
 Syntax:
POP dest
 Operation:
dest  [ESP]
ESP
 ESP + 4
- - - - - - of df sf zf af pf cf
170
PUSHF: push flags register
 Syntax:
PUSHF
 Operation:
ESP
 ESP - 4
[ESP]  EFLAGS
- - - - - - of df sf zf af pf cf
171
POPF: pop flags register
 Syntax:
POPF
 Operation:
EFLAGS  [ESP]
ESP
 ESP + 4
X X X X X X X
of df sf zf af pf cf
172
PUSHA: push all registers
 Syntax:
PUSHA
 Operation:
temp
ESP
[ESP
[ESP
[ESP
[ESP
[ESP
[ESP
[ESP
[ESP
+
+
+
+
+
+
+
+
0x1C]
0x18]
0x14]
0x10]
0x0C]
0x08]
0x04]
0x00]










ESP
ESP - 0x20
EAX
ECX
EDX
EBX
temp
EBP
ESI
EDI
- - - - - - of df sf zf af pf cf
173
POPA: pop all registers
 Syntax:
POPA
 Operation:
EDI
ESI
EBP
EBX
EDX
ECX
EAX
ESP








[ESP + 0x00]
[ESP + 0x04]
[ESP + 0x08]
[ESP + 0x10]
[ESP + 0x14]
[ESP + 0x18]
[ESP + 0x1C]
ESP + 0x20
- - - - - - of df sf zf af pf cf
174
Flow Control Instructions
 JMP
 Jcc
 CALL
 RET
175
JMP: jump
 Syntax:
JMP dest
 Operation (absolute jump):
EIP  dest
 Operation (relative jump):
EIP  EIP + dest
- - - - - - of df sf zf af pf cf
176
Jcc: short jump conditional
 Syntax:
Jcc dest
 Operation:
if(cc)
EIP  EIP + dest
endif
 Notes: cc is any of the condition codes. dest
must be within a signed 8-bit range (-128 to
127).
- - - -
- - of df sf zf af pf cf
177
Jcc: near jump conditional
 Syntax:
Jcc NEAR dest
 Operation:
if(cc)
EIP  EIP + dest
endif
 Notes: cc is any of the condition codes. dest
must be within a signed 32-bit range.
- - - - - - of df sf zf af pf cf
178
CALL: call subroutine
 Syntax:
CALL dest
 Operation (absolute call):
ESP
 ESP - 4
[ESP]  EIP
EIP
 dest
 Operation (relative call):
ESP
 ESP - 4
[ESP]  EIP
EIP
 EIP + dest
- - - - - - of df sf zf af pf cf
179
RET: return from subroutine
 Syntax:
RET
 Operation:
EIP  [ESP]
ESP  ESP + 4
- - - - - - of df sf zf af pf cf
180
Arithmetic Instructions










CLC
STC
CMC
ADD
ADC
INC
SUB
SBB
DEC
NEG











CMP
MUL
IMUL
DIV
IDIV
CBW
CWD
CDQ
CWDE
MOVSX
MOVZX
181
CLC: clear carry flag
 Syntax:
CLC
 Operation:
CF  0
- - - - - - 0
of df sf zf af pf cf
182
STC: set carry flag
 Syntax:
STC
 Operation:
CF  1
- - - - - - 1
of df sf zf af pf cf
183
CMC: complement carry flag
 Syntax:
CMC
 Operation:
CF  ~CF
- - - - - - X
of df sf zf af pf cf
184
ADD: add integers
 Syntax:
ADD dest, orig
 Operation:
dest  dest + orig
X - X X X X X
of df sf zf af pf cf
185
ADD examples
 AX: 0075
 CX: 01A2
 Results:
Add ax,cx
 AX: 0217
 CX: 01A2
 SF=ZF=CF=OF=0
186
ADD examples
 AX: 77AC
 CX: 4B35
 Results:
add ax, cx
 AX: C2E1
 CX: 4B35
 SF=OF=1; ZF=CF=0
187
ADC: add with carry
 Syntax:
ADC dest, orig
 Operation:
dest  dest + orig + CF
X - X X X X X
of df sf zf af pf cf
188
INC: increment integer
 Syntax:
INC dest
 Operation:
dest  dest + 1
X - X X X X of df sf zf af pf cf
189
INC examples
 ECX: 00 00 01 A2
inc ecx
 Results:
 ECX= 00 00 01 A3
 SF=ZF=OF=0
190
INC examples
 EDX: 7F FF FF FF
inc edx
 Results:
 EDS: 80 00 00 00
 SF=OF=1; ZF=0
191
SUB: subtract integers
 Syntax:
SUB dest, orig
 Operation:
dest  dest - orig
X - X X X X X
of df sf zf af pf cf
192
SUB examples
 EAX: 00 00 00 75
 ECX: 00 00 01 A2
sub eax, ecx
 Results:
 EAX: FF FF FE D3
 ECX: 00 00 01 A2
 SF=1, ZF=CF=OF=0
193
SUB examples
 DX: FF 20
 Word at value FF 20
sub dx, Value
 Results:
 DX:00 00
 Value: FF 20
 ZF=1PF=1, the rest are zero.
194
SBB: subtract with borrow
 Syntax:
SBB dest, orig
 Operation:
dest  dest - orig - CF
X - X X X X X
of df sf zf af pf cf
195
DEC: decrement integer
 Syntax:
DEC dest
 Operation:
dest  dest - 1
X - X X X X of df sf zf af pf cf
196
DEC examples
 BX: 00 01
dec bx
 Results:
 BX: 00 00
 ZF=1; SF=OF=0
197
DEC examples
 AL: F5
dec al
 Results:
 AL: F4
 SF=1; OF=ZF=0
198
NEG: negate
 Syntax:
NEG dest
 Operation:
dest  - dest
 Notes: Sets CF, unless dest is zero, y which
case CF is cleared.
X - X X X X X
of df sf zf af pf cf
199
NEG examples
 BX: 01 A2
neg bx
 Results:
 BX: FE 5E
 SF=1; ZF=0
200
NEG examples
 DH: F5
neg dh
 Results:
 DH:0B
 SF=ZF=0
201
NEG examples
 EAX: 00 00 00 00
neg eax
 Results:
 EAX: 00 00 00 00
 SF=0; ZF=1
202
CMP: compare integers
 Syntax:
CMP op1, op2
 Operation:
NULL  op1 - op2
X - X X X X X
of df sf zf af pf cf
203
MUL: unsigned integer
multiply
 Syntax:
MUL orig
 Operation:
case(size(orig))
8: AX
 AL * orig
16: DX:AX
 AX * orig
32: EDX:EAX  EAX * orig
endcase
 Notes: CF and OF are cleared if

the high order of the result is zero.
Orig cannot be immediate
X - ? ? ? ? X
of df sf zf af pf cf
204
MUL examples
 AX: 00 05
 BX: 00 02
 DX: ?? ??
 Results:
 DX: 00 00
 AX: 00 0A
 CF=OF=0
mul bx
205
MUL examples
 AL: 05
 Byte at Factor: FF
 Results:
 AX: 04 FB
 CF=OF=1
mul Factor
206
IMUL: signed integer
multiply
 Syntax #1:
IMUL orig
 Operation:
case(size(orig))
8: AX
 AL * orig
16: DX:AX
 AX * orig
32: EDX:EAX  EAX * orig
endcase
207
IMUL examples
 AX: 00 05
 BX: 00 02
 DX: ?? ??
 DX: 00 00
 AX: 00 0A
 CF=OF=0
imul bx
208
IMUL examples
 AL: 05
 Byte at Factor: FF
imul Factor
 Results:
 AX: 04 FB
 CF=OF=1
209
IMUL: signed integer multiply
(continued)
 Syntax #2:
IMUL dest, orig
 Operation:
dest  dest * orig
X - ? ? ? ? X
of df sf zf af pf cf
210
IMUL examples
 EBX: 00 00 00 0A
imul ebx, 10
*Note source may be immediate
 Results:
 EBX: 00 00 00 64
 CF=OF=0
211
IMUL: signed integer multiply
(continued)
 Syntax #3:
IMUL dest, orig, const
 Operation:
dest  orig * const
 Notes: CF and OF are cleared if the result is the
same size as the multiplicand.
X - ? ? ? ? X
of df sf zf af pf cf
212
IMUL examples
 Word at Value: 08F2
 BX: ?? ??
imul bx, Value, 1000
 Results:
 BX: F1 50
 CF=OF=1
213
DIV: unsigned integer divide
 Syntax:
DIV orig
 Operation:
case(size(orig))
8: AL  AX / orig
AH  AX % orig
16: AX  DX:AX / orig
DX  DX:AX % orig
32: EAX  EDX:EAX / orig
EDX  EDX:EAX % orig
endcase
? - ? ? ? ? ?
of df sf zf af pf cf
214
DIV
source (divisor)
other(dividend)
Quotient
Remainder
byte
word
double word
AX
DX:AX
EDX:EAX
AL
AX
EAX
AH
DX
EDX
215
DIV examples
 EDX: 00 00 00 00
 EAX: 00 00 00 64
 EBX: 00 00 00 0D
(100/13)
div ebx
 Results:
 EDX: 00 00 00 09
 EAX: 00 00 00 07
216
IDIV: signed integer divide
 Syntax:
IDIV orig
 Operation:
case(size(orig))
8: AL  AX / orig
AH  AX % orig
16: AX  DX:AX / orig
DX  DX:AX % orig
32: EAX  EDX:EAX / orig
EDX  EDX:EAX % orig
endcase
? - ? ? ? ? ?
of df sf zf af pf cf
217
CBW: convert byte to word
 Syntax:
CBW
 Operation:
AX  SignExtend(AL)
- - - - - - of df sf zf af pf cf
218
CBW examples
 AL: 53
cbw
 Results:
 AX: 0053
219
CBW examples
 AL: C6
 Results:
 AX: FF C6
cbw
220
CWD: convert word to dword
 Syntax:
CWD
 Operation:
DX:AX  SignExtend(AX)
- - - - - - of df sf zf af pf cf
221
CWD example
 AX: 07 0D
 DX: ?? ??
 Results:
 DX: 00 00
 AX: 07 0D
cwd
222
CDQ: convert dword to
qword
 Syntax:
CDQ
 Operation:
EDX:EAX  SignExtend(EAX)
- - - - - - of df sf zf af pf cf
223
CDQ example
 EAX: FF FF FA 13
 EDX: ?? ?? ?? ??
cdq
 Results:
 EDX: FF FF FF FF
 EAX: FF FF FA 13
224
CWDE: convert word to
dword extended
 Syntax:
CWDE
 Operation:
EAX  SignExtend(AX)
- - - - - - of df sf zf af pf cf
225
CWDE example
 AX: FF 2A
cwde
 Results:
 EAX: FF FF FF 2A
226
MOVSX: move data with sign
extend
 Syntax:
MOVSX dest, orig
 Operation:
dest  SignExtend(orig)
 Notes: orig must be smaller than dest.
- - - - - - of df sf zf af pf cf
227
MOVSX examples
 Word at value: 07 0D
movsx ecx, value
 Results:
 ECX: 00 00 07 0D
228
MOVSX examples
 Word at value: F7 0D
movsx ecx, value
 Results:
 ECX: FF FF F7 0D
229
MOVZX: move data with zero
extend
 Syntax:
MOVZX dest, orig
 Operation:
dest  ZeroExtend(orig)
 Notes: orig must be smaller than dest.
- - - - - - of df sf zf af pf cf
230
MOVZX examples
 Word at value: 07 0D
movzx ecx, value
 Results:
 ECX: 00 00 07 0D
231
MOVZX examples
 Word at value: F7 0D
movzx ecx, value
 Results:
 ECX: 00 00 F7 0D
232
Logical and Bitwise
Instructions
 AND
 OR
 XOR
 NOT
 TEST
 SHL
 SHR
 SAR
 ROL
 ROR
 RCL
 RCR
233
AND: bitwise and
 Syntax:
AND dest, orig
 Operation:
dest  dest & orig
 Notes:
0&0=0
0&1=0
1&0=0
1&1=1
0 - X X ? X 0
of df sf zf af pf cf
234
OR: bitwise or
 Syntax:
OR dest, orig
 Operation:
dest  dest | orig
 Notes:
0|0=0
0|1=1
1|0=1
1|1=1
0 - X X ? X 0
of df sf zf af pf cf
235
XOR: bitwise xor
 Syntax:
XOR dest, orig
 Operation:
dest  dest ^ orig
 Notes:
0^0=0
0^1=1
1^0=1
1^1=0
0 - X X ? X 0
of df sf zf af pf cf
236
NOT: bitwise not
 Syntax:
NOT dest
 Operation:
dest  ~dest
 Notes:
~0 = 1
~1 = 0
0 - X X ? X 0
of df sf zf af pf cf
237
TEST: test bits
 Syntax:
TEST op1, op2
 Operation:
NULL  op1 & op2
0 - X X ? X 0
of df sf zf af pf cf
238
SHL: shift left
 Syntax:
SHL dest, count
 Operation:
msb
cf
lsb
...
0
? - X X ? X X
of df sf zf af pf cf
239
SHR: shift right
 Syntax:
SHR dest, count
 Operation:
msb
0
lsb
...
cf
? - X X ? X X
of df sf zf af pf cf
240
SAR: shift arithmetic right
 Syntax:
SHR dest, count
 Operation:
msb
lsb
...
cf
? - X X ? X X
of df sf zf af pf cf
241
ROL: rotate left
 Syntax:
ROL dest, count
 Operation:
msb
cf
lsb
...
? - X X ? X X
of df sf zf af pf cf
242
ROR: rotate right
 Syntax:
ROR dest, count
 Operation:
msb
lsb
...
cf
? - X X ? X X
of df sf zf af pf cf
243
RCL: rotate through carry
left
 Syntax:
RCL dest, count
 Operation:
msb
cf
lsb
...
? - X X ? X X
of df sf zf af pf cf
244
RCR: rotate through carry
right
 Syntax:
RCR dest, count
 Operation:
msb
lsb
...
cf
? - X X ? X X
of df sf zf af pf cf
245
String Instructions
 CLD
 STD
 REP STOSB
 REP STOSW
 REP STOSD
 REP MOVSB
 REP MOVSW
 REP MOVSD
246
CLD: clear direction flag
 Syntax:
CLD
 Operation:
DF  0
- 0 - - - - of df sf zf af pf cf
247
STD: set direction flag
 Syntax:
STD
 Operation:
DF  1
- 1 - - - - of df sf zf af pf cf
248
REP STOSB: repeat store
string byte
 Syntax:
REP STOSB
 Operation:
while(ECX <> 0)
[EDI]  AL
if(DF = 0)
EDI  EDI + 1
else
EDI  EDI - 1
endif
ECX  ECX - 1
endwhile
- - - - - - of df sf zf af pf cf
249
REP STOSW: repeat store
string word
 Syntax:
REP STOSW
 Operation:
while(ECX <> 0)
[EDI]  AX
if(DF = 0)
EDI  EDI + 2
else
EDI  EDI - 2
endif
ECX  ECX - 1
endwhile
- - - - - - of df sf zf af pf cf
250
REP STOSD: repeat store
string dword
 Syntax:
REP STOSD
 Operation:
while(ECX <> 0)
[EDI]  EAX
if(DF = 0)
EDI  EDI + 4
else
EDI  EDI - 4
endif
ECX  ECX - 1
endwhile
- - - - - - of df sf zf af pf cf
251
REP MOVSB: repeat move
string byte
 Syntax:
REP MOVSB
 Operation:
while(ECX <> 0)
BYTE [EDI]  BYTE [ESI]
if(DF = 0)
ESI  ESI + 1
EDI  EDI + 1
else
ESI  ESI - 1
EDI  EDI - 1
endif
ECX  ECX - 1
endwhile
- - - - - - of df sf zf af pf cf
252
REP MOVSW: repeat move
string word
 Syntax:
REP MOVSW
 Operation:
while(ECX <> 0)
WORD [EDI]  WORD [ESI]
if(DF = 0)
ESI  ESI + 2
EDI  EDI + 2
else
ESI  ESI - 2
EDI  EDI - 2
endif
ECX  ECX - 1
endwhile
- - - - - - of df sf zf af pf cf
253
REP MOVSD: repeat move
string dword
 Syntax:
REP MOVSD
 Operation:
while(ECX <> 0)
DWORD [EDI] 
if(DF = 0)
ESI  ESI
EDI  EDI
else
ESI  ESI
EDI  EDI
endif
ECX  ECX - 1
endwhile
DWORD [ESI]
+ 4
+ 4
- 4
- 4
- - - - - - of df sf zf af pf cf
254
CHAPTER 6
Mixing C and
Assembly Language
Modularization
 Most programs consist of a number of
seperate parts, called modules.
 Source modules are seperately edited
and compiled or assembled in order to
produce the corresponding object
modules.
 All the object modules are linked together
to produce an executable program.
256
Modularization
(continued)
source module
*.asm
source module
*.c
nasm
gcc
...
source module
*.o
standard C
library
ld (linker)
source module
*.o
start
file
crt0.o
ELF
executable
file
257
Exporting & Importing
Names in Assembly
Language
 Any assembly
language
label may be exported
to other modules
using the global
directive.
258
Exporting & Importing Names in
Assembly Language (continued)
 The global directive must appear before
the definition of the corresponding symbol.
 If a module exports a certain label, any
other module may import it.
 To import a label, the extern directive
must be used.
 A label can not be defined and declared
extern in the same module.
259
Assembly Export/Import Example
bits 32
section .data
global alpha
extern beta
alfa dd 500
section .text
global _start
extern func
_start
inc
dword [alpha]
inc
byte [beta]
call func
mov
eax, 1
mov
ebx, 0
int
0x80
module1.asm
bits 32
section .data
global beta
extern alpha
beta db 10
section .text
global func
func
xor
eax, eax
mov
al, [beta]
add
[alpha], eax
ret
module2.asm
260
Assembly Export/Import Example (continued)
 Building the program:
$ nasm -f elf module1.asm
$ nasm -f elf module2.asm
$ ld -s module1.o module2.o -o program
$ ls
module1.asm
module2.asm
module1.o
module2.o
program
261
Exporting & Importing
Names in ANSI C
 By default, al function names
and global variables are
exportable to other modules.
 If a name is prefered to be
kept local to a module, it must
be declared static.
262
Exporting & Importing
Names in ANSI C (continued)
 To indicate that a name is probably declared in


some other module, the extern modifier must
be used in the variable or function prototype
declaration.
The extern modifier is optional in function
prototype declarations.
It is not an error to declare a name extern and to
have it defined as well in the same module.
263
ANSI C Export/Import Example
int x;
static int y;
/* defines an exportable variable */
/* defines a local module variable */
/* import x if not defined in this module */
extern int x;
/* import h if not defined in this module */
extern int h(int, int);
int f(int a, int b) /* defines an exportable function */
{
return a + b;
}
static int g(int c) /* defines a local module function */
{
return c + c;
}
264
x86 and GCC Data types
GCC Data Type Size in bytes Assembly Language
Equivalent
char
1
byte
short
2
word
int
4
dword
long
4
dword
long long
8
qword
float
4
dword
double
8
qword
long double
10
tword
void *
4
dword
265
Register Usage
 Function return their values in the
following registers:
 AL for char
 AX for short
 EAX for int, long and void *
 EDX:EAX for long long
 ST0 for floating point
266
Register Usage (continued)
 Registers EAX, ECX, EDX (not
EBX)
may be changed by the function; all other
registers must be saved and restored.
 Flags may be changed by the procedure
with the following restriction: The direction
flag is 0 by default. The direction flag may
be set temporarily, but must be cleared
before any call or return.
267
Passing Parameters
 The parameters received by

a C function, or a C-callable
assembly language subroutine,
are passed through the stack.
Parameters are pushed into the stack in
reverse order, that is, from right to left.
This means that the first paramater is
always the nearest to the top of the stack.
268
Passing Parameters (continued)
 After the parameters are pushed into the stack,

a CALL instruction to the desired function or
subroutine is executed.
When the function or subroutine returns, the
parameters are still in the stack and must be
removed by the caller. This may be done using
POP instructions or by adjusting directly the
ESP register through an ADD instruction.
269
Subroutine Prologue
 The first two instructions in a C-callable
subroutine that receives arguments should
be:
push
ebp
mov
ebp, esp
 This saves the EBP value, so that it can
now point to the current top of stack.
270
Subroutine Prologue (continued)
 After this
...
prologue,
the stack
has the
following
layout:
Subroutine
parameters
EBP+n
EBP+8
CALL return
address
Original value
of EBP
EBP+4
EBP
ESP
271
Subroutine Epilogue
 In order to undo the subroutine prologue,
the following intructions must be the last in
a C-callable subroutine:
pop
ebp
ret
272
CHAPTER 8
Floating Point
Instructions
FPU: Floating
Point Unit
 The FPU (Intel x87) is used for mathematical


computations that require floating point
numbers.
Uses IEEE 754 standard for floating point
numbers.
Works in parallel together with the other x86
units.
274
FPU Registers
 CPU and FPU have a separate set of registers,



mutually inaccessible.
FPU has a stack of eight 80-bit registers.
The register at the top of the stack is called ST0,
the one bellow is ST1 and so on.
All values in the FPU registers are stored as real
extended numbers (80-bit). All computations
take place using this precision.
275
FPU Registers (continued)
79
63
0
st0
st1
st2
mantissa
st3
exponent
st4
sign
st5
st6
st7
276
x87 Data Types
 The values contained in the FPU registers may
be converted to and from the following data
types:
x87 Data Type
word integer
short integer
long integer
packed BCD integer
single precision real
double precision real
extended precision real

Number
NASM Type
ANSI C Type
of Bytes
2
word
short
4
dword
int
8
10
4
8
10
qword
tword
dword
qword
tword
long long
not available
float
double
long double
The long long type is a GCC extension to ANSI C.
277
FPU Operations
 Most FPU operations
involve pushing and
popping values to and
from the register stack.
 When a value is pushed to the stack,
register ST0 becomes ST1, ST1 becomes
ST2 and so on, thus making space in ST0
for the pushed value.
278
FPU Operations (continued)
 The opposite occurs when the stack is
popped: ST1 becomes ST0, ST2
becomes ST1 and son on.
 Instructions that refer to memory usually
require a size prefix: word, dword, qword
or tword.
279
Using FPU
Instructions
1.
2.
3.
4.
Reset FPU (FINIT).
Copy data from memory into FPU
registers.
Process data.
Copy data from FPU registers back into
memory.
280
Types of FPU
Operations
 Real Transfers
 Integer Transfers
 Packed BCD




Transfers
Loading Constants
Addition
Normal Subtraction
Reversed Subtraction
 Multiplication
 Normal Division
 Reversed Division
 Transcendental


Instructions
Comparisons
Miscellaneous
Operations
281
Types of FPU Operations (continued)
 Description of most FPU operations can
be consulted in the FPU Operation Tables.
282
CHAPTER 9
SIMD Instructions
Data Transfer Instructions
 MOVD
 MOVQ
284
MOVD: move dword
 Syntax:
MOVD dest, orig
 Operation:
dest  orig
Notes: dest and orig may be MMX registers, memory locations
or 32-bit integer registers. When the destination operand is an MMX
register, the 32-bit source value is written to the low-order 32 bits of
the 64-bit MMX register and zero-extended to 64 bits. When the
source operand is an MMX register, the low-order 32 bits of the
MMX register are written to the 32-bit integer register or 32-bit
memory location selected with the destination operand.
285
MOVQ: move qword
 Syntax:
MOVQ dest, orig
 Operation:
dest  orig
 Notes: orig and dest can be either an MMX
register or a memory location; however, data
cannot be transferred from one memory location
to another memory location.
286
Arithmetic Instructions
 PADDB
 PADDW
 PADDD
 PADDSB
 PADDSW
 PADDUSB
 PADDUSW
 PSUBB
 PSUBW
 PSUBD
 PSUBSB
 PSUBSW
 PSUBUSB
 PSUBUSW
287
Arithmetic Instructions (continued)
 PMULLW
 PMULHW
 PMADDWD
288
Data Range Limits for
Saturation
Data Type
Lower Limit
Upper Limit
Decimal Hexadecimal Decimal Hexadecimal
Signed Byte
-128
Signed Word
-32,768
Unsigned Byte
0
Unsigned Word
0
0x80
127
0x8000 32,767
0x00
255
0x0000 65,535
0x7F
0x7FFF
0xFF
0xFFFF
289
PADDB: packed truncated
byte addition
 Syntax:
PADDB dest, orig
 Operation:
dest
+
+
+
+
+
+
+
+
=
=
=
=
=
=
=
=
orig
dest
290
PADDW: packed truncated
word addition
 Syntax:
PADDW dest, orig
 Operation:
dest
+
+
+
+
=
=
=
=
orig
dest
291
PADDD: packed truncated
dword addition
 Syntax:
PADDD dest, orig
 Operation:
dest
+
+
=
=
orig
dest
292
PADDSB: packed signed
saturated byte addition
 Syntax:
PADDSB dest, orig
 Operation:
dest
+
+
+
+
+
+
+
+
=
=
=
=
=
=
=
=
orig
dest
293
PADDSW: packed signed
saturated word addition
 Syntax:
PADDSW dest, orig
 Operation:
dest
+
+
+
+
=
=
=
=
orig
dest
294
PADDUSB: packed unsigned
saturated byte addition
 Syntax:
PADDUSB dest, orig
 Operation:
dest
+
+
+
+
+
+
+
+
=
=
=
=
=
=
=
=
orig
dest
295
PADDUSW: packed unsigned
saturated word addition
 Syntax:
PADDUSW dest, orig
 Operation:
dest
+
+
+
+
=
=
=
=
orig
dest
296
PSUBB: packed truncated
byte subtraction
 Syntax:
PSUBB dest, orig
 Operation:
dest
-
-
-
-
-
-
-
-
=
=
=
=
=
=
=
=
orig
dest
297
PSUBW: packed truncated
word subtraction
 Syntax:
PSUBW dest, orig
 Operation:
dest
-
-
-
-
=
=
=
=
orig
dest
298
PSUBD: packed truncated
dword subtraction
 Syntax:
PSUBD dest, orig
 Operation:
dest
-
-
=
=
orig
dest
299
PSUBSB: packed signed
saturated byte subtraction
 Syntax:
PSUBSB dest, orig
 Operation:
dest
-
-
-
-
-
-
-
-
=
=
=
=
=
=
=
=
orig
dest
300
PSUBSW: packed signed
saturated word subtraction
 Syntax:
PSUBSW dest, orig
 Operation:
dest
-
-
-
-
=
=
=
=
orig
dest
301
PSUBUSB: packed unsigned
saturated byte subtraction
 Syntax:
PSUBUSB dest, orig
 Operation:
dest
-
-
-
-
-
-
-
-
=
=
=
=
=
=
=
=
orig
dest
302
PSUBUSW: packed unsigned
saturated word subtraction
 Syntax:
PSUBUSW dest, orig
 Operation:
dest
-
-
-
-
=
=
=
=
orig
dest
303
PMULLW: packed multiply
low word (signed)
 Syntax:
PMULLW dest, orig
 Operation:
dest
*
*
*
*
Low Order
Low Order
orig
Low Order Low Order
=
=
=
=
dest
304
PMULHW: packed multiply
high word (signed)
 Syntax:
PMULHW dest, orig
 Operation:
dest
*
*
*
*
orig
High Order High Order High Order
=
=
=
High Order
=
dest
305
PMADDWD: packed multiply
and add (signed)
 Syntax:
PMADDWD dest, orig
 Operation:
dest
*
*
*
*
orig
+
+
dest
306
Logical Instructions
 PAND
 POR
 PXOR
 PANDN
307
PAND: bitwise qword and
 Syntax:
PAND dest, orig
 Operation:
dest
&
orig
=
dest
308
POR: bitwise qword or
 Syntax:
POR dest, orig
 Operation:
dest
|
orig
=
dest
309
PXOR: bitwise qword xor
 Syntax:
PXOR dest, orig
 Operation:
dest
^
orig
=
dest
310
PANDN: bitwise qword
and/not
 Syntax:
PANDN dest, orig
 Operation:
dest
~
~dest
&
orig
=
dest
311
Shift Instructions
 PSLLW
 PSLLD
 PSLLQ
 PSRLW
 PSRLD
 PSRLQ
 PSRAW
 PSRAD
312
PSLLW: packed word logical
shift left
 Syntax:
PSLLW dest, orig
 Operation:
dest
<<
<<
<<
<<
=
=
=
=
orig
dest
313
PSLLD: packed dword logical
shift left
 Syntax:
PSLLD dest, orig
 Operation:
dest
<<
<<
=
=
orig
dest
314
PSLLQ: packed qword
logical shift left
 Syntax:
PSLLQ dest, orig
 Operation:
dest
<<
orig
=
dest
315
PSRLW: packed word logical
(unsigned) shift right
 Syntax:
PSRLW dest, orig
 Operation:
dest
>>
>>
>>
>>
=
=
=
=
orig
dest
316
PSRLD: packed dword logical
(unsigned) shift right
 Syntax:
PSRLD dest, orig
 Operation:
dest
>>
>>
=
=
orig
dest
317
PSRLQ: packed qword logical
(unsigned) shift right
 Syntax:
PSRLQ dest, orig
 Operation:
dest
>>
orig
=
dest
318
PSRAW: packed word
arithmetic (signed) shift right
 Syntax:
PSRAW dest, orig
 Operation:
dest
>>
>>
>>
>>
=
=
=
=
orig
dest
319
PSRAD: packed dword
arithmetic (signed) shift right
 Syntax:
PSRAD dest, orig
 Operation:
dest
>>
>>
=
=
orig
dest
320
Comparison Instructions
 PCMPEQB
 PCMPEQW
 PCMPEQD
 PCMPGTB
 PCMPGTW
 PCMPGTD
321
PCMPEQB: packed compare
for equal bytes
 Syntax:
PCMPEQB dest, orig
 Operation:
dest
== == == == == == == ==
orig
=
=
=
=
=
=
=
=
dest
All ones if true, all zeros if false.
322
PCMPEQW: packed compare
for equal words
 Syntax:
PCMPEQW dest, orig
 Operation:
dest
==
==
==
==
=
=
=
=
orig
dest
All ones if true, all zeros if false.
323
PCMPEQD: packed compare
for equal dwords
 Syntax:
PCMPEQD dest, orig
 Operation:
dest
==
==
=
=
orig
dest
All ones if true, all zeros if false.
324
PCMPGTB: packed compare
for greater than bytes (signed)
 Syntax:
PCMPGTB dest, orig
 Operation:
dest
>
>
>
>
>
>
>
>
=
=
=
=
=
=
=
=
orig
dest
All ones if true, all zeros if false.
325
PCMPGTW: packed compare
for greater than words (signed)
 Syntax:
PCMPGTW dest, orig
 Operation:
dest
>
>
>
>
=
=
=
=
orig
dest
All ones if true, all zeros if false.
326
PCMPGTD: packed compare for
greater that dwords (signed)
 Syntax:
PCMPGTD dest, orig
 Operation:
dest
>
>
=
=
orig
dest
All ones if true, all zeros if false.
327
Conversion Instructions
 PACKSSWB
 PACKSSDW
 PACKUSWB
 PUNPCKLBW
 PUNPCKLWD
 PUNPCKLDQ
 PUNPCKHBW
 PUNPCKHWD
 PUNPCKHDQ
328
PACKSSWB: pack words into
bytes with signed saturation
 Syntax:
PACKSSWB dest, orig
 Operation:
dest
orig
dest
329
PACKSSDW: pack dwords into
words with signed saturation
 Syntax:
PACKSSDW dest, orig
 Operation:
dest
orig
dest
330
PACKUSWB: pack words into
bytes with unsigned saturation
 Syntax:
PACKUSWB dest, orig
 Operation:
dest
orig
dest
331
PUNPCKLBW: unpack low
packed bytes
 Syntax:
PUNPCKLBW dest, orig
 Operation:
dest
orig
dest
332
PUNPCKLWD: unpack low
packed words
 Syntax:
PUNPCKLWD dest, orig
 Operation:
dest
orig
dest
333
PUNPCKLDQ: unpack low
packed dwords
 Syntax:
PUNPCKLDQ dest, orig
 Operation:
dest
orig
dest
334
PUNPCKHBW: unpack high
packed bytes
 Syntax:
PUNPCKHBW dest, orig
 Operation:
dest
orig
dest
335
PUNPCKHWD: unpack high
packed words
 Syntax:
PUNPCKHWD dest, orig
 Operation:
dest
orig
dest
336
PUNPCKHDQ: unpack high
packed dwords
 Syntax:
PUNPCKHDQ dest, orig
 Operation:
dest
orig
dest
337
Empty MMX State
Instruction
 EMMS
338
EMMS: empty MMX state
 Syntax:
EMMS
 Notes: Should be used at the end of a
sequence of MMX instructions in order to allow
subsequent FPU instructions.
339
CHAPTER 10
Interrupt Handling
Interrupting Program
Execution
 An interrupt is an
asynchronous event that
is typically triggered by hardware (I/O
device).
 An exception is a synchronous event
that is generated when the processor
detects one or more predefined
conditions while executing an
instruction.
341
Interrupting Program Execution
(continued)
 When an interrupt or exception is
signaled, the processor halts execution of
the current task and switches to a handler
procedure that has been written
specifically to handle the interrupt or
exception condition.
342
Interrupting Program Execution
(continued)
 The processor accesses the handler
procedure through an entry in the
interrupt descriptor table (IDT).
 When the handler has completed handling
the interrupt or exception, program control
is returned to the interrupted task.
343
Interrupt Descriptor Table
 The IDT comprises up to 256 8-byte gate



descriptors.
A gate is the mechanism that allows a task to
execute code in a different privilege level.
Each gate descriptor contains the segment
selector, offset and privilege level of its
corresponding handler procedure.
The address and size of the IDT is stored in the
48-bit Interrupt Descriptor Table Register.
(IDTR).
344
Interrupt Descriptor Table Register
47
IDTR
16 15
IDT base address (32 bits)
Gate for Interrupt 0
Gate for Interrupt 1
...
IDT may
begin at
any
address in
physical
memory
IDT Limit
Gate for Interrupt n
0
handler procedure
code for
interrupt 0
handler procedure
code for
interrupt 1
handler procedure
code for
interrupt n
8-byte descriptors
345
SIDT: store IDTR
 Syntax:
SIDT dest
 Operation:
dest  IDTR
- - - - - - of df sf zf af pf cf
346
Hardware Interrupts
 The x86 processor has two pins that can
be attached to external interruptgenerating devices. These pins, or input
lines, are:
 INTR Maskable interrupts
 NMI Nonmaskable interrupts
347
Interrupt Flag
 The interrupt flag IF is contained


in the EFLAGS register.
The INTR input line may be enabled or disabled
through software (running in the correct
privileged level) with the use of the STI (set IF)
and CLI (clear IF) instructions. This means that
INTR may be masked (disabled).
The NMI input line is nonmaskable, which
means it may not be disabled.
348
The PIC 8259
 The 8259 Programmable Interrupt
Controller (PIC) chip accepts interrupts
from up to eight different devices. If any
one of the devices requests service, the
8259 will toggle the CPU’s INTR input line
and pass an interrupt vector number to the
CPU’s data bus.
 Several PICs may be cascaded in order to
support up to different 64 devices.
349
The PIC 8259 (continued)
 A typical PC uses two PICs to provide 15
interrupt inputs (seven on the master PIC
with its eighth input coming from the slave
PIC to process its eight inputs).
 In modern motherboards, the 8259 is
usually incorporated into a larger chip as
part of the chipset.
350
PIC and CPU Connections
IRQ7
IRQ6
IRQ5
IRQ4
IRQ3
IRQ2
IRQ1
IRQ0
PIC
8259
slave
IRQ7
IRQ6
IRQ5
IRQ4
IRQ3
IRQ2
IRQ1
IRQ0
PIC
8259
master
INTR
CPU
x86
D0 D1 D2 D3 D4 D5 D6 D7
data bus
351
PIC Inputs for a PC (Real Mode)
8259 Pin Vector Number
0x08
IRQ 0
0x09
IRQ 1
0x0A
IRQ 2
0x0B
IRQ 3
0x0C
IRQ 4
0x0D
IRQ 5
0x0E
IRQ 6
0x0F
IRQ 7
0x70
IRQ 8/0
0x71
IRQ 9/1
0x72
IRQ 10/2
0x73
IRQ 11/3
0x74
IRQ 12/4
0x75
IRQ 13/5
0x76
IRQ 14/6
0x77
IRQ 15/7
Device
Timer chip
Keyboard
Cascade for slave controller (IRQ 8-15)
Serial port 2
Serial port 1
Parallel port 2 in AT, reserved in PS/2 systems
Diskette drive
Parallel port 1
Real-time clock
CGA vertical retrace (and other IRQ 2 devices)
Reserved
Reserved
Reserved in AT, auxiliary device on PS/2 systems
FPU interrupt
Hard disk controller
Reserved
352
Interrupts and Exceptions
(Protected Mode)
Vector Number
0
1
2
3
4
5
6
7
Description
Divide error
Debug
NMI interrupt
Breakpoint
Overflow
Bound range exceeded
Invalid opcode
Device not available
8
Double fault
9
10
Reserved
Invalid TSS
Source
DIV and IDIV instructions
Any code or data reference
Nonmaskable external interrupt
INT 3 instruction
INTO instruction
BOUND instruction
UD2 instruction or reserved opcode
No math coprocessor
Any instruction that can generate
an exception, an NMI, or an INTR.
Task switch or TSS access.
353
Interrupts and Exceptions
(continued)
Vector Number
Description
11
Segment Not Present
12
Stack Segment Fault
13
General Protection
14
15
Page Fault
Reserved
16
Floating-Point Error
17
18
19-31
Alignment Check
Machine Check
Reserved
32-255
Maskable Interrupts
Source
Loading segment registers or
accessing system segments.
Stack operations and SS register
loads.
Any memory reference and other
protection checks.
Any memory reference.
Floating-point or WAIT/FWAIT
instruction.
Any data reference in memory.
Model dependent.
External interrupt from INTR pin or
INT n instruction.
354
Signals
 Linux traps all interrupts and
exceptions that are generated
by the system.
 Under some circumstances, the
operating system will send a
signal to a running process
informing it that an exceptional
situation has occurred.
355
Signals (continued)
 Some signals report errors such as
references to invalid memory addresses;
others report asynchronous events, such
as disconnection of a phone line.
356
Hardware Interrupts & Signals
Process
PIC
8259
1. A device
generates
a hardware
interrupt
CPU
x86
3. If required, the
OS kernel sends a
signal to a process
OS
kernel
2. CPU calls the handler
procedure provided by the
OS kernel
357
Software Exceptions & Signals
1. Process
generates
a software
exception
Process
3. OS kernel sends
a signal to the
offending process
OS
kernel
CPU
x86
2. CPU calls the
handler procedure
provided by the
OS kernel
358
Signal Handling
 A programmer may
arrange for a particular
signal to be ignored or to
be processed by a special
piece of code called a
signal handler.
359
Signal Handling (continued)
 In the latter case, the process that
receives the signal suspends its current
flow of control, executes the signal
handler, and the resumes the original flow
of control when the signal handler
finishes.
360
Predefined Signals
 There are 31 different signals defined for
UNIX.
 A programmer may choose one of the
following actions for a particular signal:
 Trigger a user-supplied signal handler
 Trigger the default kernel-supplied handler
 Ignore it
361
Default Signal Handlers
 DUMP: terminate the process and
generate a core (memory) image file
 QUIT: terminate the process without
generating a core image file
 IGNORE: ignore and discard the signal
 SUSPEND: suspends the process
362
List of Signals
SIGHUP
SIGINT
SIGQUIT
SIGILL
SIGTRAP
SIGIOT
SIGBUS
SIGFPE
SIGKILL
Signal
Number
1
2
3
4
5
6
7
8
9
Default
Action
quit
quit
dump
dump
dump
dump
dump
dump
quit
SIGUSR1
10
quit
Macro
Description
Hangup
Interrupt
Quit
Illegal instruction
Trace trap (for debugging)
IO Trap instruction
Bus error
Floating Point Exception
Kill (cannot be caught, blocked or
ignored)
User defined signal 1
363
List of Signals (continued)
SIGSEGV
SIGUSR2
SIGPIPE
Signal
Number
11
12
13
SIGALRM
SIGTERM
SIGCHLD
SIGCONT
SIGSTOP
14
15
17
18
19
SIGTSTP
20
Macro
Default
Description
Action
dump Segmentation violation
User defined signal 2
quit
Write on a pipe with no one to read
quit
it
Alarm clock
quit
Software termination signal
quit
ignore Child status has changed
ignore Continue after stop
suspend Stop (cannot be caught, blocked or
ignored)
suspend Stop signal generated from
keyboard
364
List of Signals (continued)
Macro
SIGTTIN
SIGTTOU
SIGURG
SIGXCPU
SIGXFSZ
SIGVTALRM
SIGPROF
SIGWINCH
SIGLOST
Signal
Default
Description
Number Action
21
suspend Background read attempted from
control terminal
22
suspend Background write attempted to
control terminal
23
ignore Urgent condition present on socket
24
quit
CPU time limit exceeded
25
quit
File size limit exceeded
26
quit
Virtual time alarm
27
quit
Profiling timer alarm
28
ignore Window size changed
29
quit
Resource lost
365
Setting a Signal
Handler
 The signal system call allows a process
to specify the action that it will take when
a particular signal is received.
366
Setting a Signal Handler (continued)
 It takes two parameters (from left to right):
1. The code number of the signal to be
reprogrammed.
2. The address of a user defined function,
which will be executed when the
specified signal arrives, or zero
(SIG_DFL) to use the default handler,
or one (SIG_IGN) to ignore the signal.
367