Data Transfer Instructions (cont.)

Transcript Data Transfer Instructions (cont.)

Assembly Language Fundamentals
Chapter 2
1
Directives and Instructions
 Assembly language statements are either directives or
instructions
 Instructions are executable statements. They are translated
by the assembler into machine instructions. Ex:
call MySub
mov ax,5
;transfer of control
;data transfer
 Directives tells the assembler how to generate machine code
and allocate storage. Ex:
count db 50
2
;creates 1 byte
;of storage
;initialized to 50
A Template for Assembly Language Programs
 .386 = directive to accept
all instructions of 386 and
previous processors (use
.586 to assemble Pentium
specific instructions)
 end = directive that marks
the end of the program
 main = label of the entry
point of the program (first
instruction to execute)
 ret = instruction that
returns the control to the
caller (here the Win32
console)
 Macros to perform I/O are
included in csi2121.inc
3
.386
.model flat
include csi2121.inc
.data
;data allocation
;directives here
.code
main:
;instructions here
ret
end
The FLAT Memory Model
 The .model flat directive tells the assembler to generate code
that will run in protected mode and in 32-bit mode
 Also ask the assembler to do whatever is needed in order
that code, stack, and data share the same 32-bit memory
segment
 All the segment registers will be loaded with the correct
values at load time and do not need to be changed by the
programmer
 Only the offset part of a logical address becomes relevant
 Each data byte (or instruction) is referred to only by a 32-bit
offset address
 The directives .code and .data mark the beginning of the
code and data segments. They are used only for protection
.code is read-only
.data is read and write
4
Steps to Produce an Executable File
Source
file
Assembler
Object
file
linker
Executable
file
library
 The assembler produces an object file from the assembly
language source
 The object file contains machine language code with some
external and relocatable addresses that will be resolved by
the linker. There values are undetermined at that stage.
 The linker extract object modules (compiled procedures)
from a library and links them with the object file to produce
the executable file.
 The addresses in the executable file are all resolved but they
are still logical addresses.
5
Using Borland’s BCC32
 All these steps are performed with the command:
bcc32 –v hello.asm
 The bcc32 command calls TASM32 to assemble
and produce an object file
 It then calls ILINK32 to link this object file with the
C/C++ library functions and Win32 functions used
by the program to produce the executable file
hello.exe
 The –v option produces full debugging info
 See the LabInfo page for all the info you need
6
Names
 A name identifies either:
 a variable
 a label
 a constant
 a keyword (assembler-reserved word).
7
Names (Cont.)
 A variable is a symbolic name for a location in memory that
was allocated by a data allocation directive. Ex:
count db 50
; allocates 1 byte to
; variable count
 A label is a name given to an instruction. It must be followed
by ‘:’. Ex:
main:
mov eax, 5
xor eax, ebx
jump main
8
Names (Cont.)
 The first character must be a letter or any one of
‘@’, ‘_’, ‘$’, ‘?’
 subsequent characters can include digits
 A programmer chosen name must be different
from an assembler reserved word
 avoid using ‘@’ as the first character since many
keywords start with it
 When called from bcc32, the TASM32 assembler is
case sensitive for user-defined words but case
insensitive for the assembler reserved words
9
Integer Constants
 Integer constants are made of numerical digits
with, possibly, a sign and a suffix. Ex:
 -23 (a negative integer, base 10 is default)
 1011b (a binary number)
 1011 (a decimal number)
 0A7Ch (an hexadecimal number)
 A7Ch (this is the name of a variable, an
hexadecimal number must start with a decimal
digit)
10
Character and String Constants
 They are any sequence of characters enclosed
either in single or double quotation marks.
Embedded quotes are permitted. Ex:
 ‘A’
 ‘ABC’
 “Hello World!”
 “123” (this is a string, not a number)
 “This isn’t a test”
 ‘Say “hello” to him’
11
Simple Data Allocation Directives
 The DB (define byte) directive allocates storage for
one or more byte values
[name] DB initval [,initval]
 Each initializer can be any constant. Ex:
a db 10, 32, 41h ;allocate 3 bytes
b db 0Ah, 20h,‘A’;same values as above
 A question mark (?) in the initializer leaves the initial
value of the variable undefined. Ex:
c db ?
;the initial value for c is
;undefined
 Everything that follows “;” is ignored by the
assembler. It is thus a comment
12
Simple Data Allocation Directives (cont.)
 A string is stored as a sequence of characters. Ex:
aString db “ABCD”
bString DB ‘A’,’B’,’C’,’D’;same values
cString db 41h,42h,43h,44h ;same values again
 The (offset) address of a variable is the address of its first byte.
Ex: If the following data segment starts at address 0
.data
Var1 db “ABC”
Var2 db “DEFG”





13
The address of Var1 is 0 = the address of ‘A’
The address of ‘B’ is 1
The address of ‘C’ is 2
The address of Var2 is 3
The address of ‘E’ is 4 …
Simple Data Allocation Directives (cont.)
 Define Word (DW) allocates a sequence of words.
Ex:
A dw 1234h, 5678h ; allocates 2 words
 Intel’s x86 are little endian processors: the lowest
order byte (of a word or double word) is always
stored at the lowest address.
 Ex: if variable A (above) is located at address 0, we
have:
 address:
0
1
2
3
 value:
34h 12h 78h 56h
14
Simple Data Allocation Directives (cont.)
 Define Double Word (DD) allocates a sequence of double
words. Ex:
B dd 12345678h ;allocates 1 double word
 If this variable is located at address of 0, we have:
 address: 0
1
2
3
 value:
78h
56h
34h
12h
 If a value fits into a byte, it will be stored in the lowest
ordered byte available. Ex:
V dw ‘A’
 the value will be stored as:
 address: 0
1
 value:
41h
00h
15
Simple Data Allocation Directives (cont.)
 The DUP operator enables us to repeat values when
allocating storage. Ex:
a db 100 dup(?) ;100 bytes
;uninitialized
b db 3 dup(“Ho”) ;6 bytes: “HoHoHo”
 DUP can be nested:
c db 2 dup(‘a’, 2 dup(‘b’))
;this allocates 6 bytes:‘abbabb’
 DUP must be used with data allocation directives
 There is a bug is some TASM32 versions:
b db 3 dup(“Ho”)
 Will allocate 6 bytes that will be filled with 0 (i.e. the specified
initial values are ignored).
16
Constants
 We can use the equal-sign (=) directive or the EQU
directive to give a name to a constant. Ex:
one = 1 ;this is a constant
two equ 2; also a constant
 The EQU and = directives are equivalent
 The assembler does not allocate storage to a
constant (in contrast with data allocation
directives)
 It merely substitutes, at assembly time, the value
of the constant at each occurrence of the
assigned name
17
Constants (cont.)
 In place of a constant, we can use a constant
expression involving the standard operators used
in HLLs: +, -, *, /
 Ex: the following constant expression is evaluated
at assembly time and given a name at assembly
time:
A = (-3 * 8) + 2
 A constant can be defined in terms of another
constant:
B = (A+2)/2
18
Exercise 1
 Suppose that the following data segment starts at
address 0
.data
A DW 1,2
B DW 6ABCh
Z EQU 232
C DB 'ABCD'




19
A) Find the address of variable A.
B) Find the address of variable B.
C) Find the address of variable C.
D) Find the address of character ‘C’.
Data Transfer Instructions
 The MOV instruction transfers the content of the source
operand to the destination operand
mov destination,source
 This changes the content of destination (but not the content
of source)
 Both operands must be of the same size.
 An operand can be either direct or indirect
 Direct operands (this chapter) are either:
 Immediate (a constant): noted imm
 Register: noted reg
 Memory variable (with displacement), noted mem
 Indirect operands are used for indirect addressing (later
chapter)
20
Data Transfer Instructions (cont.)
 Some restrictions on MOV:
 imm cannot be the destination operand...
 EIP cannot be an operand
 Source and destination cannot both be mem.
Direct memory-to-memory data transfer is
forbidden!
mov wordVar1,wordVar2; illegal
21
Data Transfer Instructions (cont.)
 The type of an operand is given by its size (byte,
word, doubleword…)
 Both operands of MOV must be of the same type
 Type check is done by the assembler
 The type assigned to a mem operand is given by
its data allocation directive (DB, DW…)
 The type assigned to a register is given by its size
 An imm source operand of MOV must fit into the
size of the destination operand
22
Data Transfer Instructions (cont.)
 Examples of MOV usage:
mov bh, 255; 8-bit operands
mov al, 256; error: cst too large
mov bx, AwordVar; 16-bit operands
mov bx, AbyteVar; error: size mismatch
mov edx, AdoublewordVar;32-bit operands
mov cx, bl ; error: size mismatch
mov wordVar1,wordVar2 ;error: mem-to-mem
23
MOVZX: Move with Zero Extend
 Often we want to move the content of a source operand into
a destination operand of larger size
 The MOVZX instruction does this operation by filling with
zeros the high order part of the destination. Usage:
MOVZX destination,source
 Immediate operands are not allowed here
 The size of destination must be strictly larger than the size
of source
 Example:
mov bh, 80h
movzx ah,bh
;illegal, size mismatch
movzx ax,bh
;AX = 0080h
movzx ecx,ax
;ECX = 00000080h
 Notice that if the signed value in the source operand is
negative, then MOVZX will not preserve the sign.
mov bh, 80h
;BH = 80h (negative)
movzx ax,bh
;AX = 0080h (positive)
24
MOVSX: Move with Sign Extend
25
 We can use the MOVSX instruction to preserve the sign of
the source operand. Usage:
MOVSX destination,source
 The high order part of the destination operand will be the
sign extension of the source operand
 The sign extension of a negative number is …111111
 The sign extension of a positive number is …0000000
 Examples:
mov bh, 80h
;BH = 80h (negative)
movsx ax,bh
;AX = FF80h (negative)
;FFh is the sign extension of 80h
mov bl, 7Ah
;BL = 7Ah (positive)
movsx ax,bl
;AX = 007Ah (positive)
;00h is the sign extension of 7Ah
 MOVSX preserves the signed value whereas MOVZX
preserves the unsigned value
 Immediate operands are not allowed and the size of
destination must be strictly larger than the size of source.
Data Transfer Instructions (cont.)
 We can add a displacement to a memory operand to access a
memory value without a name Ex:
.data
arrB db 10h, 20h
arrW dw 1234h, 5678h
 arrB+1 refers to the location one byte beyond the beginning of
arrB and arrW+2 refers to the location two bytes beyond the
beginning of arrW.
mov al,arrB
; AL = 10h
mov al,arrB+1 ;AL=20h (mem with displacement)
mov ax,arrW+2
; AX = 5678h
mov ax,arrW+1
; AX = 7812h
; little endian convention!
mov ax,arrW-2
; AX = 2010h negative
; displacement permitted
26
Data Transfer Instructions (cont.)
 The XCHG instruction exchanges the content of
the source and destination operands:
XCHG destination,source
 Only mem and reg operands are permitted (and
must be of the same size)
 Both operands cannot be mem (direct mem-tomem exchange is forbidden).
 To exchange the content of word1 and word2, we
have to do:
mov ax,word1
xchg word2,ax
mov word1,ax
27
Exercise 2
 Given the following data segment
.data
A dw 1234h,-1
B dd 55h,66778899h
 Indicate if the following instruction is legal. If it is, indicate
the value, in hexadecimal, of the destination operand
immediately after the instruction is executed (please verify
your answers with a debugger)
MOV eax,A
MOV bx,A+1
MOV bx,A+2
MOV dx,A+4
MOV cx,B+1
MOV edx,B+2
28
Simple Arithmetic Instructions
 The ADD instruction adds the source to the
destination and stores the result in the
destination (source remains unchanged)
ADD destination,source
 The SUB instruction subtracts the source from
the destination and stores the result in the
destination (source remains unchanged)
SUB destination,source
 Both operands must be of the same size and
they cannot be both mem operands
 Recall that to perform A - B the CPU in fact
performs A + NEG(B)
29
Simple Arithmetic Instructions (cont.)
 ADD and SUB affect all the status flags according to the result
of the operation




ZF (zero flag) = 1 iff the result is zero
SF (sign flag) = 1 iff the msb of the result is one
OF (overflow flag) = 1 iff there is a signed overflow
CF (carry flag) = 1 iff there is an unsigned overflow
 Signed overflow: when the operation generates an out-ofrange (erroneous) signed value
 Unsigned overflow: when the operation generates an out-ofrange (erroneous) unsigned value
30
More on Overflows
 A unsigned overflow occurs if and only if (IFF) the
unsigned value of the result does not fit into the
destination operand
 This occurs IFF the unsigned interpretation of
the result is erroneous
 It is signaled by CF=1
 A signed overflow occurs IFF the signed value of
the result does not fit into the destination operand
 This occurs IFF the signed interpretation of the
result is erroneous
 It is signaled by OF=1
31
Simple Arithmetic Instructions (cont.)
 Both types of overflow occur independently and are
signaled separately by CF and OF
mov
add
mov
add
mov
add
al, 0FFh
al,1
; AL=00h, OF=0, CF=1
al,7Fh
al, 1
; AL=80h, OF=1, CF=0
al,80h
al,80h ; AL=00h, OF=1, CF=1
 Hence: we can have either type of overflow or both of
them at the same time
32
Overflow Example
mov ax,4000h
add ax,ax
;AX = 8000h
 Unsigned Interpretation:
 The sum of the 2 magnitudes 4000h + 4000h
gives 8000h. This is the result in AX (the
unsigned value of the result is correct). CF=0
 Signed Interpretation:
 we add two positive numbers: 4000h + 4000h
 and have obtained a negative number!
 the signed value of the result in AX is erroneous.
Hence OF=1
33
Overflow Example
mov ax,8000h
sub ax,0FFFFh
;AX = 8001h
 Unsigned Interpretation:
 from the magnitude 8000h we subtract the
larger magnitude FFFFh
 the unsigned value of the result is erroneous.
Hence CF=1
 Signed Interpretation:
 We subtract -1 from the negative number 8000h
and obtained the correct signed result 8001h.
Hence OF=0
34
Overflow Example
mov ah,40h
sub ah,80h
;AH = C0h
 Unsigned Interpretation:
 we subtract from 40h the larger number 80h
 the unsigned value of the result is wrong.
Hence CF=1
 Signed Interpretation:
 we subtract from 40h (64) a negative number 80h
(-128) to obtain a negative number
 the signed value of the result is wrong. Hence
OF=1
35
Exercise 3
 For each of these instructions, give the content (in
hexadecimal) of the destination operand and the
CF and OF flags immediately after the execution of
the instruction (verify your answers with a
debugger).
 ADD AX,BX when AX contains 8000h and BX
contains FFFFh.
 SUB AL,BL when AL contains 00h and BL contains
80h.
 ADD AH,BH when AH contains 2Fh and BH
contains 52h.
 SUB AX,BX when AX contains 0001h and BX
contains FFFFh.
36
Simple Arithmetic Instructions (cont.)
 The INC (increment) and DEC (decrement)
instructions add 1 or subtracts 1 from a single
operand (mem or reg operand)
INC destination
DEC destination
 They affect all status flags, except CF. Say that
initially we have, CF=OF=0
mov bh,0FFh
; CF=0, OF=0
inc bh
; bh=00h, CF=0, OF=0
mov bh,7Fh
; CF=0, OF=0
inc bh
; bh=80h, CF=0, OF=1
37
Simple Arithmetic Instructions (cont.)
 The NEG instruction performs the twos
complement of its operand
NEG destination
 Where destination is either mem or reg
 CF=0 IFF the result is 0
 OF=1 IFF there is a signed overflow. Ex:
mov ax,-5
neg ax; CF = 1, OF = 0
mov ax,8000h
neg ax; CF=1, OF=1 signed overflow!
38
I/O on the Win32 Console
 Our programs will communicate with the user via the Win32
console (the MS-DOS box)
 Input is done on the keyboard
 Output is done on the screen
 Modern OS like Windows forbids user programs to interact
directly with I/O hardware
 User programs can only perform I/O operation via system
calls
 For simplicity, our programs will perform I/O operations by
using macros that are provided in the csi2121.inc file
 These macros are calling C libraries functions like printf()
which, in turn, are calling the Win32 API
 Hence, these I/O operations will be slow but simple to use
and easy to migrate to another OS
 We will examine the mechanisms involved in I/O operations
later in the course
39
Character Output
 The putch macro prints on the screen the character of the
operand’s ASCII code. Usage:
putch source
 Where source must be a 32-bit operand
 i.e. either imm, reg32, or mem32 (a double word variable)
.data
aword dw 41h
adword dd 61h
.code
putch aword ;error: 16-bit operand
putch adword ;‘a’ is written on screen
putch ‘b’ ;’b’ is written on screen
mov eax,’c’
putch eax ;’c’ is written on screen
putch ax ;error: 16-bit operand
40
Character Output (cont.)
 Also: the cursor will advance one position after
printing the character
 The putch macro calls the putchar() function from
the C library. Hence:
 The number 10 = 0Ah will direct the cursor to the
beginning of the next line (the “newline character”
in C). So the <CR> and <LF> functions are both
performed on the screen.
putch 10 ;move the cursor to the
;beginning of next line
41
String Output
 To print a string, use the following macro:
putstr source
 Where source must be mem operand (i.e. the name of a
variable). It cannot be a reg or imm operand.
 This macro calls printf(“%s”, ) of the C library. Hence:
 The number 10 = 0Ah will move the cursor to the beginning of
the next line (the “newline character” in C)
 The string must be a “null terminating” string. The last
character must have ASCII code = 0h. Ex:
.data
msg db “hello”,0ah,“world”,0h
.code
putstr msg ;prints ‘hello’ on one line
;and ‘world’ on the next line
42
Integer Output
 To print the signed value of an integer, use:
putint source
 Where source must be a 32-bit operand
 i.e. either imm, reg32, or mem32 (a double word variable) .
Ex:
.data
aword dw 243
adword dd -266
.code
putint aword ;error: 16-bit operand
putint adword ;-266 is written on screen
putint -1 ; -1 is written on screen
mov eax,0FFFFFFFFh
putint eax ;-1 is written on screen
putint ax ;error: 16-bit operand
43
Character Input
 To read one or more character on the keyboard, we will use
the getch macro. Usage:
getch
 This macro calls getchar() from the C library. So it uses a
memory buffer that we will call the input buffer.
 Upon execution of getch, the input buffer is first examined.
 If the input buffer is empty, then getch waits for the user to
enter an input line (a sequence of char ended by <CR>).
 Each character that the user enters (at the keyboard) is
copied into the input buffer
 When the user enters the <CR>: the screen cursor move to
the next line, the value 0Ah is stored in the input buffer and
the control is pass to the instruction following getch
 The ASCII code of the first character entered on the keyboard
will be stored in AL. The remaining bits of EAX are filled with
zeros. Ex:
mov eax,-1
getch ; eax=41h if the user first hits ‘A’
44
Character Input (cont.)
 Example: Suppose that the input buffer is initially empty
and, upon execution of getch, the users enters
“hello”+<CR> on the keyboard.
 Then, when the control returns to the instruction following
getch, EAX contains 068h (= ‘h’) and the input buffer looks
like this:
‘h’
‘e’
Pointer to
next char
45
‘l’
‘l’
‘o’
0Ah
Pointer to
last char
 If the input buffer is not empty when getch is executed, then
EAX will get loaded with the ASCII code of the next character
in the input buffer and the pointer to the next char will
increase by one.
 The input buffer is empty only when the pointer to the next
char points beyond the last character (i.e: 0Ah)
 The user is prompted only when the input buffer is empty
Character Input (example)



Try to understand this program
It first prints “?” and moves the cursor to
the next line awaiting user input
When the user enters “abcdef” +
<CR>, the program displays (before
exiting):
abc
.code
main:
putch '?'
putch 10
getch
putch eax
getch
putch eax
getch
putch eax
ret
But if, instead, the user enters “a” +
<CR>, the program displays:
a
and the cursor moves to the next line
awaiting user input. If the user then
enters “bcdef”+<CR>, the program
prints on the next line (before exiting):
b
46
.386
.model flat
include csi2121.inc
end
I/O Example: Case Conversion
.386
.model flat
include csi2121.inc
.data
msg1 db "Enter a lower case letter: ",0
msg2 db 'In upper case it is: '
char db ?,0
.code
main:
putstr msg1
getch
;char in eax and goto next line
sub al,20h ;converts to upper case
mov char,al
putstr msg2
ret
end
47

Data Transfer Instructions (cont.)

Transcript Data Transfer Instructions (cont.)

Directory