intro to assembler
Download
Report
Transcript intro to assembler
The Assembly Language Level
Part A – Introduction, Macros, and
Conditional Assembly
Assembly Language Level
• Implemented by translation (rather than via
interpretation).
• Translator
– Program that converts a user’s program written in
some language to another language
– Source language = language in which the original
program is written
– Target language = language to which source
language is converted
Translation vs. interpretation
• In translation, the original program in the
source language is not directly executed.
• Translation produces an object program or
executable binary program.
Translation vs. interpretation
• Execution only begins after translation has
completed.
• When an object program is executed, only
three levels are in evidence:
1. OS Machine
2. ISA
3. microarchitecture
Assembler (assembly language)
• assembler = translator for a source language
which is basically a symbolic representation
for a numerical target language (machine
language)
• compiler = translator where the source
language is a high-level language (such as Java
or C) and the target language is either a
numerical machine language or a symbolic
representation of one
Assembler (assembly language)
• Pure assembly language:
– each statement produces exactly one machine
instruction
– one-to-one correspondence between machine
instructions and statements in the assembler
program
• Symbolic assembler is easier for humans to
understand than numeric machine code.
Why use Assembler?
• Access to machine.
– Provides access to all features and instructions
available on the target machine.
• Performance.
– execution time and memory
Why shouldn’t we use Assembler?
• Not portable.
– Assembler programs can only run on the target
architecture.
• Not easy to understand.
– harder to debug
– harder to maintain
– requires more time to write
Compromise
• Tuning (code profiling)
– Note that 10% of the code is often executed 90%
of the time.
– So one may wish to write all of the code in a HLL,
identify the 10% above, and then rewrite that 10%
in assembler.
Reasons for studying Assembler
1. The success or failure of a large project may
depend upon performance.
2. Embedded systems often have very little
memory (for reasons of expense or power).
3. Compilers must either produce assembler or
machine code. (So the compiler writer must
know assembler.)
4. Allows us to examine the real machine. (And
someone must be able to design, build, and
test these machines.)
Assemblers we will use
• MASM
– Microsoft’s assembler
– Used to study the Pentium architecture
• gcc/g++
– The Linux/Unix c/c++ compilers will also accept
assembler.
– Used to study the Ultra SPARC architecture
• Mixed language programming
– Assembler and C/C++
Assembler statement
• Parts:
1.
2.
3.
4.
label field
opcode
operand[s]
comment
Format of an Assembly Language Statement:
computation of N=I+J via Pentium
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Format of an Assembly Language Statement:
computation of N=I+J via SPARC
Computation of N = I + J. (c) SPARC.
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Assembler statement
• Parts:
1. label field
• assigns name to code or data
• always followed by : (SPARC)
• only code labels are followed by :
(Pentium)
2. opcode
3. operand[s]
4. comment
Assembler statement
• Parts:
1. label field
2. opcode
• symbolic name for machine instruction
• or
• pseudoinstruction (assembler directive)
3. operand[s]
4. comment
Pseudoinstructions (1)
Some of the pseudoinstructions available in the
Pentium 4 assembler (MASM).
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Pseudoinstructions (2)
Some of the pseudoinstructions available in the
Pentium 4 assembler (MASM).
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Assembler statement
• Parts:
1. label field
2. opcode
3. operand[s]
• specifies addresses, const values, or
registers (see caution below)
– Intel is
– Sun is
4. comment
dst, src
src, dst
Assembler statement
• Parts:
1. label field
2. opcode
3. operand[s]
4. comment
• ; (Pentium)
• ! (SPARC)
Representing numbers (ints)
• Fixed, finite number of bits.
bits
8
16
32
64
bytes
1
2
4
8
C/C++
char
short
int or long
long long
Intel
[s]byte
[s]word
[s]dword
[s]qword
Sun
byte
half
word
xword
MASM (Microsoft’s Assembler)
Defining data in MASM:
[label]
[label]
[label]
[label]
<tab>
<tab>
<tab>
<tab>
type
type
type
type
<tab>
<tab>
<tab>
<tab>
val1[,…,valn]
?
n dup (?)
n dup (k)
<tab>
<tab>
<tab>
<tab>
[;comment]
[;comment]
[;comment]
[;comment]
For unsigned int data, type may be byte, word, dword,
or qword (earlier versions: db, dw, dd, dq).
For signed int data, type may be sbyte, sword, sdword,
or sqword.
Example of defining data in MASM
count
tbl
szTemp
szPrime
value1
value2
value3
.data
dword
byte
byte
byte
byte
byte
word
byte
byte
…
.code
…
;begin data section
0
;counter - init to 0
1, 2, 4, 8, 16, 32, 64, 128 ;power-of-2 table
16 dup (?)
;buffer for messages
"%08lx", 0
;message format string
‘a’
?
10h
10b
4 dup ("stack") ;20 bytes: stack stack stack stack
;begin code section
Symbols vs. data
• Symbols
– doesn’t use storage
– value can’t change at run time but can be changed during
assembly (compilation)
N
table
=
byte
100
N dup (0)
– $ is predefined location counter (not PC)
list
byte
listSize =
10, 20, 30, 40
($ - list)
• Data
– variables
– use storage
– values can change at run time
;======================================================================
TITLE
Program Template
;
; File:
template.asm
; Author:
George J. Grevera, Ph.D.
; Date:
12/23/2004
; Description:
;
This file contains a simple console program. It also illustrates a
;
template that you should adopt for all of your code.
; Build:
;
See build.bat for a command line oriented way to assemble and link
;
this program. Simply type the following to create an executable:
;
\masm32\bin\ml /c /coff /Cp /nologo /Zd /Zi /Fl /Fm /FR /DDebug template.asm
;
\masm32\bin\link /nologo /map /debug /subsystem:console template.obj
;
or
;
build template
;======================================================================
; (this should not require any changes)
.686
;instructions for Pentium Pro (or better)
.model
flat, stdcall
;no crazy segments!
option
casemap:none
;case sensitive
;----------------------------------------------------------------------
;---------------------------------------------------------------------; (insert needed external definitions here)
.nolist
;listing off
include
\masm32\include\windows.inc
include
\masm32\include\masm32.inc
include
\masm32\include\kernel32.inc
include
\masm32\macros\macros.asm
includelib \masm32\lib\masm32.lib
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
.list
;listing on
;---------------------------------------------------------------------; (insert symbol definitions here)
PROMPT equ
"<hit return>" ;user prompt
CR
=
13
;carriage return
LF
=
10
;linefeed
.data
; (insert variables definitions here)
;---------------------------------------------------------------------Symbols are not
the same as data!
;---------------------------------------------------------------------.code
; (insert executable instructions here)
main
PROC
;program execution begins here
print
SADD(CR,LF,"Hello, world.",CR,LF,CR,LF)
;output a message
mov
eax, input(PROMPT)
;prompt the user
exit
;end of program
main
ENDP
; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; (insert additional procedures/functions here)
Symbols are not
the same as data!
END
main
They are used as
;======================================================================
placeholders for
values.
;---------------------------------------------------------------------.code
; (insert executable instructions here)
main
PROC
;program execution begins here
print
SADD(13,10,"Hello, world.",13,10,13,10)
;output a message
mov
eax, input("<hit return>")
;prompt the user
exit
;end of program
main
ENDP
; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; (insert additional procedures/functions here)
Symbols are not
the same as data!
END
main
They are used as
;======================================================================
placeholders for
values.
ADVANCED TOPIC: MACROS
Macro Definition, Call, Expansion
Assembly language code for interchanging P and Q twice.
Analogous to #define in C/C++.
(a) Without a macro.
(b) With a macro.
macro
defn
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Macro Definition, Call, Expansion
Comparison of macro calls with procedure calls.
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Macros with parameters
Nearly identical sequences of statements.
(a) Without a macro. (b) With a macro.
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
Macro example (without
parameters)
CR
LF
=
=
13
10
;carriage return
;linefeed
;this macro outputs a newline (cr,lf)
newline MACRO
print
SADD(CR,LF)
ENDM
Macro example (with parameters)
We like to state something like the following:
;output
show
show
show
show
newline
registers
" eax:",
" ebx:",
" ecx:",
" edx:",
[esp+28]
[esp+16]
[esp+24]
[esp+20]
;output eax
;output ebx
;output ecx
;output edx
;end line
Macro example (with parameters)
.data
szTemp byte
szPrime
show
16 dup (?)
;buffer for messages
byte
"%08lx", 0
;message format string
.code
MACRO caption, value
;note: 'value' must not be in eax, ecx, or edx
; (because they may be modified by the first call).
print SADD(caption)
mov
eax, value
invoke wsprintf, offset szTemp, offset szPrime, eax
print offset szTemp
ENDM
Procedures/functions
When a procedure is called,
execution of the procedure
always begins at the first
statement of the procedure.
Tanenbaum, Structured Computer
Organization, Fifth Edition, (c) 2006
Pearson Education, Inc. All rights reserved.
ADVANCED TOPIC: CONDITIONAL
ASSEMBLY
Conditional assembly: IF
(not to be confused with .IF)
IF expression1
ifstatements
[[ELSEIF expression2
elseifstatements]]
[[ELSE
elsestatements]]
ENDIF
Grants assembly of
ifstatements if
expression1 is true
(nonzero) or
elseifstatements if
expression1 is false (0)
and expression2 is true.
Analogous to #if in C/C++.
Conditional assembly: IF
PROCESSOR
=
.
.
.
IF
shl
ELSE
mov
shl
ENDIF
80386
;Set to 8086 for 8086-only code
PROCESSOR eq 80386
ax, 4
;Must be 8086 processor.
cl, 4
ax, cl
Conditional assembly: IFDEF &
IFNDEF
IFDEF name
statements
[ELSE
statements]
ENDIF
IFNDEF name
statements
[ELSE
statements]
ENDIF
Grants assembly if name
is a previously defined
label, variable, or symbol.
Analogous to #ifdef in
C/C++.
Grants assembly if name
has not been defined.
Analogous to #ifndef in
C/C++.
Conditional assembly: IFDEF &
IFNDEF
DEBUG = 0
IFDEF DEBUG
print
char "In PrintMat",cr,lf,0
ENDIF
;======================================================================
TITLE
Program Template
;
; File:
template.asm
; Author: George J. Grevera, Ph.D.
; Date:
12/23/2004
; Description:
;
This file contains a simple console program. It also illustrates a
;
template that you should adopt for all of your code.
; Build:
;
See build.bat for a command line oriented way to assemble and link
;
this program. Simply type the following to create an executable:
;
\masm32\bin\ml /c /coff /Cp /nologo /Zd /Zi /Fl /Fm /FR
/DDebug
template.asm
;
\masm32\bin\link /nologo /map /debug /subsystem:console template.obj
;
or
;
build template
;======================================================================
; (this should not require any changes)
.686
;instructions for Pentium Pro (or better)
.model flat, stdcall
;no crazy segments!
option casemap:none
;case sensitive
;----------------------------------------------------------------------
Conditional compilation in C/C++
• Not supported by Java.
#define Debug
…
for (int i=0; i<100; i++) {
#ifdef Debug
printf( "in loop. i = %d \n", i );
#endif
…
}
Conditional compilation in C/C++
• How is this different?
#define Debug
…
for (int i=0; i<100; i++) {
#ifdef Debug
printf( "in loop. i = %d \n", i );
#endif
bool Debug = true;
…
…
}
for (int i=0; i<100; i++) {
if (Debug) {
printf( “in loop. i = %d \n”, i );
}
…
}
Conditional compilation in C/C++
• Like MASM, C/C++ supports:
#define
#if / #ifdef / #ifndef
…
#elif
…
#else
…
#endif
Definitions (including macro
definitions) in C/C++
• A variety of definitions:
#define A
Note the absence of any types!
#define B
12.7
#define MyAbs(x)
if (x < 0)
x = -x;
How does differ from variable
and function definitions?
\
\