Transcript cont.
Chapter II: Assembler
Chapter goal:
Overview:
Introduce the fundamental Basic Assembler Functions
functions that any
assembler must perform. Machine-Dependent
Assembler Features
Assign machine address
Translate mnemonic
operation codes to machine Machine-Independent
Assembler Features
language equivalents.
Assembler Design Options
2: Assembler
1
Basic Assembler Functions
(Using SIC as an Example)
Assembler directives:
START :
Specify name and starting address for the program
END : …
BYTE :
Generate character or hexdecimal constant
WORD:
Generate one-word integer constant
RESB :
Reserve the indicated number of bytes for a data area
RESW : …
2: Assembler
2
Example of a SIC Assembler Language Program
2: Assembler
3
Example of a SIC Assembler Language Program (cont.)
2: Assembler
4
Example of a SIC Assembler Language Program (cont.)
2: Assembler
5
A Simple SIC Assembler
The translation steps
Convert mnemonic operation codes to their machine language
equivalent.
Convert symbolic operands to their equivalent machine
addresses.
Build the machine instructions in the proper format.
Convert the data constants specified in the source program into
their internal machine representations.
Write the object program and the assembly listing.
2: Assembler
6
Output: the object program
2: Assembler
7
The Object code for the above program
2: Assembler
8
The Object code for the above program (cont.)
2: Assembler
9
The Object code for the above program (cont.)
2: Assembler
10
The Format for Object Program
The object program will later be loaded into
memory for execution.
Three types of records for object program format
Header: contains the program name, starting address,
and length.
Text: contains the translated instructions and data of
the program
End: marks the end of the object program and
specifies the address in the program where
execution is to begin.
2: Assembler
11
The Format for Object Program (cont.)
2: Assembler
12
The object program
2: Assembler
13
Two Passes of our Simple Assembler
2: Assembler
14
The Data Structures
Two major data structures:
Operation code table (OPTAB)
Symbol table (SYMTAB)
Note: SYMTAB is usually organized as a hash table
for efficiently of insertion and retrieval.
Location counter (LOCCTR)
2: Assembler
15
The Algorithm (Pass 1)
2: Assembler
16
The Algorithm (Pass 2)
2: Assembler
17
Machine-Dependent Assembler Features
(using SIC/XE as an example)
Addressing modes
Immediate addressing modes:
COMP #0
Indirect addressing:
J @RETADR
The extended instruction format
+LDT #4096
Most of the register-to-memory instructions are assembled
using either program-counter relative or base relative
addressing. If either program-counter relative nor base relative
addressing can be used, then the 4-byte (Format 4) must be
used..
2: Assembler
18
Example of a SIC/XE Assembler Language Program
2: Assembler
19
Example of a SIC/XE Assembler Language Program
(cont.)
2: Assembler
20
Example of a SIC/XE Assembler Language Program
(cont.)
2: Assembler
21
Output: the object program
2: Assembler
22
The Object code for the above program
2: Assembler
23
The Object code for the above program (cont.)
2: Assembler
24
The Object code for the above program (cont.)
2: Assembler
25
Program Relocation
An object program that contains the information necessary to
perform this kind of modification is called a relocatable program.
2: Assembler
26
Program Relocation (cont.)
We can solve the relocation problem in the following
way:
1. When the assembler generates the object code for the JSUB
instruction we are considering, it will insert the address of
RDREC relative to the start of the program. (This is the reason we
initialized the location counter to 0 for the assembly)
2. The assembler will also produce a command for the loader,
instructing it to add the beginning address of the program to the
address field in the JSUB instruction at load time.
2: Assembler
27
Program Relocation (cont.)
2: Assembler
28
Program Relocation (cont.)
2: Assembler
29
Machine-Independent Assembler Features
Literals
Symbol-Defining Statements
Expressions
Program Blocks
Control Sections and Program Linking
2: Assembler
30
Literal
It is often convenient for the programmer to be able to write the
values of a constant operand as a part of the instruction that uses
it. Such an operands is called a literal.
E.g., (In Fig 2.9)
45
215
001A ENDFIL
1062 WLOOP
LDA =C’EOF’
TD
=X’05’
032010
E32011
The difference between a literal and an immediate operand. With
immediate addressing, the operand value is assembled as part of
the machine instruction. With a literal, the assembler generate the
specified value as a constant at some other memory location.
2: Assembler
31
Literal (cont.)
Literal pools: Normally literals are placed into a pool at the
end of the program. The assembly listing of a program containing
literals usually includes a listing of this literal pool, which shows
the assigned addresses and the generated data values.
The assembler directive LTORG is used for creating the literal
pool.
2: Assembler
32
Program demonstrating additional assembler features
2: Assembler
33
Program demonstrating additional assembler features (cont.)
2: Assembler
34
Program demonstrating additional assembler features (cont.)
2: Assembler
35
The above program with object code
2: Assembler
36
The above program with object code (cont.)
2: Assembler
37
The above program with object code (cont.)
2: Assembler
38
Symbol-Defining Statements
Most assembler provides an assembler directive that allows the
programmer to define symbols and specify their values.
The assembler directive : EQU
E.g.,
symbol
EQU
value
Usage sample:
+LDT
#4096
+LDT
MAXLEN
#MAXLEN
EQU 4096
2: Assembler
39
Symbol-Defining Statements (An example…)
STAB
FLAGS
RESB 1100
EQU
EQU
EQU
LDA
VALUE,X
SYMBOL
VALUE
STAB
STAB+6
STAB+9
2: Assembler
40
Expressions
Assembler generally allow arithmetic expressions formed
according to the normal rules using the operators +, -, * , and /
E.g.,
MAXLEN
EQU
BUFEND-BUFFER
2: Assembler
41
Program Blocks
The source program logically contained subroutines, data areas,
etc. However they were handled by the assembler as one entity,
resulting in a single block of object code.
Note:
The term program blocks refer to segments of code that are
rearranged within a single object program unit, and control
section to refer to segments that are translated into independent
object program units.
The assembler directive USE indicates which portions of the
source program belong to the various blocks.
2: Assembler
42
Example of a program with multiple program blocks
2: Assembler
43
Example of a program with multiple program blocks
(cont.)
2: Assembler
44
Example of a program with multiple program blocks
(cont.)
2: Assembler
45
The above program with object code
2: Assembler
46
The above program with object code (cont.)
2: Assembler
47
The above program with object code (cont.)
2: Assembler
48
Program Blocks
Pass 1
Use separate location counter for each program block.
Pass 2
The assembler needs the address for each symbol relative to
the start of the object program.
2: Assembler
49
The object program
2: Assembler
50
The loading processes
2: Assembler
51
Control sections and program linking
A control section is a part of the program that maintain its identity
after assembly; each such control section can be loaded and
relocated independently of the others.
Note:
1. The assembler has no idea where any other control section will
be loaded at execution time.
2. The reference between control sections are called external
reference .
Two assembler directive:
1. EXTDEF
2. EXTREF
: defined the external symbol that may be used
by other sections.
: named the symbols that are used in this
control section and defined elsewhere.
2: Assembler
52
Illustration of control sections and program linking
2: Assembler
53
Illustration of control sections and program linking
(cont.)
2: Assembler
54
Illustration of control sections and program linking
(cont.)
2: Assembler
55
The above program with object code
2: Assembler
56
The above program with object code (cont.)
2: Assembler
57
The above program with object code (cont.)
2: Assembler
58
Control sections and program linking (cont.)
The two new record types are Define and Refer.
2: Assembler
59
The object program
2: Assembler
60
Assembler Design Options
– One-pass Assembler
Main problem:
One need to solve the forward reference problem.
Solution:
Require all such areas be defined in the source program before
they are referenced.
In order to reduce the size of the problem, many one-pass
assemblers prohibit forward reference to data items.
Usually one-pass assembler generate object code in
memory for immediate execution. No object program is
written out, and no loader is needed.
--------- load-and-go assembler.
2: Assembler
61
Assembler Design Options
– One-pass Assembler (cont.)
If an instruction operand is a symbol that has not
yet been defined, the operand address is omitted
when the instruction is assembled.
The address of the operand field of the instruction that
refers to the undefined symbol is added to a list of forward
references associated with the symbol table entry.
When the definition for a symbol is encountered, the
forward reference list for that symbol is scanned, and the
proper address is inserted into any instructions previously
generated.
2: Assembler
62
Sample program for a one-pass assembler
2: Assembler
63
Sample program for a one-pass assembler (cont.)
2: Assembler
64
Sample program for a one-pass assembler (cont.)
2: Assembler
65
Object code in memory and symbol table entries
for above program (after scanning line 40)
2: Assembler
66
Object code in memory and symbol table entries
for above program (after scanning line 160)
2: Assembler
67
Object program from one-pass assembler for
above program
2: Assembler
68
Assembler Design Options
– Multi-pass Assembler
2: Assembler
69
Example of multi-pass assembler operation
2: Assembler
70
Example of multi-pass assembler operation (cont.)
2: Assembler
71
Example of multi-pass assembler operation (cont.)
2: Assembler
72
Example of multi-pass assembler operation (cont.)
2: Assembler
73
Example of multi-pass assembler operation (cont.)
2: Assembler
74