Linking and Loading
Download
Report
Transcript Linking and Loading
Linking and Loading
Notes by Dr. Baker
Edited by Dr. Stoecklin
1
Context: Programming Language Implementation
• Direct Execution
– done by hardware
– program must be translated into machine instructions, in main
memory
– example: compilation, linking, and loading of C program
• Interpretation
– done by software
– program is never translated all the way to machine
instructions
– example: interpretation of shell scripts by a shell program
– example: interpretation of Java byte code by a JVM
– ultimately depends on a directly executed interpreter program
2
Compilation, Assembly,
Linking and Loading
3
Compilation, Linking and Loading
4
Libraries
• statically linked
• dynamically linked
• shareable
5
Static Linker Functions
• combine object modules into a single load module
• linkage = replacement of symbolic addresses (relative
to other module origins)
– by absolute addresses
if used with an absolute loader
– by offsets relative to the new load module origin
if used with a relocating loader
• relocation = replacement of addresses relative to local
module origin
– by absolute addresses
if used with an absolute loader
– by offsets relative to the new load module origin
if used with a relocating loader
6
Example of Static Linkage & LinkTime Relocation: Source Code
f.c:
}
main.c:
int f (int x) {
if (x <= 1)
return x;
return x * f (x - 1);
}
#include <stdio.h>
extern int f (int x);
int i = 2;
char format[] = "f (%d) = %d\n";
int main (int argc, char const *argv[]) {
int j;
j = f (i);
printf (format, i, j);
return 0;
7
Example of Static Linkage & LinkTime Relocation: Linker Actions
8
Complete Example
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Makefile
main.c - C source code
main.E - expanded source code
main.s - assembly code
main.o - object code
main.obj - human-readable object code dump
main.nm - dump of link names
f.c - C source code
f.E - expanded source code
f.s - assembly code
f.o - object code
f.obj - human-readable object code dump
f.nm - dump of link names
linkex - loadable main program
linkex.obj - human-readable object code dump
linkex.out – output of linkage
linkex.nm - dump of link names
9
Compilation Phases & Intermediate Files
main.c
main.s
main.E
main.ob
j
main.n
m
f.c
f.s
f.E
f.obj
f.nm
10
Linking All the (Static) Components
main.c
f.c
linkex.ou
t
11
Linker Options
•
By selecting the linker options
– "-Xlinker -verbose" and "-Xlinker -print-map"
•
•
•
•
•
•
•
•
•
•
•
•
•
•
we can see what the linker, which is invoked as the final stage of gcc,
Did to produce the final executable program.
The linker output, which we redirected to file linkex.out,
shows that it opened the following object and library files:
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.oLOAD main.o
LOAD f.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../libc.so
LOAD /lib/libc.so.6LOAD /usr/lib/libc_nonshared.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o
– The files with "crt" in their name are components of the C-language run time system. The
files with "lib" in their name are libraries.
12
Function f Before Relocation & Linking
Offset
Hex Instr
Symbolic Instruction
0:
55
push
%ebp
1:
89 e5
mov
%esp,%ebp
3:
83 7d 08 01
cmpl
$0x1,0x8(%ebp)
7:
7f 07
jg
10
9:
8b 45 08
mov
0x8(%ebp),%eax
c:
89 c0
mov
%eax,%eax
e:
eb 18
jmp
28
10:
83 ec 0c
sub
$0xc,%esp
13:
8b 45 08
mov
0x8(%ebp),%eax
16:
48
dec
%eax
17:
50
push
%eax
18:
e8 fc ff ff ff
call
19 [f]
1d:
83 c4 10
add
$0x10,%esp
20:
89 c0
mov
%eax,%eax
22:
0f af 45 08
imul
0x8(%ebp),%eax
26:
89 c0
mov
%eax,%eax
28:
c9
leave
29:
c3
ret
2a:
89 f6
mov
13
%esi,%esi
Function main Before Relocation & Linking
Offset
Hex Instr
Symbolic Instruction
0:
55
push
%ebp
1:
89 e5
mov
%esp,%ebp
3:
83 ec 08
sub
$0x8,%esp
6:
83 ec 0c
sub
$0xc,%esp
9:
ff 35 00 00 00 00
pushl
0x0 [i]
f:
e8 fc ff ff ff
call
10 [f]
14:
83 c4 10
add
$0x10,%esp
17:
89 c0
mov
%eax,%eax
19:
89 45 fc
mov
%eax,0xfffffffc(%ebp)
1c:
83 ec 04
sub
$0x4,%esp
1f:
ff 75 fc
pushl
0xfffffffc(%ebp)
22:
ff 35 00 00 00 00
pushl
0x0 [i]
28:
68 00 00 00 00
push
$0x0 [format]
2d:
e8 fc ff ff ff
call
2e [printf]
32:
83 c4 10
add
$0x10,%esp
35:
b8 00 00 00 00
mov
$0x0,%eax
3a:
c9
leave
3b:
c3
ret
14
Relocations of Symbols
Offset
Symbol
0x08048400
main
0x0804843c
f
0x080494ec
format
0x080494e8
i
15
After Relocation and Linking
Offset
Offset in Module
Hex Instr Symbolic Instruction
- main 0x08048400 55
push
%ebp
0x08048401 89 e5
mov
%esp,%ebp
0x08048403 83 ec 08
sub
$0x8,%esp
0x08048406 83 ec 0c
sub
$0xc,%esp
0x08048409 ff 35 08 04 94 e8
pushl
0x0 [i]
0x08048400 e8 08 04 84 3c
call
10 [f]
0x08048414 83 c4 10
add
$0x10,%esp
0x08048417 89 c0
mov
%eax,%eax
0x08048419 89 45 fc
mov
%eax,0xfffffffc(%ebp)
0x0804841c 83 ec 04
sub
$0x4,%esp
0x0804841f ff 75 fc
pushl
0xfffffffc(%ebp)
0x08048422 ff 35 08 04 94 e8
pushl
0x0 [i]
0x08048428 68 08 04 94 ec
push
$0x0 [format]
0x0804842d e8 fc ff ff ff
call
2e [printf]
0x08048432 83 c4 10
add
$0x10,%esp
0x08048435 b8 00 00 00 00
mov
$0x0,%eax
0x0804843a c9
leave
0x0804843b c3
ret
-f-
0x0804843c 55
push
%ebp
0x0804843d 89 e5
mov
%esp,%ebp
0x0804843f 83 7d 08 01
cmpl
$0x1,0x8(%ebp)
0x08048443 7f 07
jg
10
0x08048445 8b 45 08
mov
0x8(%ebp),%eax
0x08048448 89 c0
mov
%eax,%eax
0x0804844a eb 18
jmp
28
0x0804844c 83 ec 0c
sub
$0xc,%esp
0x0804844f 8b 45 08
mov
0x8(%ebp),%eax
0x08048452 48
dec
%eax
0x08048453 50
push
%eax
0x08048454 e8 08 04 84 3c
call
19 [f]
0x08048459 83 c4 10
add
$0x10,%esp
0x0804845c 89 c0
mov
%eax,%eax
16
Loading
17
Types of Loaders
•
•
•
•
absolute loader
relocating loader
dynamic relocating loader
dynamic linking & relocating loader
18
Absolute Loader
• oldest and most primitive
• load module always goes into the same
location in main memory
• location is predetermined at link time
• addresses in code are all absolute
• addresses are bound at link time
19
Relocating Loader
allows module to be loaded into
any location in main memory
but not moved after initial
loading
location is determined at load
time
addresses in code are all relative
load module format must
identify locations of addresses
that need relocation,
and the bases to which they are
relative
the latter is called the relocation
dictionary
addresses are bound at load time
20
Dynamic Relocating Loader
• allows module to be moved after initial
loading
• needed to support swapping
• addresses are kept in relative form
• base is added to address at run time
• special hardware is required
– base & bounds register model
– multiple segment/base register model (e.g.
IBM 360, Intel x86)
21
Dynamic Linking Loader
• Performs linking on demand
• May be done at load time, all at once
• May be done at run time, incrementally
–
–
–
–
can save main memory space
can save time
can defer binding of installation-dependent details
can make system updates and bug-fixes much
simpler
– can break working software
– can also be a security hole
22
Shareable Object Libraries
• several processes can share one copy in
memory
• code must be reentrant
• enabled by dynamic linking and virtual
memory (to come)
• allows code sharing between processes
• allows OS to make better use of main memory
• allows OS to reduce the amount of disk I/O
needed for loading and swapping processes
23
Static vs Dynamic Linking and Loading
24
Review of Binding Times for Functions
•
•
•
•
•
Programming/coding
Compilation/assembly
Static linking (load module creation)
Program load
Run time (dynamic loading on first
reference)
25
Dynamic Memory Allocation
26
Context and Motivation
• Main memory management to processes
(before virtual memory)
• Memory management within a process
(malloc() implementation)
• Memory management within the kernel
• Disk buffer management
27
Operations to Support
• allocate (size in bytes)
• free (pointer to block of storage)
28
Implementation Issues
•
•
•
•
When allocating:
How to keep track of the free blocks?
How to choose which free block to allocate?
What to do with left-over fragment, if selected
free block is larger than request?
• When freeing:
• How to discover the size of the block of
memory to free, given only a pointer?
• How to coalesce adjacent free blocks (to
reverse fragmentation)?
29
Coalescing Free Blocks
30
How to Discover the Size of the
Block Given only a Pointer?
The usual method:
31
How to Keep Track of the Free Blocks?
• implicit list of all blocks, using lengths
requires longer searching for a free block
32
How to Keep Track of the Free Blocks?
• explicit list of just the free blocks, using pointers
allows faster searching for free block
variations allows faster searching and/or insertion:
segregated free lists (blocks grouped by size)
sorted free list (by size, possibly a tree)
33
Placement: How to Choose
Which Block to Allocate
• First Fit
– always search from beginning of list
– can take linear time in total number of blocks (implicit list) or
number of free blocks (explicit list)
– tends to cause small blocks to accumulate at front of list (why?)
– tends to preserve larger blocks at back of list (why?)
• Next Fit
– search from end of previous search
– average search time may be shorter (why?)
– tends to fragment storage into smaller blocks (why?)
• Best Fit
– choose block closest in size that fits
– search entire list (implicit list) or sort list by size (explicit list)
– average search time may be smaller (solution: segregated list, or
tree)
– fragments tend to be small and many (solution: round up size)
34
What to Do with Left-over Fragment?
• General rules:
• split the block - if the fragment is large
enough
• don't split the block - if the fragment is
small
• Buddy System:
• only split into blocks of size that is an
integer power of two
35
How to Coalesce Adjacent Free Blocks
• Subproblems:
• finding neighbor blocks
• determining whether neighbor blocks are
free
• updating the free list
36
Finding Neighbor Blocks
• Traverse entire free list - O(n)
• Keep size at each end - O(1)
37
Determining Whether
Neighbor Blocks Are Free
• Traverse entire free list - O(n)
• Boundary tags - O(1)
38
Updating The Free List
• Traverse entire free list - O(n)
• Boundary tags - O(1)
39
Implementation Details: C
Pointer Arithmetic
• pointer addition is different from integer
addition
• if p is of type int *, p + 1 points to the
next integer (not byte) after the integer
pointed to by p
40