Linux details: Program execution and the dynamic linker

Download Report

Transcript Linux details: Program execution and the dynamic linker

Program Execution in Linux
David Ferry, Chris Gill
CSE 522S - Advanced Operating Systems
Washington University in St. Louis
St. Louis, MO 63143
1
Creating an Executable File
Two stages:
• Compilation
• Linking
The compiler translates
source code to machine
code.
The linker connects
binary files to libraries
to create an executable.
//Source code
#include <stdio.h>
int foo = 20;
int main( int argc, char* argv[]){
printf(“Hello, world!\n”);
return 0;
}
Compiler
Relocatable Object file:
00000000 D foo
00000000 T main
U puts
Linker
Executable file
CSE 522S – Advanced Operating Systems
2
The Symbol Table
Program binaries have a symbol table that
keep track of data and code:
foo = 10;
Example: int
int bar = 20;
The linker
must resolve
undefined
symbols before
the program
can be run!
int main( int argc, char* argv[] ){
printf(“Hello, world!\n”);
return 0;
}
CSE 522S – Advanced Operating Systems
3
Static vs. Dynamic Linking
Static linking – required code and data is
copied into executable at compile time
Dynamic linking – required code and data is
linked to executable at runtime
Static:
my_program.o
Dynamic:
Program code
my_program.o
libc.so
Program Data
Program code
Library Code
Program Data
Library Code
CSE 522S – Advanced Operating Systems
4
Parts of a Program
A program has
two components:
• Data
• Code
0xc000_0000
Virtual Address Space
Stack
Either component may be:
• static (fixed at compile time)
• dynamic (linked at run time)
The compiler creates static
sections as part of a binary.
The linker links dynamic
sections from other binaries.
0x0000_0000
Memory Map Segment
Heap
.bss
.data
.text
CSE 522S – Advanced Operating Systems
5
Program Segmentation
Static code:
• .text segment
0xc000_0000
Stack
Dynamic code:
• Memory map segment
Memory Map Segment
Static data:
• .data segment (initialized)
Dynamic data:
• Memory map segment
Initialized at runtime:
• Stack
• Heap
• .bss
Virtual Address Space
Heap
.bss
.data
0x0000_0000
.text
CSE 522S – Advanced Operating Systems
6
Running a Statically Linked Program
A statically linked program is entirely
self-contained:
The loader creates a valid process by
loading a binary image into memory
• On Linux, execve() system call
The C runtime initializes the process to
execute normal C code
• Usually called crt0.o
CSE 522S – Advanced Operating Systems
7
The C Runtime
• Initializes the C stack and heap
• Sets up argc and argv
• Calls user-specified program constructors and
destructors
• Does C library intialization
CSE 522S – Advanced Operating Systems
8
Running a Statically Linked Program
1. User forks() an existing process to get a
new process space
2. execve() reads program into memory
3. Starts executing at _start() in the C
runtime, which sets up environment
4. C runtime eventually calls main()
5. After main returns, C runtime does some
cleanup
CSE 522S – Advanced Operating Systems
9
Running a
Dynamically Linked Program
• Some functions and data
do not exist in process
space at runtime
• The dynamic linker
(called ld) maps these
into the memory map
segment on-demand
Stack
Memory Map Segment
Heap
.bss
.data
.text
CSE 522S – Advanced Operating Systems
10
Linking at Runtime
At compile time:
• The linker (ld) is embedded in program
• Addresses of dynamic functions are replaced
with calls to the linker
At runtime the linker does lazy-binding:
• Program runs as normal until it encounters an
unresolved function
• Program jumps to linker
• Linker maps shared library into address space
and replaces the unresolved address with the
resolved address
CSE 522S – Advanced Operating Systems
11
Runtime Linker Implementation
Uses a procedure link table
(PLT) to do lazy binding
Stack
//Source code
#include <stdio.h>
int foo = 20;
int main( int argc, char* argv[]){
printf(“Hello, world!\n”);
return 0;
}
Heap
.bss
Procedure Link Table (PLT)
linker_stub()
.data
.text
CSE 522S – Advanced Operating Systems
12
Runtime Linker Implementation
Uses a procedure link table
(PLT) to do lazy binding
//Source code
#include <stdio.h>
Stack
Library with printf() function
int foo = 20;
int main( int argc, char* argv[]){
printf(“Hello, world!\n”);
return 0;
}
Heap
.bss
Procedure Link Table (PLT)
library printf()
.data
.text
CSE 522S – Advanced Operating Systems
13
Static vs. Dynamic Linking
Static:
• Does not need to look up libraries at runtime
• Does not need extra PLT indirection
• Replicates disk space
Dynamic:
• Less disk space (7K vs 571K for hello world)
• Shared libraries already in memory and
in hot cache
• Incurs lookup and indirection overheads
CSE 522S – Advanced Operating Systems
14
Executable File Format
The current binary file format is called
ELF - Executable and Linking Format
• First part of file is the ELF Header, which defines
contents of the rest of the file
• Segments contain data & code needed at runtime
• Sections contain linking & relocation data
• Adds additional segments past .text, .data, etc.:
.rodata – read-only data
.debug – debugging symbol table
and more…
• GCC adds it’s own sections…
CSE 522S – Advanced Operating Systems
15
Binary File Utilities
nm – prints symbol table
objdump – prints all binary data
readelf – prints ELF data
pmap – prints memory map of a running
process
• ldd – prints dynamic library dependencies
of a binary
• strip – strips symbol data from a binary
•
•
•
•
CSE 522S – Advanced Operating Systems
16