Transcript 52223_10

52.223 Low Level Programming
Lecturer: Duncan Smeed
The Interface Between High-Level and
Low-Level Languages
High Level Linking
 Programming in assembly language (AL) - also known as lowlevel language (LLL) - is now seldom used to develop complete
applications.
 Generally speaking, where convenience and development time
are more important than speed or code size, applications can be
effectively written in a high-level language. Then such
programs can be optimised using AL if necessary.
 AL may need to be used to control high-speed hardware, access
low-level DOS services, and so on.
 What we will be concerned with in this section is the interface,
the connection, between high-level languages and assembly
language.
52223_10/2
The Interface Between High-Level and Low-Level Languages
General Conventions
 The following general considerations need to be
addressed when calling AL subroutines form highlevel languages:
• Naming Conventions
• Calling Conventions
52223_10/3
The Interface Between High-Level and Low-Level Languages
Naming Conventions
 When calling an AL subroutine from another
language, any identifiers that are shared between the
two languages must be compatible with both.
 These are known as external identifiers.
 The linker resolves references to external identifiers,
but can only do so if the naming conventions are
consistent.
 C does not change the case of names, and it is case
sensitive. C (compilers) normally also prefix the
beginning of external identifiers with an underscore
(_) character.
52223_10/4
The Interface Between High-Level and Low-Level Languages
Calling (HLL to LL) Conventions
 The calling conventions used by a program refer to
the low-level details about how procedures/functions
are called. This is often referred to as a low-level
protocol.
 We need to know the following:
• Which registers must be restored
• The parameter (argument) passing order
• Whether arguments are passed by value or reference
• How the stack pointer will be restored
• How function results will be returned
52223_10/5
The Interface Between High-Level and Low-Level Languages
...Calling (HLL to LL) Conventions
 It is often difficult for the same subroutine to be called
by different languages.
 A subroutine called by Pascal, for instance, expects its
parameters to be in a different order from the
subroutine called by C.
 One principle seems to be universal: when a HLL calls
a subroutine, it pushes its arguments on the stack
before executing a CALL instruction.
 But beyond this basic principle, languages vary in their
calling conventions.
52223_10/6
The Interface Between High-Level and Low-Level Languages
...Calling (HLL to LL) Conventions
 The calling program (caller) needs to know how arguments are
to be passed to the subroutine.
 The caller also needs to know if it is responsible for restoring
the original value of the stack pointer after the call.
 The called subroutine (callee) uses a calling convention to
decide how parameters are to be received.
 The callee also needs to know whether or not it is responsible
for restoring the stack pointer before returning.
 If the subroutine is a function, i.e. one that returns a result, then
the convention for returning the function result also needs to be
known.
52223_10/7
The Interface Between High-Level and Low-Level Languages
Stack Structure
52223_10/8
The Interface Between High-Level and Low-Level Languages
Procedure Linking
 An IA-32 processor provides two pointers for linking
of procedures: the stack-frame base pointer and the
return instruction pointer.
 When used in conjunction with a standard software
procedure-call technique, these pointers permit reliable
and coherent linking of procedures.
52223_10/9
The Interface Between High-Level and Low-Level Languages
Stack-Frame Base Pointer
 The stack is typically divided into frames. Each stack frame can
then contain local variables, parameters to be passed to another
procedure, and procedure linking information.
 The stack-frame base pointer (contained in the EBP register)
identifies a fixed reference point within the stack frame for the
called procedure.
 To use the stack-frame base pointer, the called procedure
typically copies the contents of the ESP register into the EBP
register prior to pushing any local variables on the stack.
 The stack-frame base pointer then permits easy access to data
structures passed on the stack, to the return instruction pointer,
and to local variables added to the stack by the called
procedure.
52223_10/10
The Interface Between High-Level and Low-Level Languages
Return Instruction Pointer
 Prior to branching to the first instruction of the called
procedure, the CALL instruction pushes the address in
the EIP register onto the current stack.
 This address is then called the return instruction
pointer and it points to the instruction where execution
of the calling procedure should resume following a
return from the called procedure.
 Upon returning from a called procedure, the RET
instruction pops the return-instruction pointer from the
stack back into the EIP register. Execution of the
calling procedure then resumes.
52223_10/11
The Interface Between High-Level and Low-Level Languages
Parameter Passing
 Parameters can be passed between procedures in any
of three ways:
• through general-purpose registers,
• on the stack, or
• in an argument list
52223_10/12
The Interface Between High-Level and Low-Level Languages
Passing Parameters Through the General
Purpose Registers
 The processor does not save the state of the generalpurpose registers on procedure calls.
 A calling procedure can thus pass up to six parameters
to the called procedure by copying the parameters into
any of these registers (except the ESP and EBP
registers) prior to executing the CALL instruction.
 The called procedure can likewise pass parameters
back to the calling procedure through general-purpose
registers.
52223_10/13
The Interface Between High-Level and Low-Level Languages
Passing Parameters on the Stack
 To pass a large number of parameters to the called
procedure, the parameters can be placed on the stack,
in the stack frame for the calling procedure.
 Here, it is useful to use the stack-frame base pointer
(in the EBP register) to make a frame boundary for
easy access to the parameters.
 The stack can also be used to pass parameters back
from the called procedure to the calling procedure.
52223_10/14
The Interface Between High-Level and Low-Level Languages
Passing Parameters in an Argument List
 An alternate method of passing a larger number of
parameters (or a data structure) to the called procedure
is to place the parameters in an argument list in one of
the data segments in memory.
 A pointer to the argument list can then be passed to the
called procedure through a general-purpose register or
the stack.
 Parameters can also be passed back to the calling
procedure in this same manner.
52223_10/15
The Interface Between High-Level and Low-Level Languages
Some examples...
52223_10/16
The Interface Between High-Level and Low-Level Languages
Block Scope
#include <stdio.h>
#define PutDigit(c) {putchar((c)+'0');\
putchar('\n');}
void Scope(void) {
int level = 1; PutDigit(level);
{ int level = 2; PutDigit(level);
{ int level = 3; PutDigit(level); }
PutDigit(level);
}
PutDigit(level);
}
int main( ) {
Scope();return 0;
}
52223_10/17
The Interface Between High-Level and Low-Level Languages
Recursive Functions
 How does the following program work?
#include <stdio.h>
#define PrintDigit(c) (putchar((c)+'0'))
void PrintNumber(unsigned int number) {
if (number >= 10)
PrintNumber(number/10);
PrintDigit(number%10);
}
int main( ) {
PrintNumber(1984);
putchar('\n');
return 0;
}
52223_10/18
The Interface Between High-Level and Low-Level Languages
main:
...Recursive Functions
pushl
movl
subl
andl
movl
call
movl
movl
movl
call
movl
movl
popl
ret
52223_10/19
%ebp
%esp, %ebp
$8, %esp
$-16, %esp
$1984, (%esp)
PrintNumber
stdout, %eax
%eax, 4(%esp)
$10, (%esp)
_IO_putc
$0, %eax
%ebp, %esp
%ebp
The Interface Between High-Level and Low-Level Languages
...Recursive Functions
PrintNumber:
pushl
movl
subl
movl
movl
cmpl
jbe
movl
movl
mull
shrl
movl
call
.L2:
52223_10/20
%ebp
%esp, %ebp
$24, %esp
%ebx, -4(%ebp)
8(%ebp), %ebx
$9, %ebx
.L2
$-858993459, %edx
%ebx, %eax
%edx
$3, %edx
%edx, (%esp)
PrintNumber
The Interface Between High-Level and Low-Level Languages
...Recursive Functions
.L2:
movl
movl
mull
shrl
leal
addl
movl
subl
addl
movl
movl
movl
call
movl
movl
popl
ret
52223_10/21
$-858993459, %edx
%ebx, %eax
%edx
$3, %edx
(%edx,%edx,4), %eax
%eax, %eax
%ebx, %edx
%eax, %edx
$48, %edx
stdout, %eax
%eax, 4(%esp)
%edx, (%esp)
_IO_putc
-4(%ebp), %ebx
%ebp, %esp
%ebp
The Interface Between High-Level and Low-Level Languages