Development tools and Programming for embedded processors
Download
Report
Transcript Development tools and Programming for embedded processors
嵌入式處理器架構與程式設計
王建民
中央研究院 資訊所
2008年 7月
Contents
Introduction
Computer Architecture
ARM Architecture
Development Tools
GNU Development Tools
ARM Instruction Set
ARM Assembly Language
ARM Assembly Programming
GNU ARM ToolChain
Interrupts and Monitor
2
Lecture 4
Development Tools
Outline
Compilers
Assemblers
Linkers and Loaders
Runtime Environment
4
What is a compiler?
A program translator
Source language
E.g., C, C++, Java, Pascal
Target language
E.g., assembly language for x86, MIPS, ARM
5
Historical Background
Machine language first
1957: First FORTRAN compiler
18 programmer-years of effort
Extremely ad hoc
Today’s techniques were created in response
to the difficulties of implementing early
compilers
6
Phases of a Compiler
Analysis (“front end”)
Synthesis (“back end”)
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Intermediate Code Optimization
Target Code Generation/Optimization
Front & back ends share symbol table
7
Lexical Analysis
Aka “scanning”, transform characters into
tokens
Example:
TDOUBLE
TIDENT
TOP
double f = sqrt(-1); TIDENT
TLPAREN
TOP
TINTCONSTANT
TRPAREN
TSEP
(“double”)
(“f”)
(“=“)
(“sqrt”)
(“(“)
(“-”)
(“1”)
(“)”)
(“;”)
8
Syntax Analysis
Aka “parsing”
Uses context-free grammars
Structural validation
Creates parse tree or derivation
9
Derivation of “sqrt(-1)”
Expression -> UnaryExpression
Expression -> FuncCall
Expression -> TINTCONSTANT
UnaryExpression -> TOP Expression
FuncCall
-> TIDENT TLPAREN Expression TRPAREN
Expression
-> FuncCall
-> TIDENT TLPAREN
-> TIDENT TLPAREN
-> TIDENT TLPAREN
-> TIDENT TLPAREN
Expression TRPAREN
UnaryExpression TRPAREN
TOP Expression TRPAREN
TOP TINTCONSTANT TRPAREN
10
Parse Tree of “sqrt(-1)”
Expression
FuncCall
Expression
UnaryExpression
Expression
TIDENT
TLPAREN
TOP
TINTCONSTANT
TRPAREN
11
Semantic Analysis
“Does it make sense”?
Checking semantic rules, such as
Is variable declared?
Are operand types compatible?
Do function arguments match function
declarations?
Types
12
Intermediate Code Generation
A program for an abstract machine
Requirements
Easy to generate from parse tree
Easy to translate into target code
A variety of forms
Quadruple or three-address code
Register transfer language
13
Intermediate Code Example
Three-address code (TAC)
j = 2 * i + 1;
if (j >= n)
j = 2 * i + 3;
return a[j];
L0:
t1 = 2 * i
t2 = t1 + 1
j = t2
t3 = j < n
if t3 goto L0
t4 = 2 * i
t5 = t4 + 3
j = t5
t6 = a[j]
return t6
14
Intermediate Code Optimization
Inhibiting code generation of unreachable
code segments
Getting rid of unused variables
Eliminating multiplication by 1 and addition
by 0
Loop optimization
Common sub-expression elimination
. . ., etc.
15
Code Optimization Example
Before
L0:
t1 = 2 * i
t2 = t1 + 1
j = t2
t3 = j < n
if t3 goto L0
t4 = 2 * i
t5 = t4 + 3
j = t5
t6 = a[j]
return t6
After
t1 = 2 * i
j = t1 + 1
t3 = j < n
if t3 goto L0
j = t1 + 3
L0:
t6 = a[j]
return t6
16
Target Code Generation
Example: a in %o0, i in %o1, n in %o2, j in %g2
t1 = 2 * i
sll %o1, 1, %o1
j = t1 + 1
t3 = j < n
if t3 goto L0
add
cmp
blt
nop
add
j = t1 + 3
L0: t6 = a[j]
return t6
delayed branch
%o1, 1, %g2
%g2, %o2
.LL3
%o1, 3, %g2
.LL3: sll %g2, 2, %g2
retl
ld [%o0+%g2], %o0
17
Pascal Example: Source Code
PROGRAM STATS
VAR
SUM,SUMSQ,I,VALUE,MEAN,VARIANCE : INTEGER
BEGIN
SUM := 0;
SUMSQ := 0;
FOR I := 1 TO 100 DO
BEGIN
READ(VALUE);
SUM := SUM + VALUE;
SUMSQ := SUMSQ + VALUE * VALUE
END;
MEAN := SUM DIV 100;
VARIANCE := SUMSQ DIV 100 – MEAN * MEAN;
WRITE(MEAN,VARIANCE)
END.
18
Pascal Example: Token Coding
Token
PROGRAM
VAR
BEGIN
END
END.
INTEGER
FOR
READ
WRITE
TO
DO
Code
1
2
3
4
5
6
7
8
9
10
11
Token
;
:
,
:=
+
*
DIV
(
)
id
int
Code
12
13
14
15
16
17
18
19
20
21
22
23
19
Scanner Output: Token Stream I
Line
1
2
3
4
5
Token type
1
22
2
22
14
22
14
22
14
22
14
22
14
22
13
6
3
22
15
Toke specifier
Line
^STATS
6
^SUM
^SUMSQ
7
^I
^VALUE
^MEAN
^VARIANCE
^SUM
8
9
Token type
23
12
22
15
23
12
7
22
15
23
10
23
11
3
8
20
22
21
12
Toke specifier
#0
^SUMSQ
#0
^I
#1
#100
^VALUE
20
Scanner Output: Token Stream II
Line
10
11
12
13
Token type
22
15
22
16
22
12
22
15
22
16
22
18
22
4
12
22
15
22
19
Toke specifier
^SUM
Line
^SUM
14
^VALUE
^SUMSQ
^SUMSQ
^VALUE
^VALUE
15
^MEAN
^SUM
16
Token type
23
12
22
15
22
19
23
17
22
18
22
12
9
20
22
14
22
21
5
Toke specifier
#100
^VARIANCE
^SUMSQ
#100
^MEAN
^MEAN
^MEAN
^VARIANCE
21
Pascal Example: BNF Grammar
1
<prog>
2
3
4
5
6
7
8
9
10
11
<prog-name>
<decl-list>
<dec>
<type>
<id-list>
<stmt-list>
<stmt>
<assign>
<exp>
<term>
12
13
14
15
16
17
<factor>
<read>
<write>
<for>
<index-exp>
<body>
::= PROGRAM <prog-name> VAR <decl-list>
BEGIN <stmt-list> END.
::= id
::= <dec> | <decl-list> ; <dec>
::= <id-list> : <type>
::= INTEGER
::= id | <id-list> , id
::= <stmt> | <stmt-list> ; <stmt>
::= <assign> | <read> | <write> | <for>
::= id := <exp>
::= <term> | <exp> + <term> | <exp> - <term>
::= <factor> | <term> * <factor>
| <term> DIV <factor>
::= id | int | ( <exp> )
::= READ ( <id-list> )
::= WRITE ( <id-list> )
::= FOR <index-exp> DO <body>
::= id := <exp> TO <exp>
::= <stmt> | BEGIN <stmt-list> END
22
Parser Output: Parse Tree I
23
Parser Output: Parse Tree II
24
Intermediate Code I
Operation Op1
Op2
Result
(1) :=
#0
SUM
{SUM := 0}
(2) :=
#0
SUMSQ
{SUMSQ := 0}
(3) :=
#1
I
{FOR I := 1 TO 100}
(4) JGT
I
(5) CALL
XREAD
(6) PARAM
VALUE
(7) +
SUM
(8) :=
i1
(9) *
VALUE
VALUE
i2
(10) +
SUMSQ
i2
i3
(11) :=
i3
(12) +
I
#100
(15)
{READ(VALUE)}
VALUE
i1
SUM
{SUM :=
SUM + VALUE}
{SUMSQ := SUMSQ
+ VALUE * VALUE}
SUMSQ
#1
i4
{end of FOR loop}
25
Intermediate Code II
Operation Op1
(13) :=
Op2
i4
Result
I
(14) J
(4)
(15) DIV
SUM
#100
i5
(16) :=
i5
(17) DIV
SUMSQ
#100
i6
(18) *
MEAN
MEAN
i7
(19) -
i6
i7
i8
(20) :=
i8
(21) CALL
XWRITE
{WRITE
(22) PARAM
MEAN
(MEAN,VARIANCE)}
(23) PARAM
VARIANCE
VARIANCE)}
MEAN
{MEAN :=
SUM DIV 100}
{VARIANCE :=
SUMSQ DIV 100
- MEAN * MEAN}
VARIANCE
26
Assembly Code I
27
Assembly Code II
28
Compiler Issues
Symbol Table Management
Scoping
Error Handling & Recovery
Passes
One-pass vs. multi-pass
Most compilers are one-pass up to code
optimization phase
Several passes are usually required for code
optimization
29
End-to-End Compilation
source Lexical tokens
code analysis
IR
gen.
IR
Syntax
analysis
AST
IR
optimization
IR
parse
tree
Semantic
analysis
Target code
generation
Assembly code
Assembler
object
code
Linker & executable
Execution
Loader
code
30
Outline
Compilers
Assemblers
Linkers and Loaders
Runtime Environment
31
C versus Assembly Language
C is called a “portable assembly language”
Allows low level operations on bits and bytes
Allows access to memory via use of pointers
Integrates well with assembly language
functions
Advantages over assembly code
Easier to read and understand source code
Requires fewer lines of code for same function
Doesn’t require knowledge of the hardware
32
C versus Assembly Language
Good reasons for learning assembly
language
In time-critical sections of code, it is possible to
improve performance with assembly language
It is a good way to learn how a processor works
In writing a new operating system or in porting
an existing system to a new machine, there are
sections of code which must be written in
assembly language
33
Best of Both Worlds
Integrating C and assembly code
Convenient to let C do most of the work and
integrate with assembly code where needed
Make our gas routines callable from C
Use C compiler conventions for function calls
Preserve registers that C compiler expects
saved
34
GNU vs. Intel Assembler1
In this course, we will be using the GNU
assembler, referred to as “gas”
Available on UNIX machines as “i386-as”
The GNU assembler uses the AT&T syntax
(instead of official Intel/Microsoft syntax)
Text is written using Intel assembly language
syntax which is not the same as GNU syntax
Local references will provide gas notes for the
text sections with Intel assembly language
35
GNU vs. Intel Assembler2
Overall, the follow are the key differences
between the Intel and the gas syntax:
The GNU operation codes have a size indicator
that is not present on the Intel operation codes
Intel: MOV gas: movb, movw, or movl
The GNU operands are in the opposite order
from the Intel operands
Intel: MOV dest, source gas: movb source, dest
36
GNU vs. Intel Assembler3
The GNU register names are preceded by a %
that is not present on the Intel register names
GNU constants are represented differently from
the Intel constant representations
Intel: MOV AH, AL gas: movb %al, %ah
Intel: MOV AL, 0AH gas: movb $0xa, %al
Comments are indicated with # instead of ;
Intel: ; comment here gas: # comment here
37
GNU vs. Intel Assembler4
You should familiarize yourself with both
Intel and GNU assembly language syntax
Even for GNU assembler, the syntax may
not be the same on different platforms.
You may need to use Intel syntax in your
professional work someday
38
The Four Field Format1
The Label Field
A label is a symbol followed by :
Can be referred to as a representation of the
address
The ‘Opcode’ Field
Mnemonic to specify the instruction and size
Unnecessary to remember instruction code values
Directives to guide the work of the assembler
In GNU assembly language, directive begins with .
39
The Four Field Format2
The Operand Field(s)
On which the instruction operates
Zero, one, or two operands depending on the
instruction
The Comment Field
Comment contains documentation
It begins with a # anywhere and goes to the end
of the line
40
Symbolic Constants
Allow use of symbols for numeric values
Perform same function as #define in C
Format is:
SYMBOL = value
Example:
NCASES = 8
movl $NCASES, %eax
41
Assembly Coding for a C Function
General form for a C function in assembly:
.globl
.text
_mycode
_mycode:
_mydata:
. . .
ret
.data
.long
.end
17
42
Assembler Directives1
Defining a label for external reference (call)
.globl
_mycode
Defining code section of program
.text
Defining data section of program
.data
End of the Assembly Language
.end
43
Assembler Directives2
Defining / initializing storage locations:
.long
.word
.byte
0x12345678
0x1234
0x12
# 32 bits
# 16 bits
# 8 bits
Defining / initializing a string
.ascii
.asciz
“Hello World\n\0”
“Hello World\n”
44
C Function Coding Conventions
Same function name as used in the calling C
program except with a leading _
Use only %eax, %ecx, and %edx to avoid
using registers that the C compiler expects
to be preserved across a call and return
Save/restore other registers on stack as
needed
Return value in %eax before “ret”
instruction
45
Example #1: Sum Two Numbers
C “driver” to execute sum2.s is called
sum2c.c
extern int sum2(void);
int main(void)
{
printf(“Sum2 returned %d\n”, sum2());
return 0;
}
46
Example #1: Assembly Code
Assembly code for sum2.s
#sum2.s -- Sum of
.text
.globl
_sum2: movl $8,
addl $3,
ret
two numbers
_sum2
%eax
%eax
#number in eax
47
How to pass parameters?
How would you modify the source code so
that sum2(int a, int b) returns the sum of
two integer parameters a and b?
extern int sum2(int a,int b);
int main(void)
{
printf(“3 + 8 = %d\n”, sum2(3,8));
return 0;
}
48
Addressing Memory and I/O
With gas, we’ve already seen operands with:
% for registers (part of the processor itself)
$ for immediate data (part of the instruction
itself)
Accessing memory versus I/O
Memory
I/O
Read
movb address, %al
inb address, %al
Write
movb %al, address
outb %al, address
49
Addressing Memory1
Direct addressing for memory
Intel uses [ ] gas does not use any “operator”
Example:
.text
movl %eax, 0x1234
movl 0x1234, %edx
. . .
50
Addressing Memory2
Direct addressing for memory
Gas allows use of a variable name for address
Examples:
total:
.text
movl %eax, total
movl total, %edx
. . .
.data
.long
0
51
Addressing Memory3
Direct addressing for memory
Why can’t we write an instruction such as this?
movl
first, second
Intel instruction set does not support
instructions to move a value from memory to
memory!
Must always use a register as an
intermediate location for the value being
moved
52
Addressing Memory4
Indirect Addressing
Defined as using a register as the address of the
memory location to access in an instruction
movl
movb
$0x1234, %ebx
(%ebx), %al
Memory
%ebx
0x00001234
One byte
%al
53
Addressing Memory5
Indirect Addressing
May also be done with a fixed offset, e.g. 4
movl $0x1234, %ebx
movb 4(%ebx), %al
Memory
%ebx
0x00001234
+4
%al
One byte
54
Outline
Compilers
Assemblers
Linkers and Loaders
Runtime Environment
55
Linker and Loader Functions
Allocation
Loading
Relocation
Linking
Execution
56
Program Relocation
問題:如果程式載入的起始位址和原始
程式的預定值不同,則執行時無法得到
預期的結果
Absolute Program
Relocatable Program
解決方法
利用PC relative或base relative定址模式
提供address modification的資訊給loader,於
程式載入執行前完成修正的工作
57
real address = starting address + offset
Program Linking
允許獨立的程式單元,可以個別組譯,
執行時才和其他程式單元連結
Section:可以獨立組譯、載入和重定位的程
式單元,通常用於副程式或程式的邏輯單元
External definition:定義section內的symbols
給其他section使用
External reference:宣告section內所使用的
symbols為定義在其他section的external
symbols
59
Classification
Loader: a system program that performs the
loading function.
Absolute Loader
Relocating Loader
Linking Loader
Linker: a system program that performs the
linking function.
Linkage Editor
Binder
64
Linkage Editors1
作用
Linkage editor執行linking和部分relocation的
工作,並將連結好的程式寫到檔案或程式庫
說明
連結好的程式稱為load module或executable
image
執行時只需利用簡單的relocating loader就可
以將程式載入記憶體內
載入的動作只需一個pass即可完成
65
Linkage Editors2
比較
Linking loader於每次執行時,都要進行所有
linking的工作,並須處理automatic library
search和external references;linkage editor只
須於產生load module時進行一次上述工作
Linking loader將連結好的程式直接置於記憶
體內;linkage editor將連結好的程式置於檔
案或程式庫內
Linking loader較浪費時間,適合於每次執行
都要重新組譯的狀況;linkage editor適合執
行一個程式很多次而不必重新組譯的狀況
67
Dynamic Linking1
作用
程式執行前不做任何連結的工作,直到程式
執行後,當一個副程式第一次被呼叫時才被
載入記憶體,並和程式的其他部分連結
也稱為dynamic loading或load-on-call
比較
Linkage editor在程式被載入執行前進行連結
Linking loader在程式被載入時進行連結
Dynamic linking將連結延遲到程式執行後
68
Dynamic Linking2
優點
副程式真正用到時才將其載入,可節省載入
和連結沒有被執行到的副程式所需的時間和
佔用的空間
允許使用者以交談的方式呼叫任何副程式,
而不必載入整個程式庫
系統可以隨時動態地進行reconfiguration
缺點
系統較複雜而且有較多的額外工作(overhead)
69
Dynamic Linking3
方法
將dynamic loader視為作業系統的一部分,凡
是需要動態載入的程式都必須透過作業系統
服務來呼叫
呼叫副程式時,程式發出load-and-call的作
業系統服務請求,並以副程式名稱為參數
作業系統檢查內在表格以決定副程式是否已
載入,如尚未載入則將副程式由程式庫載入
作業系統將控制權轉移給被呼叫的副程式
被呼叫的副程式完成處理後,先將控制權傳
回作業系統,再由作業系統傳回原來的程式
72
Outline
Compilers
Assemblers
Linkers and Loaders
Runtime Environment
73
Runtime Environment
To understand the environment in which
your final output will be running.
How a program is laid out in memory:
Code
Data
Stack
Heap
How function callers and callees pass info
74
Executable Layout in Memory
From low memory up:
Code (text segment, instructions)
Static (constant) data
Global data
Dynamic data (heap)
Runtime stack (procedure calls)
Review of what’s in each section:
stack
heap
globl
static
code
75
Text Segment (Executable Code)
Actual machine instructions
Code segment write-protected, so running
code can’t overwrite itself.
Arithmetic / logical / comparison / branch /
jump / load / store / move / …
(Debugger can overwrite it.)
You’ll create the precursor for the code in
this segment by emitting assembly code.
Assembler will build final text.
76
Data Segment1
Data Objects
Whose size is known at compile time
Whose lifetime is the full run of the program
(not just during a function invocation)
Static data includes things that won’t change
(can be write-protected):
Virtual-function dispatching tables
String literals used in instructions
Arithmetic literals could be, but more likely
incorporated into instructions.
77
Data Segment2
Global data (other than static)
Variables declared global
Local variables declared static (in C)
Declared local to a function.
Retain values even between invocations of that
function (lifetime is whole run).
Semantic analysis ensures that static locals are
not referenced outside their function scope.
78
Dynamic Data (Heap)1
Data created by malloc or New.
Heap data lives until deallocated or until
program ends. (Sometimes longer than you
want, if you lose track of it.)
Garbage collection / reference counting are
ways of automatically de-allocating dead
storage in the heap.
79
Dynamic Data (Heap)2
Heap allocation starts at bottom of heap (lower
addresses) and allocates upward.
Requirements of alignment, specifics of allocation
algorithm may cause storage to be allocated out of
(address) order.
*p3
p1 = new Big();
p2 = new Medium();
*p2
*p4
p3 = new Big();
p4 = new Tiny();
So (int)p2 > (int)p1
*p1
0x1000000
But (int)p4 < (int)p3
Compare pointers for equality, not < or >.
80
Runtime Stack1
Data used for function invocation:
Variables declared local to functions (including
main) aka “automatic” data.
Except for statics (in data segment)
Variables declared in anonymous blocks inside
function.
Arguments to function (passed by caller).
Temporaries used by generated code (not
representing names in source).
Possibly value returned by callee to caller.
81
Runtime Stack2
Types of data that can be allocated on
runtime stack:
In C, all kinds of data: simple types, structs,
arrays.
C++: stack can hold objects declared as class
type, as well as pointer type.
Some languages don’t allow arrays on stack.
82
Stack Terminology1
A stack is an abstract data type.
Top
Base
Push new value onto Top; pop value off Top.
Higher elements are more recent, lower
elements are older.
83
Stack Terminology2
Stack implementation can grow any direction.
MIPS stack grows downward (from higher
memory addresses to lower).
Possible difficulty with terminology.
Some people (and documents) talk about going “up”
and “down” the stack.
Some use the abstraction, where “up” means “more
recent”, towards Top.
Some (including gdb) say “up” meaning “towards older
entries”, toward Base.
84
Other Resources
Caches (very fast)
Physical memory (fast)
Virtual memory (swapping is slower)
Possibly multiple levels
Includes main memory + swap space on disk
Registers (the fastest)
85
Storage Layout Issues
Variables (local & file-scope)
Functions
Objects
Arrays
Strings
86
Arrays
C uses row-major order
Whole first row is stored first, then whole
second row, … then whole nth row.
Fortran uses column-major order
Whole first column is stored first, then whole
second col, … then whole kth col.
Storage still a big block, but a column is
contiguous instead of a row.
87
Generating Code for Array Refs
In C, use size of element and range of each
dimension to compute the offset of any
given element:
If A has m rows and n columns of 4-byte
elements:
&A [i] [j] is &A + 4 * (n * i + j)
88
C struct Objects
Structs in C are stored in adjacent words,
enough for all fields to be aligned:
struct cow {
char milk;
// 3 slack-bytes after this
char* name; // aligned on single word } Cow;
‘A’
0x20000
“Bossy”
89
Making Function Calls
Each active function call has its own unique
stack frame
Frame pointer
Static link
Return address
Arguments
Local variables and temporaries
Who does which (caller vs callee)?
How do callers and callees communicate?
90
Who does what?1
Before a function call, the calling routine:
Saves any necessary registers
Pushes arguments onto the stack
Sets up the static link (if appropriate)
Saves the return address into $ra
Jumps (or branches) to the target (AKA the
callee, the called function)
91
Who does what?2
During a function call, the called routine:
Saves any necessary registers
Sets up the new frame pointer
Makes space for any locals or temporaries
Does its work
Sets up return value in $v0
Works only for integer or pointer values
Tears down frame pointer and static link
Restores any saved registers
Jumps to return address (saved on stack)
92
Who does what?3
After a function call, the calling routine:
Removes return address and parameters from
the stack
Gets return value from $v0
Restores any saved registers
Continues executing
93
Parameter Passing
Call by value
Call by value-result
Supported by Ada
Call by reference
Supported by C
Supported by Fortran
Call by name
Like C preprocessor macros
94
An Example Program
int dump(arg1, arg2, arg3, stop)
int arg1, arg2, arg3, *stop;
{
int loc1 = 5, loc2 = 6, loc3 = 7;
int *p;
printf("Address
Content\n");
for (p = stop; p >= (int*)(&p); p--)
printf("%8x:
%8x\n", p, *p);
return 9;
}
int main(argc, argv, envp)
int argc;
char *argv[], *envp[];
{
int var1 = 1, var2 = 2, var3 = 3;
var3 = dump(var1, var2, var3, &envp);
}
95
Sample Output
Address
bffff9a8:
bffff9a4:
bffff9a0:
bffff99c:
bffff998:
bffff994:
bffff990:
bffff98c:
bffff988:
bffff984:
bffff980:
bffff97c:
bffff978:
bffff974:
bffff970:
bffff96c:
bffff968:
bffff964:
bffff960:
bffff95c:
bffff958:
Content
bffff9ec
bffff9e4
1
420158d4
bffff9b8
1
2
3
bffff998
4212a2d0
4212aa58
bffff9a8
3
2
1
80483d1
bffff998
5
6
7
bffff958
Comment
envp
argv
argc
return address (crt0)
fp (crt0) <- fp (main)
var1
var2
var3
stop
arg3
arg2
arg1
return address (main)
fp (main) <- fp (dump)
loc1
loc2
loc3
p
96