Symbol table
Download
Report
Transcript Symbol table
Program design and
analysis
Designing embedded programs is more
difficult and challenging as compared to
PC/workstation programs.
Design challenges:
-
rich functionality
Meet system deadlines
Fit into restricted memory
Meet power requirements
Debugging
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Assembly and linking
Steps in compilation:
HLL
HLL
HLL
compiler
Symbolic to bit
level translation
assembly
assembly
assembly
code
assembler
Object
assembly
code
assembly
loader
© 2000 Morgan
Kaufman
Executable
binary
Overheads for Computers as
Components
linker
Exec built from
many files, addresses
linked, …
Multiple-module programs
Programs may be composed from several
files addresses for instructions/data are done by linker
Addresses become more specific during
processing:
relative addresses are measured
relative to the start of a module;
absolute addresses are measured
relative to the start of the CPU address
space.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Assemblers
Major tasks:
generate binary for symbolic
instructions;
translate labels into addresses;
handle pseudo-ops (data, etc.).
Starting point in
Assembly labels:
label1
© 2000 Morgan
Kaufman
ORG 100
ADR r4,c
memory
Pseudo-op
Overheads for Computers as
Components
Two-pass assembly
Pass 1:
Scan code to determine address of each
module generate symbol table
Pass 2:
Assembles instructions using labels
computed in first pass generate binary
instructions
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Symbol table
PLC
xx
yy
ADD r0,r1,r2
ADD r3,r4,r5
CMP r0,r3
SUB r5,r6,r7
assembly code
© 2000 Morgan
Kaufman
xx
yy
0x8
0x10
symbol table
Overheads for Computers as
Components
Symbol table generation
Use program location counter (PLC) to
determine address of each location.
Scan program, keeping count of PLC.
Addresses are generated at assembly
time, not execution time.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Symbol table example
PLC=0x7
Symbol table
ADD
r0,r1,r2
PLC=0xb
xx
ADD r3,r4,r5
PLC=0xf
CMP r0,r3
PLC=0x13
yy
SUB r5,r6,r7
xx
yy
0xb
0x13
NOTE: Assume that
each instruction takes
4 memory locations
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Pseudo-operations
Pseudo-ops do not generate instructions:
ORG sets program location.
EQU generates symbol table entry without
advancing PLC.
Example:
- Label FOO=5 will be
added to symbol table
add r0, r1, r2
FOO EQU 5
- BAZ will remain the
nd
same
as
when
2
BAZ SUB r3,r4,#FOO
statement is not present
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Linking
Programs are written as several smaller
pieces
Programs may use library routines (.o files)
Linker combines several object modules
into a single executable module.
Jobs:
put modules in order;
resolve labels across modules.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Externals and entry points
entry point
xxx
yyy
ADD r1,r2,r3
a
B a external reference
%1
ADR r4,yyy
ADD r3,r4,r5
Some labels are both
defined and used in
the same file
Some will be defined in
a single file but used
elsewhere
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Module ordering
Controlling where code modules are loaded
in memory is important, e.g., :
- Interrupt code must be placed at precise
locations in memory
- Code has to be placed in specific memory
locations in EPROM/RAM.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
… Module ordering
Code modules must be placed in absolute
positions in the memory space.
Load map or linker flags (given by user)
control the order of modules.
module1
module2
module3
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Dynamic linking
Some operating systems link modules
dynamically at run time:
shares one copy of library among all
executing programs Saves storage space as we
do not have to link separate copies of commonly used routines
such as I/o to every executable program
allows programs to be updated with
new versions of libraries.
Some delay is introduced before
program starts execution
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program design and
analysis
We need to control:
where interrupt code is placed
where data and instructions are placed
execution speed, memory/power consumption
Compilation flow.
Basic statement translation.
Basic optimizations.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Compilation:
High level Low level code
Compilation strategy:
compilation = translation + optimization
High level language translation
Better instruction sequence
Compiler determines quality of code:
use of CPU resources, e.g., register usage
memory access scheduling;
code size.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Basic compilation phases
HLL
parsing, symbol table, semantic analysis
machine-independent
optimizations
machine-dependent
optimizations
assembly
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
e.g., arithmetic
simplifications
(use one memory
location for x in
x= c*x )
Statement translation and
optimization
Source code is translated into
intermediate form such as CDFG.
CDFG is transformed/optimized.
CDFG is translated into instructions with
optimization decisions machine dependent
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Arithmetic expressions
a*b + 5*(c-d)
b
a
c
*
expression
5
-Some machines perform
memory to memory arithmetic
x
w
-In many machines, e.g.,
ARM, we have to load
variables into registers
*
y
+
z
DFG
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
d
Temporary
variables
Arithmetic expressions,
cont’d.
b
a
1
c
*
2
5
3
4
*
+
d
-
ADR r4,a
MOV r1,[r4]
ADR r4,b
MOV r2,[r4]
MUL r3,r1,r2
ADR r4,c
MOV r1,[r4]
ADR r4,d
MOV r5,[r4]
SUB r6,r4,r5
2
MUL r7,r6,#5
3
ADD r8,r7,r3
DFG
© 2000 Morgan
Kaufman
1
code
Overheads for Computers as
Components
4
An obvious
optimization:
Use a register
whose value is
not needed
Control code generation
if (a+b > 0)
x = 5;
else
x = 7;
a+b>0
x=7
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
x=5
Control code generation,
cont’d. CDFG model (1-2-3/other walk)
1
3
a+b>0
T
x=5
2
ADR r5,a
LDR r1,[r5]
ADR r5,b
LDR r2,b
ADD r3,r1,r2
BLE label3
1
LDR r3,#5
ADR r5,x
STR r3,[r5]
B stmtent
x=7
label3
© 2000 Morgan
Kaufman
LDR r3,#7
ADR r5,x
STR r3,[r5]
stmtent ...
2
Overheads for Computers as
Components
3
Procedure linkage:
CPU’s
subroutine call is not usually sufficient to support
procedures in programming languages
Need code to:
call and return;
pass parameters and results.
Parameters and returns are passed on
stack.
Procedures with few parameters may use
registers.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
A Typical Procedure Linkage:
f(arg_0, …, arg_n-1)
Push arg_n-1 (onto stack)
…
Push arg_1
Push arg_0
Store return address in a register
Branch to procedure B (f, Return address)
De-allocate arguments POP n words off the
stack
Use a register for return variable of f(.)
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Stack-based management- handled by
compiler
A block of memory called “stack frame”
is required for every call of a function.
The stack grows and shrinks during
execution difficult to predict worst case at compile time
SP
Total
Stack
space
© 2000 Morgan
Kaufman
SP
Used
Space
Overheads for Computers as
Components
SP
Procedure stacks
proc1(int a) {
int b, c;
b=5;
proc2(b);
c=2+b;
…
}
growth
proc1
FP
frame pointer: defines
end of last frame
SP
stack pointer: defines
end of current frame
© 2000 Morgan
Kaufman
proc2
5
accessed relative to SP
Overheads for Computers as
Components
ARM procedure linkage
APCS (ARM Procedure Call Standard):
BL foo
r0-r3 pass parameters into procedure. Extra
parameters are put on stack frame.
r0 holds return value.
r4-r7 hold register values.
r11 is frame pointer, r13 is stack pointer.
r10 holds limiting address on stack size to
check for stack overflows.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
8051, PIC:
Static Allocation- no runtime memory allocation
How each byte of RAM is used is
established at compile time
Global and static data are allocated in
fixed locations.
Local data is stored in a block set for
each function, i.e., local data x is stored in
the same place for each invocation nonreentrant code
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Data layout in memory:
1D, 2D arrays, structures
Compiler: must translate references to
data structures to references in raw
memory address computations
Different types of data structures use
different data layouts.
Some offsets into data structure can be
computed at compile time, others must be
computed at run time.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
One-dimensional arrays
C array name points to 0th element:
aptr
a[0]
a[1]
= *(aptr + 1)
a[2]
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Two-dimensional arrays:
different possibilities
2-dim array is accessed thru 1-dim array
Row-major layout:
a[0,0]
a[0,1]
...
N
...
a[1,0]
a[1,1]
© 2000 Morgan
Kaufman
M
Calculation done at
run time when i, j
change
= a[i*M+j]
Overheads for Computers as
Components
Structures
Fields within structures are static offsets:
aptr
struct {
int field1;
char field2;
} mystruct;
field1
field2
struct mystruct a, *aptr = &a;
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
4 bytes
*(aptr+4)
Addition
done at
compile
time
(a.filed2)
Expression simplification:
Machine independent
Constant folding:
8+1 = 9
Algebraic:
a*b + a*c = a*(b+c)
Strength reduction:
Addition done at compile
time
a*2 a<<1
for (i=0 ; i<NOPS+1; i++) …
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Dead code elimination
Dead code:
Compile flag
#define DEBUG 0
if (DEBUG) dbg(p1);
0
0
Can be eliminated by
analysis of control
flow, constant folding.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
1
dbg(p1);
Procedure inlining
Eliminates procedure linkage overhead:
int foo(a,b,c) { return a + b - c;}
z = foo(w,x,y);
Inlining result: z = w + x - y;
C++ provides an inline construct.
More code/memory size
may slow down cached systems
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Loop transformations
Loops are compact but can contribute to a
large fraction of computation time
Goals:
reduce loop overhead;
improve memory system performance.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Loop unrolling
Reduces loop overhead, enables
some other optimizations.
for (i=0; i<4; i++)
a[i] = b[i] * c[i];
iteration and test contribute to overhead
a[0]=b[0]*c[0], …, a[3]=b[3]*c[3]
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Loop fusion
Fusion combines two loops into 1:
for (i=0; i<N; i++) a[i] = b[i] * 5;
for (j=0; j<N; j++) w[j] = c[j] * d[j];
for (i=0; i<N; i++) {
a[i] = b[i] * 5; w[i] = c[i] * d[i];
}
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Register allocation
Goals:
choose register to hold each variable;
determine lifespan of variable in the
register.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Register lifetime graph
w = a + b; t=1
x = c + w; t=2
y = c + d; t=3
Naive allocation:
Assign each variable to
separate registers (7
registers)
a
b
c
d
w
x
y
1
2
3
time
Maximum 3 registers in use
3 registers required
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Using your compiler
Understand various optimization levels (O1, -O2, etc.)
Study the assembly language output.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program design and
analysis
Optimizing for execution time.
Optimizing for energy/power.
Optimizing for program size.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Motivation
Embedded systems must often meet
deadlines.
Faster may not be fast enough.
Need to be able to analyze execution
time.
Worst-case, not typical.
Need techniques for reliably improving
execution time.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Run times will vary
Program execution times depend on
several factors:
Input data values different execution
paths
State of the instruction, data caches.
Pipelining effects.
Other: e.g., Floating point operations- most
sensitive to data values
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Measuring program speed
CPU simulator.
I/O may be hard.
May not be totally accurate.
Hardware timer start/stop at beginning/end of code
Requires board, instrumented program.
Logic analyzer.
Limited logic analyzer memory buffer.
Relies on code being able to produce
identifiable bus events.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program performance
metrics
Average-case:
For typical data values, whatever they are.
Worst-case:
for meeting deadlines
For any possible input set.
Best-case:
For any possible input set.
Too-fast programs may cause critical
races at system level timing
requirements should be met as well
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Performance analysis
Elements of program performance
(Shaw):
execution time = program path + instruction
timing
Path is the sequence of instructions and
depends on data values.
Instruction timing depends on: data
dependency, pipeline/cache behavior.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program paths:
longest path != longest
execution time
Consider for loop:
i=0; f=0;
for (i=0, f=0, i<N; i++)
f = f + c[i]*x[i];
Loop initiation block
executed once.
Loop test executed
N+1 times.
Loop body and
variable update
executed N times.
© 2000 Morgan
Kaufman
i<N
N
Y
f = f + c[i]*x[i];
Overheads for Computers as
Components
i = i+1;
Loop optimizations
Loops are good targets for optimization.
Basic loop optimizations:
code motion;
induction-variable elimination;
strength reduction (x*2 -> x<<1).
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Code motion:
Moving unnecessary
code out of a loop
Performed at every iteration,
i.e., N*(M-1) executions
i=0; Xi=0;
= N*M
i<N*M
i<X
Y
for (i=0; i<N*M; i++)
z[i] = a[i] + b[i];
z[i] = a[i] + b[i];
i = i+1;
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
N
Induction variable
elimination
Induction variable: value derived from
loop iteration, e.g., loop index.
Consider loop:
for (i=0; i<N; i++)
for (j=0; j<M; j++)
z[i,j] = b[i,j];
Rather than re-compute i*M+j for each
array in each iteration, share induction
variable between arrays, increment at end
of loop body.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Induction variable
elimination
for (i=0; i<N; i++)
for (j=0; j<M; j++){
zbinduct = i*M + j;
*(zptr+zbinduct)=*(bptr+zbinduct);
}
OR: Eliminate multiplication (strength reduction)
zbinduct=0;
for (i=0; i<N; i++)
for (j=0; j<M; j++){
*(zptr+zbinduct)=*(bptr+zbinduct);
zbinduct++;
© 2000 Morgan }
Overheads for Computers as
Kaufman
Components
Performance optimization
hints
Simple high-level statements may be
time consuming, e.g., dyn. mem. alloc
malloc( )
Use registers efficiently.
Use page mode memory accesses
arrange data contiguously so that they go in one
page
Analyze cache behavior
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Energy/power optimization
Energy: ability to do work.
Most important in battery-powered systems.
Power: energy per unit time.
Important even in wall-plug systems---power
becomes heat Controlling power increases
reliability and cost
Memory accesses are a major component of power
consumption Reduce memory access
Turn-off parts of the system when not needed
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Measuring energy
consumption
Execute a small loop, measure current:
Ammeter
I
while (TRUE)
a();
CPU
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Calculate power:
Power with a() in the
while loop –
Power without a() in the
while loop
Sources of energy
consumption
Relative energy per operation:
memory transfer: 33 most expensive
external I/O: 10
SRAM write: 9
SRAM read: 4.4
multiply: 3.6
add: 1
register access (<1) most efficient
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Cache behavior is
important
Cache is power hungry built from SRAM
Energy consumption has a sweet spot as
cache size changes:
cache too small: program thrashes,
burning energy on external memory
accesses;
cache too large: cache itself burns too
much power.
There is an optimum cache size
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Optimizing for energy
First-order optimization:
high performance = low energy.
Not many instructions trade speed for
energy.
Generally speaking: Making program run
faster also reduces energy consumption.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Optimizing for program
size
Goal:
reduce hardware cost of memory;
reduce power consumption of memory units.
Two opportunities:
data;
instructions.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Data size minimization
Reuse constants, variables, data buffers in
different parts of code.
Select buffer size carefully
Requires careful verification of correctness.
Generate data using instructions- on-thefly Although code takes space, there may be some
net space saving
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Reducing code size
Avoid function inlining.
Encapsulate functions in subroutines
Due to overhead there is a minimum size function body
for which a subroutine makes sense
Choose CPU with compact instructions.
Use specialized instructions where
possible e.g. Multiply-Accumulate is smaller and
faster than separate arithmetic operations
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program design and
analysis
Program validation and testing.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Goals
Make sure software works as intended.
We will concentrate on functional testing--performance testing is harder.
What tests are required to adequately test
the program?
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Basic testing procedure
Provide the program with inputs.
Execute the program.
Compare the outputs to expected results.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Types of software testing
Black-box: tests are generated without
knowledge of program internals.
Clear-box (white-box): tests are
generated from the program structure.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Program design and
analysis
Software modem.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Theory of operation
Frequency-shift keying:
separate frequencies for 0 and 1.
0
1
time
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
FSK encoding
Generate waveforms based on current bit:
0110101
© 2000 Morgan
Kaufman
bit-controlled
waveform
generator
Overheads for Computers as
Components
A/D converter
FSK decoding
zero filter
detector
0 bit
one filter
detector
1 bit
Passes frequencies in the range
representing 1 and reject those
representing 0.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Transmission scheme
Send data in 8-bit bytes. Arbitrary spacing
between bytes.
Byte starts with 0 start bit.
Receiver measures length of start bit to
synchronize itself to remaining 8 bits.
start (0)
bit 1
bit 2
Length denotes start bit
© 2000 Morgan
Kaufman
bit 3
...
bit 8
Samples taken at midpoints
Overheads for Computers as
Components
Requirements
Inputs
Analog sound input, reset button.
Outputs
Analog sound output, LED bit display.
Functions
Transmitter: Sends data from memory
in 8-bit bytes plus start bit.
Receiver: Automatically detects bytes
and reads bits. Displays current bit on
LED.
1200 baud.
Performance
Manufacturing cost
Power
Physical
size/weight
© 2000 Morgan
Kaufman
Dominated by microprocessor and
analog I/O
Powered by AC.
Small desktop object.
Overheads for Computers as
Components
Specification
Line-in*
1
1
Receiver
sample-in()
bit-out()
input()
Transmitter
1
1
bit-in()
sample-out()
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Line-out*
output()
System architecture
Interrupt handlers for samples:
input and output.
Transmitter.
Receiver.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Transmitter
Waveform generation by table lookup.
float sine_wave[N_SAMP] = { 0.0, 0.5,
0.866, 1, 0.866, 0.5, 0.0, -0.5, -0.866, -1.0, 0.866, -0.5, 0};
time
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Receiver
Filters (FIR for simplicity) use circular
buffers to hold data.
Timer measures bit length.
State machine recognizes start bits, data
bits.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Hardware platform
CPU.
A/D converter.
D/A converter.
Timer.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
Component design and
testing
Easy to test transmitter and receiver on
host.
Transmitter can be verified with speaker
outputs.
Receiver verification tasks:
start bit recognition;
data bit recognition.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components
System integration and
testing
Use loopback mode to test components
against each other.
Loopback in software or by connecting D/A
and A/D converters.
© 2000 Morgan
Kaufman
Overheads for Computers as
Components