DAC Presentation kit
Download
Report
Transcript DAC Presentation kit
Matlab Extensions for the
Development, Testing and Verification
of Real-Time DSP Software
David P. Magee
Communication Systems Engineer
Texas Instruments
Dallas, TX
Presentation Outline
DSP Software Development
DSP Simulator
Introduction to Intrinsics
FFT Example
Algorithm Optimization Results
Other Matlab and Simulink Extensions
Closing Remarks
Q&A
DSP Software Development
Common steps for DSP software development
Step 1: Develop
Understanding
Develop Floating
Point Simulation
Debug
Simulation
Step 2: Address
Scaling Issues
Develop Fixed
Point Simulation
Debug
Simulation
Step 3: Optimize
for Performance
Develop
Assembly Code
Debug
Assembly Code
Issues with the 3 Step Approach
Each step takes time and resources
Algorithm testing at each stage
Multiple versions of the algorithm – version control
headaches
Evaluation of processor instruction set
compatibility and MIPS requirements often occurs
late in the software development cycle
Debugging algorithms on a pipelined and/or
parallel processor can be very difficult (the
problem is getting more difficult as processors
become more complicated)
Can the development cycle be improved ? Yes !
Improved Software Development Cycle
Merge Steps 2 and 3
Step 1: Develop
Understanding
Develop Floating
Point Simulation
Debug
Simulation
Step 2: Address
Scaling Issues and
Optimize for
Performance
Simultaneously
Develop Fixed
Point Simulation
and Assembly
Code
Simultaneously
Debug
Simulation and
Assembly Code
Question: How can these steps be combined ?
Matlab + DSP Simulator
Develop Floating Point and Fixed Point Simulations in a single
development environment - Matlab
Develop and test C/C++ code for Fixed Point Simulation in
cooperation with the DSP Simulator
Migrate the C/C++ code directly to the target DSP
Floating Point
Simulation
System
Simulation
e1
Matlab Simulation
Environment
Fixed Point
Simulation
System
Simulation
Host
Environment
DSP
Simulator
DSP Simulator in Matlab
Develop and Debug Fixed Point
C/C++ Code in Matlab
DSP Simulator
Benefits:
Accelerate the development and
analysis of DSP code
C/C++ code
A mechanism to implement your
IP blocks in efficient DSP code
Process large amounts of data
Compare fixed point and floating
point algorithm implementations
Provide mixed simulation
environment with fixed point and
floating point algorithm
implementations
Advanced graphing capabilities
MEX-file
Matlab
What is a MEX-file ?
A file containing one function that interfaces C/C++
code to the Matlab shell
MathWorks specifies the syntax for this function
void mexFunction(int nlhs,mxArray *plhs[ ],
int nrhs,const mxArray *prhs[ ])
See http://www.mathworks.com
Enter
mex files into their Search engine
What is a DSP Simulator ?
A library of functions that simulate the
mathematical operations of DSP assembly
instructions.
For TI DSPs, the compiler recognizes special
functions called Intrinsics and maps them directly
into inline assembly instructions
In the DSP Simulator, make each function represent
a supported compiler Intrinsic
Intrinsic Example
ADD2: adds the upper and lower 16-bit portions of
a 32 bit register
Intrinsic: dst = _add2(src1,src2)
Assembly Instruction: ADD2 (.unit) src1,src2,dst
C code
C6x Assembly Code
Function Example() {
.
}
Example:
Compile
.
y = _add2(a,b);
ADD2 . S1 A1,A2,A3
.
.
.
DSP Simulator Example
DSP Simulator
C Code with _add2() Intrinsic
typedef struct _REG32X2
{
short lo;
C code
short hi;
} reg32x2;
Function Example() {
.
int32 _add2(int32 a,int32 b) {
y = _add2(a,b);
int32 y;
.
reg32x2 *pa,*pb,*py;
}
pa = (reg32x2 *)&a; pb = (reg32x2 *)&b;
py = (reg32x2 *)&y;
py->lo = pa->lo+pb->lo;
py->hi = pa->hi+pb->hi;
return(y);
} // end of _add2() function
DSP Simulator
How many Intrinsics exist for each DSP family ?
TMS320C54x: 36
TMS320C55x: 42
TMS320C62x: 59
TMS320C64x: 135
TMS320C64+: 162
TMS320C67x: 68
Most algorithms
previously written in
assembly code can
now be expressed in
C/C++ code with
Intrinsic function
calls
DSP Simulator
Consists of two files
C6xSimulator.c
C6xSimulator.h
Contains C functions for representing the
numerical operations of 158 DSP assembly
instructions
Can control endianness with a symbolic constant
DSP Simulator and C++
DSP Simulator works in C++ programming
environments
Partition
data into appropriate types (real, complex)
and bit widths (8/16/32 bits)
Write functions in C++
Use operator overloading for required data types to
map operators to the desired Intrinsic functions
Benefit: Operator overloading allows for easy
migration to next generation DSP
instruction sets
Using the DSP Simulator
Develop C/C++ code with Intrinsic function calls
Compile and link the C/C++ code and the DSP
Simulator to form a Matlab executable file
Debug and evaluate the performance of the fixed
point algorithms in Matlab
Rely on TI tools to generate an optimized assembly
version of the C/C++ code for the target DSP
Benefit: One version of C/C++ code runs in Matlab
and in the target DSP !
Migrating C/C++ Code to the DSP
How does it work ?
C/C++ code can directly access DSP assembly instructions
without actually writing assembly code
Benefit: Eliminate headaches associated with assembly
programming
Pipeline scheduling
Register allocation
Unit allocation
Stack manipulation
Parallel instruction debug
Conclusion: Make the compiler do the hard work !
When is the C/C++ Code Optimized ?
Look at compiler report in the assembly file to
determine unit loading.
Look
at the assembly code. Are all the units being
used each cycle ?
Try to balance loading by using different sequence of
Intrinsics to perform the same overall mathematical
operation.
e.g. X * 4 => X << 2
May
require manual unrolling of loops.
Determine the ideal number of MAC operations for
an algorithm and compare it to the compiler report
Limitations
DSP software engineer must perform algorithm
mapping from floating point to fixed point manually
ranges
for floating point values
fixed point scaling issues
saturation issues
DSP software architecture is limited to the
creativity of the software engineer
Recommendation: Develop an automated tool that
converts Matlab/Simulink floating
point files to fixed point DSP C/C++
code using the programming
guidelines discussed in the paper.
FFT Example
Developed an FFT for the C64x DSP architecture
Briefly discuss
FFT Functions
FFT Simulation File
Development time between hand coded assembly
and C code with Intrinsics
Software
development time
Software performance
FFT Functions
The FFT functions
// inside the Radix-2 stage
for(k=Nover2;k>0;k--)
Main FFT function
{
.
First FFT stage
// compute the real part
Radix-2 stage
// (x0.real-x1.real)*w1.real
reg2 = _mpyhir(w1,reg1real);
Radix-4 stage
// (x0.imag-x1.imag)*w1.imag
reg3 = _mpylir(w1,reg1imag);
Last FFT stage
reg2 -= reg3;
// compute the imag part
// (x0.imag-x1.imag)*w1.real
Example: Radix-2 stage
reg4 = _mpyhir(w1,reg1imag);
Uses mpyhir() and mpylir()
Intrinsics
// (x0.real-x1.real)*w1.imag
reg5 = _mpylir(w1,reg1real);
reg4 += reg5;
.
}
Note: Twiddle factor indexing not shown in this Example
FFT Simulation File
The simulation file is a Matlab
script file
% test_fft.m
Performs the simulation
Nin = 64;
Calls the floating point
Matlab FFT function fft()
Calls the fixed point FFT
function ti_fft()
Compares the frequency
responses of fixed point
and floating point FFTs in
Matlab
Computes the SNR, NSR,
etc. using Matlab
% initialize some parameters
N = 128;
NumFFTs = 1000;
% create a random input
h = rand(NumFFTs,Nin);
h = [h;zeros(NumFFTs,N-Nin)];
% compute FFT using Matlab function
Hd = fft(h,[],2);
% call the fixed point function
[H] = ti_fft(h1dfilt,Nin,N);
% compute the NSR in dB scale
e = Hd-H;
NSR = 10*log10(sum(abs(e).^2,2)…
./sum(abs(Hd).^2,2));
FFT Development Time
Software Development Time Comparison
Time required to develop hand-coded assembly
functions
2-3
person months
Time required to develop C code with Intrinsic
function calls
2-3
person weeks
Development time is reduced by a factor of 4 to 5 !
FFT Performance Comparison
Metric: Kernel sizes and cycle counts
Kernel sizes for hand-coded assembly functions
FirstFFTStage:
R2Stage:
R4Stage:
LastFFTStage:
18*(N/16)
7*(N/8)
12*(N/8)
24*(N/16)
Kernel sizes for C code with Intrinsic function calls
FirstFFTStage:
R2Stage:
R4Stage:
LastFFTStage:
19*(N/16)
8*(N/8)
14*(N/8)
27*(N/16)
Intrinsics performance is within 15% of assembly !
Algorithm Optimization Results
Algorithm
C Intrinsics
Code Size
Assembly
Code Size
(cycles)
(bytes)
(cycles)
(bytes)
autocorr
97
800
66
384
bit_unpack
124
192
108
192
bk_massey
302
416
262
320
ch_search
485
1056
321
864
dotprod
32
160
29
160
fir_cplx
1227
832
985
448
forney
195
864
156
864
maxval
38
128
34
128
rs_encoder
463
512
402
288
syndrome
510
1216
478
1152
vecsum
45
96
40
96
In most cases, Intrinsics performance is within 10% !
Matlab Function Libraries
For a particular DSP application
DSP Simulator
The DSP Simulator emulates the
numerical behavior of the DSP
instructions
Power User develops a library of
optimized algorithms that contain
Intrinsic function calls
General user writes C/C++ code
that calls the optimized functions in
the library
The user’s C/C++ code is compiled
with the DSP Simulator, the library
and the MEX-file
User tests the algorithms for
performance, evaluates cycle
counts, etc. in Matlab
The same C/C++ code is migrated
directly to the target DSP
Library
Function 2
Function 1
Function N
C/C++ code
MEX-file
Matlab
Matlab Function Library Examples
Math Library
Library
Communications Library
Library
OuterProduct
VectorSum
FIR
RS
BF
InnerProduct
ChanEst
NoiseEst
Viterbi
Controls Library
Library
NoiseEst
Hinf
SlidingMode
PID
ResEqu
Benefit: Ability to share fixed-point DSP C/C++ code
and test vectors between multiple users
IC
Closing Remarks
DSP Simulator Benefits
Develop fixed point DSP code in Matlab
Easily compare floating point and fixed point algorithm
implementations in Matlab
Bit-true, fixed point simulations
Reduce software development time by a factor of 4 to 5
Incorporate DSP code into higher level system simulations
Debugging code in Matlab is easier than in a real-time system
Easily evaluate/predict MIPS requirements
Run the same C/C++ source code in Matlab and in the DSP
Easily migrate algorithms to new DSP instruction sets
Develop software before next generation DSPs are available
Q&A
Thanks for attending my presentation !