Слайд 1 - MOS-AK Research Group

Download Report

Transcript Слайд 1 - MOS-AK Research Group

ADMS CMI Project:
Extensions and
Spectre/UltraSim Benchmarking
Sergey Sukharev
09/22/06 Montreux
Abstract
This paper presents an approach to implement at the CMI/XML level most of
the critical features - required by external customers - to implement device
models in Spectre and UltraSim. Also some interesting data from Spectre and
UltraSim with ADMS device models and benchmarks presented here.
There is no need to change the ADMS software. Changes are done on the
common properties shared by all simulators instead.
The translation is specified using XML scripts – therefore Cadence intellectual
property is safe.
Better support of the latest Verilog-A LRM (Language Reference Manual) is
important. For example, support of voltage contributions and time integral
operator makes possible the translation of more complex device models such
as HICUM and PSP, which are now CMC standard models.
ADMS CMI project status
• Support Verilog-A compact models until SpectreVerilog-A
performance has been improved.
• Add only critical features on customer demand basis.
• Supporting/Consulting on Verilog-A compact model
requirements common to ADMS and SpectreVerilog-A
• We have interesting results to show on the ADMS
performances using Spectre and UltraSim!!
Customers / Verilog-A Device Model Developers
• EKV LDMOS (European project RobusPIC)
implemented.
• EKV 3.0 planned.
• HiCUM level 0 implemented, level 2 being implemented
by developers.
• Philips models (PSP NQS Charge) supported.
• SONY RPI TFT: Works well in Spectre and UltraSim.
• Freescale MICA models (moscap3, rbody) implemented.
Most of the critical functionality added to ADMS/CMI
•
•
•
•
•
•
•
•
•
•
•
NQS support (hidden state)
Limexp (for Spectre and UltraSim)
$simparam(“”) (for gdev, gmin, )
$analysis(“”) (tran, ac, dc, static)
M-factor (number of instances used in parallel)
Analog function (before we had to use macros)
Voltage contribution ( V(p,n) <+ complex expression )
Branch alias in the contrib statement (V(branch) <+ expression)
“string” parameter type. (for “n”/”p” – type mosfet)
Compare/CopyState for UltraSim (requested by Sony)
Switch branch (for parasitic resistors, including node collapsing) –
implementation in progress
• “ddx( )” for table model output parameters – in progress
• “idt( )” (Needed by PSP NQS Charges)
Shifting to common Verilog-A projects
• Performance evaluation in Spectre and UltraSim using
admsbsim3v3 device model.
– *** Make sure partitioning works ***
• Improve convergence in Spectre when gdev is not
specified.
• Algorithm for “switch-branch/node collapsing”.
Performance evaluation in Spectre
• 5 tests from CircuitSim90 benchmark suite using bsim3v3
model
– Built-in vs. ADMS CMI vs. Spectre Verilog-A interpreter
• Platform Solaris and Linux
• C language Compilers:
– Sun WorkShop v8, CFLAGS: -xO5
– Linux gcc, CFLAGS: –O3 –ffast_math
• Added “gdev” to bsim3v3 Verilog-A model for better DC
convergence. ( can also use “gmethod=node”)
• ADMS matched built-in performance in Spectre!!
– 0.8x – 1.2x
sram ( 1,008 transistors; 2,373 equations )
Linux: ADMS/Built-in CPU time ratio: 0.92
Built-in
ADMS
Verilog-A
42,228
36,579
36,579
Tran CPU time
221 s
205 s
4,562 s
Memory
11 M
12 M
123 M
ADMS
Verilog-A
Iterations
Solaris: ADMS/Built-in CPU time ratio: 0.8
Built-in
Iterations
40,586
36,579
36,579
Tran CPU time
527 s
421 s
14,156 s
Memory
15 M
15 M
166 M
sqrt ( 1,188 transistors; 2,900 equations )
Linux: ADMS/Built-in CPU time ratio: 1.0
Built-in
ADMS
Verilog-A
Iterations
8,255
7,722
7,723
Tran CPU time
51.3 s
51.5 s
1,105 s
Memory
11 M
13 M
128 M
Built-in
ADMS
Verilog-A
Iterations
8,254
7,722
7,723
Tran CPU time
124 s
111 s
1,001 s
Memory
14 M
14 M
179 M
Solaris: ADMS/Built-in CPU time ratio: 0.9
add32 ( 1,984 transistors; 5,092 equations )
Linux: ADMS/Built-in CPU time ratio: 1.1
Built-in
ADMS
Verilog-A
32,527
33,779
33,779
Tran CPU time
365 s
400 s
8,051 s
Memory
16 M
16.6 M
154 M
Iterations
Solaris: ADMS/Built-in CPU time ratio: 0.89
Built-in
ADMS
Verilog-A
32,527
33,779
33,779
Tran CPU time
935 s
834 s
24,728 s
Memory
24 M
25 M
242 M
Iterations
mem_plus ( 7,454 transistors; 17,788 equations )
Linux: ADMS/Built-in CPU time ratio: 1.18
Built-in
ADMS
Verilog-A
Iterations
3,165
3,342
3,342
Tran CPU time
148 s
175 s
3,144 s
Memory
46 M
48 M
335 M
Solaris: ADMS/Built-in CPU time ratio: 1.03
Built-in
ADMS
Verilog-A
Iterations
3,165
3,342
3,342
Tran CPU time
469 s
483 s
9,682 s
Memory
56 M
60 M
641 M
ram2k ( 13,880 transistors; 32,632 equations )
Linux: ADMS CPU time ratio: 1.2
Built-in
ADMS
Verilog-A
Iterations
6,354
6,535
6,535
Tran CPU time
600 s
717 s
12,300 s
Memory
56 M
61 M
520 M
Built-in
ADMS
Verilog-A
6,354
6,535
6,535
1,686 s
1,713 s
35,003 s
74 M
81 M
1.1 G
Solaris: ADMS CPU time ratio: 1.02
Iterations
Tran CPU time
Memory
Performance evaluation in UltraSim
• The same benchmarks and the same ADMS CMI device
model code as in Spectre was used.
• Matched performance without table models ( Partitioning
works!! )
• Built-in is faster because of table models.
– Investigating generic table model.
• Fixing bugs in UltraSim, enhancing ADMS CMI to
support interface
– Enhancing ADMS CMI to support “ddx(..)” to generate
output parameters for representative models.
add32 ( 1,984 transistors; 5,092 equations )
Linux: ADMS/Built-in CPU time ratio: 1.2
Built-in
ADMS
Verilog-A
Iterations
3.754 M
3.753 M
41.597 K
Tran CPU time
86.680 s
106.230 s
8110.970 s
32.3690 Mb
38.5934 Mb
389.1293 Mb
Memory
ram2k ( 13,880 transistors; 32,632 equations )
Linux: ADMS/Built-in CPU time ratio: 1.7
Iterations
Tran CPU time
Memory
Built-in
ADMS
Verilog-A
428.942 K
434.253 K
6.269 K
97.880 s
168.020 s
8796.700 s
121.4766 Mb
153.7780 Mb
2.0955 Gb
sqrt ( 1,188 transistors; 2,900 equations )
Linux: ADMS/Built-in CPU time ratio: 0.8
Built-in
ADMS
Verilog-A
Iterations
22.594 K
22.968 K
4.732 K
Tran CPU time
58.380 s
48.680 s
602.570 s
162.7547 Mb
168.6514 Mb
273.4203 Mb
Memory
sram ( 1,008 transistors; 2,373 equations )
Linux: ADMS/Built-in CPU time ratio: 1.7
Iterations
Tran CPU time
Memory
Built-in
ADMS
Verilog-A
721.427 K
711.812 K
55.509 K
95.650 s
166.730 s
5066.190 s
111.5829 Mb
138.9702 Mb
248.1950 Mb
mem_plus ( 7,454 transistors; 17,788 equations )
Linux: ADMS/Built-in CPU time ratio: 1.09
Iterations
Tran CPU time
Memory
Built-in
ADMS
Verilog-A
156.064 K
160.209 K
2.918 K
85.420 s
93.960 s
2578.510 s
159.9371 Mb
182.2137 Mb
1.1770 Gb
Keys to success and future work
• Pre-compute Model/Instance constants in initialization stage
• Share model parameters and model constants.
• Generate C-code that reflects the handwritten Verilog-A
equations in the module.
• Don’t generate what is never used in the module.
• Use the highest possible compiler optimization flags.
– ADMS CMI code was sensitive to -xO5, built-in code was
not.
• Future work:
- to develop algorithm to handle “switch-branch” including
“collapse node”;
- to create algorithm to support “ddx(..)” operator.
Thanks for attention