Synopsys Low Power Solutions

Download Report

Transcript Synopsys Low Power Solutions

Synopsys
Low Power Solutions
for ASIC Design Flow
Outline
 Market Drivers
 Synopsys Low Power Solution
 DesignPower
 Power Compiler
© 1998 Synopsys, Inc.
Confidential & Proprietary
2
Technology Trend
24%
6%
10%
Microprocessor
20% CAGR
Computers /
8%
PeripheralsSynopsys Wireless
15% CAGRRevenue 24% CAGR
13%
Networking
26% CAGR
Increasing Complexity & Integration
Higher Performance,
39% Lower Power, Lower Cost
Source: Dataquest (Projected CAGRs, 1996-2000)
© 1998 Synopsys, Inc.
Confidential & Proprietary
3
Power Issues are Business Issues
 Risk

A field failure could cost millions of
dollars
 Profitability

The package cost exceeds the target device price
 Competitiveness

The competing product has longer battery life with
more functionality
© 1998 Synopsys, Inc.
Confidential & Proprietary
4
The Traditional Medicine No Longer
Works
Spreadsheet based power estimation

Not accurate enough (short-circuit power ~30%)

Various modules require different formulas (datapath,
memory)
Reduced supply voltage

Should be carefully traded-off with performance
Smaller geometry

Is usually coupled with increased # of transistors on a chip
Manual implementation of power reduction
techniques

Reduces designers productivity and impacts Time-ToMarket
© 1998 Synopsys, Inc.
Confidential & Proprietary
5
Complete Spectrum of Low Power
Solutions
RTL
if (a<b) then
z <= ‘1’
else
z <= ‘0’
Optimization
Analysis
Power
Compiler
DesignPower
Gate
Extraction and
Characterization
PowerGate
PowerArc
Transistor
AMPS
PowerMill
RailMill
Polygon
© 1998 Synopsys, Inc.
Confidential & Proprietary
6
Arcadia
Synopsys’ Complete Low Power
Methodology
ASIC Flow
Custom Flow
RTL Source
Schematic
RTL Power Optimization
Power Analysis
Transistor level Power
Analysis & Diagnosis
Compile (DC)
Transistor level
Power Optimization
Gate level Power Optimization
Layout
Gate level Power Analysis
Place & Route
Full Chip
Net List
Full Chip Power Analysis
© 1998 Synopsys, Inc.
Confidential & Proprietary
7
Extraction
Power Library Infrastructure
Power Library is the Infrastructure for accurate
Power Analysis and Optimization
 Robust
Power Modeling
 Automatic
 Unified
 Broad
© 1998 Synopsys, Inc.
Confidential & Proprietary
Characterization
power library
library support
8
Robust Power Modeling
Dynamic
Switching Power (Isw) [70-90%]
 Also referred to as capacitive power
Vdd
P
In
Pswitching = 1/2 * V2 x S [ Ci x TRi ]
Out
Iint
Isw
N
for all
nets i
Cload
Ileak
Gnd
Internal (Short-Circuit) Power (Iint) [10-30%]
 Also referred to as short circuit power
Pinternal = S Einti (Output load, input transition) x TRi
Input transition
V
for all
cells i
Static
Leakage Power (Ileak) [<< 1%]
 Sub-threshold leakage dominates, some due to leakage substrate
t
Pleak = S Pleaki
for all
cells i
Complete power model provides infrastructure for analysis and optimization
© 1998 Synopsys, Inc.
Confidential & Proprietary
9
Robust Power Modeling (cont.)
 State-dependent power model
 Path-dependent power model
DATAIN
A
B
C
ADDRESS
RD_WR
RAM
DATAOUT
CLK
Z
D
E
Cell XYZ
CS
Power consumption varies with
Power consumption varies with
various operation modes
various input to output paths
Accurate analysis for state- and path-dependent functions such as
RAMs, I/Os and multilevel cells
© 1998 Synopsys, Inc.
Confidential & Proprietary
10
PowerArc for Library Characterization
Current Synopsys Lib.
SPICE
Netlists
Process
Specs.
PowerArc
Synopsys
.lib
 Automatic, accurate characterization
for cells and megacells
Synopsys
.lib
+ Power
 The same library shared by all gatelevel tools
Library Compiler
 Availability September 98
PowerGate
Synopsys
.db
DesignPower
Power Compiler
© 1998 Synopsys, Inc.
Confidential & Proprietary
11
The Most Comprehensive Library
Support in the Industry
ASIC/Library Vendor
Status
ASIC/Library Vendor
Availability
Alcatel Mietec
Available Now
Seiko/Epson
Available Now
Fujitsu
Available Now
Sony
Available Now
GEC Plessey
Available Now
Lucky Goldstar
Under development
IBM
Available Now
Rohm
Under development
Lucent Technologies
Available Now
Ricoh
Available Now
LSI Logic
Available Now
Toshiba
Available Now
Motorola
Available Now
OKI Semiconductor
Available Now
NEC
Available Now
Matsushita
Available Now
SGS Thomson
Available Now
Mitsubishi
Available Now
Symbios Logic
Available Now
Hitachi
Available Now
Temic-Matra MHS
Available Now
Aspec
Available Now
Texas Instrument
Available Now
VLSI Technology
Available Now
TSMC CBA
Available Now
Artisan
Available Now
Samsung
Available Now
© 1998 Synopsys, Inc.
Confidential & Proprietary
12
Power Analysis is a Key Enabling
Technology
 Power Analysis is essential for Low Power Management


Fast and accurate analysis early in the design process

Enables creation of low power designs

Drives knowledge-based architectural and implementation
decisions
Detailed and comprehensive analysis at the later stages

Ensure that power budget and constraints are satisfied

Power Signoff
© 1998 Synopsys, Inc.
Confidential & Proprietary
13
DesignPower for Early Power
Analysis
RTL Design
Power Compiler
(RTL Clock Gating)
DesignPower
Design Compiler
Power Compiler
PowerGate
Place & Route
Power optimized
design
 Early visibility into the power
consumption

Focus your efforts where the opportunities are

Power budgeting

Fast tradeoff analysis for power
Analysis is the enabling technology to design for low power
© 1998 Synopsys, Inc.
Confidential & Proprietary
14
Identification of Power Problems
The Problem:
Power budget was exceeded
on AM2910 design
Proposed Solution:
Identify power hungry
modules and look for
opportunities to reduce power
Cell
Driven Net Tot Dynamic
Cell
Internal Switching Power
Leakage
Cell
Power
Power
(% Cell/Tot)
Power
Attrs
-------------------------------------------------------------------------------------------------------STACK_BLK
247.1813 1486.9333 1734.115 (14%) 295.8000 h
REG_BLK
24.8037
700.8896 725.693 (3%)
29.7000 h
UPC_BLK
12.6486
679.9627
692.611 (2%)
13.2000 h
MUX_OUT_BLK 35.3713 201.3174
236.689 (15%)
27.0000 h
CNTL_BLK
22.1711 111.6987 133.870 (17%)
16.2000 h
-------------------------------------------------------------------------------------------------------Totals (5 cells) 34.218uW 318.080uW 352.298uW (10%) 381.900nW
Stack
Module
DesignPower quickly isolates power problems
© 1998 Synopsys, Inc.
Confidential & Proprietary
15
DesignPower Enables Intelligent
Decisions
Power Tradeoffs for AM2910 Stack
The Problem:
The stack module of the AM2910
was consuming too much power
80
75
Proposed Solution:
Gate the clock so that the
registers are only clocked
during the write cycles
70
65
60
Original Area
Gated Area
1730 Equivalent Gates
1528 Equivalent Gates
55
50
No performance impact!
45
Original
Ungated Design
DesignPower quantifies power savings
© 1998 Synopsys, Inc.
Confidential & Proprietary
16
5%
.....
50%
....
Percentage of time in write cycle
95%
Early Analysis Leads to Power Savings
National Semiconductor Success
A LAN switch ASIC of 200K gates and 41 memories
characterized for state-dependent power.
DesignPower revealed excessive power consumption
by the memories due to redundant read cycles.
The RTL was fixed and the power consumption
reduced
© 1998 Synopsys, Inc.
Confidential & Proprietary
17
DesignPower: inputs & outputs
VHDL or Verilog
RTL
Simulation
VHDL or Verilog
Gate-Level
Simulation
Switching Activity
Information
Library
DesignPower
Gate-Level Netlist
© 1998 Synopsys, Inc.
Confidential & Proprietary
18
Power Report
 Total Design
 Modules
 Individual Nets
 Individual Cells
Switching Activity Information
 Toggle-Rate (Tr) is the number of toggles per time-unit, and is used for
the power calculation. Tr = TC / DURATION
 Static-Probability (Sp) is the portion of time a node is at a logic value of
“1”, and is used for switching activity propagation and power
calculation. Sp = T1 / (T1 + T0 + TX)
(DESIGN "ex")
# of toggles Time in ‘1’ Time in ‘0’ Time in ‘x’
(TIMESCALE 1ns )
(DURATION 1000 )
(INSTANCE E/E2
(INSTANCE TOP
(PORT (DINA
(TC 500)
(T1 400)
(T0 504)
(TX 96) )
(COUNT
(TC 4328)
(T1 783)
(T0 217) )
)
( INSTANCE U_VA_30
(NET (CI
(TC 800)
(T1 300)
(T0 600)
(TX 100) )
(SO
(TC 815)
(T1 300)
(T0 249)
(TX 451) )
)
)
Example: Switching Activity Interchange Format (SAIF)
© 1998 Synopsys, Inc.
Confidential & Proprietary
19
Switching Activity Generation - RTL
 Activity of the synthesis invariant nodes is captured during RTL
simulation

sequential outputs, hierarchical boundaries, black-box pins
 Utilizes a zero-delay cycle-based propagation engine
 Same activity is used for both analysis and optimization
 New switching activity is required when the synthesis invariant
behavior is changed
© 1998 Synopsys, Inc.
Confidential & Proprietary
20
RTL Switching Activity Flow
RTL Design
HDL Compiler
SAIF
(fwd)
SAIF
(back)
RTL
Simulation
VCD
VCD
 SAIF (fwd) includes the RTL constructs to be monitored
 SAIF (back) includes the switching activity of these constructs
© 1998 Synopsys, Inc.
Confidential & Proprietary
21
Gate-Level Switching Activity Flow
Gate-Level
Design
Library
Compiler
SAIF
(lib)
SAIF
(back)
Gate-Level
Simulation
sim2dp
 Switching activity for most of the nodes is captured during gate-level
simulation
© 1998 Synopsys, Inc.
Confidential & Proprietary
22
Switching Activity: RTL vs. Gate-Level
 RTL Switching Activity:

Available early in the design process

Fast

Accurate

Does not account for glitches

Does not fully support state- and path-dependency
 Gate-Level Switching Activity:

Very accurate

Accounts for glitches
A/D
D/A
P/S
DMA

State- and path-dependency support

Requires lengthy gate-level simulation

Usually done at the later stages of the design process
© 1998 Synopsys, Inc.
Confidential & Proprietary
µp
Memory
µc
Mega
Cells
S/P
23
Control
Logic
Simulation Interface
DesignPower and Power Compiler
Abstraction Verilog-XL
VCS
VSS
MTI
IKOS
RTL
SAIF (PLI)
VCD
SAIF (PLI)
VCD
VCD
VCD
VCD
Gate-Level
SAIF (PLI)
SAIF (PLI)
SAIF
sim2dp
SAIF
PowerGate
© 1998 Synopsys, Inc.
Confidential & Proprietary
Abstraction Verilog-XL
VCS
Gate-Level
PIF (PLI) (Oct 1998)
PIF (PLI)
24
PowerGate for Detailed Power
RTL Design
Power Compiler
(RTL Clock Gating)
DesignPower
Design Compiler
Power Compiler
PowerGate
 Power verification at the later stages of the
design cycle

Ensure that power budget and constraints are
satisfied

Time based , peak power and time-average
power at user-defined intervals

Identify power hungry vectors / instructions

Isolate power problems in-time
Place & Route
Power optimized
design
© 1998 Synopsys, Inc.
Confidential & Proprietary
25
Identify Excessive Power In Time
Control Address 1
Logic
1
Address 2
Dual-port
RAM
Control
Logic
2
Common Data Bus
• The average power consumption
looks O.K yet is there a problem
with the memory?
Power
Average
• Is the memory cycle valid?
(address collision)
Time
• Is there data contention? (are both
ports in the read mode?)
© 1998 Synopsys, Inc.
Confidential & Proprietary
26
Power Compiler

Industry's first and only RTL & Gate-Level power optimizer

Push-Button power reduction at RT and Gate Levels
8/1997
RTL
10/1996
Gate
Level
© 1998 Synopsys, Inc.
Confidential & Proprietary
27
Power Compiler @ RTL
Push-button reduction in power at the RT-Level
RTL Clock-Gating
 No changes required to the RTL code
RTL
Source
 Can deliver significant reduction in power
 Power reduction is design dependent
 We
have seen 30% - 60% power reduction in
some designs
Power Compiler
Clock-Gating
(elaborate -gate_clock)
Un-mapped
Net-List +
Constraints
Downstream Dependencies
 Logic Synthesis
 Testability
Design Compiler
 Clock Tree Synthesis
© 1998 Synopsys, Inc.
Confidential & Proprietary
28
Automatic Clock-Gating @ RTL
Synchronous-load-enable implementation
elaborate
D_in
D_out
EN
FSM
Always @ (posedge CLK)
if (EN)
D_out = D_in
Register
Bank
CLK
Gated clock implementation
elaborate -gate_clock
D_out
D_in
Register
G_CLK Bank
EN
FSM
CLK
© 1998 Synopsys, Inc.
Confidential & Proprietary
29
Latch
Clock-Gating @ RTL - Power Savings
Power Savings by clock-gating
 Reduced internal power consumption
at the clock-gated flip-flops
 No need for Muxes to re-circulate the
data for these flip-flops (saves Power
& Area)
 Reduced power consumption by the
clock network
FSM
Power Saving dependency
CLK
 # of load-enable registers
 % of disabled cycles
© 1998 Synopsys, Inc.
Confidential & Proprietary
30
1
2
D_in
Register
G_CLK Bank
EN
Latch
3
D_out
Clock-Gating Styles
Latch-free {OR}
EN
Extensive user control
CLK
 Latch-based or latch-free gating style
 Which register banks to gate or
exclude from gating
GCLK
Latch-free {INV NAND BUF}
EN
GCLK
CLK
 Positive (AND) or negative (OR)
gating logic
Latch-based {NAND INV}
 Minimal bit-width of gated registers
EN
GCLK
CLK
© 1998 Synopsys, Inc.
Confidential & Proprietary
31
RTL Clock-Gating - Report
===============================================================================
|
|
| Included | Width | Enable | Setup | Clock |
Flip-Flop Name (Bit-Width)
| Excluded | Cond. | Cond.
| Cond. | Gated |
===============================================================================
|
out1_reg (8)
|
-
|
yes
|
yes
|
yes
|
yes
|
|
out2_reg (2)
|
-
|
no
|
yes
|
yes
|
no
|
===============================================================================
Summary:
Flip-Flops
Banks
number
Clock gated (total):
Bit-Width
percentage
number
percentage
1
50
8
80
Bank was excluded:
0
0
0
0
Bank width too small:
1
50
2
20
Bank always enabled:
0
0
0
0
Setup condition violated:
0
0
0
0
2
100
10
100
Clock not gated because
Total:
Information: The following instances of design SNPS_CLOCK_GATE_HIGH_<module>
have been created and must be uniquified for a hierarchical compile:
clk_gate_out1_reg
clk_gate_out2_reg
© 1998 Synopsys, Inc.
Confidential & Proprietary
32
Clock-Gating @ RTL - Dependencies
 Logic Synthesis
 Power
Compiler automatically generates set-up and hold constraints on
the gating element
 Combinatorial
set-up and hold checks are performed by DC
 Testability
 Medium
and high testability options for controllability & observability of
the enable signal
 Test
Compiler and DC XP can handle the gating circuitry during rulechecking and ATPG
 Clock-Tree-Synthesis
 Supported

by many ASIC vendors and tools providers
Contact your vendor for details
© 1998 Synopsys, Inc.
Confidential & Proprietary
33
Clock-Gating - Medium Testability
TEST_MODE
D_in
D_out
Register
Bank
EN
CLK
FSM
G_CLK
Latch
 TEST_MODE enables override of clock-gating during scan-in and
scan-out

Asserting TEST_MODE during the parallel mode will make FSM faults
un-testable
© 1998 Synopsys, Inc.
Confidential & Proprietary
34
Clock-Gating - High Testability
Other
Observability
Nodes
Observability
Register
CLK
TEST_MODE
D_in
Register
Bank
G_CLK
EN
CLK
FSM
D_out
Latch
 All FSM faults are testable
 Testability logic does not consume power

© 1998 Synopsys, Inc.
Confidential & Proprietary
Higher area cost
35
Power Compiler @ Gate-Level
Gate-Level
Netlist
Switching Activity
Constraints
(timing, power, area)
Design Compiler
Tech
Library
Power Compiler
dc_shell> compile -incremental
Power Optimized
Gate-Level Netlist
© 1998 Synopsys, Inc.
Confidential & Proprietary
36
Parasitic
(Capacitance)
Power Compiler @ Gate-Level
 Optimizes power simultaneously with area and timing
 New optimization technologies added for power

Activity-based optimizations minimize power subject to power
constraints

Power added to the synthesis optimization cost function

10% - 20% push-button reduction in power
 Works within timing constraints

no increase in negative slack
 Requires synthesis libraries updated for power
 Completely integrated with Links-to-Layout methodology
© 1998 Synopsys, Inc.
Confidential & Proprietary
37
Optimization Priorities
Priority
Cost Type
Design Rule
Delay
Dynamic Power
Leakage Power
Area
Constraints
Max Trans, Max Fanout
Clock Period, Max_delay, Min_delay
Max Dynamic Power
Max Leakage Power
Max Area
 The optimization priorities are hard coded
 Try tightening/loosening the constraints to get the
required speed/power/area trade-offs
Power Compiler works within the specified timing constraints
© 1998 Synopsys, Inc.
Confidential & Proprietary
38
Cell Sizing Example
Sized up
Critical path
a
b
an2a
n1
an2c
c
d
an2a
Sized down
a
b
Low activity net
an2c
n1
an2a
f
n2
c
d
an2a
f
n2
Delay (a,f) : reqd = 4, actual = 3.3
Delay (a,f) : reqd = 4, actual = 3.5
Cload: f = 4; n1, n2 = 2
Cload: f = 3; n1 = 2.5, n2 = 1.5
TR: a, b = .25, c, d = .5
TR: a, b = .25, c, d = .5
=> n1 = .125, n2 = .25, f = .56
=> n1 = .125, n2 = .25, f = .56
Power = 4.125
Power = 3.69
Note: Internal power effects (i.e. edge rate) also considered
© 1998 Synopsys, Inc.
Confidential & Proprietary
39
Factoring Example
Function:
f = ab + bc + cd
The function f is not on the critical path
The signals a, b, c and d are all the same bit width
Signal b is a high activity net
The two implementations below are equivalent from both
timing and area criteria
Net Result: network toggling and power is reduced
f = b(a + c) + cd
a
c
b
c
d
© 1998 Synopsys, Inc.
Confidential & Proprietary
f = ab + c (b + d)
a
b
c
f
b
d
40
f
Pin Swapping Example
Cpin = C1
Cpin = C1
toggle rate = .4
toggle rate = .8
a
d
b
b
f
c
c
d
a
toggle rate = .8
toggle rate =.4
Cpin = 1.5C1
Cpin = 1.5C1
Move high toggle nets to lower capacitance pins
© 1998 Synopsys, Inc.
Confidential & Proprietary
41
f
Phase Assignment Example
1
A
TR = .7
A
TR = .7
2:1
Mux
?
5
B
TR = .3
1
2:1
Mux
6
B
TR = .3
area = 7
area = 6
Implementation tradeoff criteria:
Solution requires:
 toggle rates of inputs and outputs
 dynamic power cost function
 pin capacitance of library cell
 actual toggle rates
 accurate cell libraries
© 1998 Synopsys, Inc.
Confidential & Proprietary
42
Push-Button Power Reduction by Power
Compiler
Intel Success (Presented by Intel at SNUG 1998)
A graphics chip for which both power and area are
critical, synthesized to 0.35m library at 3.3 Volts.
Achieved 12%, 21% and 24% reduction in power on 3
blocks with 2% or less area increase.
Lucent Success
An ISDN Transceiver ASIC, 40K gates block, synthesized
to 0.35m library
Achieved 12% push-button power reduction with 3.3%
area increase
© 1998 Synopsys, Inc.
Confidential & Proprietary
43
ASIC Low-Power Methodology
RTL Simulation
Design Exploration
RTL Design
Power Compiler
(RTL Clock Gating)
DesignPower
RTL SA
Speed
Design Compiler
Design
Implementation
Accuracy
RTL SA
Gate Simulation
SA
Physical
Design
Diagnosis
© 1998 Synopsys, Inc.
Confidential & Proprietary
Power Compiler
SNPS
.db
DesignPower
PowerGate
Place & Route
Power optimized
design
44
Cap.
Links-to-Layout for Power
Power
Compiler
Before: timing constraints not met
Physical
Design
PDEF
SDF
set_load
After: timing constraints met
Met
Constraints?
No
Floorplan
Manager
Yes
Lowest power implementation
The lowest power silicon within your timing constraints
© 1998 Synopsys, Inc.
Confidential & Proprietary
45
Summary
 Power Analysis

Early visibility into the power dissipation

Evaluate architectural and implementation
tradeoffs

Detailed and comprehensive analysis at the
later stages of the design cycle
 Power Optimization

Push-button power reduction at RT and Gate levels

Simultaneous optimization for timing, power and area

RTL simulation support for gate-level optimization
 Synopsys provides a Complete Solution

A complete set of power analysis, optimization and diagnosis tools

RT, Gate and transistor level support
© 1998 Synopsys, Inc.
Confidential & Proprietary
46