Lower Power Synthesis - VADA

Download Report

Transcript Lower Power Synthesis - VADA

L3: Lower Power Design
Overview (2)
성균관대학교 조 준 동 교수
http://vlsicad.skku.ac.kr
SungKyunKwan Univ.
VADA Lab.
1
Low-Power Design Flow
developed at LIS
•
SungKyunKwan Univ.
VADA Lab.
2
Low Power Design Flow I
Function
Partitioning and
HW/SW Allocation
System
Level
Specification
System-Level
Power Analysis
Behavioral
Description
Software
Functions
Power-driven
Behavioral
Transformation
Processor
Selection
Behavioral-Level
Power Analysis
Power Conscious
Behavioral
Description
Software-Level
Power Analysis
Software
Optimization
High-Level
Synthesis and
Optimization
RT-Level
Power Analysis
To RT-Level Design
SungKyunKwan Univ.
VADA Lab.
3
Low Power Design Flow II
RT-level
Description
Data-path
RTL
Library
RTL
mapping
Controller
Logic Synthesis
and
Optimization
Gate-Level
Power Analysis
Gate-level
Description
Processor
Control and
Steering Logic
Memory
Standard cell
Library
RTL
Macrocells
SungKyunKwan Univ.
High-Level
Synthesis and
Optimization
Switch-Level
Power Analysis
Switch-level
Description
VADA Lab.
4
Execution unit idle
time(PowerPC 603)
Special
Register
Floating
Point
Fixed Point
SPECfp92
SPECint92
Load/Store
0
20
SungKyunKwan Univ.
40
60
80
100
VADA Lab.
5
System Integration
SungKyunKwan Univ.
VADA Lab.
6
Power Consumption in Multimedia Systems
• LCD: 54.1%, HDD 16.8%, CPU 10.7%,
VGA/VRAM 9.6%, SysLogic 4.5%, DRAM
1.1%, Others: 3.2%
• 5-55 Mode:
– Display mode: CPU is in sleep-mode
(55 minutes), LCD (VRAM + LCDC)
– CPU mode: Display is idle ( 5 minutes),
Looking up - data retrival
• Handwrite recognition - biggest power
(memory, system bus active)
SungKyunKwan Univ.
VADA Lab.
7
Reducing Waste
• Locality of reference
• Demand-driven / Data-driven
computation
• Application-specific processing
• Preservation of data correlations
• Distributed processing
SungKyunKwan Univ.
VADA Lab.
8
Energy-Efficient Design
1) Reduce the supply voltage
 Energy of switching drops quadratically with the supply voltage
 This drop is accompanied by reduced circuit speed
2) Minimizing switching capacitance
 Exploiting locality of reference with distributed computational
structures, minimizing global interactions
 Enforcing a demand-driven policy that eliminates switching activities
in unused modules
 Preserving temporal correlation in data streams by minimizing the
degree of hardware sharing
SungKyunKwan Univ.
VADA Lab.
9
Switching Activity
SungKyunKwan Univ.
VADA Lab.
10
Eliminating Redundant
Computations
SungKyunKwan Univ.
VADA Lab.
11
Power saving concepts





Work with parallel computation and low frequency.
Reduce pipe stages to save registers (try to avoid hazards).
Disable input toggling when the block is at idle state.
Work with minimum gate size to reduce the toggle current.
For outputs with large fanout’s speed up the transition to
reduce the short circuit current (invest toggle current in
order to save short circuit current) .
SungKyunKwan Univ.
VADA Lab.
12
Power Management
•
DPM
(Dynamic Power Management):
stops the clock switching of a
specific unit generated by clock
generators. The clock
regenerators produce two
clocks, C1 and C2 . The logic:
0.3%, 10-20% of power savings.
•
•
•
SungKyunKwan Univ.
SPM
(Static Power Management):
saving of the power dissipation
in the steady mode. When the
system (or subsystem) remains
idle for a significant period time,
then the entire chip
(or subsystem) is shut-down.
Identify power hungry modules
and look for opportunities to
reduce power
If f is increased, one has to
increase the transistor size or
Vdd.
VADA Lab.
13
Power
Management([email protected])
•
•
•
•
use right supply and right frequency to each part of the system If one
has to wait on the occurence of some input, only a small circuit could
wait and wake-up the main circuit when the input occurs.
Another technique is to reduce the basic frequency for tasks that can
be executed slowly.
PowerPC 603 is a 2-issue (2 instructions read at a time) with 5 parallel
execution units. 4 modes:
– Full on mode for full speed
– Doze mode in which the execution units are not running
– Nap mode which also stops the bus clocking and the Sleep mode which
stops the clock generator
– Sleep mode which stops the clock generator with or without the PLL (20100mW).
•
Superpipelined MIPS R4200 : 5-stage pipleline, MIPS R4400: 8 stage,
2 execution units, f/2 in reduce mode.
SungKyunKwan Univ.
VADA Lab.
14
TI
•
•
•
•
•
•
•
Two DSPs: TMS320C541, TMS320C542 reduce power and chip count and
system cost for wireless communication applications
C54X DSPs, 2.7V, 5V, Low-Power Enhanced Architecture DSP (LEAD) family:
Three different power down modes, these devices are well-suited for wireless
communications products such as digital cellular phones, personal digital
assistants, and wireless modem,low power on voice coding and decoding
The TMS320LC548 features:
– 15-ns (66 MIPS) or 20-ns (50 MIPS) instruction cycle times
– 3.0- and 3.3-V operation
32K 16-bit words of RAM and 2K 16-bit words of boot ROM on-chip
Integrated Viterbi accelerator that reduces Viterbi butter y update in four
instruction cycles for GSM channel decoding
Powerful single-cycle instructions (dual operand, parallel instructions, conditional
instructions)
Low-power standby modes
SungKyunKwan Univ.
VADA Lab.
15
Low-power embedded system
design
• low-power embedded applications: PDAs,
mobile phones, etc. power-efficient
processor cores(ARM)
• cache/memory organization for low power
• power management on embedded system
chips, comparative analysis of power drawn
by subsystems (CPU, hard disk, display, and
standby) of notebooks
SungKyunKwan Univ.
VADA Lab.
16
High level optimization for low
power
•
•
•
•
use of parallel and/or pipelined structures,
the choice of data representations,
the exploitation of signal correlations,
the synchronization of signals for glitching
minimization, and an accurate analysis of the
shared resources.
• At the algorithmic-level, applying arithmetic and
logic transformations to the block diagram
SungKyunKwan Univ.
VADA Lab.
17
VLSI Signal Processing Design
Methodology
• pipelining, parallel processing, retiming,
folding, unfolding, look-ahead, relaxed
look-ahead, and approximate filtering
• bit-serial, bit-parallel and digit-serial
architectures, carry save architecture
• redundant and residue systems
• Viterbi decoder, motion compensation, 2Dfiltering, and data transmission systems
SungKyunKwan Univ.
VADA Lab.
18
Power-hungry Applications
• Signal Compression: HDTV Standard,
ADPCM, Vector Quantization, H.263, 2-D
motion estimation, MPEG-2 storage
management
• Digital Communications: Shaping Filters,
Equalizers, Viterbi decoders, Reed-Solomon
decoders
SungKyunKwan Univ.
VADA Lab.
19