Temperature-Aware Design
Download
Report
Transcript Temperature-Aware Design
Temperature-Aware Design
Presented by Mehul Shah
4/29/04
The Problem
Power & Thermal densities are increasing
Operating Vdd scaling much more slowly (ITRS)
Cost of cooling rising exponentially
Currently @ 50W/cm2, 100W/cm2 @ 50nm technology
Power density doubles every 3 years
$1 - $3 per Watt of power dissipation
Packages designed for worst case power
Hot spots – heat dissipation non-uniform across chip
Low-Power design techniques not sufficient
Big Hammer : Global Clock Gating limits performance
Impact of Temperature on Design
Increased Delay, Lower Reliability
Slower Transistors
Higher Leakage Power
By orders of magnitude at higher temperature
Leakage becoming more significant than switching power
Higher Metal Resistivity
Carrier mobility lower at higher
temperature
o
o
Inverter 35% slower at 110 C vs. 60 C
o
Lower Mean-Time-To-Failure (MTF)
o
Copper 39% more resistive at 120 C vs. 20 C
MTF = MTFo exp (Ea / kb T)
MTF decreases exponentially w/ Temperature
Moral of the Story
Problem: Temperature adversely affects power, performance &
reliability
Solution: “Temperature-Aware” Design
Temperature Aware Design
Thermal Modeling
Estimate Operating Temperature
Simple : Allow architects to easily reason about
thermal effects
Detailed : Model runtime temperature at
Functional-Unit granularity
Computationally Efficient
Flexible : Easily extend to novel architectures
Dynamic Thermal Management
Use runtime behavior and thermal status to
adjust/distribute workload among Functional-Units
Talk Outline
Thermal Modeling
Model Description
Validation & Case Studies
Dynamic Thermal Management
Results
Conclusions
References
Kevin Skadron et. al, “Temperature-Aware
Microarchitecture”
Wei Huang et. al, Compact Thermal Modeling for
Temperature-Aware Design”
Thermal Modeling
Thermal model
interacts with Power,
Performance,
Reliability models
Design convergence
requires several
iterations
Heat Flow vs. Electrical Phenomenon
Both can be described by the same
differential equations
Describe design as a Thermal RC circuit
Heat Flow = Electrical Current
Temperature = Voltage
Capacitance = Heat Absorption Capacity
Node = Functional Block
Solve RC equations to obtain Node
Temperature
HotSpot Package
Equivalent Model
Equivalent Model (Continued)
Die Area divided into micro-architectural blocks
Spreader, Sink divided into five blocks
Rsp, Rhs areas under the die
Trapezoids not covered by the die
Rconvective represents thermal resistance from package to air
RC Model
Vertical R’s : heat flow between layers
Lateral R’s : heat diffusion within a layer
R=t/k*A
R1 = Block1 to Spreader, R2 = Block1 to rest of the chip
t : thickness
k : thermal conductivity of the material
A : Cross-sectional area
C=c*t*A
c : thermal capacitance per unit volume
Require empirical scaling factor due to lumped model
HotSpot Validation
Fallacy of Using a Power Metric
Compact Thermal Model
Equivalent Model
Equivalent Model (Cont.)
Compact Model vs. HotSpot
Arbitrary granularity grid
Thermal interface material
Spreader, Interface under the die are divided into chip
granularity
Primary Heat Flow Path
Rvertical = t / (k * A)
C = Alpha * cp * ρ * A
Alpha : To account for lumped capacitor model
Cp : specific heat
ρ : material density
Equivalent Model (Secondary Path)
Interconnect Thermal
Model
Self-heating power &
wire length prediction
Pself = I2R
R = ρ m * L / Am
Equivalent Model (Secondary Path, Cont.)
Equivalent Thermal Resistance
Model Validation & Evaluation (Primary)
Transient
Steady State
Model Validation (Secondary)
Case Study
Thermal Management
Dynamic Thermal Management
Emergency Threshold temperature above
which chip is in thermal violation
Trigger Threshold temperature above
which DTM is applied
DTM Techniques
Temperature-Tracking Frequency Scaling
Feedback controlled Fetch Toggling
Migrating Computation
Dynamic Voltage Scaling (DVS)
Global Clock Gating
DTM Results
Conclusions
Accurate Thermal models are essential for early
design estimation
Models are similar to electrical RC networks
Arbitrary granularity for localized temperature information
Model all parts of the package
Architectural Techniques can reduce demands on the
IC package by
Dynamically adjusting workload to avoid emergencies
Reducing Hot Spots