A Case for Dynamic Pipeline Scaling
Download
Report
Transcript A Case for Dynamic Pipeline Scaling
Microarchitectural Approaches to
Exceeding the Complexity Barrier
Eric Rotenberg
Center for Embedded Systems Research (CESR)
Department of Electrical & Computer Engineering
North Carolina State University
www.tinker.ncsu.edu/ericro
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
1
Complexity Barrier
(general-purpose systems)
• Deep submicron designs
– Technology/circuit trends
• Billions of transistors & multi-GHz clock rates
• Low voltage for power management
• Risk-prone circuit techniques (e.g., dynamic logic) for
performance
• Reduced design tolerances overall
• Highly prone to transient faults (I.e., single-event upsets)
– Microarchitecture trends
• Increasingly sophisticated techniques for exploiting instructionlevel parallelism
• Functional verification becoming intractable
• Highly prone to design faults (I.e., bugs)
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
2
Time Redundancy
• Conventional time redundancy
– Run program twice, compare answers
– Can detect transient faults
– Doubles execution time
processor
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
3
Simultaneous Multithreading (SMT)
• Execute multiple programs on wide
superscalar processor at same time
– Single program does not fully utilize
parallelism of wide superscalar processor
– Running two programs simultaneously takes
less time than running them consecutively
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
4
AR-SMT
[Rotenberg, FTCS-29, June 1999]
• Run two copies of program at the same
time, one slightly ahead of the other
– Advanced stream (A-stream) passes its
control flow and data flow outcomes to
redundant stream (R-stream) for checking
SMT
processor
A-stream
R-stream
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
5
AR-SMT
[Rotenberg99, FTCS-29]
•
•
Higher processor utilization reduces overhead
of time redundancy
Can reduce overhead further
– R-stream has oracle view of the future!
– R-stream uses A-stream control flow and data flow
as 100% accurate branch/value predictions
1. R-stream executes more efficiently, yielding resources back
to A-stream
2. Exploits existing prediction verification datapaths
– “Misprediction” implies transient fault occurred
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
6
AR-SMT Performance
4 PE
8 PE
1.2
30
1.1
A-stream
R-stream
25
1
comp
gcc
go
jpeg
li
% of all cycles
normalized execution time
(single thread = 1)
1.3
20
15
10
5
0
0
1
2
3
4
5
6
7
8
number of PEs used
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
7
DIVA
[Austin99, MICRO32]
• Innovative approach for dynamically detecting
and recovering from design faults
• Add simple checker processor at commit stage
of complex processor
– Complex core passes results to checker
– Checker re-executes instructions to confirm
correctness of results before committing them
– Checker is simple hence verifiable
– Dynamic verification relieves burden of finding all
design bugs before shipping
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
8
DIVA
[Austin99, MICRO32]
Reg. File,
Memory
Simple
Checker
Processor
Complex
Processor
Core
• Simple checker can keep up with complex core
– Like R-stream in AR-SMT, checker not bound
by control/data dependences
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
9
Complexity Barrier
(embedded systems)
• What constrains embedded system complexity
– Conventional wisdom
• Power, cost, etc.
• Not unique to embedded … general-purpose and embedded
systems benefit alike from technology scaling in these respects
– Real-time constraints demand analyzability
• Need safe bound on worst-case execution time (WCET)
• Worst-case timing analysis tools
– Scalar in-order pipeline with caches and static branch prediction
is a complexity limit
• Contemporary processors excluded
– Superscalar, OOO execution, dynamic branch prediction, etc.
– Unsafe because can’t statically bound WCET
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
10
VISA
[Anantaraman et al. 03, ISCA-30]
•
Virtual simple architecture (VISA)
– VISA is timing specification of hypothetical simple
processor
– WCET derived for task assuming VISA
– Speculatively execute task on complex pipeline
•
•
•
•
Divide task into multiple sub-tasks
Sub-tasks assigned soft deadlines (checkpoints) based on
latest allowable completion time on VISA
Safe progress on unsafe processor confirmed for as long as
checkpoints are met
If miss checkpoint, reconfigure complex pipeline to simple
operating mode that directly implements VISA
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
11
VISA
[Anantaraman et al. 03, ISCA-30]
EDF scheduler,
DVS scheduling, etc.
Worst-Case Timing Analysis
WCET abstraction
Virtual Simple Architecture
Complex Processor
with Simple Mode
•
Circumvent worst-case timing analysis of
complex processors by dynamically confirming
behavior bounded by WCET of simpler proxy
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
12
Summary
• Microarchitectural approaches provide
innovative alternatives for dealing with
uncertainty
– Theme: View any unsafe system as
speculative, provide dynamic checking and
recovery from “mispredictions”
– Mispredictions
• Transient faults
• Design faults
• WCET faults
Microarchitectural Approaches to Exceeding the Complexity Barrier
© Eric Rotenberg
13