Validating a Modern Microprocessor

Download Report

Transcript Validating a Modern Microprocessor

Validating A Modern
Microprocessor
Bob Bentley
Intel Corporation
Enterprise Microprocessor Group
Hillsboro, Oregon U.S.A.
Page 1
Moore’s Law - 1965
Page 2
Source: Intel Museum
Moore’s Law - 40 Years Later
Process Name P854
P856
P858
Px60
P1262
P1264 P1266
1st Production 1995
1997
1999
2001
2003
2005
2007
65nm
45nm
Lithography
0.35mm 0.25mm 0.18mm 0.13mm 90nm
Gate Length
0.35mm 0.20mm 0.13mm <70nm <50nm <35nm <25nm
Wafer Size
(mm)
200
200
200
200/300 300
300
300
A new process every two years
Page 3
Source: Intel
300mm Semiconductor Economics
Fab
Pilot line
R&D process team
$3 billion
$1-2 billion
$0.5-1 billion
$5 billion investment requires high volume
to achieve reasonable unit cost
Page 4
Source: Intel
The Validation Challenge
 Microprocessor validation continues to be driven by the
economics of Moore’s Law
– Each new process generation doubles the number of transistors
available to microprocessor architects and designers
– Some of this increase is consumed by larger structures (caches,
TLB, etc.), which have no significant impact to validation
– The rest goes to increased complexity:
– Out-of-order, speculative execution machines
– Deeper pipelines
– New technologies (Hyper-Threading, 64-bit extensions,
virtualization, security, …
– Multi-core designs
– Increased complexity => increased validation effort and risk
High volumes magnify the cost of a
validation escape
Page 5
Microprocessor Design
Page 6
Microprocessor Design Scope
 Typical lead CPU design requires:
– 500+ person design team:
–
–
–
–
logic and circuit design
physical design
validation and verification
design automation
– 2-2½ years from start of RTL development to A0
tapeout
– 9-12 months from A0 tapeout to production qual (may
take longer for workstation/server products)
One design cycle = 2 process generations
Page 7
Pentium® 4 Processor
 RTL coding started: 2H’96
– First cluster models released: late ’96
– First full-chip model released: Q1’97
 RTL coding complete: Q2’98
– “All bugs coded for the first time!”






RTL under full ECO control: Q2’99
RTL frozen: Q3’99
A-0 tapeout: December ’99
First packaged parts available: January 2000
First samples shipped to customers: Q1’00
Production ship qualification granted: October 2000
Page 8
RTL – A Moving Target
# Files Checked In
RTL Coding Complete
Total # Lines of RTL
# Lines Changed
3000 files, 1.3M lines total
(including comments, white space)
250K lines changed
in one week
A0 tapeout
Functionality Focused
2001-04
2001-02
2000-12
2000-10
2000-08
2000-06
2000-04
2000-02
1999-12
1999-10
1999-08
1999-06
1999-04
1999-02
1998-12
1998-10
1998-08
1998-06
1998-04
1998-02
1997-12
1997-10
1997-08
1997-06
1997-04
1997-02
1996-12
1996-10
1996-08
1996-06
1996-04
1996-02
First Full-Chip
RTL Model
Timing Focused
Page 9
Microprocessor Validation
Page 10
RTL validation environment
 RTL model is MUCH slower than real silicon
– A full-chip simulation with checkers runs at ~20 Hz on a
Pentium® 4 class machine
– We use a compute farm containing ~6K CPUs running 24/7 to
get tens of billions of simulation cycles per week
– The sum total of Pentium® 4 RTL simulation cycles run prior to
A0 tapeout < 1 minute on a single 2 GHz system
 Pre-silicon validation has some advantages …
– Fine-grained (cycle-by-cycle) checking
– Complete visibility of internal state
– APIs to allow event injection
 … but no amount of dynamic validation is enough
– A single dyadic extended-precision (80-bit) FP instruction has
O(10**50) possible combinations
– Exhaustive testing is impossible, even on real silicon
Page 11
Pentium® 4 Formal
Verification
 First large-scale effort at Intel (~60 person years) to
apply formal verification techniques to CPU design
– Applying FV to a moving target is a big challenge!
 Mostly model checking, with some later work using
theorem proving to connect FP proofs to IEEE 754
 More than 14,000 properties in key areas:
– FP Execution units
– Instruction decode
– Out-of-order control mechanisms
 Found ~20 “high quality” bugs that would have been
hard to detect by dynamic testing
 No silicon bugs found to date in areas proved by FV 
Page 12
Sources of Bugs
1200
1000
800
600
400
200
0
Q4'96 Q1'97 Q2'97 Q3'97 Q4'97 Q1'98 Q2'98 Q3'98 Q4'98 Q1'99 Q2'99 Q3'99 Q4'99
CTE
Full-chip
Inspection
Other
Page 13
Unit-Level Coverage
Page 14
51'99
48'99
45'99
42'99
39'99
36'99
33'99
30'99
27'99
24'99
21'99
18'99
15'99
12'99
09'99
06'99
03'99
52'98
49'98
46'98
43'98
40'98
Pre-silicon Bug Rate
100
90
80
70
60
50
40
30
20
10
0
Page 15
Formal Verification
Page 16
Pentium® 4 Formal Property
Verification
 Objective:
– Complement other validation
activities
– Correctness, not bug hunting
 Strategy:
– Prove unit properties first,
then multiple-unit protocols &
assumptions
– Maintain properties in face of
an evolving design
System Bus
Level 1 Data Cache
Bus Unit
Level 2 Cache
Execution Units
Memory Subsystem
Integer and FP Execution Units
Fetch/
Decode
Trace
Cache
Microcode
ROM
 Tools:
– Model checking
– Symbolic simulation
– Theorem proving
BTB/Branch Prediction
Front End
Out-oforder
execution
logic
Retirement
Branch History Update
Out-of-order Engine
Pentium® 4 Basic Block Diagram
Page 17
Floating Point Verification Multiplication
S2
S1
 Verified adherence to
IEEE 754 spec including
Mantissa
results, flags & faults
Booth
datapath
Encoder
 Design specified as very C
Exponent
O
Partial Products
low-level RTL
datapath
N
generator
T
…
 Huge capacity challenge R
O
for traditional MC
Wallace Tree
L
techniques
Adder Network
 Verification required
multiple person years
Rounder logic
and iterations
 Ultimate solution: combined bit-vector level checked
results using theorem proving
 Proof framework used on subsequent designs
Page 18
Combining Formal and Dynamic
Verification
 On Pentium® 4, we treated FPV and dynamic
verification as essentially independent activities
 Current projects are seeking to exploit synergy
between the two techniques
 Currently using SAT solver technology as a
bridge between the two worlds
– SAT solvers provide much greater capacity, reducing
or eliminating the need for problem decomposition
– Allows us to do bug hunting (falsification) in addition
to verification
– Use dynamic validation to confirm or refute counterexamples
Page 19
High-Level Formal Verification
 Formal Design Process: Work with architects &
designers while design is being developed
– Abstract model of architecture built and formally checked
– Fast focus on algorithm problems
– Intensive and quick feedback for every definition change
– Find errors in early design approaches
 Goal: Completely verify architectural correctness
– Avoid finding architectural bugs late in design cycle, when fixing
them is much more difficult and costly
– Drive rigor, completeness and correctness of the high level design
– Speed up design phase based upon complete specification
– Cleaner, simpler design – fewer warts
Page 20
Verifying Architectural
Correctness
 Top Level Specification
 Architectural Feature Model (AFM)
– Declare players, describe their roles in upholding top-level
specification
– Model check for violations
 Microarchitectural Block Model (UBM)
– Player event oriented detail: case-by-case description of how each
block reacts to communications
– Verify by inserting UBM into AFM
– Fully expose and understand uarch and role in supporting top level
– Analyze uarch alternatives: UBMs can be developed fairly quickly
 Microarchitectural State Model (USM)
– One-to-one correspondence with final set of UBMs
– Describe uarch in a state-centric way
Page 21
Future Opportunities &
Challenges
Page 22
Tools & Methodology
 Full verification with SAT-based model checkers
– Currently only feasible on simple examples
– Overlapping loops/feedback in real designs cannot be handled
– Explicit induction schemes
 Accumulate design coverage for SAT-based techniques
– Need to integrate formal and dynamic (simulation) coverage to
fully understand coverage holes
 Expand formal methods beyond functional correctness
– Other areas of concern include performance, power, timing …
– These areas may be amenable to formal analysis
 Other tool opportunities:
– Parallel and Distributed Checkers
– Debugging, interactive Model Checking
Page 23
Methodology drivers
 Regression
– RTL is “live”, and changes frequently until the very last stages of
the project
– Model checking automation at lower levels allows regression to
be automated and provides robustness in the face of ECOs
 Debugging
– Need to be able to demonstrate FV counter-examples to
designers and architects
– Designers want a dynamic test that they can simulate
– Waveform viewers, schematic browsers, etc. can help to bridge
the gap
 Verification in the large
– Proof design: how do we approach the problem in a systematic
fashion?
– Proof engineering: how do we write maintainable and modifiable
proofs?
Page 24
Reasoning
 Integrated reasoning within checker
– Current status:
– Theorem provers have integrated decision procedures, model
checkers don’t have reasoning capability
– Not much improvement in exhaustive MC capacity, need
mechanical assistance for problem decomposition
– Want lightweight framework to direct MC/BMC proof strategy
– Case split, abstraction, symmetry detection, “what if”, …
– User guided problem decomposition
– Standard HDLs make it difficult for FV to automatically identify
symmetry
 Reasoning about changes
– Want to fully verify design during development, but real
designs go through many iterations
Page 25
Other Challenges
 Dealing with constantly-changing specifications
– Specification changes are a reality in design
– Properties and proofs should be readily adapted
– How to engineer agile and robust regressions?
 Protocol Verification
– This problem has always been hard
– Getting harder (more MP) and more important (intra-die
protocols make it more expensive to fix bugs)
 Verification of embedded software
– S/W for large SoCs has impact beyond functional
correctness (power, performance, …)
– Not all S/W verification techniques apply because H/W
abstraction is less feasible
– One example is microcode verification
Page 26