Transcript Slide 1

Moore’s Law in Microprocessors
Transistors on lead microprocessors double every 2 years
1000
2X growth in 1.96 years!
Transistors (MT)
100
10
486
1
386
286
0.1
0.01
P6
Pentium® proc
8086
8080
8008
4004
8085
0.001
1970
1980
1990
Year
2000
2010
1
Evolution in DRAM Chip
Capacity
100000000
10000000
64,000,000
4X growth every 3 years!
16,000,000
Kbit capacity/chip
4,000,000
1000000
1,000,000
256,000
100000
64,000
16,000
10000
4,000
1000
1,000
256
100
64
10
1980
0.07 m
0.1 m
0.13 m
0.18-0.25 m
0.35-0.4 m
0.5-0.6 m
0.7-0.8 m
1.0-1.2 m
1.6-2.4 m
1983
1986
1989
1992
1995
Year
1998
2001
2004
2007
2010
2
Die Size Growth
Die size grows by 14% to satisfy Moore’s Law
Die size (mm)
100
P6
486 Pentium ® proc
10
386
8080
8008
4004
8086
8085
286
~7% growth per year
~2X growth in 10 years
1
1970
1980
1990
Year
2000
2010
3
Clock Frequency
Lead microprocessors frequency doubles every 2 years
10000
2X every 2 years
Frequency (Mhz)
1000
P6
100
Pentium ® proc
486
10
8085
1
0.1
1970
8086 286
386
8080
8008
4004
1980
1990
Year
Courtesy, Intel
2000
2010
4
Power Dissipation
Lead Microprocessors power continues to increase
Power (Watts)
100
P6
Pentium ® proc
10
8086 286
1
8008
4004
486
386
8085
8080
0.1
1971
1974
1978
1985
1992
2000
Year
Power delivery and dissipation will be prohibitive
5
Power Density
Power Density (W/cm2)
10000
Rocket
Nozzle
1000
Nuclear
Reactor
100
8086
10 4004
Hot Plate
P6
8008 8085
Pentium® proc
386
286
486
8080
1
1970
1980
1990
Year
2000
2010
Power density too high to keep junctions at low temp
6
Design Productivity Trends
100,000
Logic Tr./Chip
10,000
Tr./Staff Month.
1,000
100
58%/Yr. compounded
Complexity growth rate
10
100
1
10
x
0.1
xx
x x
0.01
x
1
21%/Yr. compound
Productivity growth rate
x
x
Productivity
(K) Trans./Staff - Mo.
1,000
0.1
0.01
2009
2007
2005
2003
2001
1999
1997
1995
1993
1991
1989
1987
1985
1983
0.001
1981
Logic Transistor per Chip (M)
Complexity
10,000
Complexity outpaces design productivity
Courtesy, ITRS Roadmap
7
SIA Roadmap
Year
1999
2002
2005 2008 2011 2014
Feature size (nm)
Mtrans/cm2
Chip size (mm2)
180
7
170
130
14-26
170214
100
47
235
Signal pins/chip
Clock rate (MHz)
Wiring levels
768
600
6-7
1024
800
7-8
1024 1280 1408 1472
1100 1400 1800 2200
8-9
9
9-10
10
Power supply (V)
High-perf power
(W)
Battery power (W)
1.8
90
1.5
130
1.2
160
0.9
170
0.6
174
0.6
183
1.4
2.0
2.4
2.0
2.2
2.4
70
115
269
50
284
308
35
701
354
8
9
10
Design Abstraction Levels
SYSTEM
MODULE
+
GATE
CIRCUIT
Vin
Vout
DEVICE
G
S
n+
D
n+
11
Major Design Challenges
•
Microscopic issues
– ultra-high speeds
– power dissipation and supply
rail drop
– growing importance of
interconnect
– noise, crosstalk
– reliability, manufacturability
– clock distribution
•
Year
Tech.
Complexity Frequenc
y
1997
1998
0.35
0.25
13 M Tr.
20 M Tr.
400 MHz
500 MHz
1999
2002
0.18
0.13
32 M Tr.
130 M Tr.
600 MHz
800 MHz
Macroscopic issues
– time-to-market
– design complexity (millions
of gates)
– high levels of abstractions
– reuse and IP, portability
– systems on a chip (SoC)
– tool interoperability
3 Yr.
Design
Staff Size
210
270
360
800
Staff Costs
$90 M
$120 M
$160 M
12
$360 M
13
14
15
16
17
18
19
20
21
Programmable Logic Technologies
 Fuse and anti-fuse
Fuse makes or breaks link between two wires
Typical connections are 50-300 ohm
One-time programmable (testing before programming?)
Very high density
 EPROM and EEPROM
High power consumption
Typical connections are 2K-4K ohm
Fairly high density
 RAM-based
Memory bit controls a switch that connects/disconnects two
wires
Typical connections are .5K-1K ohm
22
Can be programmed and re-programmed in the circuit
Low density
Altera EPLD (Erasable Programmable Logic Devices)
•
•
Historical Perspective
– PALs: same technology as programmed once bipolar PROM
– EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light
Altera building block = MACROCELL
CLK
8 Product Term
AND-OR Array
+
Programmable
MUX's
Clk
MUX
AND
ARRAY
Output
MUX
Q
pad
I/O Pin
Inv ert
Control
F/B
MUX
Programmable polarity
Seq. Logic
Block
Programmable feedback
23
Altera EPLD
Altera EPLDs contain 8 to 48 independently programmed macrocells
Global
CLK
Personalized
by EPROM
bits:
Clk
MUX
Synchronous Mode
1
Flipflop controlled
by global clock signal
OE/Local CLK
Q
EPROM
Cell
Global
CLK
Clk
MUX
local signal computes
output enable
Asynchronous Mode
1
OE/Local CLK
Q
Flipflop controlled
by locally generated
clock signal
EPROM
Cell
+ Seq Logic: could be D, T positive or negative edge triggered
+ product term to implement clear function
24
Actel Logic Module
SOA
S0
Basic Module is a
Modified 4:1 Multiplexer
S1
D0
2:1 MUX
D1
2:1 MUX
Y
D2
2:1 MUX
R
"0"
D3
SOB
Example:
Implementation of S-R Latch
2:1 MUX
"0"
2:1 MUX
Q
"1"
2:1 MUX
S
25
Actel Interconnect
Logic Module
Horizontal
Track
Vertical
Track
Anti-fuse
Interconnection Fabric
26
Xilinx Programmable Gate Arrays
IOB
IOB
IOB
IOB
IOB
CLB
IOB
CLB
IOB
Wiring Channels
CLB
CLB
IOB
• CLB - Configurable Logic Block
– 5-input, 1 output function
– or 2 4-input, 1 output functions
– optional register on outputs
• Built-in fast carry logic
• Can be used as memory
• Three types of routing
– direct
– general-purpose
– long lines of various lengths
• RAM-programmable
– can be reconfigured
27
CLB
Slew
Rate
Control
CLB
D
Q
Passive
Pull-Up,
Pull-Down
Output
Buffer
Switch
Matrix
Vcc
Pad
Input
Buffer
CLB
Q
CLB
Programmable
Interconnect
D
Delay
I/O Blocks (IOBs)
C1 C2 C3 C4
H1 DIN S/R EC
S/R
Control
G4
G3
G2
G1
DIN
G
Func.
Gen.
SD
F'
H'
EC
RD
1
F4
F3
F2
F1
H
Func.
Gen.
F
Func.
Gen.
Y
G'
H'
S/R
Control
DIN
SD
F'
D
G'
Q
H'
1
H'
K
Q
D
G'
F'
EC
RD
X
Configurable
Logic Blocks (CLBs)
28
The Xilinx 4000 CLB
29
Xilinx 4000 Interconnect
30
Switch Matrix
31
Xilinx 4000 Interconnect Details
32
Computer-Aided Design
• Can't design FPGAs by hand
– Way too much logic to manage, hard to make changes
• Hardware description languages
– Specify functionality of logic at a high level
• Validation: high-level simulation to catch specification errors
– Verify pin-outs and connections to other system components
– Low-level to verify mapping and check performance
• Logic synthesis
– Process of compiling HDL program into logic gates and flip-flops
• Technology mapping
– Map the logic onto elements available in the implementation
technology (LUTs for Xilinx FPGAs)
33
CAD Tool Path (cont’d)
• Placement and routing
– Assign logic blocks to functions
– Make wiring connections
• Timing analysis - verify paths
– Determine delays as routed
– Look at critical paths and ways to improve
• Partitioning and constraining
– If design does not fit or is unroutable as placed split into multiple
chips
– If design it too slow prioritize critical paths, fix placement of cells,
etc.
– Few tools to help with these tasks exist today
• Generate programming files - bits to be loaded into chip for
configuration
34
Xilinx CAD Tools
• Verilog (or VHDL) use to specify logic at a high-level
– Combine with schematics, library components
• Synopsys
– Compiles Verilog to logic
– Maps logic to the FPGA cells
– Optimizes logic
• Xilinx APR - automatic place and route (simulated annealing)
– Provides controllability through constraints
– Handles global signals
• Xilinx Xdelay - measure delay properties of mapping and aid
in iteration
• Xilinx XACT - design editor to view final mapping results
35