Distributed Loss Compensation for Low-latency On
Download
Report
Transcript Distributed Loss Compensation for Low-latency On
Distributed Loss Compensation for
Low-latency On-chip Interconnects
Class Presentation For
Advanced VLSI Design Course
Instructor: Dr.Fakhraie
Presented By: Fahimeh Alsadat Hoseini
Winter 2006
University of Tehran
Dept. Electrical and Computer Enginerring
Major Reference:
ISSCC 2006 / SESSION 21 / ADVANCED CLOCKING, LOGIC AND SIGNALING TECHNIQUES / 21.7
1
Outline
Scaling trends and motivation
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
2
Scaling Trends - ITRS Roadmap
On-chip wire delay scaling more slowly than gate delay.
Impact of scaling is worst on global wiring.
Jose et al., ISSCC, 2006. 3
Motivation
Wire delays (D) grow quadratically with wire length, D R C L2
wire
wire
- Unacceptably great for long wires.
- Wire bandwidths,which are inversely proportional to D, degrade.
Latency is controlled with repeater insertion which allows linear scaling of
delay with length.
- Break long wires into N shorter segments
- Drive each one with an inverter or buffer
- Optimal number of repeaters can be determined to minimize delay
- Repeaters, often inserted with CAD tools, consume a significant
fraction of on-chip power and area in microprocessors.
4
Outline
Scaling trends and motivation
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
5
Nearly ‘speed-of-light’ wires
The time-domain solution is:
• In this case, A is a constant. Gamma, the propagation constant,
provides information about the characteristics of this line. The
imaginary part of gamma, denoted by beta, is inversely related to the
phase velocity, and the real part of gamma, denoted by alpha, is the
attenuation constant.
R. Chang ,Thesis, 2002.6
Nearly ‘speed-of-light’ wires
RC region
LC dominated
region
R. Chang ,Thesis, 2002. 7
Nearly ‘speed-of-light’ wires
Take advantage of the inductance-dominated high-frequency regime
of on-chip interconnects
Peak phase velocity is speed of light in SiO2
Reduce low-frequency spectral components of the signal which
introduce ISI and lag LC dominated response
Chang et al., JSSC, 2003.
8
Outline
Scaling trends and motivation
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
9
Distributed Loss Compensation
Used in long-distance telephone network before the
introduction of optics for long-haul communications
Clock distribution networks
- Standing wave oscillators, [O’ Mahony et al., JSSC 2003]
- Rotary traveling-wave oscillator arrays, [Wood et al.,
JSSC 2001]
Distributed amplifiers use similar ideas to extend the unitygain bandwidth
10
Negative Impedance Converter
Pole ≈ -gm/(2C) ; zero ≈ 1/(2RC)
Match frequency-dependent loss characteristics
Jose et al., ISSCC, 2006.11
NIC Attenuation Compensation -
Without NICs
With NICs
Increasing R,
C=50 fF
Increasing R,
C=600 fF
A larger cap C increases the amount of loss compensation at
higher frequencies
Negative leads to instability which can lead to excessive
ringing or oscillations
Jose et al., ISSCC, 2006.12
Latency (ps)
Latency Comparison
Optimally buffered link
Length=14mm
NIC Link
Width (µm)
NIC links have lower latencies at higher widths due to
transmission-line behavior of interconnects for widths > 2 µm
For very small widths (large R), the interconnect is predominantly
in the RC domain
Jose et al., ISSCC, 2006.13
Energy (pJ/bit)
Power Comparison
Length=14mm
Optimally buffered link
NIC Attenuation Compensation -
NIC Link
Width (µm)
Power consumed increases rapidly for widths < 4µm due to the
large number of NIC elements required
Increasing bit energy in the optimally repeated case is due to large
number of repeaters needed to drive the additional C
Jose et al., ISSCC, 2006.14
Outline
Scaling trends and motivation
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
15
Test-Chip Components
SRAM
Driver
Receiver
Error Counter
1.67mm
PLL SRAM
Serpentine
serial link
LFSR
5mm
3-Gbps link in 0.18-µm technology with a 1.5-GHz system clock
17-bit LFSR for generating PRBS and an error counter for obtaining
BER
Far-end and near-end waveforms obtained by pico-probing
Jose et al., ISSCC, 2006.16
System Architecture
Data Bandwidth : 2 / clock period [bits/sec]
Clock period : 1.5 GHz
Jose et al., ISSCC, 2006.17
Transmitter Design
Driver consists of M1-2 and termination resistor RT
Predrivers use pseudo-nmos logic
Id sets the bias point of the NICs
Jose et al., ISSCC, 2006.18
Receiver Design
70 mV far-end swing
Modified StrongARM
latch:
• Low-swing differential
receiver
• Small aperture time for
high data-rate
Line termination:
• N-well resistor with value
of 2Zo
• Excessive capacitive
loading at receiver inputs
can introduce ISI
Calibration
Caps
Jose et al., ISSCC, 2006.19
Receiver Sampling-point Calibration
Calibration is performed at the receiver end
Clock skew compensation between transmitter and receiver
Link latency compensation
Jose et al., ISSCC, 2006.20
Outline
Scaling trends and motivation
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
21
Measurement Results -
obtained from measured S-parameters
The NICs cause noticeable bandwidth reduction at frequencies
beyond ≈ 9 GHz
The NICs contribute towards a significant reduction in from ≈
50 MHz to 7 GHz
Jose et al., ISSCC, 2006.
22
Measurement Results -
obtained from measured S-parameters
Phase velocity decreases ( increases) at high frequencies
Jose et al., ISSCC, 2006.23
Outline
Prior work on low-latency repeater-less links by
exploiting transmission-line behavior
Negative-impedance converter (NIC)
System architecture
Transmitter design
Receiver design
Measurement results
Summary and conclusions
References
24
Summary
Distributed loss Optimally repeated Optimally repeated
link (DDR)
link (DDR)
compensation (DDR)
Throughput
3 Gbps
3 Gbps
3 Gbps
Clock frequency
1.5 GHz
1.5 GHz
1.5 GHz
Width/
Length
8 µm /
14 mm
0.3 µm /
14 mm
8 µm /
14 mm
12.1 ps/mm
55 ps/mm
18.6 ps/mm
7
18
5
Link-latency
Number of
NICs/repeaters
Power
0.16 pJ/bit/mm 0.17 pJ/bit/mm
consumed
0.5
pJ/bit/mm
Jose et al., ISSCC, 2006.25
Conclusions
As technology scales, on-chip latencies are increasingly
becoming a bottleneck for on-chip performance
Optimally repeated RC delays represent latencies that are as
much as 3 X those determined by the speed of light in SiO2
Repeaters consume a growing fraction of power and silicon
area
Using distributed loss compensation with NICs leads to
arbitrarily long links with a significant latency and energy/bit/mm
advantage over optimally repeated RC links
26
References
• A. P. Jose and K. L. Shepard, “Distributed Loss Compensation for Lowlatency On-chip Interconnects,” IEEE International Solid-State Circuits
Conference, 2006.
•
A. P. Jose, G. Patounakis and K. L. Shepard, “Near Speed-of-light
Onchip Interconnects using Pulsed Current-mode Signaling,” Symp.
VLSI Circuits, June, 2005.
• R. T. Chang, et al, “Near speed-of-light signaling over on-chip electrical
interconnects,” IEEE J. Solid-State Circuits, vol. 38, no. 5, May, 2003.
• R. Chang , “ Near Speed-of-Light On-Chip Electrical Interconnects,”
A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL
ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD
UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY, November 2002.
27