Principles of Computer Architecture Dr. Mike Frank
Download
Report
Transcript Principles of Computer Architecture Dr. Mike Frank
Physical Limits of Computing
Dr. Mike Frank
CIS 6930, Sec. #3753X
Spring 2002
Lecture #23
Adiabatic Electronics & CMOS
Mon., Mar. 11
Administrivia & Overview
• Don’t forget to keep up with homework!
– We are 7 out of 14 weeks into the course.
• You should have earned ~50 points by now.
• Course outline:
– Part I&II, Background, Fundamental Limits - done
– Part III, Future of Semiconductor Technology - done
– Part IV, Potential Future Computing Technologies - done
– Part V, Classical Reversible Computing
• Fundamentals of Adiabatic Processes & logic - last Wed. & Fri.
•
•
•
•
•
(----------------------- Spring Break ------------------------)
Adiabatic electronics & CMOS logic families - TODAY
Limits of adiabatics: Leakage and clock/power supplies. - Wed. 3/13
RevComp theory I: Emulating Irreversible Machines - Fri. 3/15
RevComp theory II: Bounds on Space-Time Overheads - Mon. 3/18
(plus ~7 more lectures…)
– Part VI, Quantum Computing
– Part VII, Cosmological Limits, Wrap-Up
Adiabatic electronics &
CMOS implementations
Conventional Gates are Irreversible
• Logic gate behavior (on receiving new input):
– Many-to-one transformation of local state!
– Required to dissipate bT by Landauer principle
– Incurs ½CV2 dissipation in 2 out of 4 cases.
Example:
Static CMOS Inverter:
in
out
Transformation of local state:
Just before
transition:
in out
0 0
0 1
1 0
1 1
After
transition:
in out
0
1
1
0
Exact formula:
Ediss f 1 f e 1/ f 1 CV 2
for frequency reduction
f : RC/t
Adiabatic Rules for Transistors
• Rule 1: Never turn on a transistor if it has a
nonzero voltage across it!
– I.e., between its source & drain terminals.
– Why: This erases info. & causes ½CV2 disspation.
• Rule 2: Never apply a nonzero voltage across a
transistor even during any onoff transition!
– Why: When partially turned on, the transistor has
relatively low R, gets high P=V2/R dissipation.
– Corollary: Never turn off a transistor if it has a
nonzero current going through it!
• Why: As R gradually increases, the V=IR voltage drop
will build, and then rule 2 will be violated.
Adiabatic Rules continued
• Transistor Rule 3: Never suddenly change the
voltage applied across any on transistor.
– Why: So transition will be more reversible;
dissipation will approach CV2(RC/t), not ½CV2.
Adiabatic rules for other components:
• Diodes: Don’t use them at all!
– There is always a built-in voltage drop across them!
• Resistors: Avoid moderate network resistances.
– e.g. stay away from range >10 k and <1 M
• Capacitors: Minimize, reliability permitting.
– Note: Dissipation scales with C2!
Transistor Rules Summarized
Legal transitions in green. (For n- or p-FETs.)
Dissipative states and transitions in red.
high
high
off
high
low
low
on
low
high
high
off
low
off
low
off
high
low
on
high
low
on
low
on
high
Transformation of local state:
Just before
transition:
After
transition:
in out
0 ½
1 ½
in out
0 1
1 0
Input-Barrier, Clocked-Bias Retractile
* Must reset output
prior to input.
* Combinational logic
only!
• Cycle of operation:
– Inputs raise or lower barriers
• Do logic w. series/parallel barriers
– Clock applies bias force which changes state, or not
Examples:
Hall’s logic,
SCRL gates,
Rod logic interlocks
0
0
0
Input barrier height
0
N
Clocked force applied
1
Retractile Logic w. SCRL gates
• Simple combinational logic of any depth N:
– Requires N timing phases
– Non-pipelined
– No sequential reuse of
HW (even worse)
• Sequential logic
is required!
Time
Sequential Retractile Logic
• Approach #1 (Hall ‘92):
– After every N stages, invoke an irreversible latch
• stores the output of the last stage
– Then, retract all the stages,
– and begin a new cycle
• Problems:
– Reduces dissipation by at most a factor of N
– Also reduces HW efficiency by order N!
• In worst case, compared to a pipelined, sequential circuit
• Approach #2 (Knight & Younis, ‘93):
– The “store output” stage can also be reversible!
– Gives fully-adiabatic, sequential, pipelined circuits!
• N can be as small 1 or 2 & still have arbitrarily high Q
Simple Reversible CMOS Latch
• Uses a standard CMOS transmission gate
• Sequence of operation:
(1) input initially matches latch contents (output)
(2) input changesoutput changes (3) latch closes
(4) input removed
P
in
out
P
Before
input:
in out
a a
Input
arrived:
in out
a a
b b
Input
removed:
in out
a a
a b
Resetting a Reversible Latch
• Can reversibly unlatch data as follows:
(exactly the reverse of the latching process)
– (1) Data value d stored on memory node M.
– (2) Present an exact copy of d on input.
– (3) Open the latch (connecting input to M).
• No dissipation since voltage levels match
– (4) Retract the copy of d from the input.
• Retracts copy stored in latch also.
Input-Bias Clocked-Barrier Logic
• Cycle of operation:
Can amplify/restore input signal
in clocking step.
– Data input applies bias
• Add forces to do logic
– Clock signal raises barrier
– Data input bias removed
Can reset latch
reversibly given
copy of contents.
Retract
input
0
Examples: Adiabatic
QDCA, SCRL latch, Rod
logic latch, PQ logic,
Buckled logic
1
Clock
barrier
up
0
Clock up
Input
“0”
0
1
Retract
input
N
Input
“1”
1
SCRL 6-tick clock cycle
Initial state: All gates off, all nodes neutral.
in
out
SCRL 6-tick clock cycle
Tick #1: Input goes valid, forward T-gate opens.
in
out
SCRL 6-tick clock cycle
Tick #2: Forward gate charges, output goes valid.
(Tick #1 of subsequent gate.)
in
out
SCRL 6-tick clock cycle
Tick #3: Forward T-gate closes, reverse gate charges.
in
out
SCRL 6-tick clock cycle
Tick #4: Reverse T-gate opens, forward gate discharges.
in
out
SCRL 6-tick clock cycle
Tick #5: Reverse gate discharges, input goes neutral.
in
out
SCRL 6-tick clock cycle
Tick #6: Reverse T-gate closes, output goes neutral.
Ready for next input!
in
out
24 ticks/cycle
in this versionincludes 2-level
retractile stages
Some Interesting Questions
• About pipelined, sequential, fully-adiabatic
CMOS logic:
– Q: Does it require these intermediate voltage levels?
• A: No, you can get by with only 2 different levels.
– Q: What is the minimum number of externally
provided timing signals you can get away with?
• A: 4 (12 if split levels are used)
– Q: Can the order-N different timing signals needed
for long retractile cascades be internally generated
within an adiabatic circuit?
• A: Yes, but not statically, unless N2 hardware is used
– where N is the number of stages per full sequential cycle
• We now demonstrate these answers.
2LAL: 2-level Adiabatic Logic
• Use simplified T-gate symbol:
2
• Basic buffer element:
– cross-coupled T-gates
• Only 4 timing signals,
4 ticks per cycle:
P
:
in
out
1
– i rises during tick i
– i falls during tick ((i+1) mod 4)+1
P
1
2
3
4
Tick #
1 2 3 4
P
2LAL Cycle of Operation
in1
in
21
in0
20
out1
11
in=0
11
10
21
out0
out=0
10
Shift Register Structure
• 1-tick delay per logic stage:
2
3
4
1
in
out
1
2
3
4
• Logic pulse timing & propagation:
1 2 3 4 ...
in
in
1 2 3 4 ...
More complex logic functions
• Non-inverting Boolean functions:
A
B
A
A
B
AB
AB
• For inverting functions, must use quad-rail
A=0
A=1
logic encoding:
A0
A0
• Zero-transistor A1
A1
“inverters.”
– To invert, just
swap the rails!
Hardware Efficiency issues
• Hardware efficiency: How many logic
operations per unit hardware per unit time?
• Hardware spacetime complexity: How much
hardware for how much time per logic op?
• We’re interested in minimizing:
(# of transistors) × (# of ticks) / (gate cycle)
• SCRL inverter, w. return path:
– (8 transistors) (6 ticks) = 48 transistor-ticks
• Quad-rail 2LAL buffer stage:
– (16 transistors) (4 ticks) = 64 transistor-ticks
More SCRL vs. 2LAL
• SCRL reversible NAND, w. all inverters:
– (23 transistors) (6 ticks) = 138 T-ticks
• Quad-rail 2LAL AND:
– (48 transistors) (4 ticks) = 192 T-ticks
• Result of comparison: Although 2LAL
minimizes # of rails, and # ticks/cycle, it does
not minimize overall spacetime complexity.
• The question of whether 6-tick SCRL
minimizes per-op spacetime complexity among
pipelined adiabatic CMOS logics is still open.