Digital Systems Design 2
Download
Report
Transcript Digital Systems Design 2
Digital Systems Design 2
Programmable Logic and Storage Devices
Chapter 8: “Advanced Digital Design with the Verilog HDL”,
Michael D. Ciletti.
Memory, CPLDs and FPGAs
Chapter 10: “Digital Design Principles and Practices”, John
F. Wakerly, Prentice Hall, 2001, Third Edition
Programmable Logic and Storage
Devices
With advancement of hardware technology:
Of
Density
Complexity
Size
field-programmable gate arrays (FPGAs), it provides an
attractive and cost-efficient alternative to semi-custom
application specific integrated circuits (ASICs).
The opportunity to realize large circuits in FPGAs has
created pressure for a change in the method by which
circuits are designed for FPGA-based applications:
Schematic entry tools can be productive and efficient when
designs are small.
Trend is toward larger and larger designs targeted for FPGAs.
Thus, language-based design methodology has become
essential to FPGA-based design flows.
Programmable Logic and Storage
Devices
Technologies available for implementing digital circuits range from:
Standard Integrated Circuits (ICs) used in low-density/low-performance
applications,
To Cell-based and full-custom ICs for high-density/high-performance circuits.
Can be manufactured cheaply,
Implement very limited, basic functionality at low levels of integration.
Implement specialized functionality with a high level of integration
Have a small market
Creates inventory risk because the quantities that could be sold do not warrant
the expense of their development and production.
Standard ICs:
Customized ICs
Programmable Logic Devices:
Between two extremes of density and performance that characterize standard
parts and full-custom circuits.
Born out of necessity created by two conflicting realities:
Large, dense, high-performance circuits cannot be build practically or economically
from discrete devices
Dedicated ICS cannot be produces economically to satisfy a diversity of low-volume
applciations.
Programmable Logic and Storage
Devices
PLDs:
Read-Only Memory’s (ROM)
Programmable Logic Arrays (PLA)
Programmable Array Logic (PAL)
Complex Programmable Logic Devices
(CPLD)
Field Programmable Gate Arrays (FPGA),
and
Mask-Programmable Gate Arrays
(MPGA).
Programmable Logic Devices
For most up-to-date PLDs see:
www.e-insite.net/ednmag
PLDs have
a fixed architecture
Functionality is programmed for a specific application
Programming is done by:
Manufacturer - mask-programmable logic devices
(MPLD)
End-User – field-programmable logic devices (FPLD).
Three basic characteristics distinguish PLDs from each
other:
1. An architecture of identical basic functional units
2. A programmable interconnection fabric, and
3. A programming technology.
Programmable Logic Devices
The first type of PLDs considered has the AND-OR plane structure shown in
the figure.
This type of architecture is used to implement ROMs, PLAs, and PALs.
It implements Boolean expressions in Sum of Products (SOP) form:
AND plane forms product terms selectively from the inputs, and
OR plane forms outputs from sums of selected product terms.
A programmable interconnect fabric joins the two planes, so that the outputs
implement sum-of-product expressions of the inputs.
Whether and how a plane can be programmed determines the particular
type of PLD that is implemented by the overall structure.
Product Terms
Inputs
AND Plane
OR Plane
Outputs
AND-OR plane structure of a programmable logic device
Storage Devices
The architecture used to implement PLDs lends itself
to implementation of storage devices.
Storage Devices can be:
Read-Only, or
Random Access
depending on whether the contents of a memory cell
can be written during normal operation of the device.
ROM (read-only memory) is a device programmed to
hold certain contents, which remain unchanged during
operation and after power is removed from the
device.
RAM (random-access memory) in contrast its contents
can be changed during operation, and they vanish
when the power is removed.
Read-Only Memory (ROM)
Read-Only Memory
(ROM)
A 2n x b ROM consists of
an addressable array of
semiconductor memory
cells organized as 2n
words of b bits each.
ROM Interface:
n – inputs defining
address lines.
b – outputs called bit
lines.
ROM is non-volatile
memory. It’s content is
preserved even if no
power is applied.
Read-Only Memory (ROM)
AND-OR planes for a ROM:
2n Minterms (Word lines)
formed from inputs
A(0)
A(1)
A(i)
A(n-1)
Address
Decoder
(Nonprogrammable)
AND Plane
OR Plane
Memory Array
2n x b
D(b-1)
D(i)
D(0)
b – Outputs
(bit lines)
Using ROM for “Random”
Combinational Functions
ROM can be used to perform any combinational function. ROM will
actually store for each input bit-pattern (input address) the
corresponding output bit-pattern.
Example: Truth table for a 3-input, 4-output combinational logic
function.
Inputs
Outputs
A2
A1
A0
D3
D2
D1
D0
0
0
0
1
1
1
0
0
0
1
1
1
0
1
0
1
0
1
0
1
1
0
1
1
0
1
1
1
1
0
0
0
0
0
1
1
0
1
0
0
1
0
1
1
0
0
1
0
0
1
1
1
1
0
0
0
Equivalent 2-to-4 decoder with
output-polarity control
Using ROM for “Random”
Combinational Functions
Another example that can be built with ROM is
unsigned binary multiplication.
Typical realization of a 4x4 multiplier requires to high
number of product terms (36) to obtain one pass
multiplier through a conventional PLD’s AND-OR array.
With ROM one can realize the function with one pass
through a 28 x 8 (256X8) ROM.
Contents of a ROM are normally specified by a file
that contains one entry for every address in the ROM.
The nice think about ROM-based design is that one
can usually write a simple program in a high-level
language to calculate what should be stored in the
ROM.
Two-dimensional decoding
Suppose that one wants to build a 128 x 1 ROM.
Straight forward solution will require a 7-to-128 decoder:
128 7-input NAND gates,
14 buffers and inverters with a fanout of 64 each.
ROMs with a 1M bits or more are available commercially and
they do not use linear structure for decoder – which would
require a 20-to-1,048,576 decoders.
The structure used is called two-dimensional decoding.
This structure enables reduction of the decoder size to
something on the order of the square root of the number of
addresses.
The basic idea in two-dimensional decoding is to arrange the
ROM cells in an array that is as close as possible to square.
In the next illustration a possible internal structure for a 128x1
ROM is depicted.
Two-dimensional decoding
Two-dimensional decoding
As can be seen, two-dimensional decoding allows a 128x1
ROM to be built with a 3-to-8 decoder and a 16-input
multiplexer (whose complexity is comparable to that of a 4to-16 decoder).
A 1Mx1 Rom could be built with a 10-to-1024 decoder and
1024-input multiplexer. A lot simpler than the one
dimensional alternative.
Additional benefit to reduction of decoding complexity is
that two-dimensional decoding has one other benefit –- it
leads to a chip whose physical dimensions are close to
square -- important for chip fabrication and packaging.
In ROMs with multiple data outputs the storage arrays
corresponding to each data output may be made narrower
in order to achieve an overall chip layout that is closer to
square. For example, the next figure shows the possible
layout of a 32K x 8 ROM.
Possible layout of a 32K x 8 ROM
Commercial ROM Types
A modern ROM is fabricated as a single IC chip; one that
stores 4M bits with a price under $5.
Various methods are used to “program” the information
stored in a ROM:
Mask Programmable ROMs.
Manufacturer has to be provided with the ROM content in order
to create one or more customized masks to manufacture ROMs
with the required pattern.
ROM manufacturers impose a mask charge of several thousand
dollars for the customized aspects of mask-ROM production.
Because of mask charges and the four-week delay typically
required to obtain programmed chips, mask ROMs are
normally used today only in very high-volume applications.
For a low-volume applications there are more cost-effective
choices, discussed next.
Commercial ROM Types
Programmable read-only memory (PROM)
Similar to a mask ROM, except that the customer may store
data values (program the PROM) in just a few minutes.
PROM is manufactured with all of its diodes or transistors
“connected”. This corresponds to having all desired bits at a
particular value (typically “1”). The PROM programmer can be
used to set desired bits to the opposite value.
In bipolar PROMs this is done by vaporizing tiny fusible links
inside the PROM corresponding to each bit.
A link is vaporized by selecting it using the PROM’s address
and data lines, and then applying a high-voltage pulse (1030V) to the device through a special input pin.
Early reliability problems with vaporized links technology were
solved and reliable fusible-link technology is used now days
not only in bipolar PROMs but also in the bipolar PLD circuits.
Commercial ROM Types
Erasable programmable read-only memory (EPROM):
EPROM is programmable just like PROM.
In addition it also can be “erased” to all 1s-state by exposing it
to ultra-violet light.
EPROM uses a different technology called “floating-gate MOS”.
EPROM manufacturers guarantee that a properly programmed
bit will retain 70% of its charge for at least 10 years even if
the part is stored at 125o C.
Insulating material surrounding the floating gate becomes
slightly conductive if it is exposed to ultraviolet light with a
certain wavelength which provides for the EPROM content to
be erased.
Most common application of EPROMs is to store programs in
microprocessor systems.
EPROMs are typically used during development. ROMs and
PROMs are used once the program is finalized because usually
they cost less than EPROMs of similar capacity.
Commercial ROM Types
Electrically Erasable Programmable Read-Only
Memory (EEPROM).
It is like and EPROM except that individual stored bits
may be erased electrically.
Floating gates in an EEPROM are surrounded by a
much thinner insulating layer and can be erased by
applying a voltage of the opposite polarity as the
charging voltage to the non-floating gate.
Large EEPROMs (1M bit and larger) allow erasing only
in fixed-size blocks, typically 128-512 Kbits (16-64
Kbytes) at a time. These memories are typically
called flash EPROMs or flash memories.
EEPROM can be reprogrammed only a limited number
of times (Insulating layer wares off).
Logic Symbols for standard EPROMs in
28-pin dual in-line packages.
ROM Applications
In addition to the most common application of ROMs
for program storage in microprocessor systems,
there are many other applications that can provide a
low-cost realization of a complex or “random”
combinational logic function.
Example of Voice Signals:
When an analog voice signal enters a typical
telephone systems, it is sampled 8,000 times per
second and converted into a sequence of 8-bit bytes
representing the analog signal at each sampling
point.
This example will show how ROM-based circuits can
easily deal with this highly encoded information.
Coding Voice Samples
The simplest 8-bit encoding of the sign and
amplitude of an analog signal would be an
8-bit integer in the two’s complement or
signed-magnitude system.
8-bit linear encoding yields a dynamic
range of only 28 = 256 different values.
This corresponds to a dynamic range in
signal power of 20*log(256)≈48dB.
By comparison, compact audio disks use a
16-bit linear encoding with a theoretical
dynamic range of 20*log(216)≈96dB
Coding of Voice Samples
North American telephone network uses an 8-bit compounded encoding
called μ–law PCM (pulse code modulation).
The next figure shows the format of an 8-bit coded byte: a sort of floating
point representation containing sign (S), exponent (E) and mantissa (M)
fields.
The analog value V represented by a byte in this format is given by the
formula:
V = (1-2s)*[(2E)*(2M+33)-33]
An analog signal represented in this format can range from
-8159*k to +8159*k, where k is arbitrary scale factor.
The range of the signals is 2*8159 and the smallest difference that can be
represented is only 2 (when E=0), so the dynamic range is 20*log(8159)
≈78dB.
7 654 3 21 0
S
E
M
sign exponent mantissa
Coding of Voice Samples
In many types of phone connections voice signal is
purposely attenuated by a few decibels to make things work
better.
Given a μ–law PCM byte, a digital attenuator must produce
a different PCM byte that represents the original analog
signal multiplied by a specified attenuation factor.
One way to build a digital attenuator is shown in the next
figure.
Each block in the figure can be build with perhaps a dozen
MSI chips or a CPLD or FPGA
8
14
μ-law to
linear
decoder
14
14x14
multiplier
14
linear to
μ-law
encoder
8
Coding of Voice Samples
Alternative realization of digital
attenuator can be done using a single
inexpensive 8kx8 ROM instead.
This ROM can apply any of 32
different attenuation factors to a μ–
law input byte.
High order-address bits select a table,
and the low order address bits select
an entry.
Digital Conference Circuit
In the analog telephone network, it is easy to make a conference
connection between three or more parties:
In the digital network, digital conference circuit must include a
digital adder that produces output samples corresponding to the
sums of the input samples.
We have seen how to create binary adders for 8-bit operands.
However, binary adders cannot process μ–law PCM bytes directly.
Just connect the analog phone wires together and you get an analog
summing junction.
The 8-bit μ–law PCM bytes must be converted to 14-bit linear format,
The signals then can be added,
Resulting signal must then be converted to 8-bit μ–law PCM as in
previous example.
Again, one could create a complex adder or alternatively the same
function be performed by a single 64K x 8 ROM.
The ROM has 16 address inputs accommodating two 8-bit μ–law PCM
operands.
For each pair of operand values, the corresponding ROM address
contains the pre-computed 8-bit μ–law PCM sum.
ROM-based Designs (Advantages)
Previous two examples illustrate many advantages of building complex
combinational functions with ROMs.
Most complex functions:
Are generally difficult to design with a custom digital logic
ROM realization of those functions is alternatively straight forward.
For a moderately complex function, a ROM-based circuit is usually faster
than a circuit using multiple SSI/MSI devices and PLDs, and often faster
than an FPGA or custom LSI chip in a comparable technology.
The program that generates the ROM contents can easily be structured to
handle unusual or undefined cases that would require additional hardware
in any other designs. For example adder function of the previous example
can easily handle out-of-range sums.
A ROM’s function is easily modified just by changing the stored pattern,
usually without changing any external connections. For example, the PCM
attenuator and adder ROM’s in the previous example can be changed to use
8-bit A–law PCM, the standard digital voice coding in Europe.
The prices of ROMs and other structured logic devices are always dropping,
making them more economical and their densities are always increasing,
expanding the scope of problems that can be solved with a single chip.
ROM-based Designs
(Disadvantages)
For a simple to moderately complex
functions, a ROM-based circuit
may cost more,
consume more power, or
run slower
then a circuit using a few SSI/MSI devices and
PLDs or small FPGA.
For functions more than 20 inputs, a ROMbased circuit is impractical because of the
limit on ROM sizes that are available. For
example, one wouldn’t build a 16-bit adder
in ROM – it would require billions and
billions of bits.
Complex Programmable Logic
Devices
Since their inception years ago, programmable logic
devices have been very flexible workhorses of digital
design.
As IC technology advanced, there was naturally great
interest in creating larger PLD architectures to take
advantage of increased chip density. The question is
why didn’t manufacturers just scale the existing
architectures?
For example, if DRAM densities increased by a factor
of 64 over the last 10 years, why couldn't
manufactures scale the 16V8 (16 input signals and its
complements, and a number of 16-variable product
terms) to create a “128V64”? Such device would have
64 I/O pins, and some number (say 8) of 128-variable
product terms for each of its 128 logic macro-cells.
Complex Programmable Logic
Devices
This new chip “128V64” could
combine the functions of a larger
collection of 16V8 and offer terrific
performance and flexibility using any
input in any output function?
This new chip would be very flexible
but it would not have a good
performance.
How to expand PLD architecture?
Increase # of inputs and outputs in a conventional
PLD?
E.g., 16V8 --> 20V8 --> 22V10.
Why not --> 32V16 --> 128V64 ?
Problems:
n times the number of inputs and outputs requires n2
as much chip area -- too costly
logic gets slower as number of inputs to AND array
increases
Solution: multiple PLDs with a relatively small
programmable interconnect.
Less general than a single large PLD, but can use
software “fitter” to partition into smaller PLD blocks.
CPLDs vs. FPGAs
CPLD
architecture:
Small number of
largish
PLDs (e.g., “36V18”)
on a single chip
Programmable
interconnect between
PLDs
CPLDs vs. FPGAs
FPGA
architecture
Much larger number of
smaller programmable
logic blocks.
Embedded in a sea of
lots and lots
of programmable
interconnect.
CPLD families
Identical individual PLD blocks (Xilinx “FBs”)
replicated in different family members.
Different number of PLD blocks
Different number of I/O pins
Many CPLDs have fewer I/O pins than macrocells
“Buried” Macrocells -- provide needed logic terms
internally but these outputs are not connected
externally.
IC package size dictates # of I/O pins but not the
total # of macrocells.
Typical CPLD families have devices with differing
resources in the same IC package.
Xilinx XC9500 CPLD Family
The xilinx XC9500 series is a family of CPLDs with a
similar architecture but varying number of external
input/output pins and internal PLDs (which Xilinx calls
function blocks – FBs).
Each internal PLD has 36 inputs and 18 macrocells
and outputs and might be called “36V18”.
As shown in the table in the next slide, devices in the
family are named according to the number of
macrocells they contain.
The smallest has 2 FBs and 36 macrocells, and
The largest has 16 FBs and 288 macrocells.
Xilinx CPLDs
Notice overlap in resource availability in a particular package.
Xilinx CPLDs
Another feature of this family is that a given chip, such as
XCC95108 is available in several different packages. This is
important not only to accommodate different manufacturing
practices but also to provide some choice and potential savings in
the number of external I/O pins provided. In most applications, it
is not necessary for all internal signal of a state machine or
subsystem to be visible to and used by the rest of the system.
Thus, even though the XC95108 has 108 macrocells, the outputs
of at most 69 of them can be connected externally in the 84-pin
PLCC version of the device.
In fact many of the 69 I/O pins would typically be used for inputs, in
which case even fewer outputs would be visible externally.
Note that the remaining macrocell outputs are still quite usable
internally, since they can be hooked up internally through the CPLD’s
programmable interconnect.
Macrocells whose outputs are usable only internally are sometimes
called buried macrocells.
Xilinx 9500-family CPLD
architecture
Xilinx 9500-family CPLD
architecture
I/O pins can be used as input, output or bidirectional
pins according to the device’s programming.
Special purpose pins:
GSK – global clock
GSR – global set/reset
GTS – global three-state controls;
one of these signals can be selected in each
macrocell to output enable the corresponding output
driver when the macrocell’s output is hooked up to an
external I/O pin.
Only 4 FB’s are shown in the previous schematic
diagram, however, XC9500 architecture scales to
accommodate 16 Fbs in th XC95288.
Xilinx 9500-family CPLD
architecture
Regardless of the specific family member, each FB
programmably receives 36 signals from the switch
matrix.
The inputs to the switch matrix are the 18 macrocell
outputs from each of the FBs and the external inputs
from the I/O pins.
Each FB also has 18 outputs that run “under” the
switch matrix as shown in the previous figure
connecting to the I/O blocks.
These are merely the output-enable signals for the
I/O block output drivers;
They are used when the FB macrocell’s output is
hooked up to an external I/O pin.
9500-family function blocks (FBs)
architecture
18 macrocells per FB
36 inputs per FB (partitioning challenge, but also
reason for relatively compact size of FBs)
Macrocell outputs can go to I/O cells or back into
switch matrix to be routed to this or other FBs.
9500-family function blocks (FBs)
architecture
The basic XC9500 FB programmable
AND array has just 90 product terms.
However, it also has product-term
allocation.
This mechanism allows a macrocell’s
unused product terms to be used by
other nearby macrocells in the same FB.
Next slide depicts a logic diagram of the
XC9500 product-term allocator and
macrocell.
9500-series macrocell (18 per FB)
Set control
Programmable
inversion or XOR
product term
Up to 5 product
terms
Global clock or productterm clock
Reset control
OE control
9500-series product-term allocator
programmable
steering
elements
Share terms from
above and below
9500-series I/O block
1.
2.
3.
Analog controls in addition to
logic ones:
Slew-rate control. The rise and
fall time of the output signals can be set to be fast or slow.
Pull-up resistor. When enabled,
pull-up resistor prevents output
pins from floating as the CPLD is
powered up. Useful if the outputs
are used to drive active-low
enable inputs of other logic that
is not supposed to be enabled
during power up.
User-programmable ground. This
feature reallocated an I/O pin be
ground pin and not a signal pin.
Extra ground pins are needed to
handle the high dynamic currents
that flow when multiple outputs
switch simultaneously.
Switch matrix for XC95108
Could be anything from a
limited set of multiplexers to a
full crossbar.
Multiplexer -- small, fast, but
difficult fitting
Crossbar -- easy fitting but
large and slow
Finding a complete set of
connections through a sparse
switch matrix is NP-complete
problem.
For each different CPLD-based
design, a set of switch-matrix
connections must be found be
“fitter” software.
Typically this software
together with overall CPLD
design are part of
manufacturers “secret sauce”
FPGAs
Historically, FPGA architectures and companies began around the
same time as CPLDs.
FPGAs are closer to “programmable ASICs” -- large emphasis on
interconnection routing
Xilinx launched the world’s first commercial FPGA in 1985, with the
vintage XC2000 device family.
XC3000 and XC4000 families soon followed, setting the stage for
today’s Spartan and Virtex device families.
Each evolution of devices brought improvements in density,
performance, voltage levels, pin counts, and functionality.
Thus XC4000, Spartan and Spartan/XL devices have the same basic
architecture.
Timing is difficult to predict -- multiple hops vs. the fixed delay of a
CPLD’s switch matrix.
But more “scalable” to large sizes.
FPGA programmable logic blocks have only a few inputs and 1 or 2
flip-flops, but there are a lot more of them compared to the
number of macrocells in a CPLD.
General FPGA chip architecture
a.k.a. CLB -“configurable logic
block”
Xilinx 4000-series FPGAs
FPGA specsmanship
Two flip-flops per CLB, plus two per
I/O cell.
25 “gates” per CLB if used for logic.
32 bits of RAM per CLB if not used for
logic.
All of this is valid only if your design
has a “perfect fit”.
Configurable Logic Block (CLB)
CLB function generators (F, G, H)
Use RAM to store a truth table
F, G: 4 inputs, 16 bits of RAM each
H: 3 inputs, 8 bits of RAM:
16x2 dual port RAM or
32X1 single port RAM.
RAM is loaded from an external PROM at system
initialization.
Broad capability using F, G, and H:
Any 2 funcs of 4 vars, plus a func of 3 vars
Any func of 5 vars
Any func of 4 vars, plus some funcs of 6 vars
Some funcs of 9 vars, including parity and 4-bit
cascadable equality checking
Dedicated Fast Carry and Borrow
Logic
The F and G function generators of the XC4000 family have:
separate dedicated logic for fast carry and borrow generation,
with dedicated routing to link the extra signal to the function
generator in the adjacent CLB.
One function generator (F) can be used to add a0+b0, and
Second function generator (G) can generate a1+b1.
The fast carry will forward the carry to the next CLB above
or below.
Fast carry and borrow logic increases the efficiency
performance of adders, subtractors, accumulators,
comparators, and counters.
CLB input and output connections - buried in the sea of interconnect
XC4000 Interconnect Resources
Three types of general-purpose interconnect:
1.
2.
3.
Single-length lines,
Double-length lines, and
Long lines
A grid of horizontal and vertical single-length lines
connect an array of switch boxes.
Switch boxes provide a reduced number of
connections between signal paths within each box
(not a full crossbar switch).
XC4000 Interconnect Resources
In the XC4000 there is a rich set of connections between singlelength lines and the CLB inputs and outputs.
Capabilities for nearest-neighbor and across-the-chip connection
between CLBs.
Two “single” groups are optimized for flexible connectivity between
adjacent blocks without the small number of unidirectional limitation of
wires in the “Direct Connect” groups.
With “single” wires it is possible to connect a CLB to another that’s
more than one hop away, but they would have to go through a
programmable switch for each hop which adds delay.
Wires in the “Double” groups travel past two CLBs before hitting a
switch, so they provide shorter delays for longer connections.
The “Long” groups of wires do not go through any programmable
switches at all: instead, they travel all the way across or down a row or
column and are driven by three-state drivers near the CLB.
Detail connections controlled by
RAM bits
Programmable Switch Matrix (PSM)
Each diamond in the shaded area indicating PSM:
Is a programmable switch element (SPE) that can
connect any line to any other as shown in the next
slide under (b).
While the PSM is essential, using it has a price –
signals incur a small delay each time they hop
through a PSE.
High-quality FPGA fitter software searches for not
just any CLB placement and wire connections that
work.
The “placement and routing” tool spends a lot of time
trying to optimize device performance by finding a
placement that allows short connections, and then
routing the connections themselves.
Programmable Switch Matrix
programmable switch element
turning the corner, etc.
The fitter’s job
Partition logic functions into CLBs
Arrange the CLBs
Interconnect the CLBs
Minimize the number of CLBs used
Minimize the size and delay of interconnect
used
Work with constraints
“Locked” I/O pins
Critical-path delays
Setup and hold times of storage elements
I/O blocks
Problems common to CPLDs and
FPGAs
Pin locking
Small changes, and certainly large ones, can
cause the fitter to pick a different allocation of
I/O blocks and pinout.
Locking too early may make the resulting circuit
slower or not fit at all.
Running out of resources
Design may “blow up” if it doesn’t all fit on a
single device.
On-chip interconnect resources are much richer
than off-chip; e.g., barrel-shifter example.
Larger devices are exponentially more
expensive.