Transcript Memories

COMP541
Memories - I
Montek Singh
Oct 10, 2016
1
Topics
 Overview of Memory Types
 Read-Only Memory (ROM): PROMs, FLASH, etc.
 Random-Access Memory (RAM)
 Static today
 Dynamic next
 Verilog descriptions of memories
2
Types of Memory
 Many dimensions
 Read Only vs. Read/Write (or write seldom)
 Volatile vs. Non-Volatile
 Requires refresh or not
 Look at ROM first to examine interface
3
Non-Volatile Memory Technologies
 Mask (old)  ROM
 read-only memory
 Fuses (old)  PROM
 programmable read-only memory
 Erasable  EPROM
 erasable programmable read-only memory
 Electrically erasable  EEPROM
 electrically-erasable programmable read-only memory
 today called FLASH!
 used everywhere!
4
Details of ROM
 Memory that is permanent
 k address lines
 2k items
 n bits
5
Notional View of Internals
 Main components:
 decoder for address decoding  select one row
 “wired-OR” per bit  OR’s together minterms
 ORing done by connecting outputs of effectively tristate buffers
6
Programmed Truth Table
7
ROM after programming
 Remember:
 OR is a “wired OR”
 output is 1 if any of the rows with an intact fuse is 1
 0 otherwise
8
Mask ROMs
 Oldest technology
 Originally “mask” used as last step in manufacturing
 Specify metal layer (connections)
 Used for volume applications
 Long turnaround
 Used for applications such as embedded systems and, in the
old days, boot ROM
 but cheap to mass produce!
9
Programmable ROM (PROM)
 Early ones had fusible links
 High voltage would blow out links
 Fast to program
 Single use
1
UV EPROM
 Erasable PROM
 Common technologies used UV light to erase complete device
 Took about 10 minutes
 Holds state as charge in very well insulated areas of the chip
 Nonvolatile for several (10?) years
11
EEPROM
 Electrically Erasable PROM
 Similar technology to UV EPROM
 Erased in blocks by higher voltage
 Programming is slower than reading
 Today’s flavor is called “flash memory”
 Digital cameras, MP3 players, BIOS
 Limited life
 Some support individual word write, some block
 Our boards have it:
 A flash memory chip on our Nexys boards
 Has a “boot block” that is carefully protected
 We will learn to use it in upcoming labs
12
How Flash Works
 Special transistor with floating gate
 This is part of device surrounded by insulation
 So charge placed there can stay for years
 Aside: some newer devices store multiple bits of info in a cell
 Interested in this?
 Let’s cover briefly
13
Flash
 Add an extra gate to an nMOS transistor
 a “float gate” below the actual control gate
 float gate is isolated from everything else
 can hold electrons for a while
 charge on float gate determines bit value stored
 electrons deposited 
negative charge does not allow
transistor to turn on
 if no electrons on float gate 
transistor can be turned on
by the control gate
https://en.wikipedia.org/wiki/Flash_memory
14
Flash
 Add an extra gate to an nMOS transistor
 charge on float gate determines bit value stored
 float gate can be cleared using high voltage
 erased  ‘1’ value
 cannot erase individual bits: must clear an entire “block” or “page”
 can write individual bits
 for fast write speeds:
 must have empty blocks available
 speeds slows down as memory fills
 thus, garbage collection is important
 overprovisioning used in SSDs
https://en.wikipedia.org/wiki/Flash_memory
15
Read/Write Memories
 Flash is obviously writeable
 But not meant to be written rapidly (say at CPU rates)
 And often writing needs erasure of entire blocks
 For frequent writing, use RAM
16
Random Access Memories
 So called because it takes same amount of time to
address any particular location
 Not entirely true for modern DRAMs, but somewhat true…
 First look at asynchronous static RAM
 reading and writing typically controlled by “handshakes”
 clock may still be present, but actions controlled by handshake
signals
17
Simple View of RAM
 Typical parameters:
 some word size n
 some capacity 2k
 k bits of address line
 Need a line to specify reading or writing
 typically only one wire needed
 sometimes two separate ones
18
Example: 1K x 16 memory
 RAM comes in variety of sizes
 from 1-bit wide
 main issue is no. of pins available
on chip
 Memory size often specified in
bytes
 This would be 2KB memory
 10 address lines (=1K locations)
 16 data lines (=2 bytes/location)
19
Writing
 Sequence of steps
 Set up address lines
 Set up data lines
 Activate write line (e.g., maybe a positive edge)
20
Reading
 Steps
 Setup address lines
 Activate read line
 Data available soon
 for asynchronous memory: after simply a specified amount of time
 for synchronous memory: after a clock edge
21
Chip Select
 Enable:
 Usually a line to enable the chip
 Why?
22
Timing: Writing
23
Timing: Reading
24
Static vs. Dynamic RAM
 Different internal implementations: SRAM vs. DRAM
 DRAM:
 DRAM stores charge in capacitor
 Disappears after short period of time
 Must be refreshed
 Small size
 Higher storage density  larger capacities
 SRAM:
 SRAM easier to use
 Uses transistors (think of it as latch)
 Faster
 More expensive per bit
 Smaller sizes
25
Structure of SRAM
 Internally, each bit stored in a “latch”
 One memory cell per bit
 Cell consists of a few transistors
 Not really a latch made of NANDs/NORs, but logically equivalent
 Behaves like an SR latch
 Control logic
 also need extra logic around the latch to make it work like a
memory cell
26
Structure of SRAM
 Several optimized circuits often used
 replace a full-fledged SR latch with something simpler,
smaller, faster…
 Not really a latch made of NANDs/NORs, but logically equivalent
 Behaves like an SR latch
 e.g., a simpler 6-transistor memory cell
 wordline  Select
 (bitline, bitline’)  (B, B’) as well as (C, C’)
bitline
bitline
wordline
27
Example: A Simple Organization
 Note:
 In reality, more complex
 Only one word-line is “on” at a time
2:4
Decoder
11
Address
wordline3
2
10
01
00
bitline2
wordline2
wordline1
wordline0
bitline1
stored
bit = 0
stored
bit = 1
stored
bit = 0
stored
bit = 1
stored
bit = 0
stored
bit = 0
stored
bit = 1
stored
bit = 1
stored
bit = 0
stored
bit = 0
stored
bit = 1
stored
bit = 1
Data2
Data1
bitline0
Data0
28
Zoom in: A single bit slice
 Operation:
 Cells connected to form 1
bit position (column)
 Word Select enables one
latch from address lines
 only this cell is writable
 only this cell is read
 B (and B’) set by:
 Read/Write’
 Data In
 Bit Select
 Outputs are C and C’
 if enabled, output value
of cell
 if disabled, typically
output floating
29
Let’s look at a single bit cell
bitline
wordline
stored
bit
Example:
bitline = Z
bitline = 0
wordline = 1
wordline = 0
stored
bit = 0
stored
bit = 0
bitline = Z
bitline = 1
wordline = 1
wordline = 0
stored
bit = 1
(a)
stored
bit = 1
(b)
30
Bit Slices and Modules
 Entire column of cells
 called a bit slice
 basically a 1-bit wide memory!
 Module
 module refers to a single chip of
memory
 1-bit wide memory chips are
quite common!
31
Inside an SRAM Bit Cell
 Actual implementation does not use a real SR latch!
 a tinier approximation is used
 logically behaves very much like an SR latch
 but much smaller and faster!
bitline
wordline
stored
bit
bitline
bitline
wordline
32
16 X 1 RAM “Chip”
 Now shows
address decoder
 selects
appropriate
location
33
Row/Column Layout
 For larger RAMs:
 decoder becomes pretty big
 also run into chip layout issues
 Typically:
 larger memories use “2D” matrix layout
 see next slide
34
16 X 1 RAM as 4 X 4 Array
 Two decoders
 Row
 Column
 Address just
broken up
 Not visible from
outside on
SRAMs
35
Not the same as 8 X 2 RAM!
 Minor change in logic
and pins
 Spot the difference!
36
Spot the difference!
37
Realistic Sizes
 Example: 256Kb memory organized 32K X 8
 Single-column layout would need 15-bit decoder with 32K
outputs!
 Better organization:
 A 2D (i.e., square) layout with:
 9-bit row and 6-bit column decoders
38
SRAM Performance
 Latency and Throughput important
 Current ones have cycle times in low nanoseconds
 say 1-2ns (top-end ones even lower)
 Used as cache (typically on-chip or off-chip secondary cache)
 Sizes up to 8Mbit or so for fast chips
 Expensive ones can go a bit bigger
 Energy/power
 SRAMs also better for low power vs. DRAMs
39
Wider Memory
 What if you don’t have enough bit width?
 use multiple chips and side-by-side
40
Larger/Wider Memories
 Made up from sets of chips
 Consider a 64K by 8 RAM
 our building block
41
Larger
 Let’s build a larger
memory
 256K X 8
 Decoder for high-order 2 bits
 Selects chip
 Look at selection logic
 Address ranges
 Tri-state outputs
42
SystemVerilog
Behavioral descriptions of:
ROM, single-ported RAM, dual-ported RAM, etc.
43
SystemVerilog: 1-port RAM
 RAM example
 single-ported  one address (for reading and writing)
 whether read or written is determined by “write enable”
 clock
 all writes take place on clock tick
 reads are asynchronous
– i.e., output after a propagation delay without waiting for a clock tick
addr
din
1-port
RAM
dout
wr
clock
44
SystemVerilog: 1-port RAM
logic [Dbits-1:0] mem [Nloc-1:0];
The actual
storage where
data resides
always_ff @(posedge clock)
if(wr) mem[addr] <= din;
Write operation
on clock tick if
write enabled
Reading is
asynchronous, no
clock involved
assign dout = mem[addr];
addr
din
1-port
RAM
dout
wr
clock
45
SystemVerilog: 2-port RAM
 RAM example
 2 ports
 one read-write port (using addr1)
 one read-only port (using addr2)
 2 outputs: dout1 and dout2
 only one data input: din
read-write: addr1
read-only: addr2
din
wr
2-port
RAM
clock
dout1
dout2
46
SystemVerilog: 2-port RAM
The actual
storage where
data resides
logic [Dbits-1:0] mem [Nloc-1:0];
always_ff @(posedge clock)
if(wr) mem[addr1] <= din;
assign dout1 = mem[addr1];
assign dout2 = mem[addr2];
read-write: addr1
read-only: addr2
din
wr
2-port
RAM
clock
Write operation
on clock tick if
write enabled
Reading is
asynchronous, no
clock involved
dout1
dout2
47
SystemVerilog: register file
 Register file
 3 ports
 two read-only ports (using ReadAddr1 and ReadAddr2)
 one write-only port (using WriteAddr)
 2 outputs: ReadData1 and ReadData2
 one data input: WriteData
 special case: reading $0 always returns 0
ReadAddr1
ReadAddr2
WriteAddr
3-port
register file
ReadData1
ReadData2
WriteData
wr
clock
48
SystemVerilog: register file
logic [Dbits-1:0] rf [Nloc-1:0];
always_ff @(posedge clock)
if(wr) rf[…] <= …;
assign ReadData1 = … ? … rf[…];
assign ReadData1 = … ? … rf[…];
The actual
storage where
data resides
Write operation
on clock tick if
write enabled
Reading is
asynchronous, no
clock involved
Reading $0 must
always return 0
ReadAddr1
ReadAddr2
WriteAddr
3-port
register file
WriteData
wr
ReadData1
ReadData2
Skeleton only.
You fill in the
details (Lab 8).
clock
49
SystemVerilog: memory initialization
 Specify a file that contains initial values
 one value per line:
 hex or binary
 use $readmemh for hex
 use $readmemb for binary
logic [Dbits-1:0] mem[Nloc-1:0];
initial $readmemh(“mem_data.txt”, mem, 0, Nloc-1);
always_ff @(posedge clock)
…
assign …
Specifies the file
that contains
initial values
50
SystemVerilog: ROM example
 ROM example
 single-ported
 read-only, no writing
 no clock needed
 reads are asynchronous
 i.e., output appears after a propagation delay without waiting for a
clock tick
logic [Dbits-1:0] mem [Nloc-1:0];
initial $readmemh(“mem_data.txt”, mem, 0, Nloc-1);
assign dout = mem[addr];
Read operation
only, no writes
51
Summary
 Today we looked at:
 Quick look at non-volatile memory
 Static RAM
 SystemVerilog templates for memories
 Next topic:
 Dynamic RAM
 Complex, largest, cheap
 Much more design effort to use
52