Transcript Memories

COMP541
Memories - I
Montek Singh
Feb 25, 2010
1
Topics
 Midterm Test
 Thursday after Spring Break
 Lab Preview: VGA character terminal
 Overview of Memory Types
 ROMs: PROMs, FLASH, etc.
 RAMs
 Random-Access Memory (RAM)
 Static today
 Dynamic next
2
Lab: VGA Display Driver Architecture
 No frame buffer
 Character terminal
From/To
CPU
Screen
Character
Memory
Bitmap
Memory
bitmaps by rows
RGB
VGA Driver
HSync
VSync
Valid, VSync, HSync
Timing Generator
3
Character Memory
 Dual ported
 Memory mapped
 CPU writes
 Could read also
To
CPU
 How many characters?
Screen
Character
Memory
Bitmap
Memory
bitmaps by rows
RGB
VGA Driver
HSync
VSync
Valid, VSync, HSync
Timing Generator
4
Bitmap Memory
 What bitmap size?
 5x7 at least
 Codes
 http://www.piclist.com/techref/datafile/charsets.htm
 http://www.piclist.com/techref/datafile/charset/8x8.htm
 Indexed by character memory
 So what code to store in character memory?
 What size should memory be?
5
VGA driver
 Just sends hsync, vsync
 Track current row/column
 something the Timing
Generator should provide the
VGA Driver
To
CPU
Screen
Character
Memory
Bitmap
Memory
bitmaps by rows
RGB
 Generates color
 When valid
 Maybe smaller than VGA
VGA Driver
HSync
VSync
Valid, VSync, HSync
Timing Generator
 What character code?
ASCII?
 How many rows and
columns?
6
Possibilities
 Code color into some bits of character?
 Other possibilities
 Sprites for games?
 Your own Nintendo
 Ideas?
7
RAM on FPGA
 Ours has 28 blocks, each 18Kb (bits, not bytes!)
 They call it
block RAM
 Block RAM: One or two ports, and several possible layouts
 Often you’ll use it as a 16Kb RAM module
8
Using from Verilog
 It’s a primitive
 Instantiate a block (here called R1)
RAMB16_S1
R1(
.DO(out),
.ADDR(addr),
.CLK(clk),
.DI(in),
.EN(ena),
.SSR(1’b0),
.WE(we)
);
// 1-bit Data Output
// 14-bit Address Input
// Clock
// 1-bit Data Input
// RAM Enable Input
// Synchronous Set/Reset Input
// Write Enable Input
9
4-Wide Block
RAMB16_S4 RAMB16_S4_inst (
.DO(DO),
// 4-bit Data Output
.ADDR(ADDR), // 12-bit Address Input
.CLK(CLK),
// Clock
.DI(DI),
// 4-bit Data Input
.EN(EN),
// RAM Enable Input
.SSR(SSR),
// Synchronous Set/Reset Input
.WE(WE)
// Write Enable Input
);
10
Wider Have Parity
RAMB16_S18 RAMB16_S18_inst (
.DO(DO),
// 16-bit Data Output
.DOP(DOP),
// 2-bit parity Output
.ADDR(ADDR), // 10-bit Address Input
.CLK(CLK),
// Clock
.DI(DI),
// 16-bit Data Input
.DIP(DIP),
// 2-bit parity Input
.EN(EN),
// RAM Enable Input
.SSR(SSR),
// Synchronous Set/Reset Input
.WE(WE)
// Write Enable Input
);
11
Can Initialize Block RAM
RAMB16_S1 #(
.INIT(1'b0), // Value of output RAM registers at startup
.SRVAL(1'b0), // Output value upon SSR assertion
.WRITE_MODE("WRITE_FIRST"), // WRITE_FIRST, READ_FIRST or NO_CHANGE
// The following INIT_xx declarations specify the initial contents of the RAM
// Address 0 to 4095
.INIT_00(256'h0000000000000000000000000000000000000000000000000000000000000F1F),
.INIT_01(256'h0000000000000000000000000000000000000000000000000000000000000000),
…
.INIT_3E(256'h0000000000000000000000000000000000000000000000000000000000000000),
.INIT_3F(256'h0000000000000000000000000000000000000000000000000000000000000000)
) RAMB16_S1_inst (
.DO(data),
// 1-bit Data Output
.ADDR(addr), // 14-bit Address Input
.CLK(clk),
// Clock
.DI(DI),
// 1-bit Data Input
.EN(EN),
// RAM Enable Input
.SSR(SSR),
// Synchronous Set/Reset Input
.WE(WE)
// Write Enable Input
Note that addresses
);
go
right to left,
top to bottom
12
Synthesizer Can Also Infer
 Careful how you specify (see XST manual).
module inferRAM(clk, addr, data, we);
input clk;
input [8:0] addr;
// 512 locations
output [7:0] data;
// by 8 bits
input we;
reg [7:0] mem [511:0];
reg [8:0] ra;
always @ (posedge clk)
begin
if(we)
mem[addr] <= data;
ra <= addr;
end
assign data = mem[ra];
endmodule
13
Look at Test Code
 RAM testing example
 I’ll post online for tomorrow’s lab
 Note how memory values are specified
 Addresses go right-to-left, top-to-bottom
 See the Constraints Guide and Library manuals in Xilinx docs
14
Today’s lecture
15
Types of Memory
 Many dimensions
 Read Only vs Read/Write (or write seldom)
 Volatile vs Non-Volatile
 Requires refresh or not
 Look at ROM first to examine interface
16
Non-Volatile Memory Technologies
 Mask (old)
 Fuses (old)
 Electrically erasable
17
Details of ROM
 Memory that is permanent
 k address lines
 2k items
 n bits
18
Notional View of Internals
19
Programmed Truth Table
20
Resulting Programming
 In truth, they’re laid out in 2D (row, col)
21
Mask ROMs
 Oldest technology
 Originally “mask” used as last step in manufacturing
 Specify metal layer (connections)
 Used for volume applications
 Long turnaround
 Used for applications such as embedded systems and, in the
old days, boot ROM
22
Programmable ROM (PROM)
 First ones had fusible links
 High voltage would blow out links
 Fast to program
 Single use
23
UV EPROM
 Erasable PROM
 Common technologies used UV light to erase
complete device
 Took about 10 minutes
 Holds state as charge in very well insulated areas of
the chip
 Nonvolatile for several (10?) years
24
EEPROM

Electrically Erasable PROM
 Similar technology to UV EPROM
 Erased in blocks by higher voltage
 Programming is slower than reading
 Some called flash memory
 Digital cameras, MP3 players, BIOS
 Limited life
 Some support individual word write, some block
 One on Xess board has 5 blocks
 Has a boot block that is carefully protected
25
How Flash Works
 Special transistor with floating gate
 This is part of device surrounded by insulation
 So charge placed there can stay for years
 Aside: some newer devices store multiple bits of info in a cell
 Interested in this? If so, we can cover in more detail
w/ transistors
26
Read/Write Memories
 Flash is obviously writeable
 But not meant to be written rapidly (say at CPU
rates)
 And often by blocks (disk replacement)
 On to RAM
27
Random Access Memories
 So called because it takes same amount of time to
address any particular location
 Not quite true for modern DRAMs
 First look at asynchronous static RAM
 Ones on Xilinx chip synchronous
 Data available at clock edges, like registers
 One on board can be both
28
Simple View of RAM
 Of some word size
n
 Some capacity 2k
 k bits of address line
 Maybe have read line
 Strictly speaking may not need
 Have a write line
29
1K x 16 memory
 Variety of sizes
 From 1-bit wide
 Issue is no. of pins
 Memory size often specified in
bytes
 This would be 2KB memory
 10 address lines and 16 data
lines
30
Writing
 Sequence of steps
 Setup address lines
 Setup data lines
 Activate write line (maybe a pos edge)
31
Reading
 Steps
 Setup address lines
 Activate read line
 Data available after specified amt of time
 For async
 Synchronous memories use a clock
32
Chip Select
 Usually a line to
enable the chip
 Why?
33
Writing
34
Reading
35
Static vs Dynamic RAM
 SRAM vs DRAM
 DRAM stores charge in capacitor
 Disappears after short period of time
 Must be refreshed
 SRAM easier to use
 Uses transistors (think of it as latch)
 Faster
 More expensive per bit
 Smaller sizes
36
Structure of SRAM
 Control logic
 One memory cell per bit
 Cell consists of one or more transistors
 Not really a latch made of NANDs/NORs, but logically
equivalent
37
Simple Organization
 In reality, more complex
 Note that only one wordline H at a time
2:4
Decoder
11
Address
wordline3
2
10
01
00
bitline2
wordline2
wordline1
wordline0
bitline1
stored
bit = 0
stored
bit = 1
stored
bit = 0
stored
bit = 1
stored
bit = 0
stored
bit = 0
stored
bit = 1
stored
bit = 1
stored
bit = 0
stored
bit = 0
stored
bit = 1
stored
bit = 1
Data2
Data1
bitline0
Data0
38
Bit Slice
 Cells connected to form 1 bit




position
Word Select gates one latch
from address lines
Note it selects Reads also
B (and B’) set by R/W, Data
In and BitSelect
Funny thing here when you
write. What is it?
39
Bit Cells
bitline
wordline
stored
bit
Example:
bitline = Z
bitline = 0
wordline = 1
wordline = 0
stored
bit = 0
stored
bit = 0
bitline = Z
bitline = 1
wordline = 1
wordline = 0
stored
bit = 1
(a)
stored
bit = 1
(b)
Bit Slice can Become Module
 Basically bit slice is a X1
memory
 Next
41
SRAM Bit Cell
bitline
wordline
stored
bit
bitline
wordline
bitline
16 X 1 RAM “Chip”
 Now shows
decoder
43
Row/Column
 If RAM gets large, there is a large decoder
 Also run into chip layout issues
 Larger memories usually “2D” in a matrix layout
 Next Slide
44
16 X 1 RAM as 4 X 4 Array
 Two decoders
 Row
 Column
 Address just
broken up
 Not visible from
outside on
SRAMs
45
Change to 8 X 2 RAM
 Minor change in logic
 Also pinouts
 What’s different?
46
Realistic Sizes
 Imagine 256K memory as 32K X 8
 One column layout would need 15-bit decoder with
32K outputs!
 Can make a square layout with 9-bit row and 6-bit
column decoders
47
SRAM Performance
 Current ones have cycle times in low nanoseconds
(say 2.5ns)
 Used as cache (typically on-chip or off-chip
secondary cache)
 Sizes up to 8Mbit or so for fast chips
 SRAMs also common for low power
48
Wider Memory
 What if you don’t have enough
bit width?
49
Larger/Wider Memories
 Made up from sets of chips
 Consider a 64K by 8 RAM
50
Larger
 256K X 8
 Decoder for high-
order 2 bits
 Selects chip
 Look at selection logic
 Address ranges
 Tri-state outputs
51
Deeper Memory
 Adding chips to increase
storage, but keep same width
 Need decoder
52
Today
 Fast look at non-volatile memory
 Static RAM
 Next: Dynamic RAM
 Complex, largest, cheap
 Much more design effort to use
53