Transcript ppt

EECS150 - Digital Design
Lecture 20 - Memory
April 4&9, 2002
John Wawrzynek
Spring 2002
EECS150 - Lec19-memory
Page 1
Memory Basics
• Uses:
–
–
–
–
–
–
• Example RAM: Register file
data & program storage
general purpose registers
buffering
table lookups
CL implementation
Whenever a large collection of
state elements is required.
• Types:
– RAM - random access memory
– ROM - read only memory
– EPROM, FLASH - electrically
programmable read only
memeory
Spring 2002
regid = register identifier
sizeof(regid) = log2(# of reg)
WE = write enable
EECS150 - Lec19-memory
Page 2
Register File Internals
• Functionally the regfile is
equivalent to a 2-D array
of flip-flops:
• Cell with write logic:
How do we go from "regid" to "SEL"?
Spring 2002
EECS150 - Lec19-memory
Page 3
Regid (address) Decoding
Spring 2002
EECS150 - Lec19-memory
Page 4
Standard Internal Memory Organization
• Special circuit tricks are used for the cell array to improve storage
density. (We will look at these later)
• RAM/ROM naming convention:
– examples: 32 X 8, "32 by 8" => 32 8-bit words
– 1M X 1, "1 meg by 1" => 1M 1-bit words
Spring 2002
EECS150 - Lec19-memory
Page 5
Read Only Memory (ROM)
• Functional Equivalence:
• Of course, full tri-state buffers are not needed at each cell point.
• Single transistors are used to implement zero cells. Logic one’s are
derived through precharging or bit-line pullup transistor.
Spring 2002
EECS150 - Lec19-memory
Page 6
Column MUX in ROMs and RAMs:
• Controls physical aspect ratio
• In DRAM, allows reuse of chip address pins
Spring 2002
EECS150 - Lec19-memory
Page 7
Cascading Memory Modules (or chips)
• example 256 X 8 ROM using
256 X 4 parts:
• example: 1K X * ROM using 256
X 4 parts:
• each module has tri-state
outputs:
Spring 2002
EECS150 - Lec19-memory
Page 8
Definitions
• Bandwidth:
Total amount of data accross out of a device or across an interface per unit
time. (usually Bytes/sec)
• Latency:
A measure of the time from a request for a data transfer until the data is
received.
Memory Interfaces for Acessing Data
• Asynchronous (unclocked):
A change in the address results in data appearing
• Synchronous (clocked):
A change in address, followed by an edge on CLK results in data appearing.
Somtimes, multiple request may be outstanding.
• Volatile:
Looses its state when the power goes off.
Spring 2002
EECS150 - Lec19-memory
Page 9
Example Memory Components:
• Volatile:
– Random Access Memory (RAM):
• DRAM "dynamic"
• SRAM "static"
• Non-volatile:
– Read Only Memory (ROM):
• Mask ROM "mask programmable"
• EPROM "electrically programmable"
• EEPROM "erasable electrically programmable"
• FLASH memory - similar to EEPROM with programmer integrated on
chip
Spring 2002
EECS150 - Lec19-memory
Page 10
Volatile Memory Comparison
• SRAM Cell
• DRAM Cell
word line
word line
bit line
•
•
•
•
bit line
bit line
Larger cell  lower density, higher
cost/bit
No refresh required
•
Simple read  faster access
Standard IC process  natural for
integration with logic
•
•
Spring 2002
•
Smaller cell  higher density, lower
cost/bit
Needs periodic refresh, and refresh
after read
Complex read  longer access time
Special IC process  difficult to
integrate with logic circuits
EECS150 - Lec19-memory
Page 11
In Desktop Computer Systems:
• SRAM (lower density, higher
speed) used in CPU register file,
on- and off-chip caches.
• DRAM (higher density, lower
speed) used in main memory
• Closing the GAP: Innovation targeted towards higher bandwidth for
memory systems:
–
–
–
–
–
–
SDRAM - synchronous DRAM
RDRAM - Rambus DRAM
EDORAM - extended data out SRAM
Three-dimensional RAM
hyper-page mode DRAM video RAM
multibank DRAM
Spring 2002
EECS150 - Lec19-memory
Page 12
Important DRAM Examples:
• EDO - extended data out (similar to fast-page mode)
– RAS cycle fetched rows of data from cell array blocks (long access time,
around 100ns)
– Subsequent CAS cycles quickly access data from row buffers if within an
address page (page is around 256 Bytes)
• SDRAM - synchronous DRAM
– clocked interface
– uses dual banks internally. Start access in one back then next, then receive
data from first then second.
• DDR - Double data rate SDRAM
– Uses both rising (positive edge) and falling (negative) edge of clock for
data transfer. (typical 100MHz clock with 200 MHz transfer).
• RDRAM - Rambus DRAM
– Entire data blocks are access and transferred out on a highspeed bus-like
interface (500 MB/s, 1.6 GB/s)
– Tricky system level design. More expensive memory chips.
Spring 2002
EECS150 - Lec19-memory
Page 13
Non-volatile Memory
Used to hold fixed code (ex. BIOS), tables of data (ex. FSM next state/output
logic), slowly changing values (date/time on computer)
• Mask ROM
– Used with logic circuits for tables etc.
– Contents fixed at IC fab time (truly write once!)
• EPROM (erasable programmable)
& FLASH
– requires special IC process
(floating gate technology)
– writing is slower than RAM. EPROM uses special programming system to
provide special voltages and timing.
– reading can be made fairly fast.
– rewriting is very slow.
• erasure is first required , EPROM - UV light exposure
Spring 2002
EECS150 - Lec19-memory
Page 14
FLASH Memory
• Electrically erasable
• In system programmability and erasability (no special system or
voltages needed)
• On-chip circuitry (FSM) to control erasure and programming (writing)
• Erasure happens in variable sized "sectors" in a flash (16K - 64K
Bytes)
See: http://developer.intel.com/design/flash/
for product descriptions, etc.
Spring 2002
EECS150 - Lec19-memory
Page 15
Relationship between Memory and CL
• Memory blocks can be (and
often are) used to implement
combinational logic functions:
• Examples:
– LUTs in FPGAs
– 1Mbit x 8 EPROM can
implement 8 independent
functions each of log2(1M)=20
inputs.
• The decoder part of a memory
block can be considered a
“minterm generator”.
• The cell array part of a memory
block can be considered an OR
function over a subset of rows.
Spring 2002
• The combination gives us a way
to implement logic functions
directly in sum of products form.
• Several variations on this theme
exist in a set of devices called
Programmable logic devices
(PLDs)
EECS150 - Lec19-memory
Page 16
A ROM as AND/OR Logic Device
Spring 2002
EECS150 - Lec19-memory
Page 17
PLD Summary
Spring 2002
EECS150 - Lec19-memory
Page 18
PLA Example
Spring 2002
EECS150 - Lec19-memory
Page 19
PAL Example
Spring 2002
EECS150 - Lec19-memory
Page 20
Memory Blocks in FPGAs
• LUTs can double as small RAM blocks:
– 5-LUT is a 16x1 memory
– achieves 16x density advantage over using CLB flip-flops
• Newer FPGA families include additional on chip RAM blocks
(usually dual ported)
– Called “block-rams” in Xilinx Virtex series
Spring 2002
EECS150 - Lec19-memory
Page 21
Memory Specification in Verilog
• Memory modeled by an array of registers:
reg[15:0] memword[0:1023]; // 1,024 registers of 16 bits each
//Example Memory Block Specification
//----------------------------//Read and write operations of memory.
//Memory size is 64 words of 4 bits each.
module memory (Enable,ReadWrite,Address,DataIn,DataOut);
input Enable,ReadWrite;
input [3:0] DataIn;
input [5:0] Address;
output [3:0] DataOut;
reg [3:0] DataOut;
reg [3:0] Mem [0:63];
//64 x 4 memory
always @ (Enable or ReadWrite)
if (Enable)
if (ReadWrite)
DataOut = Mem[Address]; //Read
else
Mem[Address] = DataIn;
//Write
else DataOut = 4'bz;
//High impedance state
endmodule
Spring 2002
EECS150 - Lec19-memory
Page 22
Error Correction Codes (ECC)
• Memory systems generate errors (accidentally fliped-bits)
– DRAMs store very little charge per bit
– “Soft” errors occur occasionally when cells are struck by alpha
particles or other environmental upsets.
– Less frequently, “hard” errors can occur when chips permanently fail.
• Where “perfect” memory is required
– servers, spacecraft/military computers, …
• Memories are protected against failures with ECCs
• Extra bits are added to each data-word
– extra bits are used to detect and/or correct faults in the memory
system
– in general, each possible data word value is mapped to a unique
“code word”. A fault changes a valid code word to an invalid one which can be detected.
Spring 2002
EECS150 - Lec19-memory
Page 23
Simple Error Detection Coding
Parity Bit
• Each data value, before it is
written to memory is “tagged”
with an extra bit to force the
stored word to have even parity:
b7b6b5b4b3b2b1b0p
• Each word, as it is read from
memory is “checked” by finding
its parity (including the parity
bit).
b7b6b5b4b3b2b1b0p
+
+
c
• A non-zero parity indicates an error occurred:
– two errors (on different bits) is not detected (nor any even number of
errors)
– odd numbers of errors are detected.
Spring 2002
EECS150 - Lec19-memory
Page 24
Hamming Error Correcting Code
• Use more parity bits to pinpoint
bit(s) in error, so they can be
corrected.
• Example: SEC on 4-bit data
– use 3 parity bits, with 4-data bits
results in 7-bit code word
– 3 parity bits sufficient to identify any
one of 7 code word bits
– overlap the assignment of parity
bits so that a single error in the 7-bit
work can be corrected
• Group parity bits so they
correspond to subsets of the 7 bits:
– p1 protects bits 1,3,5,7
– p2 protects bits 2,3,6,7
– p3 protects bits 4,5,6,7
Spring 2002
1 2 3 4 5 6 7
p1 p2 d1 p3 d2 d3 d4
Bit position number
001 = 110
011 = 310
p1
101 = 510
111 = 710
010 = 210
011 = 310
p2
110 = 610
111 = 710
100 = 410
101 = 510
p3
110 = 610
111 = 710
EECS150 - Lec19-memory
Page 25
Hamming Code Example
1 2 3 4 5 6 7
p1 p2 d1 p3 d2 d3 d4
• Example: c = c1c2c3= 101
– Note: parity bits occupy power-oftwo bit positions in code-word.
•
– On writing parity bits are
•
assigned to force even parity
over their respective groups.
– On reading, check bits (c1,c2,c3)
•
are generated by finding the
parity of the group along with its
parity bit. If an error occurred in
a group, the corresponding check
bit will be 1, if no error the check
bit will be 0.
– error in 4,5,6, or 7 (by c3=1)
– error in 1,3,5, or 7 (by c1=1)
– no error in 2, 3, 6, or 7 (by c2=0)
Therefore error must be in bit 5.
Note the check bits point to 5
By our clever positioning and
assignment of parity bits, the
check bits always address the
position of the error!
• c=000 indicates no error
Spring 2002
EECS150 - Lec19-memory
Page 26
Hamming Error Correcting Code
• Overhead involved in single
error correction code:
– let p be the total number of
parity bits and d the number of
data bits in a p + d bit word.
– If p error correction bits are to
point to the error bit (p + d
cases) plus indicate that no
error exists (1 case), we need:
2p >= p + d + 1,
thus p >= log(p + d + 1)
for large d, p approaches log(d)
•
Adding on extra parity bit covering
the entire word can provide double
error detection
1 2 3 4 5 6 7 8
p1 p2 d1 p3 d2 d3 d4 p4
•
On reading the C bits are computed
(as usual) plus the parity over the
entire word, P:
C=0 P=0, no error
C!=0 P=1, correctable single error
C!=0 P=0, a double error occurred
C=0 P=1, an error occurred in p4 bit
Typical modern codes in DRAM memory systems:
64-bit data blocks (8 bytes) with 72-bit code
words (9 bytes).
Spring 2002
EECS150 - Lec19-memory
Page 27