Template - Webcourse
Download
Report
Transcript Template - Webcourse
Computer Architecture
PC Structure and Peripherals
Dr. Lihu Rappoport
1
Computer Architecture 2011 – PC Structure and Peripherals
DRAM
2
Computer Architecture 2011 – PC Structure and Peripherals
Basic DRAM chip
RAS#
Addr
CAS#
Row
Address
Latch
Column
Address
Latch
Row
Address
decoder
Memory
array
Column addr
decoder
Data
DRAM access sequence
3
Put Row on addr. bus and assert RAS# (Row Addr. Strobe) to latch Row
Put Column on addr. bus and assert CAS# (Column Addr. Strobe) to latch Col
Get data on address bus
Computer Architecture 2011 – PC Structure and Peripherals
DRAM Operation
DRAM cell consists of transistor + capacitor
Capacitor keeps the state;
Transistor guards access to the state
Reading cell state:
raise access line AL and sense DL
AL
DL
M
C
• Capacitor charged
current to flow on the data line DL
Writing cell state:
set DL and raise AL to charge/drain capacitor
Charging and draining a capacitor is not
instantaneous
Leakage current drains capacitor even
when transistor is closed
DRAM cell periodically refreshed every 64ms
4
Computer Architecture 2011 – PC Structure and Peripherals
DRAM Access Sequence Timing
tRP – Row Precharge
RAS#
tRCD – RAS/CAS delay
CAS#
A[0:7]
X
Row i
Col n
X
Row j
CL – CAS latency
Data
5
Data n
Put row address on address bus and assert RAS#
Wait for RAS# to CAS# delay (tRCD) between asserting RAS and CAS
Put column address on address bus and assert CAS#
Wait for CAS latency (CL) between time CAS# asserted and data ready
Row precharge time: time to close current row, and open a new row
Computer Architecture 2011 – PC Structure and Peripherals
DRAM controller
DRAM controller gets address and command
Splits address to Row and Column
Generates DRAM control signals at the proper timing
A[20:23]
address
decoder
Chip
select
Time
delay
gen.
RAS#
CAS#
Select
A[10:19]
A[0:9]
D[0:7]
address
mux
Memory address bus
DRAM
R/W#
DRAM data must be periodically refreshed
6
DRAM controller performs DRAM refresh, using refresh counter
Computer Architecture 2011 – PC Structure and Peripherals
Improved DRAM Schemes
Paged Mode DRAM
– Multiple accesses to different columns from same row
– Saves RAS and RAS to CAS delay
RAS#
CAS#
A[0:7]
X
Row
X
Col n
X
Col n+1
X
Data n
Data
X
Col n+2
D n+1
D n+2
Extended Data Output RAM (EDO RAM)
– A data output latch enables to parallel next column address with
current column data
RAS#
CAS#
A[0:7]
Data
7
X
Row
X
Col n
X
Col n+1
X
X
Col n+2
Data n
Data n+1
Data n+2
Computer Architecture 2011 – PC Structure and Peripherals
Improved DRAM Schemes (cont)
Burst DRAM
– Generates consecutive column address by itself
RAS#
CAS#
A[0:7]
Data
8
X
Row
X
Col n
X
Data n
Data n+1
Data n+2
Computer Architecture 2011 – PC Structure and Peripherals
Synchronous DRAM – SDRAM
All signals are referenced to an external clock (100MHz-200MHz)
Makes timing more precise with other system devices
4 banks – multiple pages open simultaneously (one per bank)
Command driven functionality instead of signal driven
Burst oriented read and write accesses
ACTIVE: selects both the bank and the row to be activated
• ACTIVE to a new bank can be issued while accessing current bank
READ/WRITE: select column
Successive column locations accessed in the given row
Burst length is programmable: 1, 2, 4, 8, and full-page
• May end full-page burst by BURST TERMINATE to get arbitrary burst length
A user programmable Mode Register
CAS latency, burst length, burst type
Auto pre-charge: may close row at last read/write in burst
Auto refresh: internal counters generate refresh address
9
Computer Architecture 2011 – PC Structure and Peripherals
SDRAM Timing
clock
cmd
ACT
NOP
RD RD+PC ACT
tRCD >
20ns
NOP
RD
ACT
NOP
RD
NOP
NOP
NOP
t RRD >
20ns
BL = 1
t RC>70ns
Bank
Bank 0
X
Bank 0 Bank 0 Bank 1
X
Bank 1 Bank 0
X
Bank 0
X
X
X
Addr
Row i
X
Col j Col k Row m
X
Col n Row l
X
Col q
X
X
X
CL=2
Data
10
Data j Data k
Data n
Data q
tRCD: ACTIVE to READ/WRITE gap = tRCD(MIN) / clock period
tRC: successive ACTIVE to a different row in the same bank
tRRD: successive ACTIVE commands to different banks
Computer Architecture 2011 – PC Structure and Peripherals
DDR-SDRAM
2n-prefetch architecture
DRAM cells are clocked at the same speed as SDR SDRAM cells
Internal data bus is twice the width of the external data bus
Data capture occurs twice per clock cycle
• Lower half of the bus sampled at clock rise
• Upper half of the bus sampled at clock fall
0:n-1
SDRAM
Array
0:n-1
0:2n-1
n:2n-1
400M
xfer/sec
200MHz clock
11
Uses 2.5V (vs. 3.3V in SDRAM)
Reduced power consumption
Computer Architecture 2011 – PC Structure and Peripherals
DDR SDRAM Timing
133MHz
clock
cmd
ACT
NOP
NOP
RD
NOP
ACT
NOP
NOP
RD
NOP
ACT
NOP
NOP
tRCD >20ns
t RRD >20ns
t RC>70ns
Bank
Bank 0
X
X
Bank 0
X
Bank 1
X
X
Bank 1
X
Bank 0
X
X
Addr
Row i
X
X
Col j
X
Row m
X
X
Col n
X
Row l
X
X
CL=2
Data
12
j
+1 +2 +3
n +1 +2 +3
Computer Architecture 2011 – PC Structure and Peripherals
DIMMs
13
DIMM: Dual In-line Memory Module
A small circuit board that holds memory chips
64-bit wide data path (72 bit with parity)
Single sided: 9 chips, each with 8 bit data bus
Dual sided: 18 chips, each with 4 bit data bus
Data BW: 64 bits on each rising and falling edge of the clock
Other pins
Address – 14, RAS, CAS, chip select – 4, VDC – 17, Gnd – 18,
clock – 4, serial address – 3, …
Computer Architecture 2011 – PC Structure and Peripherals
DDR Standards
DRAM timing, measured in I/O bus cycles, specifies 3 numbers
CAS Latency – RAS to CAS Delay – RAS Precharge Time
CAS latency (latency to get data in an open page) in nsec
CAS Latency × I/O bus cycle time
Standard
name
Mem. I/O bus Cycle Data
VDDQ
clock
clock
time
rate
(V)
(MHz) (MHz) (ns) (MT/s)
DDR-200
100
100
10
200
DDR-266
133⅓
133⅓
7.5
266⅔
DDR-333
166⅔
166⅔
6
333⅓
DDR-400
200
200
5
400
2.5
2.6
Module
name
transfer Timing
CAS
rate
(CL-tRCD- Latency
tRP)
(MB/s)
(ns)
PC-1600
1600
PC-2100
2133⅓
PC-2700
2666⅔
PC-3200
3200
2.5-3-3
3-3-3
3-4-4
12.5
15
15
Total BW for DDR400
14
3200M Byte/sec = 64 bit2200MHz / 8 (bit/byte)
6400M Byte/sec for dual channel DDR SDRAM
Computer Architecture 2011 – PC Structure and Peripherals
DDR2
DDR2 doubles the bandwidth
DDR2-533 cell works at the same freq.
as a DDR266 cell or a PC133 cell
Memory
Cell Array
I/O
Buffers
Data Bus
Memory
Cell Array
I/O
Buffers
Data Bus
Memory
Cell Array
I/O
Buffers
Data Bus
Prefetching increases latency
Smaller page size: 1KB vs. 2KB
4n pre-fetch: internally read/write 4×
the amount of data as the external bus
Reduces activation power – ACTIVATE
command reads all bits in the page
8 banks in 1Gb densities and above
Increases random accesses
1.8V (vs 2.5V) operation voltage
15
Significantly lower power
Computer Architecture 2011 – PC Structure and Peripherals
DDR2 Standards
Mem
Standard
clock
name
(MHz)
I/O
Peak
Cycle Bus
Data rate Module
transfer
time clock
(MT/s)
name
rate
(MHz)
DDR2-400
100
10 ns
200
400
PC23200
DDR2-533
133
7.5 ns
266
533
DDR2-667
166
6 ns
333
MHz
DDR2-800
200
5 ns
400
MHz
DDR21066
266
3.75
ns
533
MHz
16
Timings
CAS
Latency
3200 MB/s
3-3-3
4-4-4
15
20
PC24200
4266 MB/s
3-3-3
4-4-4
11.25
15
667
PC25300
5333 MB/s
4-4-4
5-5-5
12
15
800
PC26400
6400 MB/s
4-4-4
5-5-5
6-6-6
10
12.5
15
1066
PC28500
8533 MB/s
6-6-6
7-7-7
11.25
13.125
Computer Architecture 2011 – PC Structure and Peripherals
DDR3
17
30% power consumption reduction compared to DDR2
1.5V supply voltage, compared to DDR2's 1.8V
90 nanometer fabrication technology
Higher bandwidth
8 bit deep prefetch buffer (vs. 4 bit in DDR2 and 2 bit in DDR)
Transfer data rate
Effective clock rate of 800–1600 MHz using both rising and
falling edges of a 400–800 MHz I/O clock
DDR2: 400–800 MHz using a 200–400 MHz I/O clock
DDR: 200–400 MHz based on a 100–200 MHz I/O clock
DDR3 DIMMs
240 pins, the same number as DDR2, and are the same size
Electrically incompatible, and have a different key notch location
Computer Architecture 2011 – PC Structure and Peripherals
DDR3 Standards
Module
name
Peak
transfer
rate
(MB/s)
800
PC3-6400
6400
1.875
1066⅔
PC3-8500
8533⅓
1.5
1333⅓
PC3-10600
Standard
Name
Mem
clock
(MHz)
I/O bus
I/O bus
Data
clock
Cycle time
rate
(MHz)
(ns)
(MT/s)
DDR3-800
100
400
2.5
DDR3-1066
133⅓
533⅓
DDR3-1333
166⅔
666⅔
Timings
(CL-tRCDtRP)
CAS
Latency
(ns)
5-5-5
6-6-6
6-6-6
7-7-7
8-8-8
12 1⁄2
15
11 1⁄4
13 1⁄8
15
10666⅔
8-8-8
9-9-9
12
13 1⁄2
11 1⁄4
12 1⁄2
13 3⁄4
DDR3-1600
200
800
1.25
1600
PC3-12800
12800
9-9-9
10-10-10
11-11-11
DDR3-1866
233⅓
933⅓
1.07
1866⅔
PC3-14900
14933⅓
11-11-11
12-12-12
11 11⁄14
12 6⁄7
17066⅔
12-12-12
13-13-13
11 1⁄4
12 3⁄16
DDR3-2133
18
266⅔
1066⅔
0.9375
2133⅓
PC3-17000
Computer Architecture 2011 – PC Structure and Peripherals
DDR2 vs. DDR3 Performance
The high latency of DDR3 SDRAM has
negative effect on streaming operations
19
Source: xbitlabs
Computer Architecture 2011 – PC Structure and Peripherals
SRAM – Static RAM
20
True random access
High speed, low density, high power
No refresh
Address not multiplexed
DDR SRAM
2 READs or 2 WRITEs per clock
Common or Separate I/O
DDRII: 200MHz to 333MHz Operation; Density: 18/36/72Mb+
QDR SRAM
Two separate DDR ports: one read and one write
One DDR address bus: alternating between the read address
and the write address
QDRII: 250MHz to 333MHz Operation; Density: 18/36/72Mb+
Computer Architecture 2011 – PC Structure and Peripherals
SRAM vs. DRAM
Random Access: access time is the same for all locations
DRAM – Dynamic RAM
SRAM – Static RAM
Refresh
Refresh needed
No refresh needed
Address
Address muxed: row+ column
Address not multiplexed
Access
Not true “Random Access”
True “Random Access”
density
High (1 Transistor/bit)
Low (6 Transistor/bit)
Power
low
high
Speed
slow
fast
Price/bit
low
high
Typical usage Main memory
21
cache
Computer Architecture 2011 – PC Structure and Peripherals
Read Only Memory (ROM)
Random Access
Non volatile
ROM Types
PROM – Programmable ROM
• Burnt once using special equipment
EPROM – Erasable PROM
• Can be erased by exposure to UV, and then reprogrammed
E2PROM – Electrically Erasable PROM
• Can be erased and reprogrammed on board
• Write time (programming) much longer than RAM
• Limited number of writes (thousands)
22
Computer Architecture 2011 – PC Structure and Peripherals
Hard Disks
23
Computer Architecture 2011 – PC Structure and Peripherals
Hard Disk Structure
Rotating platters coated with a magnetic surface
Each platter is divided to tracks: concentric circles
Each track is divided to sectors
• Smallest unit that can be read or written
Disk outer tracks have more space for
sectors than the inner tracks
• Constant bit density: record more
sectors on the outer tracks
• speed varies with track location
Sector
Track
Moveable read/write head
Radial movement to access all tracks
Platter rotation to access all sectors
Platters
Buffer Cache
A temporary data storage area
used to enhance drive performance
24
Computer Architecture 2011 – PC Structure and Peripherals
Hard Disk Structure
25
Computer Architecture 2011 – PC Structure and Peripherals
Disk Access
Seek: position the head over the proper track
Average: Sum of the time for all possible seek / total # of possible seeks
Due to locality of disk reference, actual average seek is shorter: 4 to 12 ms
Rotational latency: wait for desired sector to rotate under head
The faster the drives spins, the shorter the rotational latency time
Most disks rotate at 5,400 to 15,000 RPM
• At 7200 RPM: 8 ms per revolution
An average latency to the desired information is halfway around the disk
• At 7200 RPM: 4 ms
Transfer block: read/write the data
Transfer time is a function of: sector size, rotation speed,
and recording density: bits per inch on a track
Typical values: 100 MB / sec
Disk Access Time = Seek time + Rotational Latency + Transfer time
+ Controller Time + Queuing Delay
26
Computer Architecture 2011 – PC Structure and Peripherals
EIDE Disk Interface
EIDE, PATA, UltraATA, ATA 100, ATAPI: all the same interface
Uses for connecting hard disk drives and CD/DVD drives
80-pin cable, 40-pin dual header connector
100 MB/s
EIDE controller integrated with the motherboard
EIDE controller has two channels
27
Primary and a secondary, which work independently
Two devices per channel: master and slave, but equal
• The 2 devices take turns in controlling the bus
If there are two device on the system (e.g., a hard disk and a CD/DVD)
• It is better to put them on different channels
Avoid mixing slower (DVD) and faster devices (HDD) on the same channel
If doing a lot of copying from drive to drive
• Better performance by separating devices to separate channels
Computer Architecture 2011 – PC Structure and Peripherals
Disk Interface – Serial ATA (SATA)
Point-to-point connection
Easier routing, easier installation, better reliability, improved airflow
1/6 the board area compared to EIDE connector
4 wires for signaling + 3 ground to minimize impedance and crosstalk
Current HDDs still do not utilize SATA rev 3 BW
28
SATA rev 1: 150 MB/sec
SATA rev 2: 300 MB/sec
SATA rev 3: 600 MB/sec
Thinner (7 wires), flexible, longer cables
No master/slave jumper configuration needed
when a adding a 2nd SATA drive
Increased BW
Dedicated BW per device (no sharing)
HDD peak (not sustained) gets to 157 MB/s
SSD gets to 250 MB/sec
Computer Architecture 2011 – PC Structure and Peripherals
Flash Memory
Flash is a non-volatile, rewritable memory
NOR Flash
Supports per-byte data read and write (random access)
• Erasing (setting all the bits) done only at block granularity (64-128KB)
• Writing (clearing a bit) can be done at byte granularity
Suitable for storing code (e.g. BIOS, cell phone firmware)
NAND Flash
Supports page-mode read and write (0.5KB – 4KB per page)
• Erasing (setting all the bits) done only at block granularity (64-128KB)
Suitable for storing large data (e.g. pictures, songs)
• Similar to other secondary data storage devices
29
Reduced erase and write times
Greater storage density and lower cost per bit
Computer Architecture 2011 – PC Structure and Peripherals
Flash Memory Principles of Operation
Information is stored in an array of memory cells
In single-level cell (SLC) devices, each cell stores one bit
Multi-level cell (MLC) devices store multiple bits per cell using multiple levels of
electrical charge
Each memory cell is made from a floating-gate transistor
Resembles a standard MOSFET, with two gates instead of one
• A control gate (CG), as in other MOS transistors, placed on top
• A floating gate (FG), interposed between the CG and the MOSFET channel
The FG is insulated all around by an oxide layer electrons placed on it are trapped
• Under normal conditions, will not discharge for many years
When the FG holds a charge, it partially cancels the electric field from the CG
• Modifies the cell’s threshold voltage (VT): more voltage has to be applied to the
CG to make the channel conduct
Read-out: apply a voltage intermediate between the
possible threshold voltages to the CG
• Test the channel's conductivity by sensing
the current flow through the channel
• In a MLC device, sense the amount of current flow
30
Computer Architecture 2011 – PC Structure and Peripherals
Flash Write Endurance
Typical number of write cycles
SLC
MLC
NAND flash
100K
1K – 3K
NOR flash
100K to 1M
100K
Bad block management (BBM)
Performed by the device driver software, or by a HW controller
• E.g., SD cards include a HW controller perform BBM and wear leveling
Map logical block to physical block
• Mapping tables stored in dedicated flash blocks or
• Each block checked at power-up to create a bad block map in RAM
ECC compensates for bits that spontaneously fail
31
Each write is verified, and block is remapped in case of write failure
Memory capacity gradually shrinks as more blocks are marked as bad
22 (24) bits of ECC code correct a one bit error in 2048 (4096) data bits
If ECC cannot correct the error during read, it may still detect the error
Computer Architecture 2011 – PC Structure and Peripherals
Flash Write Endurance (cont)
Wear-leveling algorithms
Dynamic wear leveling
Map Logical Block Addresses (LBAs) to physical Flash memory addresses
Each time a block of data is written, it is written to a new location
• Link the new block
• Mark original physical block as invalid data
• Blocks that never get written remain in the same location
Static wear leveling
32
Evenly distribute data across flash memory and move data around
Prevent from one portion to wear out faster than another
SSD's controller keeps a record of where data is set down on the drive as
it is relocated from one portion to another
Periodically move blocks which are not written
Allow these low usage cells be used by other data
Computer Architecture 2011 – PC Structure and Peripherals
Solid State Drive – SSD
Most manufacturers use "burst rate" for Performance numbers
Not its steady state or average read rate
Any write operation requires an erase followed by the write
When SSD is new, NAND flash memory is pre-erased
Consumer-grade multi-level cell (MLC)
Allows ≥2 bit per flash memory cell
Sustains 2,000 to 10,000 write cycles
Notably less expensive than SLC drives
Enterprise-class single-level cell (SLC)
Allows 1 bit per flash memory cell
Lasts 10× write cycles of an MLC
The more write/erase cycle the shorter the drive's lifespan
Use wear-leveling algorithms to evenly distribute writes
DRAM cache to buffer data writes to reduce number of write/erase cycles
Extra memory cells to be used when blocks of flash memory wear out
33
Computer Architecture 2011 – PC Structure and Peripherals
SSD (cont.)
Data in NAND flash memory organized in fixed size in blocks
34
When any portion of the data on the drive is changed
• Mark block for deletion in preparation for the new data
• Read current data on the block
• Redistribute the old data
• Lay down the new data in the old block
Old data is rewritten back
Typical write amplification is 15 to 20
• For every 1MB of data written to the drive, 15MB to 20MBs of space
is actually needed
• Using write combining reduces write amplification to ~10%
Flash drives compared to HD drives:
Smaller size, faster, lighter, noiseless, lower power
Withstanding shocks up to 2000 Gs (like 10 foot drop onto concrete)
More expensive (cost/byte): ~2$/1GB vs ~0.1$/1GB in HDD
Computer Architecture 2011 – PC Structure and Peripherals
The Motherboard
35
Computer Architecture 2011 – PC Structure and Peripherals
Computer System Structure – 2009
External
Graphics
Card
HDMI
PCI express ×16
CPU
BUS
LLC
Core
Core
North Bridge (GMCH)
On-board
Graphics
Memory
controller
DDRII
Channel 1
Mem BUS
DDRII
Channel 2
South Bridge (ICH)
PCI express ×1
36
Serial Port
Parallel Port
IO Controller
Floppy
Drive
keybrd
USB
IDE
SATA
controller controller controller
mouse
Old DVD
Drive
Hard
Disk
PCI
Sound
Card
speakers
Lan
Adap
LAN
Computer Architecture 2011 – PC Structure and Peripherals
Computer System – Nehalem
External
Graphics
Card
PCI express ×16
DDRIII
Cache
Channel 1
Mem
BUS
DDRIII
Memory
controller
Core
CPU
BUS
Core
Channel 2
North Bridge
On-board
Graphics
HDMI
South Bridge
PCI express ×1
37
Serial Port
Parallel Port
IO Controller
Floppy
Drive
keybrd
USB
SATA
SATA
controller controller controller
mouse
DVD
Drive
Hard
Disk
PCI
Sound
Card
speakers
Lan
Adap
LAN
Computer Architecture 2011 – PC Structure and Peripherals
Computer System – Westmere
External
Graphics
Card
PCI express ×16
MCP – Multi Chip Package
DDRIII
Cache
Channel 1
Mem
BUS
DDRIII
Memory
controller
Core
Core
Channel 2
North Bridge
On-board
Graphics
Display link
South Bridge (PCH)
HDMI
PCI express ×1
38
Serial Port
Parallel Port
IO Controller
Floppy
Drive
keybrd
USB
SATA
SATA
controller controller controller
mouse
DVD
Drive
Hard
Disk
PCI
Sound
Card
speakers
Lan
Adap
LAN
Computer Architecture 2011 – PC Structure and Peripherals
Computer System – Sandy Bridge
External
Graphics
Card
PCI express ×16
2133-1066
MHz
DDRIII
Channel 1
DDRIII
Cache
Mem
BUS
Memory
controller
GFX
System
Agent
Core
Channel 2
Line out
Line in
S/PDIF out
S/PDIF in
Core
Display link
Audio
Codec
4×DMI
South Bridge (PCH)
exp
slots
BIOS
39
Serial Port
Parallel Port
Super I/O
PCI express ×1
LPC
USB
Floppy
Drive
D-sub, HDMI, DVI, Display port
PS/2
keybrd/
mouse
mouse
SATA
DVD
Drive
SATA
Hard
Disk
Lan
Adap
LAN
Computer Architecture 2011 – PC Structure and Peripherals
PCH Connections
LPC (Low Pin Count) Bus
Supports legacy, low BW I/O devices
Typically integrated in a Super I/O chip
• Serial and parallel ports, keyboard, mouse, floppy disk controller
Other: Trusted Platform Module (TPM), Boot ROM
Direct Media Interface (DMI)
The link between an Intel north bridge and an Intel south bridge
• Replaces the Hub Interface
DMI shares many characteristics with PCI-E
• Using multiple lanes and differential signaling to form a point-to-point link
Most implementations use a ×4 link, providing 10Gb/s in each direction
• DMI 2.0 (introduced in 2011) doubles the BW to 20Gb/s with a ×4 link
Flexible Display Interface (FDI)
Connects the Intel HD Graphics integrated GPU with the PCH south bridge
• where display connectors are attached
40
Supports 2 independent 4-bit fixed frequency links/channels/pipes at
2.7GT/s data rate
Computer Architecture 2011 – PC Structure and Peripherals
Motherboard Layout – 1st Gen Core2TM
IEEE1394a
header
audio
header
PCI express
PCI add-in PCI
express
x1
x16
card
connector connector connector
Back panel
connectors
Processor core power connector
Rear chassis fan header
High Def. Audio header
PCI add-in card connector
LGA775 processor socket
Parallel ATA IDE connector
GMCH: North Bridge + integ GFX
Processor fan header
Speaker
Front panel USB header
4 × SATA
connectors
41
DIMM Channel A sockets
Serial port header
DIMM Channel B sockets
Diskette drive connector
ICH: South Battery
Bridge +
integ Audio
Main
Power
connector
Computer Architecture 2011 – PC Structure and Peripherals
Motherboard Layout (Sandy Bridge)
IEEE1394a
header
PCI add-in PCI
express x1
card
connector connector
PCI express
x16
connector
Back panel
connectors
audio
header
Processor core power connector
High Def. Audio header
S/PDIF
Rear chassis fan header
LGA775 processor socket
Processor fan header
DIMM Channel A sockets
DIMM Channel B sockets
Front chassis fan header
Chassis intrusion header
Front panel USB headers
Bios setup config jumper
SATA speaker
connectorsBattery Main
Power
PCH
connector
42
Serial port header
Rear chassis fan header
Computer Architecture 2011 – PC Structure and Peripherals
ASUS Sabertooth P67 B3 Sandy Bridge Motherboard
43
Computer Architecture 2011 – PC Structure and Peripherals
How to get the most of Memory ?
Single Channel DDR
L2 Cache
FSB – Front Side Bus
CPU
CPU
44
Memory Bus
DDR
DIMM
Dual channel DDR
Each DIMM pair must be the same
L2 Cache
DRAM
Ctrlr
FSB – Front Side Bus
DRAM
Ctrlr
CH A
DDR
DIMM
CH B
DDR
DIMM
Balance FSB and memory bandwidth
800MHz FSB provides 800MHz × 64bit / 8 = 6.4 G Byte/sec
Dual Channel DDR400 SDRAM also provides 6.4 G Byte/sec
Computer Architecture 2011 – PC Structure and Peripherals
How to get the most of Memory ?
Each DIMM supports 4 open pages simultaneously
The more open pages, the more random access
It is better to have more DIMMs
• n DIMMs: 4n open pages
DIMMs can be single sided or dual sided
Dual sided DIMMs may have separate CS of each side
• In this case the number of open pages is doubled (goes up to 8)
• This is not a must – dual sided DIMMs may also have a common
CS for both sides, in which case, there are only 4 open pages, as
with single side
45
Computer Architecture 2011 – PC Structure and Peripherals
Motherboard Back Panel
Rear
Surround
USB
2.0
ports
eSATA
46
LAN
port
USB
2.0
ports
USB
2.0
ports
DVI-I
DisplayPort
HDMI
IEEE
1394A
USB
3.0
ports
Center /
subwoofer
Line
in
S/PDIF Mic in
/ Side
surround
Line
out/
Front
speakers
Computer Architecture 2011 – PC Structure and Peripherals
System Start-up
Upon computer turn-on several events occur:
1. The CPU "wakes up" and sends a message to activate the BIOS
2. BIOS runs the Power On Self Test (POST):
make sure system devices are working ok
47
Initialize system hardware and chipset registers
Initialize power management
Test RAM
Enable the keyboard
Test serial and parallel ports
Initialize floppy disk drives and hard disk drive controllers
Displays system summary information
Computer Architecture 2011 – PC Structure and Peripherals
System Start-up (cont.)
3. During POST, the BIOS compares the system configuration data
obtained from POST with the system information stored on a
memory chip located on the MB
A CMOS chip, which is updated whenever new system components
are added
Contains the latest information about system components
4. After the POST tasks are completed
the BIOS looks for the boot program responsible for loading the
operating system
Usually, the BIOS looks on the floppy disk drive A: followed by drive
C:
5. After boot program is loaded into memory
It loads the system configuration information contained in the
registry in a Windows® environment, and device drivers
6. Finally, the operating system is loaded
48
Computer Architecture 2011 – PC Structure and Peripherals
Backup
49
Computer Architecture 2011 – PC Structure and Peripherals
Western Digital HDDs
Caviar Green 2TB
Caviar Blue 1TB
Maximum external transfer
rate
Maximum sustained data rate
300MB/s
126MB/s
Average rotational latency
Spindle speed
Cache size
Caviar Black 2TB
138MB/s
4.2 ms
7,200 RPM
7,200 RPM
32MB
64MB
Platter size
500GB
Areal density
400 Gb/in²
Available capacities
2TB
Idle power
6.1W
8.2W
Read/write power
6.8W
10.7W
Idle acoustics
28 dBA
29 dBA
Seek acoustics
33dBA
30-34 dBA
50
Computer Architecture 2011 – PC Structure and Peripherals
HDD Example
Performance Specifications
Rotational Speed
Buffer Size
Average Latency
Load/unload Cycles
7,200 RPM (nominal)
64 MB
4.20 ms (nominal)
300,000 minimum
Buffer To Host (Serial ATA)
6 Gb/s (Max)
Formatted Capacity
Capacity
Interface
User Sectors Per Drive
2,000,398 MB
2 TB
SATA 6 Gb/s
3,907,029,168
Transfer Rates
Physical Specifications
Acoustics
Idle Mode
Seek Mode 0
Seek Mode 3
Current Requirements
Power Dissipation
Read/Write
Idle
Standby
Sleep
51
29 dBA (average)
34 dBA (average)
30 dBA (average)
10.70 Watts
8.20 Watts
1.30 Watts
1.30 Watts
Computer Architecture 2011 – PC Structure and Peripherals
DDR Comparison
DDR
Bus clock
SDRAM
(MHz)
Standard
52
Internal
rate
(MHz)
Prefetch
(min
burst)
Transfer
Rate
(MT/s)
Voltage
DIMM
pins
DDR
100–200
100–200
2n
200–400
2.5
184
DDR2
200–533
100–266
4n
400–1066
1.8
240
DDR3
400–1066
100–266
8n
800–2133
1.5
240
Computer Architecture 2011 – PC Structure and Peripherals
SSD vs HDD
Attribute or characteristic
Random access time[57]
Consistent read
performance[61]
Solid-state drive
Hard disk drive
~0.1 ms
5–10 ms
Read performance does not change based on where data is If data is written in a fragmented way, reading back the
stored on an SSD
data will have varying response times
Fragmentation
Non-issue due
Acoustic levels
Mechanical reliability
SSDs have no moving parts and make no sound
No moving part
Files may fragment; periodical defragmentation is required
to maintain ultimate performance.
Maintenance of temperature Less heat
Susceptibility
to environmental factors
Magneticsusceptibility
Weight and size
Parallel operation
Write longevity
Cost per capacity
Storage capacity
Read/write performance
symmetry
Free block availability
and TRIM
Power consumption
53
No flying heads or rotating platters to fail as a result
of shock, altitude, or vibration
No impact on flash memory
Magnets or magnetic surges can alter data on the media
very light compared to HDDs
Some flash controllers can have multiple flash chips reading HDDs have multiple heads (one per platter) but they are
and writing different data simultaneously
connected, and share one positioning motor.
Flash-based SSDs have a limited number of writes (1-5
Magnetic media do not have a similar limited number of
million or more) over the life of the drive.
writes but are susceptible to eventual mechanical failure.
$.90–2.00 per GB
$0.05/GB for 3.5 in and $0.10/GB for 2.5 in drives
Typically 4-256GB
typically up to 1 – 2 TB
Less expensive SSDs typically have write speeds
HDDs generally have slightly lower write speeds than their
significantly lower than their read speeds. Higher
read speeds.
performing SSDs have a balanced read and write speed.
SSD write performance is significantly impacted by the
availability of free, programmable blocks. Previously written
HDDs are not affected by free blocks or the operation (or
data blocks that are no longer in use can be reclaimed by
lack) of the TRIM command
TRIM; however, even with TRIM, fewer free, programmable
blocks translates into reduced performance.[29][75][76]
High performance flash-based SSDs generally require 1/2 to High performance HDDs require 12-18 watts; drives
1/3 the power of HDDs
designed for notebook computers are typically 2 watts.
Computer Architecture 2011 – PC Structure and Peripherals
PC Connections
Raw bandwidth
(Mbit/s)
Transfer speed
(MB/s)
Max. cable
length (m)
3,000
300
2 with eSATA HBA (1
with passive adapter)
5 V/12 V[33]
SATA revision 3.0
SATA revision 2.0
SATA revision 1.0
PATA 133
SAS 600
SAS 300
SAS 150
6,000
3,000
1,500
1,064
6,000
3,000
1,500
600[34]
300
150[35]
133.5
600
300
150
1
No
IEEE 13943200
3,144
393
IEEE 1394800
IEEE 1394400
USB 3.0*
USB 2.0
USB 1.0
SCSI Ultra-640
SCSI Ultra-320
Fibre Channel
over optic fibre
Fibre Channel
over copper cable
786
393
5,000
480
12
5,120
2,560
98.25
49.13
400[38]
60
1.5
640
320
10,520
1,000
Name
eSATA
eSATAp
Power provided
No
1 (15 with port
multiplier)
1 per line
2
0.46 (18 in)
No
10
No
1 (>65k with
expanders)
15 W, 12–25 V
63 (with hub)
4.5 W, 5 V
2.5 W, 5 V
Yes
127 (with hub)[39]
No
15 (plus the HBA)
No
126
(16,777,216 with
switches)
100 (more with
special cables)
100[36]
4.5[36][37]
3[39]
5[40]
3
12
2–50,000
4,000
400
12
InfiniBand
Quad Rate
10,000
1,000
5 (copper)[41][42]<10,0
00 (fiber)
No
Thunderbolt
10,000
1,250
100
10 W
54
Devices per
channel
1 with point to point
Many withswitched
fabric
7
Computer Architecture 2011 – PC Structure and Peripherals