ARM Systems-on-chip - Electrical & Computer Engineering
Download
Report
Transcript ARM Systems-on-chip - Electrical & Computer Engineering
CPE 626: Advanced VLSI Design
L01
Department of Electrical and
Computer Engineering
University of Alabama in Huntsville
Outline
Computer Engineering:
Motivation, Present, Future
Computer Engineering Methodology
Power as a Design Constraint
Stored-program Computer: MU0 Example
Digital System Modeling: Motivation
2
Why Computer Engineering?
CHANGE! It is exciting. It has never been more exciting!
It impacts every aspect of human life.
PC, 2002
PDA, 2002
Eniac, 1946
(first stored-program computer)
Bionic, 2002
3
Why Such Change?
Continuous growth in performance
due to advances in technology (CMOS VLSI) and
innovations in computer design (RISC, RAID, ILP)
Lower cost
due to simpler development and higher volumes
These resulted in significant enhancement
of the capability available to computer user
Example: our today’s PC of less than $1000
has more performance, main memory and
disk storage than $1 million computer in 1970s
4
Computer Engineering Methodology
Market
Implementation
Complexity
Applications
Evaluate Existing
Systems for
Bottlenecks
Benchmarks
Technology
Trends
Implement Next
Generation System
Simulate New
Designs and
Organizations
Workloads
5
Technology Trends
Logic
Capacity
Speed/Latency
4x in 3 years
1.54x per year
State of the art:
Intel Pentium 4,
Disk
4x in 3-4 years
2x in 10 years
2.2 GHz,
0.13microns,
42 million transistors
Reuters, Monday 11 June 2001:
Intel engineers have designed and manufactured the
world’s smallest and fastest transistor of 0.02 microns in
size.
DRAM
4x in 3-4 years
2x in 10 years
This will open the way for microprocessors of 1 billion
transistors, running at 20 GHz by 2007.
6
Pentium III Die Photo
EBL/BBL - Bus logic, Front, Back
MOB - Memory Order Buffer
Packed FPU - MMX Fl. Pt. (SSE)
IEU - Integer Execution Unit
FAU - Fl. Pt. Arithmetic Unit
MIU - Memory Interface Unit
DCU - Data Cache Unit
PMH - Page Miss Handler
DTLB - Data TLB
BAC - Branch Address Calculator
RAT - Register Alias Table
SIMD - Packed Fl. Pt.
RS - Reservation Station
BTB - Branch Target Buffer
IFU - Instruction Fetch Unit (+I$)
ID - Instruction Decode
ROB - Reorder Buffer
MS - Micro-instruction Sequencer
1st Pentium III, Katmai: 9.5 M transistors, 12.3 *
10.4 mm in 0.25-mi. with 5 layers of aluminum
7
Pentium 4 Die Photo
42M Xtors
PIII: 26M
217 mm2
PIII: 106
mm2
L1 Execution
Cache
Buffer
12,000
Micro-Ops
8KB data
cache
256KB L2$
8
Future Applications
Desktop: 90% of cycles will be spent on media
applications
video encode/decode, polygon & image-based graphics
audio processing, compression, music,
speech recognition/synthesis
modulation/demodulation at audio and video rates
Scientific desktops: high-performance FPs and graphics
Commercial servers: support for databases and
transaction processing, enhancement for reliability,
support for scalability
Embedded computing:
special support for graphics or video, power limitations
9
Future Directions
Conditions
new workloads are characterised
with more exploitable parallelism
dominant wire delays on a billion transistor chip
will force hardware to be more distributed
Novel architectural techniques
Develop architectural
Exploit parallelism
techniques that exploit
semiconductor technology
o multiprocessor on chip
and workload characteristics
o simultaneous multithreading
in order to maximize
CPU-memory integration
performance at low cost
o memory tolerating techniques
o flexible hierarchy to adapt to application
Reconfigurable computing
10
Power as a Design Constraint
Power becomes critical issue
Portable and mobile platforms
battery-operated devices
Desktops, server farms
Reliability?
Power consumption: IT consumes 10% in the US
Power density: 30 W/cm2 in Alpha 21364
(3x of typical hot plate)
11
Power as a Design Constraint (cont’d)
Dynamic power
consumption
Power due to shortPower due to
circuit current
leakage current
during transition
P ACV f AVIshort f VIleak
2
A (activity of gates) =>
Turn off unused parts
or
use design techniques
to minimize number of
transitions
Reduce the
supply voltage, V
fmax
( V Vt )2
V
qVt
Ileak exp(
)
kT
Reduce
threshold Vt
12
Recap: Computer Architecture
Computer Architecture describes user’s view of the
computer: visible registers, data types, instruction set,
instruction formats, memory management table structures,
exception handling
Computer Organization describes user’s invisible
implementation of the architecture: pipeline structure,
caches, TLB, ...
13
Stored-program computer
FF..FF16
instructions
registers
address
data
processor
instructions
and data
memory
00..00 16
14
Typical Hierarchy
Transistors
Logic gates, memory cells, special
circuits
Single-bit adders, MUXs, flip-flops,
decoders, coders
Word-wide adders, MUXs, registers,
decoders, buses
ALUs, shifters, register files, memory
blocks
Processor, peripheral cells, cache
memories, MMUs
Integrated system chips
PCBs
Mobile phones, laptops, PCs, engine
controllers
Vdd
A
A.B
B
Vss
15
MU0 – A Simple Processor
Instruction format
Instruction set
4 bits
opcode
12 bits
S
Instruction
Opcode
Effect
LDA S
0000
ACC := mem16[S]
STO S
0001
mem16[S] := ACC
ADD S
0010
ACC := ACC + mem16[S]
SUB S
0011
ACC := ACC - mem16[S]
JMP S
0100
PC := S
JGE S
0101
if ACC >= 0 PC := S
JNE S
0110
if ACC !=0 PC := S
STP
0111
stop
16
MU0 Datapath Example
Program Counter – PC
Accumulator - ACC
Arithmetic-Logic Unit – ALU
Instruction Register
Instruction Decode and
Control Logic
ad dress bus
Follow the principle that the
memory will be limiting
factor in design: each
instruction takes exactly the
number of clock cycles
defined by the number of
memory accesses it must
take.
PC
contro l
IR
me mory
AL U
ACC
da ta b us
17
MU0 Datapath Design
Assume that each instruction starts
when it has arrived in the IR
Step 1: EX (execute)
LDA S: ACC <- Mem[S]
STO S: Mem[S] <- ACC
ADD S: ACC <- ACC + Mem[S]
SUB S: ACC <- ACC - Mem[S]
JMP S: PC <- S
JGE S: if (ACC >= 0) PC <- S
JNE S: if (ACC != 0) PC <- S
Step 2: IF (fetch the next instruction)
Either PC or the address in the IR
is issued to fetch the next
instruction
address is incremented in the
ALU and value saved into the PC
Initialization
Reset input to start
executing instructions from
a known address; here it is
000hex
o provide zero at the ALU
output and then load it
into the PC register
18
MU0 RTL Organization
Control Logic
Asel
Bsel
ACCce (ACC change enable)
PCce (PC change enable)
IRce (IR change enable)
ACCoe (ACC output enable)
ALUfs (ALU function select)
MEMrq (memory request)
RnW (read/write)
Ex/ft (execute/fetch)
19
MU0 control logic
In p ut s
Op c o de Ex / f t ACC1 5
In s t ruc t i o n
Re s e t ACCz
Reset
xxxx
1
x
x
x
LDA S
0000
0
0
x
x
0000
0
1
x
x
STO S
0001
0
0
x
x
0001
0
1
x
x
ADD S
0010
0
0
x
x
0010
0
1
x
x
SUB S
0011
0
0
x
x
0011
0
1
x
x
JMP S
0100
0
x
x
x
JGE S
0101
0
x
x
0
0101
0
x
x
1
JNE S
0110
0
x
0
x
0110
0
x
1
x
STOP
0111
0
x
x
x
Out p ut s
Bs el
PCc e ACCo e
MEMrq Ex / f t
As e l ACCc e IRc e
ALUf s
Rn W
0
0
1
1
1
0
=0
1
1
0
1
1
1
0
0
0
=B
1
1
1
0
0
0
1
1
0
B+1
1
1
0
1
x
0
0
0
1
x
1
0
1
0
0
0
1
1
0
B+1
1
1
0
1
1
1
0
0
0
A+B
1
1
1
0
0
0
1
1
0
B+1
1
1
0
1
1
1
0
0
0
A-B
1
1
1
0
0
0
1
1
0
B+1
1
1
0
1
0
0
1
1
0
B+1
1
1
0
1
0
0
1
1
0
B+1
1
1
0
0
0
0
1
1
0
B+1
1
1
0
1
0
0
1
1
0
B+1
1
1
0
0
0
0
1
1
0
B+1
1
1
0
1
x
0
0
0
0
x
0
1
0
20
MU0 ALU Design
ALU functions: A+B, A-B, B,
B+1, 0 (used only when reset is
active) => 4 functions
Binv
Cin
Aen (enable operand A)
Binv (invert operand B)
reset
sum
B
A
Aen
Cout
21
Digital System Modeling: Motivation
Requirements specification
Functional specification
Testing and verification of the design
Formal verification of the correctness of the design
Automatic synthesis
22
Gajski and Kuhn’s Y Chart
Architectural
Behavioral
Structural
Algorithmic
Systems
Functional Block
Processor
Hardware Modules
Algorithms
Logic
ALUs, Registers
Register Transfer
Circuit Gates, FFs
Logic
Transistors
Transfer Functions
Rectangles
Cell, Module Plans
Floor Plans
Domains
Clusters
Functional – operations performed by the system
Physical Partitions
Structural – how the system is composed
Geometry – how the system is laid out in physical space
Physical/Geometry
23