Transcript Lecture3
Embedded Systems Design: A Unified
Hardware/Software Introduction
Chapter 3 General-Purpose Processors:
Software
1
Introduction
• General-Purpose Processor
– Processor designed for a variety of computation tasks
– Low unit cost, in part because manufacturer spreads NRE
over large numbers of units
• Motorola sold half a billion 68HC05 microcontrollers in 1996 alone
– Carefully designed since higher NRE is acceptable
• Can yield good performance, size and power
– Low NRE cost, short time-to-market/prototype, high
flexibility
• User just writes software; no processor design
– a.k.a. “microprocessor” – “micro” used when they were
implemented on one or a few chips rather than entire rooms
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
2
Basic Architecture
• Control unit and
datapath
Processor
Control unit
– Note similarity to
single-purpose
processor
Datapath
ALU
Controller
Control
/Status
• Key differences
– Datapath is general
– Control unit doesn’t
store the algorithm –
the algorithm is
“programmed” into the
memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Registers
PC
IR
I/O
Memory
3
Datapath Operations
• Load
Processor
– Read memory location
into register
Control unit
Datapath
ALU
• ALU operation
Controller
– Input certain registers
through ALU, store
back in register
Registers
• Store
– Write register to
memory location
+1
Control
/Status
10
PC
11
IR
I/O
Memory
...
10
11
...
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
4
Control Unit
•
Control unit: configures the datapath
operations
Processor
– Sequence of desired operations
(“instructions”) stored in memory –
“program”
•
Control unit
ALU
Controller
Instruction cycle – broken into
several sub-operations, each one
clock cycle, e.g.:
– Fetch: Get next instruction into IR
– Decode: Determine what the
instruction means
– Fetch operands: Move data from
memory to datapath register
– Execute: Move data through the
ALU
– Store results: Write data from
register to memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Datapath
Control
/Status
Registers
PC
IR
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Memory
R1
...
500
501
10
...
5
Control Unit Sub-Operations
• Fetch
– Get next instruction
into IR
– PC: program
counter, always
points to next
instruction
– IR: holds the
fetched instruction
Processor
Control unit
ALU
Controller
Control
/Status
Registers
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Datapath
Memory
R1
...
500
501
10
...
6
Control Unit Sub-Operations
• Decode
Processor
– Determine what the
instruction means
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
7
Control Unit Sub-Operations
• Fetch operands
Processor
– Move data from
memory to datapath
register
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
8
Control Unit Sub-Operations
• Execute
– Move data through
the ALU
– This particular
instruction does
nothing during this
sub-operation
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
9
Control Unit Sub-Operations
• Store results
– Write data from
register to memory
– This particular
instruction does
nothing during this
sub-operation
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC
100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
10
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
10
PC 100
IR
load R0, M[500]
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
R1
...
500
501
10
...
11
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
+1
Control
/Status
PC=101
Registers
Fetch Decode Fetch Exec. Store
ops
results
clk
10
PC 101
IR
inc R1, R0
R0
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
11
R1
...
500
501
10
...
12
Instruction Cycles
PC=100
Fetch Decode Fetch Exec. Store
ops
results
clk
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
PC=101
Registers
Fetch Decode Fetch Exec. Store
ops
results
clk
10
PC 102
IR
store M[501], R1
R0
11
R1
PC=102
Fetch Decode Fetch Exec. Store
ops
results
clk
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
I/O
100 load R0, M[500]
101
inc R1, R0
102 store M[501], R1
Memory
...
500 10
501 11
...
13
Architectural Considerations
• N-bit processor
– N-bit ALU, registers,
buses, memory data
interface
– Embedded: 8-bit, 16bit, 32-bit common
– Desktop/servers: 32bit, even 64
• PC size determines
address space
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
IR
I/O
Memory
14
Architectural Considerations
• Clock frequency
– Inverse of clock
period
– Must be longer than
longest register to
register delay in
entire processor
– Memory access is
often the longest
Processor
Control unit
Datapath
ALU
Controller
Control
/Status
Registers
PC
IR
I/O
Memory
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
15
Pipelining: Increasing Instruction
Throughput
Wash
1
2
3
4
5
6
7
8
1
2
3
Non-pipelined
Dry
1
Decode
1
2
3
4
5
6
7
1
Time
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
Instruction 1
pipelined instruction execution
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
6
7
8
2
3
4
5
6
7
pipelined dish cleaning
3
Execute
Store res.
8
2
Fetch ops.
5
Pipelined
non-pipelined dish cleaning
Fetch-instr.
4
8
Time
Pipelined
8
Time
16
Superscalar and VLIW Architectures
• Performance can be improved by:
– Faster clock (but there’s a limit)
– Pipelining: slice up instruction into stages, overlap stages
– Multiple ALUs to support more than one instruction stream
• Superscalar
– Scalar: non-vector operations
– Fetches instructions in batches, executes as many as possible
• May require extensive hardware to detect independent instructions
– VLIW: each word in memory has multiple independent instructions
• Relies on the compiler to detect and schedule instructions
• Currently growing in popularity
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
17
Two Memory Architectures
Processor
• Princeton
Processor
– Fewer memory
wires
• Harvard
– Simultaneous
program and data
memory access
Program
memory
Data memory
Harvard
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Memory
(program and data)
Princeton
18
Cache Memory
• Memory access may be slow
• Cache is small but fast
memory close to processor
– Holds copy of part of memory
– Hits and misses
Fast/expensive technology, usually on
the same chip
Processor
Cache
Memory
Slower/cheaper technology, usually on
a different chip
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
19
Programmer’s View
• Programmer doesn’t need detailed understanding of architecture
– Instead, needs to know what instructions can be executed
• Two levels of instructions:
– Assembly level
– Structured languages (C, C++, Java, etc.)
• Most development today done using structured languages
– But, some assembly level programming may still be necessary
– Drivers: portion of program that communicates with and/or controls
(drives) another device
• Often have detailed timing considerations, extensive bit manipulation
• Assembly level may be best for these
Embedded Systems Design: A Unified
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
20