Transcript Lecture #2
Stuff to know!
Introductions
Instruction Set Architecture (ISA)
Today’s topics
Architecture overview
Machine instructions
Instruction Execution Cycle
CISC machines
RISC machines
Parallelism
Instruction-level
Processor-level
Internal representation
Microprograms
Limits of representation
External representation
Binary, octal, decimal, hexadecimal number systems
Terms
CPU: Central Processing Unit
ALU: Arithmetic/Logic Unit
Memory: storage for data and programs (separate from
CPU)
Register: fast temporary storage inside the CPU
Bus: parallel "wires" for transferring a set of electrical
signals simultaneously
Internal: Transfers signals among CPU components
Control: Carries signals for memory and I/O operations
Address: Links to specific memory locations
Data: Carries data CPU memory
Microprogram: sequence of micro-instructions required
to execute a machine instruction
Cache: temporary storage for faster access
Note: caching takes place at many levels in a computer system
Registers
General/Temporary: fast local memory inside the CPU
one type of cache
Control: dictates current state of the machine
Status: indicates error conditions
IR: Instruction Register (holds current instruction)
IP: Instruction Pointer (holds memory address of next
instruction) [often called Program Counter (PC)]
MAR: Memory Address Register (holds address of
memory location currently referenced)
MDR: Memory Data Register: holds data being set to or
retrieved from the memory address in the MAR
Machine instructions
Each computer architecture provides a set of
machine-level instructions
Instruction Set Architecture (ISA)
Specific to one particular architecture
Like everything inside a computer, machine
instructions are implemented electrically
Micro-instructions set the switches in the control
register
Hypothetical CISC* machine
Shows hardware components
Does not show digital logic level or microprograms.
Shows how machine-level instructions can be
stored and executed.
Illustrates
Finite-state machine
*CISC
Complex Instruction Set Computer
VonNeumann architecture
Instruction execution cycle
Real computers …
Use the “stored program” concept
VonNeumann architecture
Program is stored in memory, and is executed
under the control of the operating system
Operate using an Instruction Execution
Cycle
Instruction Execution Cycle
1.
2.
3.
4.
Fetch next instruction (at address in IP) into IR.
Increment IP to point to next instruction.
Decode instruction in IR
If instruction requires memory access,
A.
B.
Determine memory address.
Fetch operand from memory into a CPU register, or send
operand from a CPU register to memory.
Execute micro-program for instruction
6.
Go to step 1.
Note: default execution is sequential
5.
Example CISC Instruction
ADD
R1, mem1
;(Add contents of memory location mem1 to register R1)
Example ADD Microprogram
(each microinstruction executes in one clock cycle)
1.
2.
3.
4.
5.
6.
7.
Copy contents of R1 to ALU Operand_1
Move address of mem1 to MAR
Signal memory fetch (gets contents of memory address
currently in MAR into MDR)
Copy contents of MDR into ALU Operand_2
Signal ALU addition
Check Status Register
Copy contents of ALU Result to R1
Improving CISC
CISC speed (and convenience) is
increased by
more efficient microprograms
more powerful ISA level instructions
cache memory
more registers
wider buses
making it smaller
more processors
floating point instructions
Etc.
Clock Cycles
So how “slow” is this?
It isn’t slow
Execution near light-speed
Clock cycle length determines CPU speed
(mostly)
Limitations of CISC
Improving a specific architecture requires
instructions to be backward compatible.
So … how about a different architecture?
RISC machines
Reduced Instruction Set Computer
Much smaller set of instructions at ISA level
Instructions are like CISC micro-instructions
RISC assembly level programs look much
longer (more instructions) than CISC
assembly level programs, but they execute
faster. Why?
RISC design principles
Instructions executed directly by hardware (no
microprograms).
Maximize rate of fetching instructions.
Instructions easy to decode
Instruction cache
Fetching operands, etc.
Only LOAD and STORE instructions reference
memory.
Plenty of registers
More speed improvement
Minimize memory and I/O accesses
Cache
Separate I/O unit (buffers/processing)
Separate network communication unit (NIC)
Etc.
Parallel processing
Parallelism (overview)
Instruction-level parallelism
pipeline
cache
Processor-level parallelism
multiprocessor (multiple CPUs, common memory)
multicomputer (multiple CPUs, each with own
memory)
Pipelining
Pipelining Equations!
For k execution stages, n instructions require k
+ (n – 1)
S1
1
S2
S3
S4
S5
I-1
2
S1
I-1
3
I-1
4
I-1
5
8
9
10
11
12
1
I-1
2
I-2
S2
S3
S4
S5
S6
I-1
I-1
6
7
S6
I-1
3
4
I-2
I-2
I-2
I-1
I-2
5
I-2
No Pipelining
I-2
6
I-2
7
I-2
I-2
I-1
I-1
I-2
6-stage pipeline
I-1
I-2
Instruction Caching
Hardware provides area for multiple
instructions in the CPU
Reduces number of memory accesses
Instructions are available for immediate
execution
Might cause problems with decision,
repetition, and procedure structures in
programs
Multiprocessor (shared memory)
Multicomputer (distributed memory)
Comparisons
Cache and Pipelining
Implemented in hardware
Multiprocessor
Difficult to build
Relatively easy to program
Multicomputer
Easy to build (given networking technology)
Extremely difficult to program
Other types of parallelism
Hybrid systems
Scalable architectures
Add more processors (nodes), without having
to re-invent the system
Simulated parallelism
Super-Scalar
Applications of Parallelism
Multi-user systems
Networks
Internet
Speed up single processes
Chess example
Other AI applications
Questions?
Parallelism
… more later
Internal representation (tomorrow!)
Data
Instructions
Addresses
Internal representation
Just like everything else in a computer, the
representation of numbers is implemented
electrically
switches set to off or on
with open(transparent)/closed(opaque) gates.
There are two states for each gate
The binary number system uses two digits (0 and 1)
In order to simplify discussion, we use the standard
shorthand to transcribe the computer
representation:
off is written as digit 0
on is written as digit 1
External representation
Use the binary number system to represent
numeric values electrically.
Switches (gates) are grouped into bytes,
words, etc., to represent the digits of a
binary number.
Note: The number of gates in a group
depends on the computer architecture and
the type of data represented. E.G.,
For Intel-based architectures
byte = 8-bits, word = 2 bytes (16 bits)
integers use 2, 4, 8, or 10 bytes
Binary number system
has 2 digits: 0 and 1 (binary digit)
has places and place values determined by powers
of 2.
(in theory) can uniquely represent any integer value
A binary representation is just another way of writing a
number that we are accustomed to seeing in decimal
form.
(in practice, inside the computer) representation is
finite
Representations with too many digits get chopped.
Internal representation
Place values (right-to-left) are 20,21,22,23,24, etc.
Bits are numbered (right-to-left) starting at 0
Place value depends on number of "bits" defined for the
type.
Example:
15
A 16-bit integer might be
14
13
12
11
10
9
8
7
(red is "on")
6
5
4
3
2
1
0
(bit numbers)
… transcribed by a human as 0000000010110010
To convert to its familiar decimal representation, just add up the
place values of the places that are "on".
Converting binary to decimal
215 214 213 212 211 210 29
28
27
26
25
24
23
22
21
20
32768
256
128
64
32
16
8
4
2
1
0
16384
8192
4096
2048
1024
512
0 0 0 0 0 0 0 1 0 1 1 0 0 1 0
in decimal form:
128 + 32 + 16 + 2 = 178
How many different codes (integers) can be represented using
16 bits?
What is the largest (unsigned) integer that can be represented
using 16 bits?
What is the largest (unsigned) integer that can be represented
using 32 bits?
Prove that for n-bit representation, number of codes is 2n,
largest unsigned integer is 2n – 1, and largest signed integer
is 2n-1 - 1
Converting decimal to binary
Method 1:
Removing largest powers of 2
Method 2:
Successive division by 2
Converting decimal to binary
Example: 157
Method 1:Removing largest powers of 2
157 – 128 = 29
29 – 16 = 13
13 – 8 = 5
5–4=1
1–1=0
10011101
Method 2:Successive division by 2
157 ÷ 2 =
78 ÷ 2 =
39 ÷ 2 =
19 ÷ 2 =
9÷2=
4÷2=
2÷2=
1÷2=
78 R 1
39 R 0
19 R 1
9R1
4R1
2R0
1R0
0R1
10011101
Numeric representation
We will show (later) exactly how an electrical
operation can be performed on two electrical
numeric representations to give an electrical
result that is consistent with the rules of
arithmetic.
… but not quite consistent …
Since the number of gates in each group (byte,
word, etc.) is finite, computers can represent
numbers with finite precision only.
Example:
Suppose that signed integer data is represented using
16 gates. Then the largest integer that can be
represented is 65535. What happens if we add 1 ?
If necessary, representations are truncated;
overflow / underflow can occur, and the Status
Register will be set
Representing negative integers
Must specify size!
Specify n: number of bits (8, 16, 32, etc.)
There are 2n possible "codes"
Separate the "codes" so that half of them
represent negative numbers.
Note that exactly half of the codes have 1 in the
"leftmost" bit.)
Binary form of negative numbers
Several methods, each with disadvantages.
We will focus on twos-complement form
For a negative number x :
specify number of bits
start with binary representation of |x|
change every bit to its opposite, then add 1 to the
result.
Binary form of negative numbers
Example: -13 in 16-bit twos-complement
|-13| = 13 = 0000000000001101
ones-complement is 1111111111110010
add 1 to get 1111111111110011 = -13
Note that -(-13) should give 13. Try it.
Hexadecimal representation?
Convert binary to hex in the usual way
-13 = 1111111111110011 = FFF3 H = 0xfff3
Note: For byte, word, etc., if the first hex digit is greater
than or equal to 8, the value is negative.
Convert negative binary to decimal?
Find twos complement, convert, and prepend a minus
sign.
Signed numbers using 4-bit
twos-complement form
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111
Notice that all of the negative numbers have 1 in
the leftmost bit. All of the non-negative
numbers have 0 in the leftmost bit.
For this reason, the leftmost bit is called the sign bit
Note: Nobody uses 4-bit representations
(“nibble”), but there’s not enough room to show
8-bit representations here.
You can extend this diagram to 8-bit, 16-bit, etc.
n-bit twos-complement form
The 2n possible codes give
all zero
2n-1 - 1 positive numbers
2n-1 negative numbers
Note: zero is its own complement
Note: there is one “weird” number
01111111 + 1 = 10000000
127 + 1 = -128
(inconsistent with rules of arithmetic)
127 is the largest number that can be represented in 8
bits. This means that -(-128) cannot be represented
with 8 bits.
i.e., the 2's-complement of 10000000 is 10000000
Signed or Unsigned?
A 16-bit representation could be used for signed or
unsigned numbers
16-bit signed range is -32768 .. +32767
16-bit unsigned range is 0 .. 65535
Both forms use the same 216 codes
Example:
1010101010101010 unsigned is 43690 decimal
1010101010101010 signed is -21846
Programmer must tell the computer which form is being
used.
Other representations
Every integer number has a unique
representation in each "base" 2
Hexadecimal is commonly used for easily
converting binary to a more manageable
form.
example 16-bit binary to hexadecimal:
Binary
0001 0111 1011
Hexadecimal
1
7
B
Write it as 0x17BD or 17BDh
1101
D
Questions?
Read Irvine Chapter 17.1