Lecture 12 - TAMU Computer Science Faculty Pages

Download Report

Transcript Lecture 12 - TAMU Computer Science Faculty Pages

Review
CPSC 321
Andreas Klappenecker
Administrative Issues
• Midterm is on October 12
• Allen Parish’s help session Friday 10:15-12:15
Questions? Project partners?
Today’s Menu
What happened so far...
History
One of the first calculation tools was the abacus, presumably
invented sometime between 1000-500 B.C.
Early History
• around 1600, John Napier invents the Napier bones, a
tool that helps in calculations
• 1621, William Oughtred invents the slide rule that
exploit Napier’s logarithms to assist in calculations
• 1625 Wilhelm Schickard invents a mechanical device
to add, subtract, multiply and divide numbers
• 1640 Blaise Pascal invents his Arithmetic Machine
(which could only add)
• 1671 Wilhelm von Leibniz invents the Step Reckoner,
a device that allows to perform additions,
subtractions, multiplications, divisions, and evaluation
of square roots (by stepped additions)
Early History
• Charles Babbage proposes in 1822 a machine to
calculate tables for logarithms and trigonometric
functions, called the Difference Engine.
• Before completing the machine, he invents in 1833 the
more sophisticated Analytic Engine that uses Jacquard
punch cards to control the arithmetic calculations
• The machine is programmable, has storage capabilities, and
control flow mechanisms – it is a general purpose computer.
• The Analytic Engine was never completed.
• Augusta Ada Lovelace writes the first program for the
Analytical Engine (to calculate Bernoulli numbers).
Some consider her as the first programmer.
Z1
The Z1 computer was clocked at 1 Hz. The memory consists of 64
words with 22 bits. Input and output is done by a punch tape
reader and a punch tape writer.
The computer has two registers with 22 bits and is able to
perform additions and subtractions (it is not a general purpose
computer).
Z3
(Art and photo courtesy of Horst Zuse)
Zuse constructed the Z3, a fully programmable general purpose
computer, in 1939-1941. Remarkably, it contained a binary floating
point arithmetic. It was clocked at 5.33 Hz, based on relays, and
had 64 words of 22 bits.
The small memory did not allow for storage of the program.
Performance
• Response time: time between start and
finish of the task (aka execution time)
• Throughput: total amount of work done
in a given time
Performance
(Absolute) Performance
Relative Performance
Amdahl’s Law
The execution time after making an
improvement to the system is given by
Exec time after improvement = I/A + E
I = execution time affected by improvement
A = amount of improvement
E = execution time unaffected
Assembly Language
.text
# code section
.globl main
main:
li $v0, 4
# system call for print string
la $a0, str
# load address of string to print
syscall
# print the string
li $v0, 10
# system call for exit
syscall
# exit
.data
str:
.asciiz “Hello world!\n” # NUL terminated string, as in C
Things to know...
• Instruction and pseudo-instructions
• Register conventions (strictly enforced)
• Machine language instruction given, find
the corresponding assembly language
instruction
• Code puzzles
• Solve a small programming task
Instruction Word Formats
• Register format
op-code
rs
6
rt
5
rd
5
shamt
5
funct
5
6
• Immediate format
op-code
rs
6
rt
5
immediate value
5
16
• Jump format
op-code
6
26 bit current segment address
26
Machine Language
What does
that mean?
• Machine language level programming means
that we have to provide the bit encodings for
the instructions
• For example, add $t0, $s1, $s2 represents
the 32bit string
• 00000010001100100100000000100000
• Assembly language mnemonics usually
translate into one instruction
• We also have pseudo-instructions that
translate into several instructions
Watson, the case is clear…
• add $t0, $s1, $s2
• 00000010001100100100000000100000
op-code
6
rs
rt
5
rd
5
shamt
5
5
funct
6
• 000000 10001 10010 01000 00000 100000
• Operation and function field tell the computer to
perform an addition
• 000000 10001 10010 01000 00000 100000
• registers $17, $18 and $8
Number Representations
•
•
•
•
Signed and unsigned integers
Number conversions
Comparisons
Overflow rules
Detecting Overflow
• No overflow when adding a positive and a negative
number
• No overflow when signs are the same for subtraction
• Overflow occurs when the value affects the sign:
•
•
•
•
overflow when adding two positives yields a negative
or, adding two negatives gives a positive
or, subtract a negative from a positive and get a negative
or, subtract a positive from a negative and get a positive
Detecting Overflow
Operation
A+B
Operand A Operand B Overflow
if result
>=0
>=0
<0
A+B
<0
<0
>=0
A-B
>=0
<0
<0
A-B
<0
>=0
>=0
Logic Design: Build ALU, etc.
CarryIn
Operation
Operation
a0
CarryIn
b0
CarryIn
ALU0
Result0
CarryOut
a
a1
0
b1
CarryIn
ALU1
Result1
CarryOut
1
Result
a2
b2
CarryIn
ALU2
Result2
CarryOut
2
b
a31
b31
CarryOut
CarryIn
ALU31
Result31
Logic Design
• Determine truth tables of combinatorial
circuits
• Determine combinatorial circuits from
truth tables
• Determine critical path
• Overflow detection in ALU
a.
CarryOut
SLT
Binvert
Operation
CarryIn
4 operations
• subtraction output
available
• Connect
• MSB set output
•

a
0
1
Result
b
w/ LSB less
0
2
1
Less
3
Set
Overflow
detection
b.
Overflow
Adders
• Ripple carry
• Carry-lookahead
Fast Adders
Iterate the idea, generate and propagate
ci+1 = gi + pici
= gi + pi(gi-1 + pi-1 ci-1)
= gi + pigi-1+ pipi-1ci-1
= gi + pigi-1+ pipi-1gi-2 +…+ pipi-1 …p1g0
+pipi-1 …p1p0c0
Two level AND-OR circuit
Carry is known early!
Start
Multiplication
Multiplier0 = 1
0010 (multiplicand)
__ x_1011 (multiplier)
0010 x 1
00100 x 1
001000 x 0
0010000 x 1
1. Test
Multiplier0
Multiplier0 = 0
1a. Add multiplicand to product and
place the result in Product register
0010110
2. Shift the Multiplicand register left 1 bit
Multiplicand
Shift left
64 bits
3. Shift the Multiplier register right 1 bit
Multiplier
Shift right
64-bit ALU
32 bits
Product
Write
64 bits
Control test
32nd repetition?
No: < 32 repetitions
Yes: 32 repetitions
Done
Booth Multiplication
Current and previous bit
00: middle of run of 0s, no action
01: end of a run of 1s, add multiplicand
10: beginning of a run of 1s, subtract mcnd
11: middle of string of 1s, no action
Example: 0010 x 0110
Iteration Mcand
Step
Product
0
1
0010
0010
0010
Initial values
2
0010
0010
10: prod-=Mcand
arith>> 1
0000
0000
0000
1110
1111
3
0010
0010
11: no op
arith>> 1
1111 0001,1
1111 1000,1
4
0010
0010
01: prod+=Mcand
arith>> 1
0001 1000,1
0000 1100,0
00: no op
arith>> 1
0110,0
0110,0
0011,0
0011,0
0001,1
IEEE 754 Floating Point Representation
• Float – 1 sign bit, 8 exponent bits, 23
bits for significand.
seeeeeeeefffffffffffffffffffffff
value = (-1)s x F x 2E-127
with F= 1 + .ffffffff .... fff
• Double – 1 sign bit, 11 exponent bits, 52
bits for significand
Processor
• Be able to build the datapath
• Be able to explain issues concerning
datapath and control
Control
0
M
u
x
Add ALU
result
Add
4
Instruction [31– 26]
PC
Instruction
memory
Read
register 1
Instruction [20– 16]
Instruction
[31– 0]
Instruction [15– 11]
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [25– 21]
Read
address
1
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Zero
ALU ALU
result
Address
Write
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
Single- versus Multicycle Processor
• What are the differences?
• What is executed during the 7th cycle?
• How many cycles do we need?
Summary
Step name
Instruction fetch
Action for R-type
instructions
Instruction
decode/register fetch
Action for memory-reference
Action for
instructions
branches
IR = Memory[PC]
PC = PC + 4
A = Reg [IR[25-21]]
B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address
computation, branch/
jump completion
ALUOut = A op B
ALUOut = A + sign-extend
(IR[15-0])
Memory access or R-type
completion
Reg [IR[15-11]] =
ALUOut
Load: MDR = Memory[ALUOut]
or
Store: Memory [ALUOut] = B
Memory read completion
Load: Reg[IR[20-16]] = MDR
if (A ==B) then
PC = ALUOut
Action for
jumps
PC = PC [31-28] II
(IR[25-0]<<2)
Need for Speed
• You need to be able to answer the
question in a short time (75 minutes)
• Routine calculations, such as number
conversions, should not slow you down
• Read the chapters very carefully!
• Many repetitions will help you to gain a
better understanding