CS 232: Computer Architecture II - Parallel Programming Laboratory

Transcript CS 232: Computer Architecture II - Parallel Programming Laboratory

CS 232:
Computer Architecture II
Prof. Laxmikant (Sanjay) Kale
CS 232 Objectives:
• Learn about how a computer system, and specifically its
processor, works
–
–
–
–
Assembly language programming
Instruction Set Architecture
Basic Processor design: Data-path and control
An overview of advanced topics
• with specific topics covered in depth
• Example: Pipelining, Caches, I/O, Multiprocessors
Some of the Dos and Don’t
• Must use the class web page regularly
– Important announcements, syllabus, lecture notes (slides),
assignments
• Also, must read the class newsgroup regularly
–
–
–
–
Frequently asked questions
Timely response
Make sure you don’t post solutions
No trash tolerated!
Lets begin with numbers
• Say you wanted to build a machine that calculates the
exact square of any given number quickly.
– 100 years ago
– May want to use the same idea for other things
• calculate cubes?
• Other number calculations?
• Need to decide:
– How are you going to represent the numbers?
– What are you going to build the machine out of?
• Need a basic “smart” component
Representing numbers
• Focus on integers
• Examine our decimal number system:
– Much nicer compared to roman numbers
• How do you add two large numbers in roman representation?
• With weighted positions (ones, tens, hundreds) addition is
easy.
– In fact, we can describe a simple procedure (“algorithm”) for doing it.
– But choice of “10” as the base is somewhat arbitrary
• What base is better for our machine?
– Base 10: need 10 symbols
– Base 2: need 2 symbols (0,1)
• We can probably make machines that can represent 0 and 1
– so that they don’t mix with each other
Binary number representation
• You have done this in CS231
– 11011 is 27
– 35 is 100011
– Addition and subtraction rules are the same as decimal numbers
• Let us agree to use this in our machine
– Still have the problem of having to translate between decimal
system (that we understand) and binary system, that our machine
understands
– Assume we will do it manually, for now
Smart component: Switch
• A switch is either on or off
– To really compose switches to build a machine, we need to be
able to control a switch
• (I.e. if switch A is ON, it will also change switch B to the ON
position).
– Let us assume we have electricity
• Basics: Voltage, Current, Flows if switch is on
• Make a controllable switch
– such that if input voltage is “High”, the switch is “ON” (closed)
– Let us assume “High Voltage” represents 1 and “Low” 0.
Gates
• We can now connect switches together:
– Two switches (A and B) connected in series:
• If both are “ON”, the output is HIGH
• So, if the input to both switch A and switch B is High, the
output of the composite circuit is High
• Let us call this the “AND” circuit (or AND gate)
– You can also connect switches such that:
• If either of the switch is ON, the output is HIGH
• Connect the switches in parallel
• So, if input to A is HIGH or input to B is HIGH, output is
High: OR gate
– NOT gate:
• Input High, Output Low, and vice versa
Gates to Adders
• We can now connect gates together to make an adder:
– Half adder (ignores carry):
• Given two inputs, if exactly one of them 1, then output 1
• Also output carry if both are 1
• How can we connect gates together to make this circuit?
– Full adder:
• uses carry
• 3 inputs lines, two output lines:
– Multi-bit adder:
• Feed the carry of least significant bit to the next higher one
So, we have an adder
• But we needed to square a number….
• The “adder” is just a simple calculating device
– We could connect a lot more of those to make a multiplier, or a
squarer.
– But, we can reuse the same hardware.
– Challenge: just use one adder (say 32 bit adder).
• We need one more type of device:
–
–
–
–
–
To store intermediate results
“Memory”: or registers
We tell the adder to repeatedly add M to a running total.
So, we need to store the running total and M somewhere
Also need to to remember how many times to add (Count)
How to make registers
• Switches can be connected to make a memory device
– Latches and Flip-Flops
Reset
Set
NOR
NOR
A register is a set latches grouped together
What does it look like now?
• We have:
– 3 32 bit “registers” : M, result, count
– 1 32 bit adder
– How are they connected?
• We need to bring back M and result to the inputs of the adder
• Store the sum back in result
• bring “count” and “1” to the adder
• store sum in count
• We can use gates to select which register is connected to the
input of the adder: multiplexers
– Also, must decide when the count has reached M
– How to tell when to connect which input to the adder??
Multiplexers
• Multiplexers connect one of their inputs to the output,
– and allow one to choose which ones to select
– can be implemented with gates as well
Input
Control
Instructions and “Stored Program”
• Let us have a “control unit”
– Tells which inputs are to be connected,
– Where to store the output
• How does the control unit know what to do?
– Store instructions for it in another set of registers?
– What is an instruction?
• Identifies Input registers, and output register,
Registers
Multi
plexor
Control Unit
Adder
Instructions
The SIS CPU
Registers
Multi
plexor
Adder
Multi
plexor
PC: Program Counter
Control Unit
Instructions
memory
with
multiplexor
Let us add memory
• When the amount of data is large:
–
–
–
–
Can’t keep on adding registers
Memory: a linear, random-access storage
Cheaper, slower than registers
Now we need to bring the data from memory into registers and
vice versa
The SIS CPU
Data
Memory
Registers
Multi
plexors
Load and
Store
Unit
Control Unit
Adder
Multi
plexor
PC
Instructions
memory
with
multiplexor
The instruction set so far
• Add R1, R2, R3
– Add contents of registers R1 and R2 and store result in R3
• Load R1, R2
– Load contents of R2 with memory location indicated by R1
• Store R1,R2
– Store contents of R2 in memory location indicated by R1
• Need to decide when to stop adding (for squaring)
– It suffices to compare two registers, and change PC based on the
result
– BEQ R1,R2, R3
• If contents of R1 and R2 are identical, change PC to contents
of R3
A few more instructions
• Loading fixed numbers into registers:
– have to be careful: instructions have only 16 bits, of which 4 are
opcode
– LoadThis R1, Data
• It is called : LoadImmediate (LDI R1, Data)
• Data can be 8 bits only
• Where does it go? In the lower order 8 bits of the register?
• A couple more branching instructions
– BNE R1, R2, R3 // branch to R3 if not equal
– BR R1 // unconditional branch
The machine language
• Each instruction is:
–
–
–
–
16 bits
1 3-4 bit field to specify operation (need only 3 for now)
3 4-bit fields to identify operand registers
Assign meanings to opcode bit combinations:
0001: Load
1000: Input
0010: Store
1001: Output
0011: Add
1010: LDI
0100: BEQ
1011: SetZero
0101: BNE
0110: BR
The squaring program
• Read N
– and output square of N
0
Input R1
1000 0001 xxxx xxxx
2
setzero R2 // count = 0;
Complete the program
4
setzero R3 // result = 0
6
LDI R5, 00001010 // ten
8
LDI R4, 00000001 // R4 has one
10
add R1, R3, R3
12
ADD R4,R2, R2
14
BNE R1,R2, R5
16
Output R3
18
Stop
Question: is this program correct? Check
it and report: correct if needed
The point of this exercise:
• Get the “big picture”
– Bottom up:
• we know exactly how the whole machine works, assuming
only that a “controllable switch” is available
– The entire machine architecture, with all the fundamental
components is introduced in a simple setting
– A simpler machine language to get you started
• Later (soon!)
– We switch to a real machine language (MIPS, in Chapter 3)
– Chapter 3 is about instruction set architecture,
• So we will then ignore how it can be implemented
• return to implementation in Chapter 5
–
Another program:
• Suppose we want to print squares of M numbers, starting
with N
Higher level language:
– input sequence: 5, 3
– output sequence: 25, 36, 49
• Write in Higher language
• Then:
– convert to symbolic lang.
– then to machine language
• We will program in
– The “symbolic language”
– Called the assembly language
– Assemblers translate this
read n, m
x = n;
while (x<=m) {
r=0; c=0;
while (c!=x)
{ r = r+x; c= c+1;}
output r;
x = x+1;}

CS 232: Computer Architecture II - Parallel Programming Laboratory

Transcript CS 232: Computer Architecture II - Parallel Programming Laboratory

Directory