Transcript Here
Computer System
• Building blocks of a computer system:
• Using bits
– Binary data and operations
– Logic gates
•
•
•
•
•
Units of measuring amount of data
CPU vs. memory
(Operating System)
Programming languages
Models of computation, e.g. Turing Machine
Binary
• All information inside computer is in binary
• Smallest unit of data is the bit
• Only the values 0 and 1 are used
0 means “false” or “off” or the number 0
1 means “true” or “on” or the number 1
• Individual bit values can be manipulated with
Boolean operations: “and”, “or”, “not”, etc.
– In hardware, we implement these operations with
logic gates.
Boolean examples
• AND
– To graduate, you must have 128 credits and 2.0 GPA.
• OR
– Classics scholarship requires 3 years of Latin or 3 years of
Greek.
• XOR (“exclusive” or)
– To go to Cincinnati, you can fly or drive. In other words, it
doesn’t make sense to do both.
– Do you want a 2-door or a 4-door car?
• NOT
– If a statement is true, its negation is false, and vice versa.
Gates
• Basic building blocks of
CPU’s circuitry.
• Usually 2 inputs.
• X and Y could be 0 or 1.
• Combining gates into a
circuit:
– The output of one gate
becomes input to another.
– This is how more useful
operations are performed.
‘AND’ and ‘OR’
AND
OR
X
Y
ans
X
Y
ans
1
1
1
1
1
1
1
0
0
1
0
1
0
1
0
0
1
1
0
0
0
0
0
0
Note:
0 AND (anything) = 0
1 OR (anything) = 1
XOR
• XOR basically says,
“either but not both”
• The output is 1 if both
inputs are different.
XOR
X
Y
Ans
1
1
0
1
0
1
0
1
1
0
0
0
NOR, NAND
• NOR gate
– Negation of the OR
– Same as feeding output of OR into a NOT gate.
– Symbol for NOR gate is same as OR but with a
loop on the end.
• NAND gate
– Negation of the AND…. analogous to NOR.
• Interesting property:
– NOR and NAND are universal gates. Any other
boolean operation can be implemented by using
several NAND’s or several NOR’s.
Units of data size
• Bit = a single 0 or 1 value
•
Nibble = 4 bits = 1 hexadecimal digit
• Byte = 8 bits
•
•
•
•
Kilobyte (KB) = 210 bytes
Megabyte (MB) = 220 bytes
Gigabyte (GB) = 230 bytes
Terabyte (TB) = 240 bytes
• 210 = 1024, though 1000 is a close approx.
CPU and memory
• CPU’s job is to obey instructions and do calculations
• Memory system stores information for current and future
use
– CPU has tiny number of “registers” for calculations
– main memory (RAM) stores all files currently open
– Secondary memory (e.g. hard drive) is for long-term
storage of files
– Backup system: tape, external hard drive
• Other types of memory:
– Cache, between CPU and RAM
– Removable drive, e.g. USB or DVD
RAM
• Runs on electricity: volatile but fast
• Each byte is numbered and addressable
– Capable of holding a single character or small #
Address
Contents
0
“c”
1
“a”
2
“t”
3
9
4
25
5
100
…
…
CPU, memory
• Contrast between levels of memory
– Tradeoff between cost / size / speed
• Manipulating data by performing instructions
• “What is going on in the CPU?”
• Handout
– A simple machine language
Memory comparison
Type
Size
Access time
Cost per MB
CPU registers
256 bytes
1 ns
N/A
Cache
64 KB
2 ns
$ 20
RAM
512 MB
50 ns
$ 0.20
Disk
200 GB
100,000 ns
$ 0.0002
Numbers are approximate.
“ns” means nanosecond = 1 billionth of a second
Basic computer anatomy
• Inside a computer are 2 parts
– CPU
– Memory
– These are connected by a data bus: an “HOV lane” where traffic
can go either way.
• CPU contains:
– ALU: arithmetic and logic unit
– Control unit: figures out what to do next
– Registers to hold values needed for calculation
• Memory (RAM) contains:
– Software: list of instructions the CPU needs to perform
– Data: Input and output values need to be stored while program
runs
Stored program idea
• Program = software = list of instructions for CPU to do
• Programs reside in memory
• CPU will do 1 instruction at a time
• For each instruction, we do the following:
–
–
–
–
Fetch it from memory
Decode – figure out what it means
Execute – do it
And then we continue with the next instruction… until the
program is finished.
Simple example
• A program to add two numbers.
• This program may reside at bytes 100-116 in RAM.
• The two numbers we wish to add are located at bytes
200 and 204 in RAM.
• We want the result to go into memory at byte 208.
• Program may go something like this:
–
–
–
–
Load the value at Memory[200] into register 1.
Load the value at Memory[204] into register 2.
Add registers 1 and 2, and put result in register 3.
Store the value from register 3 into Memory[208].
• Note that the bus is communicating instructions (RAM to
CPU) as well as data (both ways).
Machine language
• Unfortunately, instructions for CPU can’t be in English,
French, etc.
• Machine language = binary (or hex) representation of
our instructions.
– Each type of computer has its own machine
language.
• This is the original form of “computer programming”.
• Verbs: Instruction set. e.g. Add, subtract, load, store…
• Nouns: Operands such as: registers, memory locations,
constants, other instructions
Verbs
3 kinds of instructions (instruction set)
• Data transfer, using the bus
– Load a value from memory into a CPU register
Very similar to fetching an instruction!
– Store a value from a CPU register into memory
• ALU
– Bit manipulation: AND, OR, XOR, NOT, shift left, shift right, …
– Arithmetic: add, sub, mul, div, remainder, =, <, >, , , ≠, …
• Control
– “Go to” another instruction in program. In other words, interrupt
normal sequence of instructions.
– Can be conditional or unconditional
Example language
• Let’s consider very simple HW.
• 256 bytes of RAM: addressable by 8 bits
• CPU contains
– Instruction register (to store contents of instruction)
– Program counter (to indicate instruction’s address)
– 16 general purpose registers: addressable by 4 bits
• Each register is 1 byte
• Each instruction is 2 bytes = 16 bits = 4 hex digits long
• Instruction format:
– First 4 bits are the opcode = specify which instruction type
– Other 12 bits are operand(s)
• What do instructions mean?
Machine language
• Machine language examples
– Don’t memorize…
• Instruction execution
• Operations in instruction set
– Performing arithmetic sometimes requires load / store
instructions in addition to the arithmetic instruction
– Instructions to manipulate bits directly
Example instructions
• Note: 16 possible opcodes: 4 bit opcode
• Note: 16 possible registers: register number also 4 bits
• Opcode 5 is used for adding
– Expects 3 register operands
– 5RST means R = S + T, where R, S and T are register numbers
– Ex. 5123 means
Add registers 2 and 3 and put result in register 1.
• Opcode 2 is for putting a constant in a register
– Expects a register operand, and an 8-bit constant operand
– 2RXX means R = XX, where XX is some 8-bit pattern
– Ex. 27c9 means
Put the hexidecimal “c9” into register 7.
• Try an example using both types of instructions.
More instructions
• Opcode 1 is for loading a memory value into a register
– Expects a register operand (4 bits), and a memory address from
which to load (8 bits).
– Ex. 1820 means to go out to memory at address [20], grab the
contents and load it into register 8. (It does not mean put the
number 20 in register 8.)
• Opcode 3 is a store = opposite of load
– Ex. 3921 means to take the value in register 9, and put it into
memory at location [21]. (It does not mean put the number 9 into
memory location 21.)
• Opcode C (hex code for 12) is for telling CPU it’s done.
– Expects operand to be 12 zero-bits.
Some practice
Refer to handout…
• How would we put the number 64 into memory
at address 12?
• How would we add the numbers 6 and 8 and put
the result in register 1?
• How would we add register 7 to register 5 and
put the answer in memory at address 32?
Execution
• In our example, each instruction is 2 bytes long.
• Program counter (PC) begins at address of first
instruction.
• For each instruction:
– Fetch (and increment PC by 2)
– Decode
– Execute
• Note that RAM contains both instructions and data,
separated from each other. For example, addresses 099 could be reserved for code.
Bitwise
• Operations that manipulate bits directly
– Logical
– Shift
Logic operations
• Work just like gates, but we do several bits in parallel.
• Examples
10101110
01101011
AND 11110000
AND 00011111
• Try the same examples with “OR” and “XOR”
• Observations:
– What happens when you AND with a 1? With a 0?
– What about OR’ing with a 1 versus a 0?
– What about XOR?
• ASCII code: how do you capitalize a letter?
Shift operations
• Given a bit pattern like 00011100, we can shift the bits
left to obtain:
00111000.
• If we shift to the right instead, 00011100 becomes
this:
00001110.
• We can even shift by more than one position.
– Shifting 01010000 by 3 bits right 00001010.
• Sometimes when we shift, 1’s fall off the edge.
– Shifting 01010000 by 2 bits left 01000000.
• When we shift, the “vacated” bits are usually 0.
Why shift?
• One application of a shift operation is to:
– Multiply by 2: left shift
– Divide by 2: right shift
– Try some examples – should look familiar with our earlier work
on binary numbers.
• One funny exception: dividing a (signed) negative
number by 2. We need a different operation: arithmetic
right shift.
– In this case, we want the vacated bit to be 1
– Example: –12 in signed is 11110100.
If we shift right by 1, we get 01111010, but it should be
this:
11111010.
Rotate
• Rotate operations work the same as shift… except that
the vacated bits come from the other end of the number.
• So, instead of 1’s falling off the edge, they rotate.
• For example, 01010000 rotated left by 2 becomes
01000001.
• Also: 00001111 rotated right by 3 becomes: 11100001.
Summary
• Here is a list of bitwise operators:
• Logical
– and, or, xor, not
• Shift
–
–
–
–
–
sll (Shift left logical)
srl (Shift right logical)
sra (Shift right arithmetic)
rol (Rotate left)
ror (Rotate right)
Language evolution
• Machine language
• Assembly language
– Like machine language, also unique to each manufacturer
• High-level language
–
–
–
–
–
FORTRAN, COBOL
Pascal, Algol, Ada
C, C++, C#
Java, Javascript, Python
many more
Example
• How would we calculate:
12 + 22 + 32 + … + 202 ?
• Let’s create our own solution, and see what the “code”
looks like in different types of languages:
– Machine language
– Assembly language
– High-level language
Machine language
00003000:
00004000:
00004004:
00004008:
0000400c:
00004010:
00004014:
00004018:
0000401c:
00004020:
00004024:
00000014
200c0001
20080000
3c0a0000
354a3000
8d4a0000
018a4822
1d200005
018c0018
00005812
010b4020
00004028:
0000402c:
00004030:
00004034:
218c0001
08001005
2008000a
0000000c
help me!
Assembly language
numValue: .word 20
__start:
addi $12, $0, 1
addi $8, $0, 0
lui $10, 0
ori $10, $10, 0x3000
lw $10, 0($10)
while:
sub $9, $12, $10
bgtz $3, end
mult $12, $12
mflo $11
add $8, $8, $11
addi $12, $12, 1
j while
end:
addi $8, $0, 10
syscall
HLL (Pascal)
var
sum : integer;
count : integer;
begin
sum := 0;
for count := 1 to 20 do
sum := sum + count * count;
writeln(sum);
end.
3 ways to create code
Write
HLL code
compiler
Write
assembly
code
assembler
Write
machine
code
Machine
code
Machine
code
What does a compiler do?
• Scan
– Break up program into tokens
– Remove comments
• Check syntax
– Understand the structure of the program
– Do all statements obey rules of language?
• Generate code
– Create appropriate machine/assembly instructions
– Optimize operations to save time
• We need a compiler for each language and
architecture.
Thinking
• How computers think
– Concept of “state”
– Turing machine model
– Finite state machine model
a.k.a. “Finite automaton”
State
• Fundamental concept for any computation
– Machine keeps track of where it is, what it needs
– a.k.a. Status, mode
– state may be stored in some memory cell
• Many examples
–
–
–
–
Logging in
Using a dialog box, or other user-interface
Fax machine, photocopier, telephone
Car transmission
Examples
• In a Tic-Tac-Toe game, the “state” of the
game would include:
– Whose turn it is
– Is the game over? Who won, or was it a tie?
State is determined by looking at the board.
• Backgammon (roll dice, move pieces…)
– Depending on your situation in the game, some
moves are illegal.
• Another way to think about states is to
consider all possible board configurations!
Turing machine
• Alan Turing, 1936
• Any general purpose machine must:
–
–
–
–
Work automatically
Be aware of what state it’s in
Have sufficient memory
Be able to do I/O, and be able to read the input many
times if necessary
• Powerful model, but tedious to work with
Adder example
• 4 possible final states, depending on the
inputs
– For example, (S = 0 and C = 0) would be one
outcome.
• Programming the details make working with
real TMs a headache.
x
y
S
z
C
Finite Automata
singular: finite automaton
• Simple model for machine behavior.
• Purpose is to accept or reject some input
– Examples: logging in, using a wizard, game
• At any given time, machine is in some “state”
– Start state
– Final (or accept, “happy”) states
– Dead states
• Transitions between states
Example
• Vending machine for 25¢ item.
0
+5
5
+5
+10
+10
+25
25
+5
20
10
+5
+5
15
Binary example
• We want a “word” starting with “101…”
1
need
101
need
01
0
need
1
1
1
0
0
0,1
What does this FA do?
1
A
0
B
1
0
Example
• We want a word with at least two 0’s.
need
two
1
•
0
need
one
0
1
What if we wanted exactly two 0’s?
0,1
Regular language
• Set of input strings that can be “accepted” or
recognized by a FA.
–
–
–
–
Credit card numbers
Social security numbers
Phone numbers
Date / Time (e.g. to enter into reservation system)
• Some FAs are too big to draw, so instead we
describe with regular expression.
– Shows general format of the input
Regular expression
• Use “wild cards” to make a general expression.
• ? = can replace any single character
• * = can replace any number of characters
• [ ] = can hold a range of possible valid characters
• Examples
105* = anything starting with 105
feb??.ppt = file names like feb25.ppt or feb04.ppt
furman*.xlsx = any spreadsheet about Furman
Version[123].txt = version1.txt, version2.txt, version3.txt
Dijkstra’s algorithm
• How do you find the shortest path in a network?
• General case solved by Edsger Dijkstra, 1959
4
7
9
6
7
3
8
4
2
1
3
6
• Let’s say we want to go from “A” to “Z”.
• The idea is to label each vertex with a
number – its best known distance from
A. As we work, we may find a cheaper
distance, until we “mark” or finalize the
vertex.
1. Label A with 0, and mark A.
2. Label A’s neighbors with their
distances from A.
3. Find the lowest unmarked vertex and
mark it. Let’s call this vertex “B”.
4. Recalculate distances for B’s
neighbors via B. Some of these
neighbors may now have a shorter
known distance.
5. Repeat steps 3 and 4 until you mark Z.
A
4
7
2
B
3
C
4
Z
First, we label A with 0. Mark A as final.
The neighbors of A are B and C. Label B
= 4 and C = 7.
Now, the unmarked vertices are B=4 and
C=7. The lowest of these is B.
Mark B, and recalculate B’s neighbors via
B. The neighbors of B are C and Z.
– If we go to C via B, the total
distance is 4+2 = 6. This is better
than the old distance of 7. So relabel C = 6.
– If we go to Z via B, the total
distance is 4 + 3 = 7.
A
4
7
2
B
3
C
4
Z
Now, the unmarked vertices are C=6 and
Z=7. The lowest of these is C.
Mark C, and recalculate C’s neighbors via
B. The only unmarked neighbor of C
is Z.
– If we go to Z via C, the total
distance is 6+4 = 10. This is worse
than the current distance to Z, so
Z’s label is unchanged.
The only unmarked vertex now is Z, so
we mark it and we are done. Its label
is the shortest distance from A.
A
4
7
2
B
3
C
4
Z
A
Postscript. I want to clarify something…
The idea is to label each vertex with a
number – its best known distance from
A. As we work, we may find a cheaper
distance, until we “mark” or finalize the
vertex.
4
2
B
When you mark a vertex and look to
recalculate distances to its neighbors:
– We don’t need to recalculate
distance for a vertex if marked. So,
only consider unmarked neighbors.
– We only update a vertex’s distance
if it is an improvement: if it’s shorter
than what we previously had.
7
3
C
4
Z
Shortest Paths
• Dijkstra’s algorithm:
What is the shortest distance between 2 points in a
network/graph ?
• A related problem:
What is the shortest distance for me to visit all the points
in the graph and return home?
This is called the traveling salesman problem. Nobody
knows how to solve this problem without doing an
exhaustive search! Open question in CS: why is this
problem so hard?
B
8
A
9
12
6
2
C
5
4
6
3
E
4
D