COMPENG 701 - McMaster University

Transcript COMPENG 701 - McMaster University

BASICS
Hardware components
Computer Architecture
• Computer Organization
– The von Neumann architecture
– Same storage device for both instructions and data
– Processor components
• Arithmetic Logic Unit
• Control Unit
• Registers
Computer Architecture
• Device Controllers
– Memory mapped I/O
– Direct Memory Access (DMA)
• Instruction Set
– Data transfer operations
– Arithmetic / logic operations
– Control flow instructions
Computer components
• Central Processor Unit
– e.g., G3, Pentium III, RISC
• Random Access Memory
– generally lost when power cycled
• Video RAM
– amount sets screen size, color depth
• Read Only Memory
– used for boot
• Input/Output, through interfaces such as
–
–
–
–
Small Computer System Interface
Universal Serial Bus
Firewire (video standard)
Ethernet
• Hard Disk Drive (permanent storage),
Compact Disk Read-Only-Memory,
CD-Read/Write, DVD R/W, etc.
A typical architecture (fragment)
CPU basics
• Smallest thing a computer knows
– a bit 0 or 1 (false/true)
• CPU knows how to perform and, or, xor
(exclusive or) operations
– And returns true if both same
– Or returns true if either true
– Xor returns true if different
• CPU is a massive collection of and and or gates
• A specific CPU has a set of instructions it can
execute (usually 50-100, machine language)
CPU basics
• Number of instructions per seconds is set by the
“clock speed”
– e.g., 500 MHz Pentium III
• One clock tick is called a cycle
– modern CPUs can often execute >1 instruction per cycle
• Programs are set of instructions to be executed by
the CPU
– compilers/linkers or interpreters do this for you
• Floating point speed is measured in floating point
operations per seconds (flops)
Data
Bits/Bytes and Words
• Bits are grouped into larger units
– 8 bits = 1 byte (still common)
– 2/4/8 bytes = word (varies between CPU’s)
– Most desktop machines are 32-bit words,
64-bit machines are becoming more common
• set by data bus
– Why important?
• Sets minimum size unit you can access in program,
and often the precision for computations
Words and Bytes
• Number of unique values that can be represented
depends on number of bits
• With n bits one can have 2n unique values
• For n=8 have (Byte) = 256
– grouped into larger units to represent different data
• ASCII – American Standard Code for Information
Interchange
– Basic version is 7 bit (127 characters)
– A-Z, a-z, 0-9 and special characters
– Values <32 are “control characters”
Numbers and bases
• Numbers represented in different base systems
– Binary base 2 (0-1)
– Octal base 8 (0-7)
– Hexadecimal base 16 (0-15, with A-F representing
10-15)
– E.g, 5410=3616=668=1101102
• Prefixes: kilo = 1024; mega=1048576;
giga=10737741824 (approximately 103,106,109)
Instructions
Program execution
“The machine cycle”
Instruction composition
Stored program
Fetch step of the machine cycle I
Fetch step of the machine cycle II
Decoding the instruction
Mnemonics
• It is hard to remember commands as numbers
• Use words associated with the numbers
Some Assembly language
Operating System
OS
• The Operating System (OS)
– Controls everything in the way the computer works.
– Not Specific to a CPU type but often some OS’s are
associated with specific CPUs
• G3/4/5 68x series MacOS
• Pentium, x86 DOS (Windows)
• SPARC Solaris (Unix)
– OS controls IO and memory management
• Program implementations are dependent on OS
Programming interface to OS
• Depending on language used, OS
interface may or may not be important
• For Fortran, C, C++ when program is
linked OS routines are needed
– How to read from keyboard or file?
– How to write to screen or disk?
• In your program you do not need to go into
the low-level (OS) details
Storage in memory
• Memory treated as a linear array of bytes,
from 1 to <size of memory>
• OS keeps track of used and free memory,
for use by programs and data
• Some computers do “byte-swapping”
– the bytes are not counted linearly but rather are
switched
– main (but not only) styles are Big Endian (HP, Sun,
Macs) and Little Endian (PC)
– affects ability to transfer binary data; TCP knows this
and will accommodate this up to a certain degree
Basics revisited
Hard disks
• Contain the computer “file
system”
– allows access through file names
• Directory structure points to file
location
– reason for having less space
available than the size of disk +
some calibration tracks
• Actual content of HD and
directories depend on OS
– e.g., FAT16, FAT32, NTFS for
Windows, EXT2 for Linux
• In general, OS can only use
their own file-system
Accessing RAM vs. HDD
• The highest possible
bandwidth (peak
bandwidth) for the various
types of RAM
– However, RAM also has to
match the motherboard,
chipset and the CPU
system bus
• HDD ~ only 80MB/s
• In MATLAB: try save,
load, pack, clear
Module type
Max.
Transfer, MB/s
SD RAM, PC100
800
SD RAM, PC133
1064
Rambus, PC800
1600
Rambus, Dual PC800
3200
DDR 266 (PC2100)
2128
DDR 333 (PC2700)
2664
DDR 400 (PC3200)
3200
DUAL DDR PC3200
6400
DUAL DDR2-400
8600
DUAL DDR2-533
10600
RAM and “fast” RAM/cache
• A CPU cache is a cache used
by the central processing unit of
a computer to reduce the
average time to access memory
– Access time: roughly speaking
“CPU speed against the bus speed”
Integers
• Integer numbers can be represented exactly (up
to the range allowed by the number of bytes)
• A 2-byte integer, unsigned 0-65535, signed
±32767 (sometimes called short)
• A 4-byte integer, unsigned 0-4294967295,
signed ±2147483827
– (With a 32-bit address bus, can have 4Gbytes of
memory—reason max memory is limited in
computers)
Floating point
• Representations vary between machines (often
reason binary files can not be shared)
– Precise layout of bits depends on machine and
format; all formats are (mantissa)*2(exponent)
The IEEE standard for floating
point arithmetic
• Single precision (32 bits=4 bytes)
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFF
01
89
31
The value V represented by the word may be determined as follows:
• If E=255 and F is nonzero, then V=NaN ("Not a number")
• If E=255 and F is zero and S is 1, then V=-Infinity
• If E=255 and F is zero and S is 0, then V=Infinity
• If 0<E<255 then V=(-1)S * 2(E-127) * (1.F) where "1.F" is intended to represent
the binary number created by prefixing F with an implicit leading 1 and a
binary point
• If E=0 and F is nonzero, then V=(-1)S * 2 (-126) * (0.F). These are
"unnormalized" values.
• If E=0 and F is zero and S is 1, then V=-0
• If E=0 and F is zero and S is 0, then V=0
Single precision floating point
• In particular
0 00000000 00000000000000000000000 = 0
1 00000000 00000000000000000000000 = -0
0 11111111 00000000000000000000000 = Infinity
1 11111111 00000000000000000000000 = -Infinity
0 11111111 00000100000000000000000 = NaN
1 11111111 00100010001001010101010 = NaN
0 10000000 00000000000000000000000 = +1 * 2(128-127) * 1.0 = 2
0 10000001 10100000000000000000000 = +1 * 2(129-127) * 1.101 = 22*(23+22+1)/23 = 6.5
1 10000001 10100000000000000000000 = -1 * 2(129-127) * 1.101 = -6.5
0 00000001 00000000000000000000000 = +1 * 2(1-127) * 1.0 = 2(-126)
0 00000000 10000000000000000000000 = +1 * 2(-126) * 0.1 = 2(-127)
0 00000000 00000000000000000000001 = +1 * 2(-126) *
0.00000000000000000000001 =
2(-149) (Smallest positive value)
Double precision floating point
S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
01
11 12
63
The value V represented by the word may be determined as follows:
•
•
•
•
•
•
•
If E=2047 and F is nonzero, then V=NaN ("Not a number")
If E=2047 and F is zero and S is 1, then V=-Infinity
If E=2047 and F is zero and S is 0, then V=Infinity
If 0<E<2047 then V=(-1)S * 2(E-1023) * (1.F) where "1.F" is intended to represent
the binary number created by prefixing F with an implicit leading 1 and a binary
point.
If E=0 and F is nonzero, then V=(-1)S * 2(-1022) * (0.F) These are "unnormalized"
values.
If E=0 and F is zero and S is 1, then V=-0
If E=0 and F is zero and S is 0, then V=0
Is the finite precision an issue?
• An extended example: condition number of a
symmetric matrix.
Consider a system of linear equations
1000 999 x  1999

   

 999 998 y  1997
and the perturbed system
1000 999 xˆ  1998.99

   

 999 998 yˆ  1997.01
Example
Note
 x  1
    
 y  1
but
 xˆ    20.97
    ?

 yˆ    18.99 
What went wrong?
Recall an n-by-n matrix A is symmetric if A=AT.
Fact (“spectral decomposition”):
If A is symmetric, it may be written as
A=UDUT,
where D is the diagonal matrix,
U is unitary (i.e., UUT=I, the identity matrix).
Example
Fact: U is unitary is “almost the same” as
U is a rotation matrix
• “almost the same” because U might include reflections
Fact: For A=UDUT as before, the diagonal
elements of D are the eigenvalues of A and
columns of U are the right eigenvectors of A.
Recall t is an eigenvalue of A iff det(A-tI)=0,
u is the corresponding right eigenvector iff
Au=tu.
Example
• How does A act on x, step-by-step:
Ax=UDUTx=UD(UTx)=U(D(UTx)),
that is, “rotate, scale, rotate back”.
Define the condition number of A as
(A)=|t|max(A) / |t|min(A)
where |t|max and |t|min are the largest and the smallest in
absolute value eigenvalues of A.
(A) shows how far off the solution to Ax=b may be from
the solution to Ax=b+E (a measure of “relative
singularity”).
Assignment 2
MATLAB and C code are posted on the web.
1.
Using the MATLAB script from the class, try to identify the cache size on your
machine. Note that usually a number in MATLAB occupies 8 bytes (double precision
floating point), and that the function requires roughly 2-times the memory needed to
store the x vector (x in the input, x_new in the output). Explain your results. As a
sanity check, you might want to use a CPU info tool.
2.
Condition number I: for the class example, answer the following:
a)
b)
c)
d)
3.
4.
5.
find the matrix spectral decomposition and compute the matrix condition number,
find perturbations E of the right-hand side of the equation with (the Euclidean norm) ||E||=1
that give the largest and the smallest errors in the solution vector; plot each of the two
perturbations, together with their image under UT, under D-1UT and, finally, under A-1,
given the equation for the curve ||E||=1, find its image under A-1 and plot it,
find the explicit expression for the condition number (A-g I) for g > 0.
Condition number II: given a 2x2 matrix with double-precision entries, what is the
worst condition number this matrix might have (as a real number)? Explain.
Condition number III: for the question 2 of the first assignment, is there any illconditioning? (Ill-conditioning refers to a matrix having large condition number, hence
often resulting in numerical instability; if you happen to encounter complex
eigenvalues, you are on the wrong track; for the definition in class, matrix must be
symmetric.) Explain your answer.
Bonus question: find all the distinct spectral decompositions of the identity matrix.