Transcript PPT
IT 252
Computer Organization
and Architecture
Number Representation
Chia-Chi Teng
Where Are We Now?
CS142 & 124
IT344
Review (do you remember from 124/104?)
• 8 bit signed 2’s complement binary #
-> decimal #
• 0111 1111 = ?
• 1000 0000 = ?
• 1111 1111 = ?
• Decimal # -> 8 bit signed 2’s
complement binary #
• 32 = ?
• -2 = ?
• 200 = ?
Decimal Numbers: Base 10
Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Example:
3271 =
(3x103) + (2x102) + (7x101) + (1x100)
Numbers: positional notation
• Number Base B B symbols per digit:
• Base 10 (Decimal): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Base 2 (Binary):
0, 1
• Number representation:
• d31d30 ... d1d0 is a 32 digit number
• value = d31 B31 + d30 B30 + ... + d1 B1 + d0 B0
• Binary:
0,1 (In binary digits called “bits”)
• 0b11010 = 124 + 123 + 022 + 121 + 020
= 16 + 8 + 2
#s often written = 26
0b… • Here 5 digit binary # turns into a 2 digit decimal #
• Can we find a base that converts to binary easily?
Hexadecimal Numbers: Base 16
• Hexadecimal:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
• Normal digits + 6 more from the alphabet
• In C, written as 0x… (e.g., 0xFAB5)
• Conversion: BinaryHex
• 1 hex digit represents 16 decimal values
• 4 binary digits represent 16 decimal values
1 hex digit replaces 4 binary digits
• One hex digit is a “nibble”. Two is a “byte”
• 2 bits is a “half-nibble”. Shave and a haircut…
• Example:
• 1010 1100 0011 (binary) = 0x_____ ?
Decimal vs. Hexadecimal vs. Binary
Examples:
1010 1100 0011 (binary)
= 0xAC3
10111 (binary)
= 0001 0111 (binary)
= 0x17
0x3F9
= 11 1111 1001 (binary)
How do we convert between
hex and Decimal?
MEMORIZE!
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Precision and Accuracy
Don’t confuse these two terms!
Precision is a count of the number bits in a computer
word used to represent a value.
Accuracy is a measure of the difference between the
actual value of a number and its computer
representation.
High precision permits high accuracy but doesn’t
guarantee it. It is possible to have high precision
but low accuracy.
Example:
float pi = 3.14;
pi will be represented using all bits of the
significant (highly precise), but is only an
approximation (not accurate).
What to do with representations of numbers?
• Just what we do with numbers!
• Add them
• Subtract them
• Multiply them
• Divide them
• Compare them
• Example: 10 + 7 = 17
+
1
1
1
0
1
0
0
1
1
1
------------------------1
0
0
0
1
• …so simple to add in binary that we can
build circuits to do it!
• subtraction just as you would in decimal
• Comparison: How do you tell if X > Y ?
Visualizing (Mathematical) Integer Addition
• Integer Addition
• 4-bit integers u,
v
• Compute true
sum Add4(u , v)
• Values increase
linearly with u
and v
• Forms planar
surface
Add4(u , v)
Integer Addition
32
28
24
20
16
14
12
12
10
8
8
4
6
0
4
0
2
u
4
2
6
8
10
12
0
14
v
Visualizing Unsigned Addition
• Wraps Around
Overflow
• If true sum ≥ 2w
UAdd4(u , v)
• At most once
True Sum
16
2w+1
14
Overflow
12
10
8
2w
14
6
12
10
4
8
2
0
6
0
Modular Sum
4
0
2
u
4
2
6
8
10
12
0
14
v
BIG IDEA: Bits can represent anything!!
• Characters?
• 26 letters 5 bits (25 = 32)
• upper/lower case + punctuation
7 bits (in 8) (“ASCII”)
• standard code to cover all the world’s
languages 8,16,32 bits (“Unicode”)
www.unicode.com
• Logical values?
• 0 False, 1 True
• colors ? Ex:
Red (00)
Green (01)
Blue (11)
• locations / addresses? commands?
• MEMORIZE: N bits at most 2N things
How to Represent Negative Numbers?
• So far, unsigned numbers
• Obvious solution: define leftmost bit to be sign!
• 0 +, 1 –
• Rest of bits can be numerical value of number
• Representation called sign and magnitude
• x86 uses 32-bit integers. +1ten would be:
0000 0000 0000 0000 0000 0000 0000 0001
• And –1ten in sign and magnitude would be:
1000 0000 0000 0000 0000 0000 0000 0001
Shortcomings of sign and magnitude?
• Arithmetic circuit complicated
• Special steps depending whether signs are
the same or not
• Also, two zeros
• 0x00000000 = +0ten
• 0x80000000 = –0ten
• What would two 0s mean for programming?
• Therefore sign and magnitude abandoned
Another try: complement the bits
• Example:
710 = 001112 –710 = 110002
• Called One’s Complement
• Note: positive numbers have leading 0s,
negative numbers have leadings 1s.
00000
00001 ...
01111
10000 ... 11110 11111
• What is -00000 ? Answer: 11111
• How many positive numbers in N bits?
• How many negative numbers?
Standard Negative Number Representation
• What is result for unsigned numbers if tried
to subtract large number from a small one?
• Would try to borrow from string of leading 0s,
so result would have a string of leading 1s
3 - 4 00…0011 – 00…0100 = 11…1111
• With no obvious better alternative, pick
representation that made the hardware simple
• As with sign and magnitude,
leading 0s positive, leading 1s negative
000000...xxx is ≥ 0, 111111...xxx is < 0
except 1…1111 is -1, not -0 (as in sign & mag.)
• This representation is Two’s Complement
2’s Complement Number “line”: N = 5
00000 00001
11111
11110
00010
-1 0 1
11101
2
-2
-3
11100
-4
.
.
.
.
.
.
• 2N-1 nonnegatives
• 2N-1 negatives
• one zero
• how many
positives?
-15 -16 15
10001 10000 01111
00000
10000 ... 11110 11111
00001 ...
01111
Numeric Ranges
• Unsigned Values
• UMin
= 0
• Two’s Complement Values
• TMin
=
–2w–1
000…0
• UMax
=
2w
100…0
–1
• TMax
111…1
=
011…1
• Other Values
• Minus 1
111…1
Values for W = 16
UMax
TMax
TMin
-1
0
Decimal
65535
32767
-32768
-1
0
Hex
FF FF
7F FF
80 00
FF FF
00 00
Binary
11111111 11111111
01111111 11111111
10000000 00000000
11111111 11111111
00000000 00000000
2w–1 – 1
Values for Different Word Sizes
W
UMax
TMax
TMin
8
255
127
-128
16
65,535
32,767
-32,768
32
4,294,967,295
2,147,483,647
-2,147,483,648
• Observations
• |TMin | = TMax + 1
Asymmetric range
• UMax = 2 * TMax + 1
64
18,446,744,073,709,551,615
9,223,372,036,854,775,807
-9,223,372,036,854,775,808
C Programming
#include <limits.h>
Declares constants, e.g.,
ULONG_MAX
LONG_MAX
LONG_MIN
Values platform
specific
Unsigned & Signed Numeric Values
X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
B2T(X)
0
1
2
3
4
5
6
7
–8
–7
–6
–5
–4
–3
–2
–1
• Equivalence
• Same encodings for
nonnegative values
• Uniqueness
• Every bit pattern represents
unique integer value
• Each representable integer
has unique bit encoding
• Can Invert Mappings
• U2B(x) = B2U-1(x)
Bit pattern for unsigned
integer
• T2B(x) = B2T-1(x)
Bit pattern for two’s comp
integer
Two’s Complement Formula
• Can represent positive and negative numbers
in terms of the bit value times a power of 2:
d31 x -(231) + d30 x 230 + ... + d2 x 22 + d1 x 21 + d0 x 20
• Example: 1101two
= 1x-(23) + 1x22 + 0x21 + 1x20
= -23 + 22 + 0 + 20
= -8 + 4 + 0 + 1
= -8 + 5
= -3ten
Two’s Complement shortcut: Negation
*Check out www.cs.berkeley.edu/~dsw/twos_complement.html
• Change every 0 to 1 and 1 to 0 (invert or
complement), then add 1 to the result
• Proof*: Sum of number and its (one’s)
complement must be 111...111two
However, 111...111two= -1ten
Let x’ one’s complement representation of x
Then x + x’ = -1 x + x’ + 1 = 0 -x = x’ + 1
• Example: -3 to +3 to -3
x : 1111 1111 1111 1111 1111 1111 1111 1101two
x’: 0000 0000 0000 0000 0000 0000 0000 0010two
+1: 0000 0000 0000 0000 0000 0000 0000 0011two
()’: 1111 1111 1111 1111 1111 1111 1111 1100two
+1: 1111 1111 1111 1111 1111 1111 1111 1101two
You should be able to do this in your head…
What if too big?
• Binary bit patterns above are simply
representatives of numbers. Strictly speaking
they are called “numerals”.
• Numbers really have an number of digits
• with almost all being same (00…0 or 11…1) except
for a few of the rightmost digits
• Just don’t normally show leading digits
• If result of add (or -, *, / ) cannot be
represented by these rightmost HW bits,
overflow is said to have occurred.
00000 00001 00010
unsigned
11110 11111
Peer Instruction Question
X = 1111 1111 1110 1100two
Y = 0011 1010 0000 0000two
A. X > Y (if signed)
B. X > Y (if unsigned)
C. X = -19 (if signed)
0:
1:
2:
3:
4:
5:
6:
7:
ABC
FFF
FFT
FTF
FTT
TFF
TFT
TTF
TTT
Peer Instruction Question
A: False (X negative)
B: True
C: False(X = -20)
X = 1111 1111 1110 1100two
Y = 0011 1010 0000 0000two
A. X > Y (if signed)
B. X > Y (if unsigned)
C. X = -19 (if signed)
0:
1:
2:
3:
4:
5:
6:
7:
ABC
FFF
FFT
FTF
FTT
TFF
TFT
TTF
TTT
Number summary...
META: We often make design
decisions to make HW simple
• We represent “things” in computers as
particular bit patterns: N bits 2N things
• Decimal for human calculations, binary for
computers, hex to write binary more easily
• 1’s complement - mostly abandoned
00000
00001 ...
01111
10000 ... 11110 11111
• 2’s complement universal in computing:
cannot avoid, so learn
00000 00001 ... 01111
10000 ... 11110 11111
• Overflow: numbers ; computers finite,errors!
Information units
• Basic unit is the bit (has value 0 or 1)
• Bits are grouped together in units and operated on together:
• Byte = 8 bits
• Word = 4 bytes
• Double word = 2 words
• etc.
Encoding Byte Values
• Byte = 8 bits
• Binary
• Decimal:
000000002
010
to
to
25510
111111112
First digit must not be 0 in C
• Hexadecimal
0016
to
FF16
Base 16 number representation
Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’
Write FA1D37B16 in C as 0xFA1D37B
•
Or 0xfa1d37b
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Memory addressing
• Memory is an array of information units
– Each unit has the same size
– Each unit has its own address
– Address of an unit and contents of the unit at that address are
different
0
1
2
address
123
-17
0
contents
Addressing
• In most of today’s computers, the basic unit that can be addressed is
a byte. (how many bit is a byte?)
– x86 (and pretty much all CPU today) is byte addressable
• The address space is the set of all memory units that a program can
reference
– The address space is usually tied to the length of the registers
– x86 has 32-bit registers. Hence its address space is 4G bytes
– Older micros (minis) had 16-bit registers, hence 64 KB address
space (too small)
– Some current (Alpha, Itanium, Sparc, Altheon) machines have
64-bit registers, hence an enormous address space
Machine Words
• Machine Has “Word Size”
• Nominal size of integer-valued data
Including addresses
• Many current machines still use 32 bits (4 bytes) words
Limits addresses to 4GB
Becoming too small for memory-intensive applications
• New or high-end systems use 64 bits (8 bytes) words
Potential address space 1.8 X 1019 bytes
x86-64 machines support 48-bit addresses: 256 Terabytes
• Machines support multiple data formats
Fractions or multiples of word size
Always integral number of bytes
Addressing words
• Although machines are byte-addressable, 4 byte integers are the
most commonly used units
• Every 32-bit integer starts at an address divisible by 4
int at address 0
int at address 4
int at address 8
Word-Oriented Memory Organization
32-bit
Words
• Addresses Specify
Byte Locations
• Address of first byte in
word
• Addresses of
successive words differ
by 4 (32-bit) or 8 (64-bit)
64-bit
Words
Addr
=
0000
??
Addr
=
0000
??
Addr
=
0004
??
Addr
=
0008
??
Addr
=
0012
??
Addr
=
0008
??
Bytes Addr.
0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
Data Representations
• Sizes of C Objects (in Bytes)
• C Data TypeTypical 32-bit
Intel IA32
x86-64
char
short
int
long
1
2
4
4
1
2
4
4
1
2
4
8
long long
float
double
long double
char *
8
4
8
8
4
8
4
8
10/12
4
8
4
8
10/16
8
•
Or any other pointer
Byte Ordering
How should bytes within multi-byte
word be ordered in memory?
Conventions
Big Endian: Sun, PPC Mac, Internet
Least significant byte has highest address
Little Endian: x86
Least significant byte has lowest address
Byte Ordering Example
• Big Endian
• Least significant byte has highest address
• Big End First
• Little Endian
• Least significant byte has lowest address
• Little End First
• Example
• Variable x has 4-byte representation 0x01234567
• Address given by &x is 0x100
Big Endian
Little Endian
0x100
0x101
0x102
0x103
01
23
45
45
67
0x100
0x101
0x102
0x103
67
45
23
01
Big-endian vs. little-endian
• Byte order within a word:
Value of Word #0
3
2
1
0
Little-endian
(we’ll use this)
0
1
2
3
Big-endian
Memory
address
0
byte
1
2
3
Word #0
Reading Byte-Reversed Listings
• Disassembly
• Text representation of binary machine code
• Generated by program that reads the machine code
• Example Fragment
Address
8048365:
8048366:
804836c:
Instruction Code
5b
81 c3 ab 12 00 00
83 bb 28 00 00 00 00
Assembly Rendition
pop
%ebx
add
$0x12ab,%ebx
cmpl
$0x0,0x28(%ebx)
Deciphering Numbers
Value:
Pad to 32 bits:
Split into bytes:
Reverse:
0x12ab
0x000012ab
00 00 12 ab
ab 12 00 00
Examining Data Representations
Code to Print Byte Representation of Data
Casting pointer to unsigned char * creates byte array
typedef unsigned char *pointer;
void show_bytes(pointer start, int len)
{
int i;
for (i = 0; i < len; i++)
printf("0x%p\t0x%.2x\n",
start+i, start[i]);
printf("\n");
}
Printf directives:
%p:Print pointer
%x: Print Hexadecimal
show_bytes Execution Example
int a = 15213;
printf("int a = 15213;\n");
show_bytes((pointer) &a, sizeof(int));
Result (Linux):
int a = 15213;
0x11ffffcb8 0x6d
0x11ffffcb9 0x3b
0x11ffffcba 0x00
0x11ffffcbb 0x00
Representing & Manipulating Sets
• Representation
• Width w bit vector represents subsets of {0, …, w–1}
• aj = 1 if j A
01101001 { 0, 3, 5, 6 }
76543210
01010101 { 0, 2, 4, 6 }
76543210
• Operations
• &
Intersection
01000001
{ 0, 6 }
• |
Union
01111101
{ 0, 2, 3, 4, 5, 6 }
• ^
Symmetric difference
00111100
{ 2, 3, 4, 5 }
• ~
Complement
10101010
{ 1, 3, 5, 7 }
Bit-Level Operations in C
• Operations &, |, ~, ^ Available in C
• Apply to any “integral” data type
long, int, short, char, unsigned
• View arguments as bit vectors
• Arguments applied bit-wise
• Examples (Char data type)
• ~0x41 -->
0xBE
~010000012
• ~0x00 -->
~000000002
• 0x69 & 0x55
-->
101111102
-->
111111112
0xFF
-->
0x41
011010012 & 010101012 --> 010000012
• 0x69 | 0x55
-->
0x7D
011010012 | 010101012 --> 011111012
Contrast: Logic Operations in C
• Contrast to Logical Operators
• &&, ||, !
View 0 as “False”
Anything nonzero as “True”
Always return 0 or 1
Early termination
• Examples (char data type)
• !0x41 -->
• !0x00 -->
• !!0x41 -->
• 0x69 && 0x55
• 0x69 || 0x55
• p && *p
0x00
0x01
0x01
-->
-->
0x01
0x01
(avoids null pointer access)
Shift Operations
• Left Shift:
x << y
• Shift bit-vector x left y positions
•
Throw away extra bits on left
Fill with 0’s on right
• Right Shift: x >> y
• Shift bit-vector x right y positions
<< 3
00010000
Log. >> 2
00011000
Arith. >> 2
00011000
Argument x
10100010
<< 3
00010000
Log. >> 2
00101000
Arith. >> 2
11101000
Fill with 0’s on left
• Arithmetic shift
01100010
Throw away extra bits on right
• Logical shift
Argument x
Replicate most significant bit on
right
• Undefined Behavior
• Shift amount < 0 or word size
The CPU - Instruction Execution Cycle
• The CPU executes a program by repeatedly following this cycle
1. Fetch the next instruction, say instruction i
2. Execute instruction i
3. Compute address of the next instruction, say j
4. Go back to step 1
• Of course we’ll optimize this but it’s the basic concept
What’s in an instruction?
• An instruction tells the CPU
– the operation to be performed via the OPCODE
– where to find the operands (source and destination)
• For a given instruction, the ISA specifies
– what the OPCODE means (semantics)
– how many operands are required and their types, sizes
etc.(syntax)
• Operand is either
– register (integer, floating-point, PC)
– a memory address
– a constant
Reference slides
You ARE responsible for the
material on these slides (they’re
just taken from the reading
anyway) ; we’ve moved them to
the end and off-stage to give
more breathing room to lecture!
Kilo, Mega, Giga, Tera, Peta, Exa, Zetta, Yotta
physics.nist.gov/cuu/Units/binary.html
• Common use prefixes (all SI, except K [= k in SI])
Name
Abbr Factor
SI size
Kilo
K
210 = 1,024
103 = 1,000
Mega
M
220 = 1,048,576
106 = 1,000,000
Giga
G
230 = 1,073,741,824
109 = 1,000,000,000
Tera
T
240 = 1,099,511,627,776
1012 = 1,000,000,000,000
Peta
P
250 = 1,125,899,906,842,624
1015 = 1,000,000,000,000,000
Exa
E
260 = 1,152,921,504,606,846,976
1018 = 1,000,000,000,000,000,000
Zetta
Z
270 = 1,180,591,620,717,411,303,424
1021 = 1,000,000,000,000,000,000,000
Yotta
Y
280 = 1,208,925,819,614,629,174,706,176
1024 = 1,000,000,000,000,000,000,000,000
• Confusing! Common usage of “kilobyte” means
1024 bytes, but the “correct” SI value is 1000 bytes
• Hard Disk manufacturers & Telecommunications are
the only computing groups that use SI factors, so
what is advertised as a 30 GB drive will actually only
hold about 28 x 230 bytes, and a 1 Mbit/s connection
transfers 106 bps.
kibi, mebi, gibi, tebi, pebi, exbi, zebi, yobi
en.wikipedia.org/wiki/Binary_prefix
• New IEC Standard Prefixes [only to exbi officially]
Name
Abbr Factor
kibi
Ki
210 = 1,024
mebi
Mi
220 = 1,048,576
gibi
Gi
230 = 1,073,741,824
tebi
Ti
240 = 1,099,511,627,776
pebi
Pi
250 = 1,125,899,906,842,624
exbi
Ei
260 = 1,152,921,504,606,846,976
zebi
Zi
270 = 1,180,591,620,717,411,303,424
yobi
Yi
280 = 1,208,925,819,614,629,174,706,176
As of this
writing, this
proposal has
yet to gain
widespread
use…
• International Electrotechnical Commission (IEC) in
1999 introduced these to specify binary quantities.
• Names come from shortened versions of the
original SI prefixes (same pronunciation) and bi is
short for “binary”, but pronounced “bee” :-(
• Now SI prefixes only have their base-10 meaning
and never have a base-2 meaning.
The way to remember #s
• What is 234? How many bits addresses
(I.e., what’s ceil log2 = lg of) 2.5 TiB?
• Answer! 2XY means…
X=0 --X=1 kibi ~103
X=2 mebi ~106
X=3 gibi ~109
X=4 tebi ~1012
X=5 pebi ~1015
X=6 exbi ~1018
X=7 zebi ~1021
X=8 yobi ~1024
Y=0 1
Y=1 2
Y=2 4
Y=3 8
Y=4 16
Y=5 32
Y=6 64
Y=7 128
Y=8 256
Y=9 512
MEMORIZE!
Which base do we use?
• Decimal: great for humans, especially when
doing arithmetic
• Hex: if human looking at long strings of
binary numbers, its much easier to convert
to hex and look 4 bits/symbol
• Terrible for arithmetic on paper
• Binary: what computers use;
you will learn how computers do +, -, *, /
• To a computer, numbers always binary
• Regardless of how number is written:
• 32ten == 3210 == 0x20 == 1000002 == 0b100000
• Use subscripts “ten”, “hex”, “two” in book,
slides when might be confusing
Two’s Complement for N=32
0000 ... 0000
0000 ... 0000
0000 ... 0000
...
0111 ... 1111
0111 ... 1111
0111 ... 1111
1000 ... 0000
1000 ... 0000
1000 ... 0000
...
1111 ... 1111
1111 ... 1111
1111 ... 1111
0000 0000 0000two =
0000 0000 0001two =
0000 0000 0010two =
1111
1111
1111
0000
0000
0000
1111
1111
1111
0000
0000
0000
0ten
1ten
2ten
1101two =
1110two =
1111two =
0000two =
0001two =
0010two =
2,147,483,645ten
2,147,483,646ten
2,147,483,647ten
–2,147,483,648ten
–2,147,483,647ten
–2,147,483,646ten
1111 1111 1101two =
1111 1111 1110two =
1111 1111 1111two =
–3ten
–2ten
–1ten
• One zero; 1st bit called sign bit
• 1 “extra” negative:no positive 2,147,483,648ten
Two’s comp. shortcut: Sign extension
• Convert 2’s complement number rep.
using n bits to more than n bits
• Simply replicate the most significant bit
(sign bit) of smaller to fill new bits
• 2’s comp. positive number has infinite 0s
• 2’s comp. negative number has infinite 1s
• Binary representation hides leading bits;
sign extension restores some of them
• 16-bit -4ten to 32-bit:
1111 1111 1111 1100two
1111 1111 1111 1111 1111 1111 1111 1100two
Preview: Signed vs. Unsigned Variables
• Java and C declare integers int
• Use two’s complement (signed integer)
• Also, C declaration unsigned int
• Declares a unsigned integer
• Treats 32-bit number as unsigned
integer, so most significant bit is part of
the number, not a sign bit