Bits, Bytes, Ints
Download
Report
Transcript Bits, Bytes, Ints
CS 105
“Tour of the Black Holes of Computing!”
Computer Systems
Introduction
Rett Bull, Geoff Kuenning
Spring 2013
Topics:
–1–
Class Intro
Data Representation
CS 105
Course Theme
Abstraction is good, but don’t forget reality!
Many CS Courses emphasize abstraction
Abstract data types
Asymptotic analysis
These abstractions have limits
Especially in the presence of bugs
Need to understand underlying implementations
Useful outcomes
Become more effective programmers
Able to find and eliminate bugs efficiently
Able to tune program performance
Prepare for later “systems” classes in CS
Compilers, Operating Systems, File Systems, Computer
Architecture, Robotics, etc.
–2–
CS 105
Textbooks
Randal E. Bryant and David R. O’Hallaron,
“Computer Systems: A Programmer’s Perspective”, 2nd
Edition, Prentice Hall, 2011.
Brian Kernighan and Dennis Ritchie,
“The C Programming Language, Second Edition”, Prentice
Hall, 1988
Larry Miller and Alex Quilici
–3–
The Joy of C, Wiley, 1997
CS 105
Syllabus
–4–
Syllabus on Web: http://www.cs.hmc.edu/~geoff/cs105
Calendar defines due dates
Labs: cs105submit for some, others have specific directions
CS 105
Notes:
Work groups
You must work in pairs on all labs
Honor-code violation to work without your partner!
Corollary: showing up late doesn’t harm only you
Handins
Check calendar for due dates
Electronic submissions only
Grading Characteristics
Lab scores tend to be high
Serious handicap if you don’t hand a lab in
Tests & quizzes typically have a wider range of scores
I.e., they’re primary determinant of your grade
» …but not the ONLY one
–5–
Do your share of lab work and reading, or bomb tests
Do practice problems in book
CS 105
Facilities
Assignments will use Intel computer systems
Not all machines are created alike
Even Intel Macs aren’t necessarily compatible
Most modern machines (including Knuth) are 64-bit
Wilkes: x86/Linux specifically set up for this class
Log in on a Mac, then ssh to Wilkes
If you want fancy programs, start X11 first
Directories are cross-mounted, so you can edit on Knuth
or your Mac, and Wilkes will see your files
…or ssh into Wilkes from your dorm
All programs must run on Wilkes: that’s where we
grade
–6–
CS 105
CS 105
“Tour of the Black Holes of Computing”
Integers
Topics
Numeric Encodings
Unsigned & two’s complement
Programming Implications
C promotion rules
Basic operations
Addition, negation, multiplication
Programming Implications
Consequences of overflow
Using shifts to perform power-of-2 multiply/divide
CS 105
C Puzzles
Taken from old exams
Assume machine with 32-bit word size, two’s complement
integers
For each of the following C expressions, either:
Argue that it is true for all argument values, or
Give example where it is not true
• x < 0
((x*2) < 0)
• ux >= 0
Initialization
• x & 7 == 7
(x<<30) < 0
int x = foo();
• ux > -1
int y = bar();
• x > y
unsigned ux = x;
• x * x >= 0
unsigned uy = y;
• x > 0 && y > 0 x + y > 0
–8–
-x < -y
• x >= 0
-x <= 0
• x <= 0
-x >= 0
CS 105
Encoding Integers
Unsigned
B2U(X )
w1
xi 2
Two’s Complement
i
B2T (X ) xw1 2
i0
w1
w2
xi 2 i
i0
short int x = 15213;
short int y = -15213;
Sign
Bit
C short 2 bytes long
x
y
Decimal
15213
-15213
Hex
3B 6D
C4 93
Binary
00111011 01101101
11000100 10010011
Sign Bit
For 2’s complement, most-significant bit indicates sign
0 for nonnegative
1 for negative
–9–
CS 105
Encoding Integers (Cont.)
x =
y =
– 10 –
Weight
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
-32768
Sum
15213: 00111011 01101101
-15213: 11000100 10010011
15213
1
1
0
0
1
4
1
8
0
0
1
32
1
64
0
0
1
256
1
512
0
0
1
2048
1
4096
1
8192
0
0
0
0
15213
-15213
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1 -32768
-15213
CS 105
Numeric Ranges
Unsigned Values
UMin
= 0
Two’s-Complement Values
TMin
=
–2w–1
000…0
UMax
100…0
=
2w – 1
111…1
TMax
=
2w–1 – 1
011…1
Other Values
Minus 1
111…1
Values for W = 16
UMax
TMax
TMin
-1
0
– 11 –
Decimal
65535
32767
-32768
-1
0
Hex
FF FF
7F FF
80 00
FF FF
00 00
Binary
11111111 11111111
01111111 11111111
10000000 00000000
11111111 11111111
00000000 00000000
CS 105
Values for Different Word Sizes
W
UMax
TMax
TMin
8
255
127
-128
16
65,535
32,767
-32,768
32
4,294,967,295
2,147,483,647
-2,147,483,648
Observations
|TMin | =
C Programming
TMax + 1
Asymmetric range
UMax
=
64
18,446,744,073,709,551,615
9,223,372,036,854,775,807
-9,223,372,036,854,775,808
2 * TMax + 1
#include <limits.h>
K&R Appendix B11
Declares constants, e.g.,
ULONG_MAX
LONG_MAX
LONG_MIN
– 12 –
Values platform-specific
CS 105
An Important Detail
No Self-Identifying Data
Looking at a bunch of bits doesn’t tell you what they mean
Could be signed, unsigned integer
Could be floating-point number
Could be part of a string
Only the Program (Instructions) Knows for Sure!
– 13 –
CS 105
Unsigned & Signed
Numeric Values
X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
– 14 –
B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
B2T(X)
0
1
2
3
4
5
6
7
–8
–7
–6
–5
–4
–3
–2
–1
Equivalence
Same encodings for
nonnegative values
Uniqueness
Every bit pattern represents
unique integer value
Each representable integer
has unique bit encoding
CS 105
Casting Signed to Unsigned
C Allows Conversions from Signed to Unsigned
short int
x = 15213;
unsigned short int ux = (unsigned short) x;
short int
y = -15213;
unsigned short int uy = (unsigned short) y;
Resulting Value
No change in bit representation
Nonnegative values unchanged
ux = 15213
Negative values change into (large) positive values
uy = 50323
– 15 –
CS 105
Relation Between
Signed & Unsigned
Weight
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
Sum
– 16 –
uy
=
-15213
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1 -32768
-15213
y + 2 * 32768
=
50323
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1
32768
50323
y + 65536
CS 105
Signed vs. Unsigned in C
Constants
By default are considered to be signed integers
Unsigned if have “U” as suffix
0U, 4294967259u
Casting
Explicit casting between signed & unsigned same as U2T and
T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
Implicit casting also occurs via assignments and procedure calls
tx = ux;
uy = ty;
– 17 –
CS 105
Casting Surprises
Expression Evaluation
If
you mix unsigned and signed in single expression, signed
values are implicitly cast to unsigned
Including comparison operations <, >, ==, <=, >=
Examples
for W = 32
Constant1
– 18 –
Constant2
0
-1
-1
0u
0
0u
2147483647
2147483647u
-1
(unsigned) -1
2147483647
2147483647
-2147483648
-2147483648
-2
-2
2147483648u
(int) 2147483648u
Relation Evaluation
CS 105
Casting Surprises
Expression Evaluation
If
you mix unsigned and signed in single expression, signed
values are implicitly cast to unsigned
Including comparison operations <, >, ==, <=, >=
Examples
for W = 32
Constant1
– 19 –
0
-1
-1
2147483647
2147483647U
2147483647u
-1
(unsigned) -1
2147483647
2147483647
2147483647
2147483647
Constant2
Relation Evaluation
0U
0u
0
0U
0u
-2147483648
-2147483648
-2
-2
2147483648U
2147483648u
(int) 2147483648U
2147483648u
==
<
>
>
<
>
>
<
>
unsigned
signed
unsigned
signed
unsigned
signed
unsigned
unsigned
CS 105
signed
Explanation of Casting Surprises
2’s Comp. Unsigned
Ordering Inversion
Negative Big Positive
TMax
2’s Comp.
Range
– 20 –
0
–1
–2
TMin
UMax
UMax – 1
TMax + 1
TMax
Unsigned
Range
0
CS 105
Sign Extension
Task:
Given w-bit signed integer x
Convert it to w+k-bit integer with same value
Rule:
Make k copies of sign bit:
X = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
w
k copies of MSB
X
• • •
• • •
X
– 21 –
• • •
k
• • •
w
CS 105
Sign Extension Example
short int x = 15213;
int
ix = (int) x;
short int y = -15213;
int
iy = (int) y;
Decimal
Hex
15213
3B
15213 00 00 3B
-15213
C4
-15213 FF FF C4
x
ix
y
iy
– 22 –
6D
6D
93
93
Binary
00111011
00000000 00000000 00111011
11000100
11111111 11111111 11000100
01101101
01101101
10010011
10010011
Converting from smaller to larger integer data type
C automatically performs sign extension
CS 105
Why Should I Use Unsigned?
Be careful using
C compilers on some machines generate less efficient code
unsigned i;
for (i = 1; i < cnt; i++)
a[i] += a[i-1];
Easy to make mistakes
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];
Do use when performing modular arithmetic
Multiprecision arithmetic
Other esoteric stuff
Do use when need extra bit’s worth of range
– 23 –
Working right up to limit of word size
E.g., array sizes
CS 105
Negating with Complement &
Increment
Claim: Following holds for 2’s complement
~x + 1 == -x
Complement
Observation: ~x + x == 1111…112 == -1
x
1001 1101
+ ~x
0110 0010
-1
1111 1111
Increment
~x + x + (-x + 1)
~x + 1
== -1 + (-x + 1)
== -x
Warning: Be cautious treating int’s as integers
– 24 –
OK here (associativity holds)
CS 105
Comp. & Incr. Examples
x = 15213
Decimal
x
15213
~x
-15214
~x+1 -15213
y
-15213
Hex
3B 6D
C4 92
C4 93
C4 93
Binary
00111011 01101101
11000100 10010010
11000100 10010011
11000100 10010011
0
0
~0
~0+1
– 25 –
Decimal
0
-1
0
Hex
00 00
FF FF
00 00
Binary
00000000 00000000
11111111 11111111
00000000 00000000
CS 105
Unsigned Addition
u
• • •
v
• • •
u+v
• • •
UAddw(u , v)
• • •
Operands: w bits
+
True Sum: w+1 bits
Discard Carry: w bits
Standard Addition Function
Ignores carry output
Implements Modular Arithmetic
s =
UAddw(u , v)
= u + v mod 2w
u v
UAddw (u,v)
w
u v 2
– 26 –
u v 2w
u v 2w
CS 105
Two’s-Complement Addition
u
• • •
v
• • •
u+v
• • •
TAddw(u , v)
• • •
Operands: w bits
+
True Sum: w+1 bits
Discard Carry: w bits
TAdd and UAdd have identical bit-level behavior
– 27 –
Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v
Will give s == t
CS 105
Detecting 2’s-Comp. Overflow
Task
2w–1
Given s = TAddw(u , v)
Determine if s = Addw(u , v)
Example
int s, u, v;
s = u + v;
PosOver
2w –1
0
Claim
Overflow iff either:
NegOver
u, v < 0, s 0 (NegOver)
u, v 0, s < 0 (PosOver)
– 28 –
CS 105
Multiplication
Computing exact product of w-bit numbers x, y
Either signed or unsigned
Ranges
Unsigned: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1
Up to 2w bits
Two’s complement min: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1
Up to 2w–1 bits (including 1 for sign)
Two’s complement max: x * y ≤ (–2w–1) 2 = 22w–2
Up to 2w bits, but only for (TMinw)2
Maintaining exact results
– 29 –
Would need to keep expanding word size with each product
computed
Done in software by “arbitrary-precision” arithmetic
packages
CS 105
Power-of-2 Multiply by Shifting
Operation
u << k gives u * 2k
Both signed and unsigned
Operands: w bits
True Product: w+k bits
Discard k bits: w bits
Examples
u · 2k
2k
0 ••• 0 1 0 ••• 0 0
• • •
UMultw(u , 2k)
•••
0 ••• 0 0
0 ••• 0 0
TMultw(u , 2k)
u << 3
u << 5 - u << 3
Most machines shift and add much faster than multiply
– 30 –
*
u
k
• • •
==
==
u * 8
u * 24
Compiler generates this code automatically
CS 105
Unsigned Power-of-2 Divide
by Shifting
Quotient of unsigned by power of 2
u >> k gives u / 2k
Uses logical shift
k
u
Operands:
Division:
Result:
x
x >> 1
x >> 4
x >> 8
– 31 –
/
2k
•••
•••
0 ••• 0 1 0 ••• 0 0
u / 2k
0 •••
•••
u / 2k
0 •••
•••
Division
15213
7606.5
950.8125
59.4257813
Binary Point
Computed
15213
7606
950
59
Hex
3B 6D
1D B6
03 B6
00 3B
.
•••
Binary
00111011 01101101
00011101 10110110
00000011 10110110
00000000 00111011
CS 105