CIS541_03_MachinePrecision

Transcript CIS541_03_MachinePrecision

CSE 541 – Numerical Methods
Machine Representation of Numbers
Blazing Fast
• Computers are extremely fast at performing
simple arithmetic.
– Count from one to 1,000,000,000,000 by one
- or – Execute on a Pentium-4 2GHz machine:
int i = 1;
While (1) {
printf( “%d/n”, i++ );
}
– Which reaches one trillion faster?
April 6, 2016
OSU/CIS 541
2
Blazing Fast
• Assume a human can count on average one
number per second (bad assumption).
• Implies 1 trillion seconds
• Or about 32,000 years!!!!
April 6, 2016
OSU/CIS 541
3
Blazing Fast
• Assume a human can count on average one
number per second (bad assumption).
• Implies 1 trillion seconds
• Or about 32,000 years!!!!
• Assume that a 2GHz machine can
increment by one each clock cycle.
• Implies 500 seconds!!!!
April 6, 2016
OSU/CIS 541
4
Machine Precision
• What is our current National Debt?
– (see http://www.brillig.com/debt_clock/)
• What is the annual revenue of a Fortune 100 company?
• How many molecules are in one mole of gas (under ideal
assumptions)
• How many pixels are displayed in a two hour HDTV
movie?
• What is the current balance of Bill Gate’s checking
account?
• How many ounces of Pepsi are consumed per year?
• How many bytes of storage does the Graphics Group own?
April 6, 2016
OSU/CIS 541
5
Machine Precision
• The point is that we live in a world with a
wide range of scales.
• Computers have limited capabilities.
• Let me repeat that: Computers have limited
capabilities.
April 6, 2016
OSU/CIS 541
6
Machine Precision
Computers have
limited
capabilities!!!
April 6, 2016
OSU/CIS 541
7
Accuracy
• Do we really need that much accuracy?
– American Express thinks so
– DoD thinks so (see web article on Patriot
missles)
April 6, 2016
OSU/CIS 541
8
Accuracy and Range
• 16-bit integers (common about 5 years ago)
216 = 65,536
• 32-bit integers
32
2 = 4,294,967,296
• Only 31-bits if you need signed integers.
• 64-bit integers
264  1.84 1019
• Still no Avagadro’s number: 6.022*10
April 6, 2016
OSU/CIS 541
23
9
Accuracy and Range
• 16-bit integers (common about 5 years ago)
216 = 65,536
• 32-bit integers
32
<< 1,000,000,000,000
• Only 31-bits if you need signed integers.
• 64-bit integers
2 = 4,294,967,296
264 = 1.84 1019
• Still no Avagadro’s number 6.022*10
April 6, 2016
OSU/CIS 541
23
10
Blazing Fast
• A 2GHZ 32-bit machine may be blazing
fast, but it will never reach one trillion
either, if care isn’t taken!!!
• Forever >> 32,000 years
– Humans win!!!
April 6, 2016
OSU/CIS 541
11
Accuracy and Standards
• What is int i and what is its range?
• ANSI C standard
– int – machine dependent
– short – machine dependent
Not much of a standard, huh?
April 6, 2016
OSU/CIS 541
12
Standard Range
• Corba
– long – 32-bit signed
– long long – 64 bit signed
• Java - standardized
April 6, 2016
OSU/CIS 541
13
Advice
• Never use int, long in C / C++
• Use the preprocessor to define your own types
#define CISint int \\ I want this to be 32-bit
#define CISshort short \\ 16-bit signed integer
• Hence, if you port to another architecture and need
to change it, only change it in one place.
• Likewise, use each packages types: glFloat, …
April 6, 2016
OSU/CIS 541
14
Numbers
•
•
•
•
Whole numbers: 0,1,2,…
Signed integers: …,-1,0,1,2,…
Rational numbers: a/b, 1/3, etc.
Irrational numbers: sqrt(2), e, pi, etc.
• Is there a difference between 0.1 and 1/3?
April 6, 2016
OSU/CIS 541
15
Representing Real Numbers
• In Base-10, we can
express any number as
a sum of weighted
power’s of ten.
• Hence:
3.141526
 3  100  1  101  4  102  1  103 
256  2  102  5  101  6  100
0.75  7  101  5  102
0.1  1  101
1 
  3  10i
3 i 1
April 6, 2016
OSU/CIS 541
16
Representing Real Numbers
• Using Base-2 or binary, what are our series?
• What coefficients can we have?
256  1  28
0.75  1  21  1  22
0.1  1  24  1  25  1  28 
1
?
3
April 6, 2016
OSU/CIS 541
17
Limiting Precision
• Truncate
– Truncate to what?
• Truncate to 8 significant digits.
– Okay:
3.1415926535897932384626433832795

3.1415926
April 6, 2016
OSU/CIS 541
18
Limiting Precision
• Round to nearest value:
3.1415926535897932384626433832795

3.1415927
• Rules:
– If value after last retained digit is > 5, then round-up.
– If value after last retained digit is < 5, then truncate.
– If value after last digit is =5, then:
• If all remaining digits are not zero, round-up.
• Else, randomly round-up or truncate.
April 6, 2016
OSU/CIS 541
19
Error
• We need to truncate rational and irrational
numbers to make them more manageable.
• Absolute Error: x  x  
– Typically want the absolute value:   x  x
• Relative Error:  
xx
x
• Can I have a very small Absolute error, but
a large relative error?
April 6, 2016
OSU/CIS 541
20
Condition of a Problem
• For a problem with input (data) x and output
y, y=F(x). The problem is said to be wellconditioned if small changes in x, lead to
small changes in y.
• Otherwise, we say the problem is illconditioned.
April 6, 2016
OSU/CIS 541
21
Stability of an Algorithm
• Stability indicates the sensitivity of an
algorithm for solving a problem.
• An algorithm is said to be stable if small
changes in the input x lead to small changes
in the output y.
• Otherwise, the algorithm is said to be
unstable.
April 6, 2016
OSU/CIS 541
22
Condition and Stability
• Condition => data
• Stability => algorithm
• Ill-conditioned – very hard to get a good result
with even the best algorithm.
• Stable – given good data (not ill-conditioned), the
algorithm will not yield drastically different
results if round-off or small noise is added to the
input data.
April 6, 2016
OSU/CIS 541
23
Precision
x = .256834*105
• The digit 2 is the most significant digit, while 4 is
the least significant digit.
• Precision should not be confused with accuracy.
Accuracy is how close your solution is to the
actual solution. Missed the bulls-eye by 2 inches.
• Precision is how good your estimate of the
accuracy is, 2.0”±0.003
April 6, 2016
OSU/CIS 541
24
Precision
• If you miss the bulls-eye, but repeatedly hit the
same location, then it is precise, but inaccurate.
• For instance, using Taylor’s series for the
approximation of sin(x), I can obtain a precise
number for values of x larger than 2, but the
solution would be inaccurate.
• In a computer, precision is usually how many
digits of accuracy your machine representation has
(number of bits in the mantissa).
April 6, 2016
OSU/CIS 541
25
Loss of Significance
• Consider the error for x-y using 5 decimal
digits of precision:
x = .3721448693
y =. 3720214371
x’ = .37214
y’ =. 37202
x’-y’ = .00012
x-y = .0001234322
April 6, 2016
OSU/CIS 541
26
Loss of Significance
• The relative error is:
( x  y )  ( x ' y ')
x y
.0000034322

 3 102
.0001234322
• However, the relative error of x’ and y’ is
only 1.3*10-5.
• We lost 3 significant digits.
April 6, 2016
OSU/CIS 541
27
Loss of Precision Theorem
• Examine the precision loss for x-y.
• Let x and y be normalized floating-point
y
2

1

2
numbers, with x > y > 0. If
x
for some positive integers p and q, then at
most p and at least q significant binary bits
are lost in the subtraction.
• Rule: The closer two numbers, the greater
the loss of significance.
p
April 6, 2016
OSU/CIS 541
q
28
Avoiding loss of Precision
• Use double precision or higher
• Modify the calculations to remove
subtraction of numbers close together.
• Consider: f ( x)  x  1 1 as x approaches 0.
• Reorder to remove the subtraction:
2
f ( x) 
April 6, 2016

2

x
1 1 
2
x 1 1 

2
 x 1 1 



OSU/CIS 541
x2
x 1 1
2
29
How do they do that?
• Microsoft Windows ships with a simple
calculator for its 32-bit operating system.
96
• What would you expect, if you typed in 2 ?
– or

296  0.3345
296

1 2
96
96
– 2 = 79,228,162,514,264,337,593,543,950,336
April 6, 2016
OSU/CIS 541
30
Performance .vs. Accuracy
• With a 2.2GHz computer, you can do a lot of
number crunching (two ops per clock cycle).
• Real-time applications
– Audio must be processed at 4KHz
– One trillion time steps of crash simulation
• Decide whether accuracy or performance is more important
• User interactivity (e.g., calculator):
– No need to calculate things faster than the user’s fat
fingers can type them.
• Use slower, but more accurate algorithms.
April 6, 2016
OSU/CIS 541
31
Integers
• Exact to within machine’s range:
– ±2N-1-1
• See the /usr/include/limits.h file
April 6, 2016
OSU/CIS 541
32
Fixed-Point Numbers
• Used to store floating point numbers with a fixed
range.
• Say I need values for the surface temperatures of
the earth. Restrict the values to the range (999,999).
• I could use signed integers, but would only have
three significant digits.
• I can also represent any temperature with fixed
point numbers of the form xxx.xxxxx to allow for
eight significant digits.
April 6, 2016
OSU/CIS 541
33
Implementing Fixed-point
Arithmetic
• Fixed point is easy to implement in most
cases, You simply ignore the decimal point
for all operations, use integers for your
calculations and then put the decimal point
back in when its needed (e.g., for I/O).
31.45330 => 31045330 (need to remember to divide by 100,000 later).
April 6, 2016
OSU/CIS 541
34
Floating Point Numbers
• Split into three parts:
– Sign
– Mantissa
– Exponent
• +6.024*1023
April 6, 2016
OSU/CIS 541
35
Floating-Point Numbers
• CPU architects have choices on how many
bits to allocate to each component.
• Sign() – 1-bit
• Exponent – n-bits (Determines the range of
numbers)
• Mantissa – whatever is left (Determines the
precision of the numbers)
April 6, 2016
OSU/CIS 541
36
IEEE Floating Point
• Single-precision:
• Sign() – 1-bit (s)
• Exponent – 8-bits (e)
• Mantissa – 23-bits (m)
 1
s
2
e 127
 (1.m) 2
• Notes:
• +0 and –0 are represented differently
• + and - (NaN’s) have special representations
• The mantissa is in normalized form, meaning the first digit is a
one. As such, it is not stored and we have 24-bits of precision.
April 6, 2016
OSU/CIS 541
37
Limits for Floats
• Look at the C include file float.h
#define
#define
*/
#define
#define
#define
#define
#define
#define
#define
#define
#define
DBL_DIG
DBL_EPSILON
15
/* # of decimal digits of precision */
2.2204460492503131e-016 /* smallest such that 1.0+DBL_EPSILON != 1.0
DBL_MANT_DIG
DBL_MAX
DBL_MAX_10_EXP
DBL_MAX_EXP
DBL_MIN
DBL_MIN_10_EXP
DBL_MIN_EXP
_DBL_RADIX
_DBL_ROUNDS
53
1.7976931348623158e+308
308
1024
2.2250738585072014e-308
(-307)
(-1021)
2
1
/*
/*
/*
/*
/*
/*
/*
/*
/*
#define
#define
*/
#define
#define
#define
#define
#define
#define
#define
#define
FLT_DIG
FLT_EPSILON
6
1.192092896e-07F
/* # of decimal digits of precision */
/* smallest such that 1.0+FLT_EPSILON != 1.0
FLT_GUARD
FLT_MANT_DIG
FLT_MAX
FLT_MAX_10_EXP
FLT_MAX_EXP
FLT_MIN
FLT_MIN_10_EXP
FLT_MIN_EXP
0
24
3.402823466e+38F
38
128
1.175494351e-38F
(-37)
(-125)
/*
/*
/*
/*
/*
/*
/*
April 6, 2016
OSU/CIS 541
# of bits in mantissa */
max value */
max decimal exponent */
max binary exponent */
min positive value */
min decimal exponent */
min binary exponent */
exponent radix */
addition rounding: near */
# of bits in mantissa */
max value */
max decimal exponent */
max binary exponent */
min positive value */
min decimal exponent */
min binary exponent */
38
Divide by zero
• What would happen when you ran this
program?
float x = 0.230134789;
float y = 0.230134788;
float inverse;
inverse = 1.0 / (x-y);
return 0;
April 6, 2016
OSU/CIS 541
39
Resolve and Real Numbers
• One of the beautiful benefits of C++ is the ability
to redefine or overload functions in a class
hierarchy.
• With the CSE department’s Resolve language, you
used the type, Real.
• This class (as well as Resolve) is used to ensure
that you do not divide by zero.
• Therefore, it overloads the operator”/” method,
performs a quick (???) check and then performs
the divide (if safe).
April 6, 2016
OSU/CIS 541
40
Designing An Extended
Precision Class
• C++ (and other object-oriented languages like
Java), allow most of the basic language to be
overloaded.
• You can use this feature to define new arithmetic
for:
– limiting numbers: 0.0 -> 1.0, 32-63, …
– embed debugging information: “print if value=pi
– Keep track of the error associated with the operation.
This is called Interval Analysis.
• Let’s look at designing an extended precision
class.
April 6, 2016
OSU/CIS 541
41
Extended Precision with C++
• Let’s say we want over 1000-bits of precision for
our application.
• Furthermore, let’s assume our numbers encode
temperatures and only range from
-273° to 65,000 ° C.
• Key questions:
– What should our basic number format be?
– What operations are we supporting?
– How do we implement those operations?
April 6, 2016
OSU/CIS 541
42
Decimal Type
• C# has a new decimal type:
– 128-bit format
– Base-10 arithmetic!
– 28 significant figures
• That should handle Bill Gates check-book!!!
– Fixed-point, floating point or integer?
– Bit layout?
April 6, 2016
OSU/CIS 541
43
Other C# features
• Keywords checked or unchecked.
– Controls whether to throw an OverflowException or
not.
– For integer arithmetic only.
– int wontThrow = unchecked( System.Int32.MaxValue + 1);
– int willThrow = checked( System.Int32.MaxValue + 1);
– Can also be used in blocks or as a compiler option.
• Many exceptions can be thrown and caught.
• Implicit type casting only happens when widening.
April 6, 2016
OSU/CIS 541
44

CIS541_03_MachinePrecision

Transcript CIS541_03_MachinePrecision

Directory