Accuracy, random

Transcript Accuracy, random

Python Crash Course
Accuracy
3rd year Bachelors
V1.0
dd 04-09-2013
Hour 7
Accuracy
Our society depends on software. This may be an obvious statement.
Software bugs cost the U.S. economy 60 billion dollars each year. It is
stated that a third of that cost could be eliminated by improved testing.
Bugs can cause accidents, also in science, although in astronomy
people usually do not end up in a hospital.
In this lecture we focus on floating point numbers. It is important to
realize that these numbers are represented by a limited number of bits
and therefore are limited to a certain precision. First we discuss three
sources of error:
– Rounding,
– Cancellation and
– Recursion.
Rounding
• Calculations use a fixed number of significant digits. After each
operation the result usually has to be rounded. The error is less than
(or equal) half a unit represented by the last bit of the internal
representation. In loops these errors can propagate and grow.
def step( a, b, n ):
""" Function to calculate intermediate steps in an interval """
h = (b-a) / n
x = a
print "%.18f" % (x)
while (x < b):
x = x + h
print "%.18f" % (x)
step (1.0, 2.0, 3)
1.000000000000000000
1.333333333333333259
1.666666666666666519
1.999999999999999778
2.333333333333333037
Rounding
• Enhancing the accuracy
– Replacing the condition in the while statement with a better one.
def step( a, b, n ):
""" Function to calculate intermediate steps in an interval """
eps = 1.0e-8
h = (b-a) / n
x = a
print "%.18f" % (x)
while (abs(x – b) > eps):
x = x + h
print "%.18f" % (x)
step(1.0, 2.0, 3)
1.000000000000000000
1.333333333333333259
1.666666666666666519
1.999999999999999778
Rounding
• Enhancing the accuracy
– Replacing the while with a for loop using an integer.
def step( a, b, n ):
""" Function to calculate intermediate steps in an interval """
eps = 1.0e-8
h = (b-a) / n
x = a
print "%.18f" % (x)
for i in range(0,n+1):
x = x + h
print "%.18f" % (x)
step(1.0, 2.0, 3)
1.000000000000000000
1.333333333333333259
1.666666666666666519
1.999999999999999778
Cancellation
Cancellation occurs from the subtraction of two almost equal numbers.
If you subtract two numbers with approximate equal values then the
errors are also comparable and in the subtraction we get a small value
with a relatively big error (which can be demonstrated with simple error
analysis). This effect is demonstrated when you try to calculate the
roots of the equation:
This equation has two analytical expressions for finding the roots:
With a=1.0e-5, b = 1.0e3 and c = 1.0e3
Cancellation
• Consider the following code
import numpy
def sqrt_single(x):
X = x.astype('f')
return numpy.sqrt(X)
# or use: X = numpy.cast['f'](x)
v = numpy.array([1.0e-5, 1.0e3, 1.0e3], 'f')
print "SINGLE precision type: ", v.dtype
a = v[0]; b = v[1]; c = v[2]
sq = sqrt_single(b*b-4.0*a*c)
xa1 = (-b + sq)/(2.0*a)
xa2 = (-b - sq)/(2.0*a)
print "\nx1,x2: ", xa1,xa2
print "We expect: x1*x2-c/a=0, but result is:", xa1*xa2-c/a
print "x1 from c/(a*x2)=", c/(a*xa2)
xb1 = (-2.0*c) / (b + sq)
xb2 = (-2.0*c) / (b - sq)
print "\nx1,x2: ", xb1,xb2
print "We expect x1*x2-c/a=0, but result is:", xb1*xb2-c/a
print "x2 from c/(a*x1)", c/(a*xb1)
print "\nsqrt should be 999.99998, but result is: %12f "% sq9
Cancellation
•
And the result is
SINGLE precision type:
float32
x1,x2: -3.05175788959 -100000002.526
We expect: x1*x2-c/a=0, but result is: 205175796.669
x1 from c/(a*x2)= -1.0
x1,x2: -1.0 -32768000.0
We expect x1*x2-c/a=0, but result is: -67232000.0
x2 from c/(a*x1) -100000002.526
sqrt should be 999.99998, but result is:
•
999.999939
This seems strange. Two analytical equivalent formulas do not generate
the same solution for x1 and x2. This is the effect of cancellation. Note
that the value of 4*|ac| is small compared to b^2. The square root results
in a value near b and the error in the square root will dominate. The
cancellation effects occur when we subtract b and a number near b. The
correct roots for the single precision approach are:
–
–
x1 = -1
x2 = -1.0e8
Recursion
Many scientific calculations use a new entity based on a
previous one. In such iterations the errors can accumulate
and destroy your computation. In another task we will
discuss Euler's method to solve a first order differential
equation y'=f(x,y).
Values for y are calculated with yn+1 = yn + h.f(x,y).
For small h the method is stable but the error increases.
Recursion,
The problem: to find a numerical approximation to y(t) where
Typically we use a fixed step size, i.e.,hn= h = constant
Recursion
def f(t,y):
value = y+t
return (value)
def euler(t0,y0,h,tmax):
t=t0; y=y0; td=[t0]; yd=[y0];
while t<tmax:
y = y + h*f(t,y)
yd.append(y)
t=t+h
td.append(t)
return(td,yd)
(tvals,yvals) = euler(0,1,.1,1)
for uv in zip(tvals, yvals):
print uv[0],uv[1]
Real life examples
• Software causing severe problems
– Patriot missile. Time conversion of integer to float with error
therefor missing target. Tested only for < 100 hours.
– Truncation of amounts in stock market transactions and currency
conversions.
– Exploding Ariane 5 rocket in 1996 due to limited size of integer
memory storage.
– Illegal input not handled correctly: USS Yorktown three hours no
propulsion due to database overflow.
Practical example
Practical examples
Random module
>>>
>>>
>>>
9
>>>
3
>>>
6
import random
random.seed()
random.randint(0,10)
random.randint(0,10)
random.randint(0,10)
>>> random.randrange( 0, 10, 2 )
6
>>> random.sample(xrange(100),10)
[18, 81, 22, 57, 78, 77, 15, 25, 92, 93]
>>> random.random()
0.81241742358112989
>>> random.random()
0.54302903272916259
>>> random.uniform(99.9,101.9)
100.44371626283052
Random numbers
Python’s random number generator
import random
for i in range(5):
# random float: 0.0 <= number < 1.0
print random.random(),
# random float: 10 <= number < 20
print random.uniform(10, 20),
# random integer: 100 <= number <= 1000
print random.randint(100, 1000),
# random integer: even numbers in 100 <= number < 1000
print random.randrange(100, 1000, 2)
0.946842713956
0.573613195398
0.363241598013
0.602115173978
0.526767588533
19.5910069381 709 172
16.2758417025 407 120
16.8079747714 916 580
18.386796935 531 774
18.0783794596 223 344
Warning: The random number generators provided in the standard library are pseudo-random generators. While this
might be good enough for many purposes, including simulations, numerical analysis, and games, but it’s definitely
not good enough for cryptographic use.
Random numbers
numpy’s random number generator
rand(d0, d1, ..., dn)
Random values in a given shape.
randn(d0, d1, ..., dn)
Return a sample (or samples) from the “standard
normal” distribution.
randint(low[, high, size])
Return random integers from low (inclusive) to high
(exclusive).
random_integers(low[, high, size])
Return random integers between low and high, inclusive.
random_sample([size])
Return random floats in the half-open interval [0.0, 1.0).
random([size])
Return random floats in the half-open interval [0.0, 1.0).
ranf([size])
Return random floats in the half-open interval [0.0, 1.0).
sample([size])
Return random floats in the half-open interval [0.0, 1.0).
choice(a[, size, replace, p])
Generates a random sample from a given 1-D array
bytes(length)
Return random bytes.
Random numbers
numpy’s random number generator, a few examples
beta(a, b[, size])
binomial(n, p[, size])
chisquare(df[, size])
exponential([scale, size])
gamma(shape[, scale, size])
normal([loc, scale, size])
poisson([lam, size])
power(a[, size])
standard_exponential([size])
standard_gamma(shape[, size])
standard_normal([size])
standard_t(df[, size])
uniform([low, high, size])
The Beta distribution over [0, 1].
Draw samples from a binomial distribution.
Draw samples from a chi-square distribution.
Exponential distribution.
Draw samples from a Gamma distribution.
Draw random samples from a normal (Gaussian) distribution.
Draw samples from a Poisson distribution.
Draws samples in [0, 1] from a power distribution with positive exponent a - 1.
Draw samples from the standard exponential distribution.
Draw samples from a Standard Gamma distribution.
Returns samples from a Standard Normal distribution (mean=0, stdev=1).
Standard Student’s t distribution with df degrees of freedom.
Draw samples from a uniform distribution.
The numpy.random library contains a few extra probability distributions commonly used in scientific
research, as well as a couple of convenience functions for generating arrays of random data.
Random
• Numpy random number generator
>>> import numpy.random as nr
>>> nr.rand(10)
array([ 0.21440458, 0.07646705,
0.63563963, 0.73611493,
0.24493176,
0.28994467,
>>> import numpy.random as nr
>>> nr.rand(3,2)
array([[ 0.65159297, 0.78872335],
[ 0.09385959, 0.02834748],
[ 0.8357651 , 0.43276707]])
• Pseudo randomness
>>> nr.seed([10])
>>> nr.rand(3)
array([ 0.57140259,
>>> nr.seed([10])
>>> nr.rand(3)
array([ 0.57140259,
0.42888905,
0.5780913 ])
0.42888905,
0.5780913 ])
0.33086711,
0.28596076,
0.35931123,
0.10779166])
Random generators
The normal distribution, also called Gaussian distribution, is an
extremely important probability distribution in many fields. It is a family
of distributions of the same general form, differing in their location and
scale parameters: the mean ("average") and standard deviation
("variability"), respectively.
>>> nr.normal(10.0,2.0,10)
array([ 9.10795834,
7.17735118,
9.99187537,
11.04096991, 12.09645086, 12.31425147,
10.4381389 ,
7.71719838])
8.89229777,
10.21820379,
Random generators
In probability theory and statistics, the Poisson distribution is a discrete
probability distribution. This random number counts the number of
successes in n independent experiments (where the probability of
success in each experiment is p) in the limit as n -> infinity and p -> 0
gets very small such that lambda = np >= 0 is a constant.
>>> samples = nr.poisson(10,100)
>>> samples
array([ 9, 17, 11, 12, 8, 5, 13,
5, 9, 10, 8, 12, 16, 4,
10, 8, 12, 16, 8, 8, 12,
13, 12, 8, 11, 9, 14, 14,
16, 11, 11, 5, 6, 13, 13,
10, 8, 10, 10, 7, 11, 7,
8,
8,
15,
11,
11,
13,
10,
8,
16,
11,
7,
11,
13, 5, 8, 11, 10, 10, 8, 7,
15, 4, 9, 9, 8, 12, 17, 9,
12, 15, 10, 17, 9, 8, 10, 10,
9, 14, 4, 4, 12, 18, 8, 11,
8, 11, 11, 14, 9, 6, 11, 13,
16, 9, 9, 13, 11, 7])
Introduction to language
End

Accuracy, random

Transcript Accuracy, random

Directory