Transcript PowerPoint

Announcements
• Assn 4 is posted. Note that due date is the 12th
(Monday) at 7pm. (Last assignment!)
• Final Exam on June 15, 9am. Room GOO247.
• Assn 3 solution is posted.
Spring 2006
CISC101 - Prof. McLeod
1
Last Time
• Variable Scope and Lifetime
• Sorting
Spring 2006
CISC101 - Prof. McLeod
2
Today
• Searching
– Sequential search
– Binary search
• Source and effects of round-off error.
• Start working on assignment?
• Exam prep (mostly Thursday).
Spring 2006
CISC101 - Prof. McLeod
3
Searching
• Sequential search pseudocode:
• Loop through array starting at the first element until the
value of target matches one of the array elements.
• If a match is not found, return –1.
Spring 2006
CISC101 - Prof. McLeod
4
Sequential Search
• As code:
public int seqSearch(int[] A, int target) {
for (int i = 0; i < A.length; i++)
if (A[i] == target) return i;
return –1;
}
Spring 2006
CISC101 - Prof. McLeod
5
Sequential Search, Cont.
• Works on any kind of dataset.
• Finds the first matching element – does not say
anything about how many matches there are!
• Best case – one comparison.
• Worst case (target not found) – n comparisons (n
is the same as A.length.
• On average - n/2 comparisons.
Spring 2006
CISC101 - Prof. McLeod
6
Searching in an Ordered Dataset
• How do you find a name in a telephone book?
Spring 2006
CISC101 - Prof. McLeod
7
Binary Search
• Binary search pseudocode. Only works on
ordered sets:
a) Locate midpoint of array to search.
b) Determine if target is in lower half or upper half of
array.
•
•
•
Spring 2006
If in lower half, make this half the array to search.
If in upper half, make this half the array to search.
Loop back to step a), unless the size of the array to
search is one, and this element does not match, in
which case return –1.
CISC101 - Prof. McLeod
8
Binary Search – Cont.
public int binSearch (int[] A, int key) {
int lo = 0;
int hi = A.length - 1;
int mid = (lo + hi) / 2;
while (lo <= hi) {
if (key < A[mid])
hi = mid else if (A[mid] <
lo = mid +
else return mid;
mid = (lo + hi) /
}
return -1;
} 2006
Spring
1;
key)
1;
2;
CISC101 - Prof. McLeod
9
Binary Search – Cont.
• For the best case, the element matches right at
the middle of the array, and the loop only
executes once.
• For the worst case, key will not be found and the
maximum number of loops will occur.
• Note that the loop will execute until there is only
one element left that does not match.
• Each time through the loop the number of
elements left is halved, giving the progression
below for the number of elements:
Spring 2006
CISC101 - Prof. McLeod
10
Binary Search – Cont.
• Number of elements to be searched progression:
n n n
n
n, , 2 , 3 , ..., m
2 2 2
2
• The last comparison is for n/2m, when the number
of elements is one (worst case).
• So, n/2m = 1, or n = 2m.
• Or, m = log2(n).
• So, the algorithm loops log(n) times for the worst
case.
Spring 2006
CISC101 - Prof. McLeod
11
Binary Search - Cont.
• Binary search with log(n) iterations for the worst
case is much better than n iterations for
sequential search!
• Major reason to sort datasets!
Spring 2006
CISC101 - Prof. McLeod
12
From Before:
• A float occupies 4 bytes
• A double occupies 8 bytes of memory
• (A byte is 8 bits, where a bit is zero or one.)
Spring 2006
CISC101 - Prof. McLeod
13
Storage of Real Numbers
• So, a real number can only occupy a finite
amount of storage in memory.
• This effect is very important for two kinds of
numbers:
– Numbers like 0.1 that can be written exactly in base
10, but cannot be stored exactly in base 2.
– Real numbers (like  or e) that have an infinite number
of digits in their “real” representation can only be
stored in a finite number of digits in memory.
• And, we will see that it has an effect on the
accuracy of mathematical operations.
Spring 2006
CISC101 - Prof. McLeod
14
Storage of “Real” or “Floating-Point” Numbers
- Cont.
• Consider 0.1:
(0.1)10 = (0.0 0011 0011 0011 0011 0011…)2
• What happens to the part of a real number that
cannot be stored?
• It is lost - the number is truncated.
• The “lost part” is called the “roundoff error”.
Spring 2006
CISC101 - Prof. McLeod
15
Storage of “Real” or “Floating-Point” Numbers
- Cont.
• Back to the storage of 0.1:
• Compute: 10000
 0.1
i 1
• And, compare to 1000.
float sum = 0;
for (int i = 0; i < 10000; i++)
sum += 0.1F;
System.out.println(sum);
Spring 2006
CISC101 - Prof. McLeod
16
Storage of “Real” or “Floating-Point” Numbers
- Cont.
• Prints a value of 999.9029 to the screen.
• If sum is declared to be a double then the
value: 1000.0000000001588 is printed to the
screen.
• So, the individual roundoff errors have piled up to
contribute to a cumulative error in this
calculation.
• As expected, the roundoff error is smaller for a
double than for a float.
• On the following charts, the x-axis is the counter,
i, from the code shown above.
Spring 2006
CISC101 - Prof. McLeod
17
float Accumulation
Progress of float Roundoff Error
4.00E-02
2.00E-02
0.00E+00
-2.00E-02
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
-4.00E-02
-6.00E-02
-8.00E-02
-1.00E-01
-1.20E-01
Spring 2006
CISC101 - Prof. McLeod
18
double Accumulation
Progress of double Roundoff Error
1.80E-10
1.60E-10
1.40E-10
1.20E-10
1.00E-10
8.00E-11
6.00E-11
4.00E-11
2.00E-11
0.00E+00
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
-2.00E-11
Spring 2006
CISC101 - Prof. McLeod
19
Roundoff Error
• This error is referred to in two different ways:
• The absolute error:
absolute error = |x - xapprox|
• The relative error:
relative error = (absolute error)  |x|
Spring 2006
CISC101 - Prof. McLeod
20
Roundoff Error - Cont.
• So for the calculation of 1000 as shown above,
the errors are:
Type
Absolute
Relative
float
0.0971
9.71E-5
double
1.588E-10
1.588E-13
• The relative error on the storage of 0.1 is the
absolute error divided by 1000.
Spring 2006
CISC101 - Prof. McLeod
21
The Effects of Roundoff Error
• Roundoff error can have an effect on any
arithmetic operation carried out involving real
numbers.
• For example, consider subtracting two numbers
that are very close together:
• Use the function
f ( x )  1  cos( x )
for example. As x approaches zero, cos(x)
approaches 1.
Spring 2006
CISC101 - Prof. McLeod
22
The Effects of Roundoff Error
• Using double variables, and a value of x of
1.0E-12, f(x) evaluates to 0.0.
• But, it can be shown that the function f(x) can also
be represented by f’(x):
2
sin ( x)
f ' ( x)  f ( x) 
1  cos( x)
• For x = 1.0E-12, f’(x) evaluates to 5.0E-25.
• The f’(x) function is less susceptible to roundoff
error.
Spring 2006
CISC101 - Prof. McLeod
23
The Effects of Roundoff Error - Cont.
• Another example. Consider the smallest root of
the polynomial: ax2+bx+c=0:
 b  b  4ac
x1 
2a
2
• What happens when ac is small, compared to b?
• It is known that for the two roots, x1 and x2:
c
x1 x2 
a
Spring 2006
CISC101 - Prof. McLeod
24
The Effects of Roundoff Error - Cont.
• Which leads to an equation for the root which is
not as susceptible to roundoff error:
x1 
 2c
b  b  4ac
2
• This equation approaches –c/b instead of zero
when ac << b2.
Spring 2006
CISC101 - Prof. McLeod
25
The Effects of Roundoff Error - Cont.
• The examples above show what can happen
when two numbers that are very close are
subtracted.
• Remember that this effect is a direct result of
these numbers being stored with finite accuracy in
memory.
Spring 2006
CISC101 - Prof. McLeod
26
The Effects of Roundoff Error - Cont.
• A similar effect occurs when an attempt is made to add a
comparatively small number to a large number:
boolean aVal = ((1.0E10 + 1.0E-20)==1.0E10);
System.out.println(aVal);
• Prints out true to the screen
• Since 1.0E-20 is just too small to affect any of the bit
values used to store 1.0E10. The small number would
have to be about 1.0E-5 or larger to affect the large
number.
• So, keep this behaviour in mind when designing
expressions!
Spring 2006
CISC101 - Prof. McLeod
27
The Effect on Summations
• Taylor Series are used to approximate many
functions. For example:
( 1)i 1 xi
ln( 1  x )  
i
i 1

• For ln(2):
( 1)i 1
1 1 1
ln( 2)  
 1     ...
i
2 3 4
i 1

Spring 2006
CISC101 - Prof. McLeod
28
The Effect on Summations
• Since we cannot loop to infinity, how many terms
would be sufficient?
• Since the sum is stored in a finite memory space,
at some point the terms to be added will be much
smaller than the sum itself.
• If the sum is stored in a float, which has about 7
significant digits, a term of about 1x10-8 would not
be significant. So, i would be about 108 - that’s a
lot of iterations!
Spring 2006
CISC101 - Prof. McLeod
29
The Effect on Summations - Cont.
• On testing using a float, it took 33554433
iterations and 25540 msec to compute! (sum no
longer changing, value = 0.6931375)
• Math.log(2) = 0.6931471805599453
• So, roundoff error had a significant effect and the
summation did not even provide the correct value.
A float could only provide about 5 correct
significant digits, tops.
• For double, about 1015 iterations would be
required! (I didn’t try this one…)
• So, this series does not converge quickly!
Spring 2006
CISC101 - Prof. McLeod
30
The Effect on Summations - Cont.
• Here is another way to compute natural logs:
1
1 x 
2 i 1
ln 
x
  2
1 x 
i 0 2i  1

• Using x = 1/3 will provide ln(2).
Spring 2006
CISC101 - Prof. McLeod
31
The Effect on Summations - Cont.
• For float, this took 8 iterations and <1msec
(value = 0.6931472).
• Math.log(2) = 0.6931471805599453
• For double, it took 17 iterations, <1 msec to give
the value = 0.6931471805599451
• Using the Windows calculator ln(2) =
0.693147180559945309417232121458177 (!!)
• So, the use of the 17 iterations still introduced a
slight roundoff error.
• The second version is much faster!!
Spring 2006
CISC101 - Prof. McLeod
32
Numeric Calculations
• Error is introduced into a calculation through two
sources (assuming the formulae are correct!):
– The inherent error in the numbers used in the
calculation.
– Error resulting from roundoff error.
• Often the inherent error dominates the roundoff
error.
• But, watch for conditions of slow convergence or
ill-conditioned matrices, where roundoff error will
accumulate and end up swamping out the
inherent error.
Spring 2006
CISC101 - Prof. McLeod
33
Numeric Calculations - Cont.
• Once a number is calculated, it is very important
to be able to estimate the error using both
sources, if necessary.
• The error must be known in order that the number
produced by your program can be reported in a
valid manner.
• This is a non-trivial topic in numeric calculation
that we will not discuss in this course.
Spring 2006
CISC101 - Prof. McLeod
34