Sorting, Computer Arithmetic and Algebra

Download Report

Transcript Sorting, Computer Arithmetic and Algebra

IOI/ACM/Supercom 2004
Training
Session 7
Dr Kan Min-Yen
http://www.comp.nus.edu.sg/~kanmy/talks/040608-IOItraining-pub.htm
12 Jun 2004
IOI/ACM/Supercom Session 7
1
Topics and outline



Sorting
Computer Arithmetic and Algebra
Invariants and Number Theory
12 Jun 2004
IOI/ACM/Supercom Session 7
2
Sorting

Problems and Parameters

Comparison-based sorting

Non-comparison-based
12 Jun 2004
IOI/ACM/Supercom Session 7
3
What sort of problems?
Sorting is usually not the end goal, but a
prerequisite





_________
_________
_________
_________
_________
12 Jun 2004
IOI/ACM/Supercom Session 7
4
Two properties of sorting

Stable


Items with the same value are ordered in the
same way after the sort as they were before
Important for doing ____________ sorts


E.g., sorting by First name, Last name
In-place: sorts the items without needing
extra space

___________________________________
12 Jun 2004
IOI/ACM/Supercom Session 7
5
Comparison-based sort



Based on comparing two items
Many variants, but not the only way to sort
Discuss only the important ones for
programming contests
Selection, Insertion, Merge and Quick
12 Jun 2004
IOI/ACM/Supercom Session 7
6
Comparison Sorts
Comparison Sort Animation:
http://math.hws.edu/TMCM/java/xSortLab/

Selection




Algo: find min or max of unsorted portion
Heapsort: is selection sort with heap data structure
Remember: ____________________
Insertion


Algo: insert unsorted item in sorted array
Remember: ____________________
12 Jun 2004
IOI/ACM/Supercom Session 7
7
Comparison Sorts

Merge




Idea: divide and conquer, recursion
Algo: merge two sorted arrays in linear time
Remember: _________________________
Quick



Idea: randomization, pivot
Algo: divide problem to smaller and larger half based on a pivot
Remember: ________________________!
A problem with sorting as its core may not be best solved with
generic tool. Think before using a panacea like Quicksort.
12 Jun 2004
IOI/ACM/Supercom Session 7
8
Miscellaneous sort
Most sorting relies on comparisons between two
items.

Proven to be Θ(n log n)
Example: sort an array of distinct integers ranged 1-k
We’ll go over


Radix sort
Counting sort
12 Jun 2004
IOI/ACM/Supercom Session 7
9
What is radix?


Radix is the same as base. In decimal
system, radix = 10.
For example, the following is a decimal
number with 4 digits
12 Jun 2004
0
3
2
8
1st
Digit
2nd
Digit
3rd
Digit
4th
Digit
IOI/ACM/Supercom Session 7
10
Radix Sort

Suppose we are given n d-digit decimal
integers A[0..n-1], radix sort tries to do the
following:
for j = d to 1 {
By ignoring digit-1 up to digit-(j-1), form the
sorted array of the numbers
}
12 Jun 2004
IOI/ACM/Supercom Session 7
11
Radix Sort (Example)
Original
0123
2043
9738
1024
0008
2048
1773
1239
12 Jun 2004
Sorted Array
if we only
consider
digit-4
0
Group
using
4th digit
1
2
3  0123, 2043, 1773
Ungroup
4  1024
5
6
7
8  9738, 0008, 2048
9  1239
IOI/ACM/Supercom Session 7
0123
2043
1773
1024
9738
0008
2048
1239
12
Radix Sort (Example)
0  0008
Original
0123
2043
1773
1024
9738
0008
2048
1239
12 Jun 2004
Group
using
3rd digit
1
2  0123, 1024
3  9738, 1239
4  2043, 2048
5
6
7  1773
8
9
IOI/ACM/Supercom Session 7
Sorted Array if
we only
consider digits- 0008
3 and 4
0123
Ungroup
1024
9738
1239
2043
2048
1773
13
Sorted Array if
we only
consider digit-2,
3, and 4
0  0008, 1024, 2043, 2048
0008
1  0123
1024
2  1239
Radix Sort (Example)
Original
0008
0123
1024
9738
1239
2043
2048
1773
12 Jun 2004
Group
using
2nd digit
3
4
5
6
7  9738, 1773
8
9
Ungroup
2043
2048
0123
1239
9738
1773
IOI/ACM/Supercom Session 7
14
Radix Sort (Example)
1024
2043
2048
0123
1239
9738
1773
12 Jun 2004
Done!
0008
0  0008, 0123
Original
0008
Group
using
1st digit
1  1024, 1239, 1773
2  2043, 2048
3
4
5
6
0123
Ungroup
1024
1239
1773
2043
7
8
9  9738
IOI/ACM/Supercom Session 7
2048
9738
15
Details on Radix Sort
A ______ sorting algorithm
Done from least signficant to most signficant
Can be used with a higher base for better
efficiency
1.
2.
3.
Decide whether it’s really worth it

Works for integers, but not real, floating point
4.

But see: http://codercorner.com/RadixSortRevisited.htm
The combination of 1 and 2 can be used for combining
sorts in general
12 Jun 2004
IOI/ACM/Supercom Session 7
16
Counting Sort



Works by counting the occurrences of each data value.
Assumes that there are n data items in the range of 1..k
The algorithm can then determine, for each input element,
the amount of elements less than it.


For example if there are 9 elements less than element x, then x
belongs in the 10th data position.
These notes are from Cardiff’s Christine Mumford:
http://www.cs.cf.ac.uk/user/C.L.Mumford/
12 Jun 2004
IOI/ACM/Supercom Session 7
17
Counting Sort




The first for loop initialises
C[ ] to zero.
The second for loop
increments the values in
C[], according to their
frequencies in the data.
The third for loop adds all
previous values, making
C[] contain a cumulative
total.
The fourth for loop writes
out the sorted data into
array B[].
12 Jun 2004
countingsort(A[], B[], k)
for i = 1 to k do
C[i] = 0
for j = 1 to length(A) do
C[A[j]] = C[A[j]] + 1
for 2 = 1 to k do
C[i] = C[i] + C[i-1]
for j = 1 to length(A) do B[C[A[j]]]
= A[j]
C[A[j]] = C[A[j]] - 1
IOI/ACM/Supercom Session 7
18
Counting Sort
Demo from Cardiff:
http://www.cs.cf.ac.uk/user/C.L.Mumford/trist
an/CountingSort.html

12 Jun 2004
IOI/ACM/Supercom Session 7
19
Counting and radix sort

What’s their complexity?



Why do they work so fast??


No comparisons are made
Both are stable sorts, but not _______


Radix sort: O(dn) = O(n), if d << n
Counting sort: 2k + 2n = O(n), if k << n
Can you fix them?
When to use?

______________________
12 Jun 2004
IOI/ACM/Supercom Session 7
20
Quiz and discussion
There are no right answers…





Which sort better used for a very large
random array?
For sorting a deck of cards?
For sorting a list of first name, last names?
For sorting single English words
For sorting an almost sorted set?
12 Jun 2004
IOI/ACM/Supercom Session 7
21
Computer Arithmetic and Algebra

Big Numbers



Arbitrary precision arithmetic
Multiprecision arithmetic
Computer Algebra

Dealing with algebraic
expressions a la Maple,
Mathematica
12 Jun 2004
IOI/ACM/Supercom Session 7
22
Arithmetic



Want to do standard arithmetic operations
on very large/small numbers
Can’t fit representation of numbers in
standard data types
What to do?

Sassy answer: ________________
12 Jun 2004
IOI/ACM/Supercom Session 7
23
Two representations


How to represent: 10 00000 00000 02003
Linked list: 1e16  2e3  3e0




Array: <as above>





Dense representation
Good for numbers with different or arbitrary widths
Where should the head of the linked list point to?
Sparse representation
Good for problems in the general case
Don’t forget to store the sign bit somewhere
Which base to choose: 10 or 32?
How to represent arbitrary large real numbers?
12 Jun 2004
IOI/ACM/Supercom Session 7
24
Standard operations
Most large number operations rely on techniques that
are the same as primary school techniques





Adding
Subtracting
Multiplying
Dividing
Exponentiation / Logarithms
12 Jun 2004
IOI/ACM/Supercom Session 7
25
Algorithms for big numbers

Addition of bigints x and y



Complicated part is to deal with the carry
What about adding lots of bigints together?
Solution: do all the adds first then worry about
the carries
12 Jun 2004
IOI/ACM/Supercom Session 7
26
Canonicalization and adding

Canonicalizing




12E2 + 34E0 => 1E3 + 2E2 + 3E1 + 4E0
Strip coefficient of values larger than B and carry to next
value
Reorder sparse representation if necessary
Adding


Iteratively add all Ai Bi … Zi then canonicalize
Hazard:

12 Jun 2004
Data type for each entry _________________________
____________________________ overflow will occur
IOI/ACM/Supercom Session 7
27
Algorithms for big numbers

Subtraction




Like addition but requires borrowing
(reverse carry)
Place the higher absolute magnitude number at
the top
Applicable to addition of mixed sign numbers
Comparison

Start from higher order bits then work
backwards
12 Jun 2004
IOI/ACM/Supercom Session 7
28
Multiplying

Given two big integers X and Y in canonical form:

the big integer Z = X*Y can be obtained thanks to the
formula:

Notes:



This is the obvious way we were taught in primary school, the
complexity is Θ(N2)
Canonicalize after each step to avoid overflow in coefficients
To make this operation valid, B2 must fit in the coefficient
12 Jun 2004
IOI/ACM/Supercom Session 7
29
Shift Multiplication

If a number is encoded in base B

Left shift multiplies by B
Right shift multiples by B

Yet another reason to use arrays vs. linked lists


If your problem requires many multiplications and
divisions by a fixed number b, consider using b as
the base for the number
12 Jun 2004
IOI/ACM/Supercom Session 7
30
Karatsuba Multiplication
We can split two big numbers in half:
X = X0 + B X1 and Y = Y0 + B Y1
Then the product XY is given by
(X0 + BX1) (Y0 + BY1)
This results in three terms:
X0Y0 + B (X0Y1 + X1Y0) + B2(X1Y1)
Look at the middle term. We can get the middle term almost
for free by noting that:
(X0 + X1) (Y0 + Y1) = X0Y0 + X0Y1 + X1Y0 + X1Y1
X0Y1 + X1Y0 = (X0 + X1) (Y0 + Y1) - X0Y0 - X1Y1
12 Jun 2004
IOI/ACM/Supercom Session 7
31
Karatsuba Demonstration
12 * 34
Given: X1 = 1, X0 = 2,
Y1 = 3, Y0 = 4
Calculate: X0Y0 = 8, X1Y1 = 3
(X0+X1)(Y0+Y1) = 3*7 = 21
Final middle term = 21 – 8 – 3 = 10
Solution: 8 + 10 * 101 + 3 * 102 = 408

Notes:
1.5)
 Recursive, complexity is about Θ(n
There’s a better way: FFT based multiplication not taught here that is Θ (n log n)
http://numbers.computation.free.fr/Constants/Algorithms/fft.html

12 Jun 2004
IOI/ACM/Supercom Session 7
32
Division
Algo: long division
(Skiena and Revilla, pg 109): Iterate



shifting the remainder to the left including the
next digit
Subtract off instances of the divisor
Demo:
http://www.mathsisfun.com/long_division2.html
12 Jun 2004
IOI/ACM/Supercom Session 7
33
Exponentiation





How do you calculate x256?
x256 = ((((((((x2)2)2)2)2)2)2)
What about x255?
Store x, x2, x4, x8, x16, x32, x64, x128 on the
way up.
Complexity Θ(log n)
12 Jun 2004
IOI/ACM/Supercom Session 7
34
Computer Algebra

Applicable when you need an exact
computation of something



1/7 does not exactly equal 0.142857142857
1/7 * 7 = 1
Or when you need consider polynomials with
different variables
12 Jun 2004
IOI/ACM/Supercom Session 7
35
Data structures

How to represent x5 + 2x – 1?




Dense (array): 15 04 03 02 21 -10
Sparse (linked list): [1,5][2,1][-1,0]
What about arbitrary
expressions?
Solution: use an
expression tree
(e.g. a+b*c)
12 Jun 2004
IOI/ACM/Supercom Session 7
36
Simplification: Introduction

But CA systems have to deal with equivalency between
different forms:



So an idea is to push all equations to a canonical form.


(x-2) (x+2)
X2 – 4
Like Maple “simplify”
Question: how do we implement simplify?
12 Jun 2004
IOI/ACM/Supercom Session 7
37
Simplify - Transforming Negatives
Why?


Kill off
subtraction,
negation
Addition is
commutative
To think about: How is this
related to big integer computation?
12 Jun 2004
IOI/ACM/Supercom Session 7
38
Simplify – Leveling Operators

Combine like binary trees to n-ary trees
12 Jun 2004
IOI/ACM/Supercom Session 7
39
Simplify – Rational Expressions

Expressions with * and / will be rewritten so that
there is a single / node at the top, with only *
operators below
12 Jun 2004
IOI/ACM/Supercom Session 7
40
Simplify – Collecting Like Terms
12 Jun 2004
IOI/ACM/Supercom Session 7
41
Exact / Rational Arithmetic


Need to store fractions. What data structure can be
used here?
Problems:





Keep 4/7 as 4/7 but 4/8 as 1/2
How to do this?
Simplification and factoring
Addition and subtraction need computation of greatest
common denominator
Note division of a/b  c/d is just a/b  d/c
12 Jun 2004
IOI/ACM/Supercom Session 7
42
Invariants


Sometimes a problem is easier than it looks
Look for another way to define the problem
in terms of indirect, fixed quantities
12 Jun 2004
IOI/ACM/Supercom Session 7
43
Invariants



Chocolate bar problem
You are given a chocolate bar, which consists of r
rows and c columns of small chocolate squares.
Your task is to break the bar into small squares.
What is the smallest number of splits to completely
break it?
12 Jun 2004
IOI/ACM/Supercom Session 7
44
Variants of this puzzle?





Breaking irregular shaped objects
Assembling piles of objects
Fly flying between two trains
Adding all integers in a range (1 … 1000)
Also look for simplifications or irrelevant
information in a problem
12 Jun 2004
IOI/ACM/Supercom Session 7
45
The Game of Squares and Circles



Start with c circles and s squares.
On each move a player selects two shapes.
These two are replaced by one according to the
following rule:


Identical shapes are replaced with a square. Different
shapes are replaced with a circle.
At the end of the game, you win if the last shape
is a circle. Otherwise, the computer wins.
12 Jun 2004
IOI/ACM/Supercom Session 7
46
What’s the key?


Parity of circles and squares is invariant
After every move:


if # circles was even, _____________
If # circles was odd, _____________
12 Jun 2004
IOI/ACM/Supercom Session 7
47
Sam Loyd’s Fifteen Puzzle


Given a configuration, decide whether it is
solvable or not:
8
7
12
15
1
5
2
13
6
3
9
14
4
10
11
Possible?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Key: Look for an invariant over moves
12 Jun 2004
IOI/ACM/Supercom Session 7
48
Solution to Fifteen


Look for inversion invariance
Inversion: when two tiles out of order in a row
inverted
not inverted
11

10
11
For a given puzzle configuration, let N denote the sum of:



10
the total number of inversions
the row number of the empty square.
After a legal move, an odd N remains odd whereas an even
N remains even.
12 Jun 2004
IOI/ACM/Supercom Session 7
49
Fifteen invariant
a
b
c
d

Note: if you are asked for optimal path, then
it’s a search problem
12 Jun 2004
IOI/ACM/Supercom Session 7
50
Story time: Euclid of Alexandria
(~325 BC – 265 BC)
Greek geometer
 Led school in Alexandria
 Probably wroteThe Elements, the
definitive math text until the 19th
century
Developed geometry, number theory and others
from a set of 5 postulates
Knowledge survived Dark Ages in the Western
world as it was translated into Arabic



12 Jun 2004
IOI/ACM/Supercom Session 7
51
Euclidian Algorithm for GCD



Rational calculation require the calculation of
the greatest common divisor
The Euclidean algorithm is a good way to do
this. It’s a recursive procedure:
gcd(N,M) = gcd(M, N mod M)
Demo:

http://www.math.umn.edu/~garrett/crypto/a01/
Euclid.html
12 Jun 2004
IOI/ACM/Supercom Session 7
52
Sieve of Eratosthenes
Initialization:
 Keep a boolean array [1...n] = PRIME. We will
change entries to COMPOSITE as we go on.
Algorithm:
 Set k=1. iterate towards √n
 Find the first PRIME in the list greater than k. (e.g.,
2.) Call it m. Change the numbers 2m, 3m, 4m, ...
to COMPOSITE.
 Set k=m and repeat.
12 Jun 2004
IOI/ACM/Supercom Session 7
53
Sieve of Eratosthenes
Demo from Univ. of Utah
(Peter Alfeld):
http://www.math.utah.edu/~alfeld/
Eratosthenes.html


Notes:

If at n in the sieve, we know that all numbers
not crossed out under n2 are also prime.
12 Jun 2004
IOI/ACM/Supercom Session 7
54
For self-review
Other topics related that you should cover on
your own
 Roman number conversion: these problems
crop up every once in a while
 Playing cards and other common games
 Divisibility criteria and modular arithmetic
12 Jun 2004
IOI/ACM/Supercom Session 7
55
For Fun: Monty Hall

Suppose you're on a game show, and you're given the choice of three
doors. Behind one door is a car, behind the others, goats. You pick a
door, say number 1, and the host, who knows what's behind the doors,
opens another door, say number 3, which has a goat. He says to you,
"Do you want to pick door number 2?" Is it to your advantage to switch
your choice of doors?
12 Jun 2004
IOI/ACM/Supercom Session 7
56
Analysis of the Monty Hall Dilemma
Door
case A
B
C
1
bad
bad
good
2
bad
good bad
3
good bad
bad

assume you choose door A: you have a 1/3 chance of a good prize.

But (this is key) Monty knows what is behind each door, and
shows a bad one.

In cases 1 and 2, he eliminates doors B and C respectively (which
happen to be the only remaining bad door) so a good door is left:
SWITCH!

Only in case 3 (you lucked out in your original 1 in 3 chances) does
switching hurt you.
12 Jun 2004
IOI/ACM/Supercom Session 7
57
References

Books and
websites used to
compile this
lecture
12 Jun 2004
IOI/ACM/Supercom Session 7
58
References


Skiena, S. and Revilla, M. Programming
Challenges
Gathen, J. and Gerhard, J. Modern Computer
Algebra
12 Jun 2004
IOI/ACM/Supercom Session 7
59
References

Arbitrary-Precision Arithmetic
http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE144.HTM
* World of Seven
http://www.comp.nus.edu.sg/~stevenha/programming/programming.html

Common Mistakes in Online and Real-time Programming
Contests
http://www.acm.org/crossroads/xrds7-5/contests.html

Mathematical Constants and Computation
http://numbers.computation.free.fr/Constants/constants.html

Euclid’s Elements
http://mathworld.wolfram.com/Elements.html

Monty Hall Problem
http://www.cut-the-knot.org/hall.shtml

Cut the Knot
http://www.cut-the-knot.org/
12 Jun 2004
IOI/ACM/Supercom Session 7
60