IOI/ACM/Supercom 2004 Training

Download Report

Transcript IOI/ACM/Supercom 2004 Training

IOI 2005 Training
Dr Kan Min-Yen
Topics and outline



Sorting
Computer Arithmetic and Algebra
Invariants and Number Theory
Sorting

Problems and Parameters

Comparison-based sorting

Non-comparison-based
www.personaltouchmailing.com
What sort of problems?
Sorting is usually not the end goal, but a prerequisite







(Efficient) searching!
Uniqueness / Duplicates
Prioritizing (c.f. priority queues)
Median / Selection
Frequency Counting
Set operations
Target Pair
Two properties of sorting

Stable


Items with the same value are ordered in the
same way after the sort as they were before
Important for doing multiple stage sorts


E.g., sorting by First name, Last name
In-place: sorts the items without needing
extra space

Space proportional to the size of the input n
Comparison-based sort



Based on comparing two items
Many variants, but not the only way to sort
Discuss only the important ones for
programming contests
Selection, Insertion, Merge and Quick
Comparison Sorts
Comparison Sort Animation:
http://math.hws.edu/TMCM/java/xSortLab/

Selection




Algo: find min or max of unsorted portion
Heapsort: is selection sort with heap data structure
Remember: Minimize amount of swaps
Insertion


Algo: insert unsorted item in sorted array
Remember: Minimize amount of data
Comparison Sorts

Merge




Idea: divide and conquer, recursion
Algo: merge two sorted arrays in linear time
Remember: Don’t recurse to the base case if using this sort, not inplace
Quick



Idea: randomization, pivot
Algo: divide problem to smaller and larger half based on a pivot
Remember: Partition can be used to solve problems too!
A problem with sorting as its core may not be best solved with
generic tool. Think before using a panacea like Quicksort.
Miscellaneous sort
Most sorting relies on comparisons between two
items.

Proven to be Θ(n log n)
Example: sort an array of distinct integers ranged 1-k
We’ll go over


Radix sort
Counting sort
What is radix?


Radix is the same as base. In decimal
system, radix = 10.
For example, the following is a decimal
number with 4 digits
0
3
2
8
1st
Digit
2nd
Digit
3rd
Digit
4th
Digit
Radix Sort

Suppose we are given n d-digit decimal
integers A[0..n-1], radix sort tries to do the
following:
for j = d to 1 {
By ignoring digit-1 up to digit-(j-1), form the
sorted array of the numbers
}
Radix Sort (Example)
Original
0123
2043
9738
1024
0008
2048
1773
1239
0
Group
using
4th digit
Sorted Array
if we only
consider
digit-4
1
2
3  0123, 2043, 1773
Ungroup
4  1024
5
6
7
8  9738, 0008, 2048
9  1239
0123
2043
1773
1024
9738
0008
2048
1239
Radix Sort (Example)
0  0008
Original
0123
2043
1773
1024
9738
0008
2048
1239
Group
using
3rd digit
1
2  0123, 1024
3  9738, 1239
4  2043, 2048
5
6
7  1773
8
9
Sorted Array if
we only
consider digits- 0008
3 and 4
0123
Ungroup
1024
9738
1239
2043
2048
1773
Sorted Array if
we only
consider digit-2,
3, and 4
0  0008, 1024, 2043, 2048
0008
1  0123
1024
2  1239
Radix Sort (Example)
Original
0008
0123
1024
9738
1239
2043
2048
1773
Group
using
2nd digit
3
4
5
6
7  9738, 1773
8
9
Ungroup
2043
2048
0123
1239
9738
1773
Radix Sort (Example)
1024
2043
2048
0123
1239
9738
1773
Done!
0008
0  0008, 0123
Original
0008
Group
using
1st digit
1  1024, 1239, 1773
2  2043, 2048
3
4
5
6
7
8
9  9738
0123
Ungroup
1024
1239
1773
2043
2048
9738
Details on Radix Sort
A stable sorting algorithm
Done from least signficant to most signficant
Can be used with a higher base for better
efficiency
1.
2.
3.

Decide whether it’s really worth it
Works for integers, but not real, floating point
4.

But see: http://codercorner.com/RadixSortRevisited.htm
The combination of 1 and 2 can be used for combining
sorts in general
Counting Sort



Works by counting the occurrences of each data value.
Assumes that there are n data items in the range of 1..k
The algorithm can then determine, for each input element,
the amount of elements less than it.


For example if there are 9 elements less than element x, then x
belongs in the 10th data position.
These notes are from Cardiff’s Christine Mumford:
http://www.cs.cf.ac.uk/user/C.L.Mumford/
Counting Sort




The first for loop initialises
C[ ] to zero.
The second for loop
increments the values in
C[], according to their
frequencies in the data.
The third for loop adds all
previous values, making
C[] contain a cumulative
total.
The fourth for loop writes
out the sorted data into
array B[].
countingsort(A[], B[], k)
for i = 1 to k do
C[i] = 0
for j = 1 to length(A) do
C[A[j]] = C[A[j]] + 1
for 2 = 1 to k do
C[i] = C[i] + C[i-1]
for j = 1 to length(A) do B[C[A[j]]]
= A[j]
C[A[j]] = C[A[j]] - 1
Counting Sort
Demo from Cardiff:
http://www.cs.cf.ac.uk/user/C.L.Mumford/trist
an/CountingSort.html

Counting and radix sort

What’s their complexity?



Why do they work so fast??


No comparisons are made
Both are stable sorts, but not in-place


Radix sort: O(dn) = O(n), if d << n
Counting sort: 2k + 2n = O(n), if k << n
Can you fix them to be in-place?
When to use?

When you have lots of items in a fixed range
Quiz and discussion
There are no right answers…





Which sort better used for a very large
random array?
For sorting a deck of cards?
For sorting a list of first name, last names?
For sorting single English words
For sorting an almost sorted set?
Computer Arithmetic and Algebra

Big Numbers



Arbitrary precision arithmetic
Multiprecision arithmetic
Computer Algebra

Dealing with algebraic
expressions a la Maple,
Mathematica
Arithmetic



Want to do standard arithmetic operations
on very large/small numbers
Can’t fit representation of numbers in
standard data types
What to do?

Sassy answer: use java and BigInt class
Two representations


How to represent: 10 00000 00000 02003
Linked list: 1e16  2e3  3e0




Array: <as above>





Dense representation
Good for numbers with different or arbitrary widths
Where should the head of the linked list point to?
Sparse representation
Good for problems in the general case
Don’t forget to store the sign bit somewhere
Which base to choose: 10 or 32?
How to represent arbitrary large real numbers?
Standard operations
Most large number operations rely on techniques that
are the same as primary school techniques





Adding
Subtracting
Multiplying
Dividing
Exponentiation / Logarithms
Algorithms for big numbers

Addition of bigints x and y



Complicated part is to deal with the carry
What about adding lots of bigints together?
Solution: do all the adds first then worry about
the carries
Canonicalization and adding

Canonicalizing




12E2 + 34E0 => 1E3 + 2E2 + 3E1 + 4E0
Strip coefficient of values larger than B and carry
to next value
Reorder sparse representation if necessary
Adding


Iteratively add all Ai Bi … Zi then canonicalize
Hazard:

Data type for each entry must be able to hold
maximum expected data or overflow will occur
Algorithms for big numbers

Subtraction




Like addition but requires borrowing
(reverse carry)
Place the higher absolute magnitude number at
the top
Applicable to addition of mixed sign numbers
Comparison

Start from higher order bits then work
backwards
Multiplying

Given two big integers X and Y in canonical form:

the big integer Z = X*Y can be obtained thanks to the
formula:

Notes:



This is the obvious way we were taught in primary school, the
complexity is Θ(N2)
Canonicalize after each step to avoid overflow in coefficients
To make this operation valid, B2 must fit in the coefficient
Shift Multiplication

If a number is encoded in base B

Left shift multiplies by B
Right shift multiples by B

Yet another reason to use arrays vs. linked lists


If your problem requires many multiplications and
divisions by a fixed number b, consider using b as
the base for the number
Karatsuba Multiplication
We can split two big numbers in half:
X = X0 + B X1 and Y = Y0 + B Y1
Then the product XY is given by
(X0 + BX1) (Y0 + BY1)
This results in three terms:
X0Y0 + B (X0Y1 + X1Y0) + B2(X1Y1)
Look at the middle term. We can get the middle term almost
for free by noting that:
(X00 +
+X
X11)) (Y
(Y00 +
X00Y
Y00 + X
Y11
(X
+ YY11)) = X
+ XX11YY00 + XX11Y
X00Y
Y11 +
=
-
Karatsuba Demonstration
12 * 34
Given: X1 = 1, X0 = 2,
Y1 = 3, Y0 = 4
Calculate: X0Y0 = 8, X1Y1 = 3
(X0+X1)(Y0+Y1) = 3*7 = 21
Final middle term = 21 – 8 – 3 = 10
Solution: 8 + 10 * 101 + 3 * 102 = 408

Notes:
1.5)
 Recursive, complexity is about Θ(n
There’s a better way: FFT based multiplication not taught here that is Θ (n log n)
http://numbers.computation.free.fr/Constants/Algorithms/fft.html

Division
Algo: long division
(Skiena and Revilla, pg 109): Iterate



shifting the remainder to the left including the
next digit
Subtract off instances of the divisor
Demo:
http://www.mathsisfun.com/long_division2.html
Exponentiation





How do you calculate x256?
x256 = ((((((((x2)2)2)2)2)2)2)
What about x255?
Store x, x2, x4, x8, x16, x32, x64, x128 on the
way up.
Complexity Θ(log n)
Computer Algebra

Applicable when you need an exact
computation of something



1/7 does not exactly equal 0.142857142857
1/7 * 7 = 1
Or when you need consider polynomials with
different variables
Data structures

How to represent x5 + 2x – 1?




Dense (array): 15 04 03 02 21 -10
Sparse (linked list): [1,5][2,1][-1,0]
What about arbitrary
expressions?
Solution: use an
expression tree
(e.g. a+b*c)
Simplification: Introduction

But CA systems have to deal with equivalency between
different forms:



So an idea is to push all equations to a canonical form.


(x-2) (x+2)
X2 – 4
Like Maple “simplify”
Question: how do we implement simplify?
Simplify - Transforming Negatives
Why?


Kill off
subtraction,
negation
Addition is
commutative
To think about: How is this
related to big integer computation?
Simplify – Leveling Operators

Combine like binary trees to n-ary trees
Simplify – Rational Expressions

Expressions with * and / will be rewritten so that
there is a single / node at the top, with only *
operators below
Simplify – Collecting Like Terms
Exact / Rational Arithmetic


Need to store fractions. What data structure can be
used here?
Problems:





Keep 4/7 as 4/7 but 4/8 as 1/2
How to do this?
Simplification and factoring
Addition and subtraction need computation of greatest
common denominator
Note division of a/b  c/d is just a/b  d/c
Invariants


Sometimes a problem is easier than it looks
Look for another way to define the problem
in terms of indirect, fixed quantities
Invariants



Chocolate bar problem
You are given a chocolate bar, which consists of r
rows and c columns of small chocolate squares.
Your task is to break the bar into small squares.
What is the smallest number of splits to completely
break it?
Variants of this puzzle?





Breaking irregular shaped objects
Assembling piles of objects
Fly flying between two trains
Adding all integers in a range (1 … 1000)
Also look for simplifications or irrelevant
information in a problem
The Game of Squares and Circles



Start with c circles and s squares.
On each move a player selects two shapes.
These two are replaced by one according to the
following rule:


Identical shapes are replaced with a square. Different
shapes are replaced with a circle.
At the end of the game, you win if the last shape
is a circle. Otherwise, the computer wins.
What’s the key?


Parity of circles and squares is invariant
After every move:


if # circles was even, still will be even
If # circles was odd, still will be odd
Sam Loyd’s Fifteen Puzzle


Given a configuration, decide whether it is
solvable or not:
8
7
12
15
1
5
2
13
6
3
9
14
4
10
11
Possible?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Key: Look for an invariant over moves
Solution to Fifteen


Look for inversion invariance
Inversion: when two tiles out of order in a row
inverted
not inverted
11

10
11
For a given puzzle configuration, let N denote the sum of:



10
the total number of inversions
the row number of the empty square.
After a legal move, an odd N remains odd whereas an even
N remains even.
Fifteen invariant
a
b
c
d

Note: if you are asked for optimal path, then
it’s a search problem
Story time: Euclid of Alexandria
(~325 BC – 265 BC)
How many others have been on the
bestseller list for over 1,000 years?
Greek geometer
 Led school in Alexandria
 Probably wroteThe Elements, the
definitive math text until the 19th
century
Developed geometry, number theory and others
from a set of 5 postulates
Knowledge survived Dark Ages in the Western
world as it was translated into Arabic



Euclidian Algorithm for GCD



Rational calculation require the calculation of
the greatest common divisor
The Euclidean algorithm is a good way to do
this. It’s a recursive procedure:
gcd(N,M) = gcd(M, N mod M)
Demo:

http://www.math.umn.edu/~garrett/crypto/a01/
Euclid.html
Sieve of Eratosthenes
Initialization:
 Keep a boolean array [1...n] = PRIME. We will
change entries to COMPOSITE as we go on.
Algorithm:
 Set k=1. iterate towards √n
 Find the first PRIME in the list greater than k. (e.g.,
2.) Call it m. Change the numbers 2m, 3m, 4m, ...
to COMPOSITE.
 Set k=m and repeat.
Sieve of Eratosthenes
Demo from Univ. of Utah
(Peter Alfeld):
http://www.math.utah.edu/~alfeld/
Eratosthenes.html


Notes:

If at n in the sieve, we know that all numbers
not crossed out under n2 are also prime.
Review Time
For self-review
Other topics related that you should cover on
your own
 Roman number conversion: these problems
crop up every once in a while
 Playing cards and other common games
 Divisibility criteria and modular arithmetic
For Fun: Monty Hall

Suppose you're on a game show, and you're given the choice of three
doors. Behind one door is a car, behind the others, goats. You pick a
door, say number 1, and the host, who knows what's behind the doors,
opens another door, say number 3, which has a goat. He says to you,
"Do you want to pick door number 2?" Is it to your advantage to switch
your choice of doors?
Analysis of the Monty Hall Dilemma
Door
case A
B
C
1
bad
bad
good
2
bad
good bad
3
good bad
bad

assume you choose door A: you have a 1/3 chance of a good prize.

But (this is key) Monty knows what is behind each door, and
shows a bad one.

In cases 1 and 2, he eliminates doors B and C respectively (which
happen to be the only remaining bad door) so a good door is left:
SWITCH!

Only in case 3 (you lucked out in your original 1 in 3 chances) does
switching hurt you.
References

Books and
websites used to
compile this
lecture
References


Skiena, S. and Revilla, M. Programming
Challenges
Gathen, J. and Gerhard, J. Modern Computer
Algebra
References

Arbitrary-Precision Arithmetic
http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE144.HTM
* World of Seven
http://www.comp.nus.edu.sg/~stevenha/programming/programming.html

Common Mistakes in Online and Real-time Programming
Contests
http://www.acm.org/crossroads/xrds7-5/contests.html

Mathematical Constants and Computation
http://numbers.computation.free.fr/Constants/constants.html

Euclid’s Elements
http://mathworld.wolfram.com/Elements.html

Monty Hall Problem
http://www.cut-the-knot.org/hall.shtml

Cut the Knot
http://www.cut-the-knot.org/