Algorithms and Data Structures

Download Report

Transcript Algorithms and Data Structures

Algorithms and Data
Structures
Simonas Šaltenis
Nykredit Center for Database Research
Aalborg University
[email protected]
September 10, 2001
Administration

People







Simonas Šaltenis
Anders B. Christensen
Daniel Povlsen
Home page http://www.cs.auc.dk/~simas/ad01
Check the homepage frequently!
Course book “ Introduction to Algorithms”, 2.ed
Cormen et al.
Lectures, Fib 15, A, 8:15-10:00, Mondays and
Thursdays
September 10, 2001
2
Administration (2)



Exercises: every week after the lecture
Exam: SE course, written
Troubles



Simonas Šaltenis
E1-215
[email protected]
September 10, 2001
3
Syllabus









Introduction (1)
Correctness, analysis of algorithms (2,3,4)
Sorting (1,6,7)
Elementary data structures, ADTs (10)
Searching, advanced data structures (11,12,13,18)
Dynamic programming (15)
Graph algorithms (22,23,24)
Computational Geometry (33)
NP-Completeness (34)
September 10, 2001
4
What is it all about?

Solving problems





Get me from home to work
Balance my checkbook
Simulate a jet engine
Graduate from SPU
Using a computer to help solve problems




Designing programs (architecture, algorithms)
Writing programs
Verifying programs
Documenting programs
September 10, 2001
5
Data Structures and Algorithms

Algorithm



Outline, the essence of a computational
procedure, step-by-step instructions
Program – an implementation of an
algorithm in some programming language
Data structure

Organization of data needed to solve the
problem
September 10, 2001
6
Overall Picture
Data Structure and
Algorithm Design Goals
Implementation
Goals
Robustness
Correctness
Adaptability
Efficiency
Reusability
September 10, 2001
7
Overall Picture (2)

This course is not about:




Programming languages
Computer architecture
Software architecture
Software design and implementation principles


Issues concerning small and large scale
programming
We will only touch upon the theory of
complexity and computability
September 10, 2001
8
History




Name: Persian mathematician Mohammed
al-Khowarizmi, in Latin became Algorismus
First algorithm: Euclidean Algorithm,
greatest common divisor, 400-300 B.C.
19th century – Charles Babbage, Ada
Lovelace.
20th century – Alan Turing, Alonzo Church,
John von Neumann
September 10, 2001
9
Algorithmic problem
Specification
of input

?
Specification
of output as
a function of
input
Infinite number of input instances satisfying the
specification. For example:

A sorted, non-decreasing sequence of natural
numbers. The sequence is of non-zero, finite length:


1, 20, 908, 909, 100000, 1000000000.
3.
September 10, 2001
10
Algorithmic Solution
Input instance,
adhering to
the
specification


Algorithm
Output
related to
the input as
required
Algorithm describes actions on the input
instance
Infinitely many correct algorithms for the same
algorithmic problem
September 10, 2001
11
Example: Sorting
OUTPUT
INPUT
a permutation of the
sequence of numbers
sequence of numbers
a1, a2, a3,….,an
2
5
4
10
Sort
7
Correctness
For any given input the algorithm
halts with the output:
• b1 < b2 < b3 < …. < bn
• b1, b2, b3, …., bn is a
permutation of a1, a2, a3,….,an
September 10, 2001
b1,b2,b3,….,bn
2
4
5
7
10
Running time
Depends on
• number of elements (n)
• how (partially) sorted
they are
• algorithm
12
Insertion Sort
A
3
4
6
8
9
1
7
2
j
5 1
n
i
Strategy
• Start “empty handed”
• Insert a card in the right
position of the already sorted
hand
• Continue until all cards are
inserted/sorted
September 10, 2001
for j=2 to length(A)
do key=A[j]
“insert A[j] into the
sorted sequence A[1..j-1]”
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
13
Analysis of Algorithms

Efficiency:



Running time
Space used
Efficiency as a function of input size:


Number of data elements (numbers, points)
A number of bits in an input number
September 10, 2001
14
The RAM model


Very important to choose the level of
detail.
The RAM model:

Instructions (each taking constant time):




Arithmetic (add, subtract, multiply, etc.)
Data movement (assign)
Control (branch, subroutine call, return)
Data types – integers and floats
September 10, 2001
15
Analysis of Insertion Sort

Time to compute the running time as a
function of the input size
for j=2 to length(A)
do key=A[j]
“insert A[j] into the
sorted sequence A[1..j-1]”
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
September 10, 2001
cost
c1
c2
0
c3
c4
c5
c6
c7
times
n
n-1
n-1
n-1
n



t
(t  1)
j 2 j
n
(t  1)
j 2 j
n-1
j2 j
n
16
Best/Worst/Average Case



Best case: elements already sorted 
tj=1, running time = f(n), i.e., linear time.
Worst case: elements are sorted in
inverse order
 tj=j, running time = f(n2), i.e., quadratic
time
Average case: tj=j/2, running time =
f(n2), i.e., quadratic time
September 10, 2001
17
Best/Worst/Average Case (2)

For a specific size of input n, investigate
running times for different input instances:
6n
5n
4n
3n
2n
1n
September 10, 2001
18
Best/Worst/Average Case (3)

For inputs of all sizes:
worst-case
average-case
Running time
6n
5n
best-case
4n
3n
2n
1n
1
2
3
4
5
6
7
8
9 10 11 12 …..
Input instance size
September 10, 2001
19
Best/Worst/Average Case (4)

Worst case is usually used:




It is an upper-bound and in certain application
domains (e.g., air traffic control, surgery)
knowing the worst-case time complexity is of
crucial importance
For some algorithms worst case occurs fairly
often
The average case is often as bad as the
worst case
Finding the average case can be very difficult
September 10, 2001
20
Growth Functions
1,00E+10
1,00E+09
1,00E+08
n
log n
sqrt n
n log n
100n
n^2
n^3
1,00E+07
T(n)
1,00E+06
1,00E+05
1,00E+04
1,00E+03
1,00E+02
1,00E+01
1,00E+00
1,00E-01
2
4
8
16
32
64
128
256
512
1024
n
September 10, 2001
21
Growth Functions (2)
1,00E+155
1,00E+143
1,00E+131
1,00E+119
n
log n
sqrt n
n log n
100n
n^2
n^3
2^n
1,00E+107
T(n)
1,00E+95
1,00E+83
1,00E+71
1,00E+59
1,00E+47
1,00E+35
1,00E+23
1,00E+11
1,00E-01
2
4
8
16
32
64
128
256
512
1024
n
September 10, 2001
22
That’s it?



Is insertion sort the best approach to
sorting?
Alternative strategy based on divide and
conquer
MergeSort




sorting the numbers <4, 1, 3, 9> is split into
sorting <4, 1> and <3, 9> and
merging the results
Running time f(n log n)
September 10, 2001
23
Example 2: Searching
OUTPUT
INPUT
• sequence of numbers (database)
• a single number (query)
a1, a2, a3,….,an; q
• an index of the found
number or NIL
j
2
5
4
10
7;
5
2
2
5
4
10
7;
9
NIL
September 10, 2001
24
Searching (2)
j=1
while j<=length(A) and A[j]!=q
do j++
if j<=length(A) then return j
else return NIL


Worst-case running time: f(n), average-case:
f(n/2)
We can’t do better. This is a lower bound for the
problem of searching in an arbitrary sequence.
September 10, 2001
25
Example 3: Searching
OUTPUT
INPUT
• sorted non-descending sequence
of numbers (database)
• a single number (query)
a1, a2, a3,….,an; q
• an index of the found
number or NIL
j
2
4
5
7
10;
5
2
2
4
5
7
10;
9
NIL
September 10, 2001
26
Binary search

Idea: Divide and conquer, one of the key design
techniques
left=1
right=length(A)
do
j=(left+right)/2
if A[j]==q then return j
else if A[j]>q then right=j-1
else left=j+1
while left<=right
return NIL
September 10, 2001
27
Binary search – analysis

How many times the loop is executed:



With each execution its length is cult in half
How many times do you have to cut n in half
to get 1?
lg n
September 10, 2001
28
Next Week


Correctness of algorithms
Asymptotic analysis, big O notation.
September 10, 2001
29