Introduction

Download Report

Transcript Introduction

COMP550: Algorithms & Analysis
Mon/Wed/Fri 9:05am - 9:55am (FB 009)
http://www.cs.unc.edu/~zsguo/comp550.html
Zhishan Guo
SN 126
[email protected]
Office Hours: After Class on Mon. & Wed.
Or By Appointment
UNC Chapel Hill
Z. Guo
Solving a Computational Problem
• Problem definition & specification
– specify input, output and constraints
• Algorithm design & analysis
– devise a correct & efficient algorithm
• Implementation planning
• Coding, testing and verification
UNC Chapel Hill
Our Focus
Primary Focus
Develop thinking ability
– formal thinking
(proof techniques & analysis)
– problem solving skills
(algorithm design and application)
About Coding/Programming…
UNC Chapel Hill
Goals
• Be very familiar with a collection of core algorithms.
• Be fluent in algorithm design paradigms: divide & conquer,
greedy algorithms, randomization, dynamic programming,
approximation methods.
• Be able to analyze the correctness and runtime performance
of a given algorithm.
• Be familiar with the inherent complexity (lower bounds &
intractability) of some problems.
• Be intimately familiar with basic data structures.
• Be able to apply techniques in practical problems.
UNC Chapel Hill
What Will We Be Doing
•
•
•
•
•
•
•
Examine interesting problems
Devise algorithms for solving them
Prove their correctness
Analyze their runtime performance
Study data structures & core algorithms
Learn problem-solving techniques
Applications in real-world problems
The inverted Classroom Style will be very limited.
UNC Chapel Hill
Congressional Apportionment
Article I, Section 2 of the United States Constitution
requires that Representatives and direct Taxes
shall be apportioned among the several States
which may be included within this Union,
according to their respective Numbers. . . The
Number of Representatives shall not exceed one
for every 30,000, but each State shall have at
Least one Representative. . .
UNC Chapel Hill
The Huntington-Hill Method
Currently, n = 50
and R = 435.
UNC Chapel Hill
Pseudo-code
• Well-written pseudocode reveals the internal structure
of the algorithm but hides irrelevant implementation
details, making the algorithm much easier to
understand, analyze, debug, and implement.
• The precise syntax of pseudocode is a personal choice,
but the overriding goal should be clarity and precision.
Ideally, pseudocode should allow any competent
programmer to implement the underlying algorithm,
quickly and correctly, in their favorite programming
language, without understanding why the algorithm
works.
UNC Chapel Hill
Textbook & References
• Introduction to Algorithms, 3rd Ed. by Cormen, Leiserson, Rivest, &
Stein (CLRS), McGraw Hill, 2009.
• Lecture slides will be put online
– Thanks to Profs. Mark Foskey, Ming Lin, Dinesh Manocha, Ketan MayerPatel, David Plaisted, Jack Snoeyink, Dr. Umamaheswari Devi, and Prof. Jeff
Erickson (UIUC).
OTHER REFERENCES:

Algorithmics: The Spirit of Computing, Harel

How to Solve It, Polya.

The Design and Analysis of Computer Algorithms,
Aho, Hopcroft and Ullman.

Algorithms, Sedgewick.

Algorithmics: Theory & Practice, Brassard & Bratley.

Writing Efficient Programs & Programming Pearls, Bentley.

The Science of Programming, by Gries.

The Craft of Programming, by Reynolds.
UNC Chapel Hill
Prerequisites
• Data Structure + Discrete Mathematics
• Assume that you know, or can recall with a
quick review, the materials in the following
chapters.
– Chapter 0, 1, and 2
– Section 3.2: growth of functions
– Chapter 10: elementary data structures
UNC Chapel Hill
Course Roadmap
•
•
•
•
•
•
•
•
Algorithmic Basics (4)
Divide and Conquer (5-6)
Randomized Algorithms (9-11)
Search Trees (5)
Graph Algorithms (5)
Dynamic Programming (3)
Greedy Algorithms (6)
Special Topics* (1-2)
• M13+W15+F14 = 42
UNC Chapel Hill
Course Work & Grades
• Homework: 25%
(total of 6-9, mostly design & analysis)
• Quizzes: 5%
(very basic materials)
• Mid-Term Exams: 40%
(total of 3, 50 min each, in class)
• Final Exam: 30%
• Active Class Participation: up to half a grade bonus
• Programming Projects: up to 10% bonus
UNC Chapel Hill
Examinations
• Quizzes: throughout the semester
• Midterms: 3 in class
• Final: 8am - 11am on May 4th, 2015
“Half” closed book, NO collaboration
UNC Chapel Hill
Homework Assignments
• Due at the beginning of each class on the due
date given
• No late homework will be accepted
• Lowest score will be dropped
• Can discuss in group, but must write/formulate
solutions alone (failure to explain your solution
orally to the instructor = cheat)
• Be neat, clear, precise, formal
– You’ll be graded on correctness, simplicity, elegance
& clarity
UNC Chapel Hill
Course Project
• Due on 23:59 Apr 27 (last day of class)
• No late project report or code will be accepted
• You are responsible for defining/proposing the
course project. You are encouraged to discuss with
me and/or submit a proposal earlier than Mar 30.
• It can be either some implementations of core
algorithms we cover (you are responsible for
showing the correctness of your code), or some
algorithmic study to any open problem.
• Group work is allowed though contribution of each
member must be clarified in the final report.
UNC Chapel Hill
Communication
• Visit instructor during office hours, by
appointment, or email correspondence
• All lecture notes and most of handouts are posted
at the course website:
http://www.cs.unc.edu/~zsguo/comp550.html
• Major messages are notified by email alias + on
course website
• Discussions -- face-to-face in groups, or on Piazza
• Student grades can be checked with the instructor
UNC Chapel Hill
Basic Courtesy
• Write/print assignments neatly & formally
• Please do not read newspaper & other materials, or
browse web in class
• When coming to the class late or leaving early, please
take an aisle seat quietly
• Remain quiet, except asking questions or answering
questions posed by instructors
– no whispers or private conversation
THANK YOU!!!
UNC Chapel Hill
How to Succeed in this Course
• Start early on all assignments.
DON'T procrastinate.
• Complete all reading
before class.
• Participate in class.
• Think in class.
• Review after each class.
• Be formal and precise on
all problems sets and exams
UNC Chapel Hill
Weekly Reading Assignment
Chapters 0, 1, 2, 3 and Appendix A
(Textbook: CLRS)
UNC Chapel Hill
Algorithms
• A tool for solving a well-specified computational
problem
Input
Algorithm
Output
• Example: sorting
input: A sequence of number
output: An ordered permutation of input
issues: correctness, efficiency, storage, etc.
UNC Chapel Hill
Example: Insertion Sort
for j=2 to length(A)
do key=A[j]
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
UNC Chapel Hill
Correctness Proofs
• Proving (beyond “any” doubt) that an
algorithm is correct.
– Prove that the algorithm produces correct output
when it terminates. Partial Correctness.
– Prove that the algorithm will necessarily
terminate. Total Correctness.
• Techniques
– Proof by Construction.
– Proof by Induction.
– Proof by Contradiction.
UNC Chapel Hill
Loop Invariant
• Logical expression with the following properties.
– Holds true before the first iteration of the loop –
Initialization.
– If it is true before an iteration of the loop, it remains true
before the next iteration – Maintenance.
– When the loop terminates, the invariant ― along with the
fact that the loop terminated ― gives a useful property
that helps show that the loop is correct – Termination.
• Similar to mathematical induction.
UNC Chapel Hill
Example: Insertion Sort
• Invariant: at the start of
each for loop, A[1…j-1]
consists of elements
originally in A[1…j-1] but in
sorted order; all other
elements are unchanged.
UNC Chapel Hill
for j=2 to length(A)
do key=A[j]
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
Example: Insertion Sort
• Invariant: at the start of
each for loop, A[1…j-1]
consists of elements
originally in A[1…j-1] but in
sorted order; all other
elements are unchanged

for j=2 to length(A)
do key=A[j]
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
Initialization: j = 2, the invariant trivially holds because
A[1] is a sorted array. √
UNC Chapel Hill
Example: Insertion Sort
for j=2 to length(A)
• Invariant: at the start of
do key=A[j]
each for loop, A[1…j-1]
i=j-1
consists of elements
while i>0 and A[i]>key
do A[i+1]=A[i]
originally in A[1…j-1] but in
i-sorted order; all other
A[i+1]:=key
elements are unchanged
 Maintenance: the inner while loop finds the position i
with A[i] <= key, and shifts A[j-1], A[j-2], …, A[i+1] right
by one position. Then key, formerly known as A[j], is
placed in position i+1 so that A[i]  A[i+1] < A[i+2].
A[1…j-1] sorted + A[j]
UNC Chapel Hill
A[1…j] sorted
Example: Insertion Sort
• Invariant: at the start of
each for loop, A[1…j-1]
consists of elements
originally in A[1…j-1] but in
sorted order; all other
elements are unchanged

for j=2 to length(A)
do key=A[j]
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
Termination: the loop terminates, when j=n+1. Then
the invariant states: “A[1…n] consists of elements
originally in A[1…n] but in sorted order.” √
UNC Chapel Hill
Running time
• Depends on input
(e.g., sorted/reversely)
• Depends on input size
(5 elements vs 500K)
for j=2 to length(A)
do key=A[j]
i=j-1
while i>0 and A[i]>key
do A[i+1]=A[i]
i-A[i+1]:=key
– Parameterize in input size (n)
• Want upper bounds (generally)
– Guarantee to the user
UNC Chapel Hill
Analysis
• Worst-Case (usually)
T(n) = max time on any input of size n
• Average-Case (sometimes)
T(n) = expected time over all inputs of size n
(Need assumption of…)
• Best-Case
UNC Chapel Hill
A Simple Example – Linear Search
INPUT: a sequence of n numbers, key to search for.
OUTPUT: true if key occurs in the sequence, false otherwise.
LinearSearch(A, key)
1
i1
2 while i ≤ n and A[i] != key
3
do i++
4 if i  n
5
then return true
6
else return false
A Simple Example – Linear Search
INPUT: a sequence of n numbers, key to search for.
OUTPUT: true if key occurs in the sequence, false otherwise.
LinearSearch(A, key)
1
i1
2 while i ≤ n and A[i] != key
3
do i++
4 if i  n
5
then return true
6
else return false
cost
1
1
1
1
1
1
Assign a cost of 1 to all statement executions.
Now, the running time ranges between
1+ 1+ 1 + 1 = 4 – best case
and
1+ (n+1)+ n + 1 + 1 = 2n+4 – worst case
times

n
i 2
1
1
x
x-1
1
1
1
A Simple Example – Linear Search
INPUT: a sequence of n numbers, key to search for.
OUTPUT: true if key occurs in the sequence, false otherwise.
LinearSearch(A, key)
1
i1
2 while i ≤ n and A[i] != key
3
do i++
4 if i  n
5
then return true
6
else return false
cost
1
1
1
1
1
1
times

n
i 2
1
1
x
x-1
1
1
1
If we assume that we search for a random item in the list,
on an average, Statements 2 and 3 will be executed n/2 times.
Running times of other statements are independent of input.
Hence, average-case complexity is
1+ n/2+ n/2 + 1 + 1 = n+3
Correctness Proof of Linear Search
• Use Loop Invariant for the while loop:
LinearSearch(A, key)
1
i1
2 while i ≤ n and A[i] != key
3
do i++
4 if i  n
5
then return true
6
else return false
UNC Chapel Hill
If the algm. terminates,
then it produces correct
result.
Initialization.
Maintenance.
Termination.
Argue that it terminates.
Correctness Proof of Linear Search
• Use Loop Invariant for the while loop:
– At the start of each iteration of the while loop, the
search key is not in the subarray A[1…i-1].
LinearSearch(A, key)
1
i1
2 while i ≤ n and A[i] != key
3
do i++
4 if i  n
5
then return true
6
else return false
UNC Chapel Hill
If the algm. terminates,
then it produces correct
result.
Initialization.
Maintenance.
Termination.
Argue that it terminates.
Worst-Case time &
Order of Growth
• Depends on computer (software vs. hardware)
• BIG IDEA - Asymptotic analysis
– Ignore machine dependent constants
– Look at the growth of T(n) as n -> +inf
–  Notation: we can ignore the lower-order terms, since
they are relatively insignificant for very large n. We can
also ignore leading term’s constant coefficients, since
they are not as important for the rate of growth in
computational efficiency for very large n.
UNC Chapel Hill
Comparisons of Algorithms
• Sorting
– insertion sort: (n2)
– merge sort: (n log n)
For 106 numbers, insertion sort takes 5.56 hrs on a
supercomputer using machine language and 16.67
min on a PC using C/C++ with merge sort.
Why Order of Growth Matters?
Computer speeds double every two years…
UNC Chapel Hill
Effect of faster machines
Ops/sec:
1M
2M
Gain
n*n alg
1000
1414
1.4
n log n alg
62700
118600
1.9
The number of items that can be sorted in one second
using an algorithm taking exactly n2 time as compared to
one taking n lg n time, assuming 1 million and 2 million
operations per second. Notice that, for the n lg n algorithm,
doubling the speed almost doubles the number of items
that can be sorted. (Order of growth matters!)
UNC Chapel Hill