Transcript Lecture 6

CS503: Sixth Lecture, Fall 2008
Linked Lists, Stacks, and Queues.
Michael Barnathan
Here’s what we’ll be learning:
• Data Structures:
–
–
–
–
–
Linked Lists.
Stacks.
Queues.
Dequeues.
Priority Queues.
• Theory:
– Stacks in recursion.
Reminder: Linked Lists.
•
•
•
•
Insertion:
Access:
Updating an element:
Deleting an element:
• Search:
• Merge:
O(1)
O(N)
O(1)
O(1)
O(N).
O(1).
• Dynamically sized by nature.
– Just stick a new node at the end.
• Modifications are fast, but sequential node access is the killer.
– And you need to access the nodes before performing other operations on
them.
• Three main uses:
– When search/access is not very important (e.g. logs, backups).
– When you’re merging and deleting a lot.
– When you need to iterate through the list sequentially anyway.
Recursive Definition of Linked Lists
• Just like arrays, a linked list of size n is a linked
list of size n-1 plus a node.
– This is a pretty common definition for “linear”
data structures such as arrays and linked lists.
• Note that even though the recursive
definitions are the same, arrays and lists still
have different properties.
Searching and Sorting on Lists.
• Sequential access causes problems in our partitionbased algorithms.
– You can’t perform binary search.
• Moving to the new middle is linear.
– Likewise, don’t use a list for the guessing game.
• All of the basic sorting algorithms we’ve learned can be
made to work on linked lists.
– In general, the constant-time modification speeds the
algorithms up, but the search behavior slows it down.
– The general runtime ends up the same.
– Faster sorting algorithms have problems, however.
• We’ll talk about them soon.
Any way to improve this?
• Insertion and deletion are constant time.
• But accessing the node to be deleted/inserted
after in the first place is linear.
– So it’s really the middle of the list that has
problems. We have direct pointers to the ends.
• When faced with problems of this type, ask
yourself “do we need this much power?”
– If the answer is no, restrict your data structures
for better performance.
Restricting to the ends.
• Stacks, queues, and dequeues are data
structures that restrict insertion, deletion, and
access to the end(s) of the structure.
– The primary difference between them is which
end(s) operations are performed on.
• These structures are often built on top of
linked lists.
– Through encapsulation, we can use a LinkedList as
a low-level structure but restrict the high-level
operations to the end of the structure.
Stacks
• Stacks are like stacks of dishes.
– You can only add one to the top of the stack.
– You can only take one off of the top of the stack.
– You can’t even look at the dishes in the middle.
• If you tried to add one in the middle, you’d need to set aside
all of the dishes above it, add the new dish, then add the old
dishes back on top of it.
• Same with removal; if you just yanked a dish out of the
middle, you’d get porcelain all over the floor.
• You can only operate directly on the “top”
element of a stack.
Terminology
• Adding an element to the top: “push”.
• Removing an element from the top: “pop”.
• Most recently inserted element: “top”.
– Access the top value without removing it: “peek”.
• “push” and “pop” are sometimes used in other
data structures as well.
– They simply mean “add to/remove from the front”
– Java and the STL have push(), pop() on most container
classes, including Vector and LinkedList.
LIFO
• Stacks exhibit what is called “last-in-first-out”
(LIFO) behavior.
– You add an element at the top of a stack.
– If you were to then remove an element, it would
be the one you just added – the last element you
inserted.
• The last element to go “into” the stack is the
first element to be taken “out of” the stack.
– Tip: You can reverse sequences of things this way.
Example:
top
top
50
top
42
Push 50.
rest of stack (inaccessible)
Pop.
42
Stacks in Recursion
• The system actually maintains function calls in a stack.
– That includes all of the parameters of the function.
– When you call a recursive function, say the printTo function we discussed last
class, you have an argument named “int n”.
– When it calls printTo(n-1), you invoke another function with its own copy of
“int n”.
– And so on.
•
•
•
•
When you use “n”, you are accessing the top of the stack.
When you call printTo(n-1), you are pushing n-1 onto the stack.
When the function call exits, its “n” is popped from the stack.
So what printTo() is doing is generating a stack of numbers from 1 to n,
then outputting the top element right before it’s popped.
– That is why we were able to reverse the order by moving the print statement
above the recursive call.
– We were outputting the top element right after it was pushed.
– Since stacks are LIFO, we pushed in descending order, but popped in
ascending order.
Example:
printTo(): Pushing in descending order, popping in ascending order.
1
2
System.out.println() above
recursive call: printing on push.
n-2
n
n-1
n-1
n
n
…
…
System.out.println() below
recursive call: printing on pop.
n
CRUD: Stacks
• Push:
• Pop:
• Peek:
?
?
?
• Search:
?
• Any ideas?
– Assume the underlying representation is a linked list.
– The performance is actually the same if you use a
vector underneath.
CRUD: Stacks
• Push:
• Pop:
• Peek:
O(1)
O(1)
O(1)
• Search:
O(n)
• Pushing, popping, and peeking are insertion, deletion,
and access at the end of a linked list.
• To search, you need to pop values one by one, check
them, then push them back on.
– You can store them in another stack to avoid reversing the
order (or, more accurately, to reverse the order twice).
The System Stack and Exceptions
• Note that this stack is never declared.
• It’s automatically and transparently generated for you by the
system.
– This is just how it handles function calls.
• When a program throws an exception, you get a “trace” of this
stack.
– All of the functions called and what lines in each the exception passed
through.
• Exceptions pass up the stack until they are either caught or until
they pass main().
– You can catch exceptions at any level at or above the caller.
• For example, my EmployeeLoader class threw a FileNotFoundException in its
constructor. You caught it in main().
– If you don’t catch the exception when main() exits, the Java runtime
will catch it, output an error trace, and terminate.
• This is called stack unwinding. It happens in C/C++ too.
Queues
• Similar to stacks, except you insert at the back
and remove at the front.
• Think of it as a real queue… waiting on line at
a checkout counter, for example.
– New people are added to the back of the line.
– They leave from the front.
Terminology
• Enqueue (pronounced “N Q”): insert into the
back of the queue.
• Dequeue (pronounced “day Q” or “D Q”):
remove from the front of the queue.
• Back: whichever end you insert at.
• Front: whichever end you remove from.
• Back != Front (otherwise it’s a stack).
FIFO
• Queues are first-in-first-out (FIFO).
– Also referred to as first-come-first-served (FCFS).
• The first element inserted into the queue is
the first one that will leave.
– Example: 200 people waiting on line for Wiis.
– The ones who camped at the store the previous
night are the ones who will get them first.
– The people who showed up later will have to wait
for the others.
Example:
Enqueue 1, 2, 3.
Dequeue thrice.
1 goes in first, comes out first.
2 goes in second, comes out second.
3 goes in third, comes out third.
front
FIFO!
front
1
1
2
2
1
2
3
3
back
back
3
CRUD: Queues
• Enqueue:
• Dequeue:
• Peek:
O(1)
O(1)
O(1)
• Search:
O(n)
• Since all we’ve done is change the end we add
to, the performance remains the same.
Deques:
• “Doubly-ended queues”.
– Sometimes spelled “dequeue”, but that’s confusing
because deletion from a queue is also called that.
• Usually pronounced “deck” or “day Q”.
• This is just a queue where you can add and remove at
both ends.
– It’s up to you whether to treat them as stacks or queues in
your program.
• Because these aren’t necessarily LIFO or FIFO unless
you use them that way, they’re not commonly used.
– If you want LIFO behavior, you can use a stack.
– If you want FIFO behavior, you can use a queue.
Deques: Terminology
•
•
•
•
Push_front: Add an element to the front end.
Push_back: Add an element to the back end.
Pop_front: Remove the front element.
Pop_back: Remove the back element.
• Java and C++ provide these functions for most
linear data structures.
CRUD: Deques
• Push front/back:
• Pop front/back:
• Peek:
O(1)
O(1)
O(1)
• Search:
O(n)
• Again, nothing is really changing here.
Priority Queues
• Like queues, but some people are more important and
get to cut the line.
– Imagine you’re waiting for that Wii and Bill Gates walks in.
He walks straight up to the cashier, buys it, and leaves.
– Your first thought would probably be “wow, even Bill Gates
has no confidence in the Xbox 360”.
– But your second would probably be “hey, he just cut the
whole line!”
• Yes, because Bill Gates is more important than you.
– Sorry, sorry. But you can be better programmers than he is.
Priority Queues
• Humor aside, this is how priority queues work.
• Every element has a value and a priority.
• The element with the highest priority is always
the next one to be removed from the queue.
• This is no longer FIFO or LIFO.
– The highest priority in is now first out.
• Obviously, guaranteeing this requires some
work, either on insertion or retrieval.
Priority Queues
• Are useful data structures:
– Most CPU schedulers use them.
– Print queues can use them.
– Elevators can use them.
– Businesses can use them to model their processes
and risks.
– Testers can use them to categorize bugs.
• They are appropriate whenever certain
elements should be prioritized over others.
The Insertion Strategy:
• One way to implement a priority queue is to use an
array or linked list underneath and insert elements into
it sorted by priority.
• This guarantees that the element at the front of the
queue is the one with highest priority.
• This incurs a cost:
– For arrays, finding the place to put the element is O(log n),
but shifting the elements over is O(n).
– For linked lists, insertion is O(1), but finding the right place
to insert into is O(n).
• Either way, this requires O(n) time per insertion.
• On the other hand, it only takes O(1) to dequeue.
The Selection Strategy:
• Another approach is to keep the array/list in
unsorted order and find the right element
when we dequeue.
• Insertion then becomes O(1).
• But access is then O(n).
– You have to search through the array linearly.
– Binary search cannot be used here, since the
structure is unsorted.
Queues and Sorts
• In either case, inserting elements into a priority queue then
removing them sorts them by priority.
• The insertion and selection strategies are analogous to
their respective sorts.
– In a selection-based queue, you must find the maximum priority
item and return it as if it were at the end of the queue, just as
you found the minimum value and swapped it to the end in
selection sort.
– In an insertion-based queue, you must insert the element in its
proper position, shifting elements beyond it over (in an array).
• There is another implementation of a priority queue using a
data structure called a heap.
– And consequently, another sorting algorithm, called heapsort.
– We will cover this when we get to heaps.
CRUD: Priority Queues
•
•
•
•
Enqueue (selection):
Enqueue (insertion):
Dequeue (selection):
Dequeue (insertion):
O(1).
O(n).
O(n).
O(1).
• So you either pay on insert or access. For now, that’s
your tradeoff.
• The heap strategy is a compromise:
– O(log n) for both.
• But not really.
– O(log n) is far better than O(n).
Common Applications
•
Stacks:
– The system stack keeps track of function calls.
– Pointers to free spaces on disk and memory in the OS are often accessed like stacks.
– You can implement a whole system using just a stack and a tape drive.
• Really. That’s what a Turing machine is.
– Parsers make use of these.
• Particularly a type of parser known as a “pushdown automaton”.
• These get used a lot in compilers.
– They’re handy for storing and reversing lists of numbers.
•
Queues:
– Used a lot in synchronization of event-driven or multithreaded apps.
• Events stream in and get stored in a queue until the app can handle them.
–
–
–
–
•
Used to model “arrivals” in general.
Used in almost all CPU scheduling algorithms.
Used for buffering device I/O.
Used for print jobs.
Priority queues:
– Used for everything queues are used for when priority is important, and then some.
(Push (Push (Push Pop) Pop) Pop)
• This is all we will cover simple stacks and
queues. They are fairly simple structures.
• We will come back to priority queues when
we learn about heaps.
• The lesson:
– It is often better to store things, prioritize them,
and finish them one at a time than to attend to
everything as it demands your attention.
• Next class: Mergesort, Shellsort, Quicksort.
Assignment 2:
• Using the EmployeeLoader class, write a program that groups
employees (read from Employees.csv) by city.
• Compute the average salary for each city.
• Output the names and average salaries of the cities with the
25 highest average salaries in descending order.
– You are effectively implementing the following SQL query:
– SELECT City, AVG(Salary) FROM Employees GROUP BY City ORDER BY
Salary DESC LIMIT 25
• Describe the data structures and algorithms you used and
why you chose them.
• Deadline: Tuesday, September 30.