public boolean - University of Pittsburgh
Download
Report
Transcript public boolean - University of Pittsburgh
Course Notes for
CS 0445
Data Structures
By
John C. Ramirez
Department of Computer Science
University of Pittsburgh
• These notes are intended for use by students in
•
•
CS0445 at the University of Pittsburgh and no one else
These notes are provided free of charge and may not
be sold in any shape or form
Material from these notes is obtained from various
sources, including, but not limited to, the following:
Data Structures and Abstractions with Java, 2nd and 3rd
Editions by Frank Carrano
Data Structures and the Java Collections Framework by
William Collins
Classic Data Structures in Java by Timothy Budd
Java By Dissection by Pohl and McDowell
Java Software Solutions (various editions) by John Lewis and
William Loftus
java.sun.com and its many sub-links
2
Goals of Course
• To learn, understand and be able to utilize
many of the data structures that are
fundamental to computer science
Data structures such as vectors, stacks,
queues, linked-lists and trees are used
throughout computer science
We should understand these from a user's
point of view:
• What are these data structures and how do I use
them in my programs?
3
Goals of Course
• To understand implementation issues related
to these data structures, and to see how
they can be implemented in the Java
programming language
Data structures can be implemented in various
ways, each of which has implications (ex: runtime differences, code complexity, modifiability)
We should understand these data structures
from an implementer's point of view:
• How can these data structures be effectively
implemented?
4
Goals of Course
• To understand and utilize programming
ideas and techniques utilized in data
structure implementation
Object-oriented programming, dynamic memory
utilization, recursion and other principles must
be understood in order to effectively implement
data structures
• What tools must I know and be able to use in order to
implement data structures?
5
Goals of Course
• To learn more of the Java programming
language and its features, and to become
more proficient at programming with it
Java is a very large language with extensive
capabilities
As your programming skills improve, you can
utilize more of these capabilities effectively
• Since I am working with Java, how well can I learn
and use the language and its features?
6
Lecture 1: Getting Started
• How and where can I use Java?
Java is a platform-independent language, and
thus can be used on many different platforms
• In CS 0401 last term, you
used Mac Minis
Win32 PC
Java Program
• Some of you may have
PCs or prefer PCs in the
labs
• Programs written in
either place should run the
same way
Sun Workstation
7
Lecture 1: Review
• Last term in CS 0401 you learned many
things that you will need to use this term
• Basic Java program structure and syntax
• Java control structures and utilizing them
effectively
– Loops, conditionals, etc
• Java methods and method calls
– With and without parameters
– Static methods vs instance methods
• Java variables and objects
– Instance variables vs method variables
– Objects vs references
– Dynamic nature of Java objects
8
Lecture 1: Review
• Java classes
– Syntax for writing new classes
– Inheritance
– Polymorphism via method overriding and overloading
• Interfaces, how and why to use them
• Simple Java files and graphics
• Exception handling
If you are unsure about these things, look them over in
your notes and in your CS 0401 textbook
Also read Appendices A-F in the Carrano Data Structures
text
We will briefly go over Appendices B, C and D and
review some of these concepts in the first few lectures
9
Lecture 1: Classes and Objects
• Classes are blueprints for our data
The class structure provides a good way to
encapsulate the data and operations of a new
type together
• Instance data and instance methods
Access restrictions (i.e. data hiding through
private declarations) allow the implementation
details of the data type to be hidden from a user
• public, protected and private allow various levels
of accessibility
10
Lecture 1: Classes and Objects
User of the class knows the nature of the data,
and the public methods, but NOT the
implementation details
• But does not need to know them in order to use
the class
– Ex: BigInteger
• These are determined by the specifics of the private
data and the method implementations
We call this data abstraction
• This is related to abstract data types (ADTs), which
we will discuss shortly
11
Lecture 1: Classes and Objects
• Java classes determine the structure of Java
objects
public declarations (typically methods) give the
interface and functionality of the objects
• How the “outside world” communicates with the
objects
private declarations (typically data and some
methods) hide the implementation details from
the user
• To put it another way, Java objects are
instances of Java classes
12
Lecture 1: Classes and Objects
• Class declaration keywords:
public – accessible outside class
private – inaccessible outside class
protected – accessible only within class and
subclasses [and same package]
static – part of class rather than instance;
shared by all instances
final – constant – cannot be assigned
(variables), overridden (methods) or subclassed
(classes)
13
Lecture 1: References, Pointers and Memory
• Other than the primitive variables, all
Java variables are references
14
Lecture 1: References, Pointers and Memory
• Be careful when comparing
Know when you want to compare references or
contents
For reference variables, we typically need to use
a method to compare contents
• ex. for strings, equals
• u.equals(s) returns true
– We can redefine equals() for our own classes as well
Java does not allow operator overloading, so we
cannot redefine comparison operators to
compare contents – must use named methods
15
Lecture 2: References, Pointers and Memory
• How do references and pointers relate?
Many programming languages (ex: C, C++,
Pascal) use pointer variables
• Pointers are variables that store addresses of other
•
memory locations
Pointers allow indirect access of the data in objects
101001
X
010010
Y
101001
Object A
010010
16
Object B
Lecture 2: References, Pointers and Memory
• So the value stored in a pointer is an address
• However, if you dereference a pointer, you gain
access to the object it "points to"
X = Y;
// Changes what X points to
// X no longer has access to
// Object B
101001
X
101001
Y
101001
Object A
010010
17
Object B
Lecture 2: References, Pointers and Memory
• In C++, you dereference pointers using the *
operator
*X = *Y;// Changes contents of object
// that X points to. The
// value of X is unchanged
101001
X
010010
Y
101001
Object A
010010
18
Object A
Lecture 2: References, Pointers and Memory
• References in Java
Behave in a way similar to pointers, but with
more restriction
• Dereferencing is implicit – there is no dereference
•
operator
Reference values (addresses) can be assigned but
they cannot be manipulated
But aliasing still occurs and you must be very
careful
• Be aware of when you want a new object or a
reference to an old one
19
Lecture 2: References, Pointers and Memory
• Java Memory Use
All objects in Java are allocated dynamically
• Memory is allocated using the new operator
• Once allocated, objects exist for an indefinite period of
time
– As long as there is an active reference to the object
• Objects that have no references to them are no longer
accessible in the program
– Ex. Object B in slide 17
• These objects are marked for GARBAGE COLLECTION
20
Lecture 2: References, Pointers and Memory
The Java garbage collector is a process that runs
in the background during program execution
• When the amount of available memory runs low, the
garbage collector reclaims objects that have been
marked for collection
– A fairly sophisticated algorithm is used to determine
which objects can be garbage collected
– If you take CS 1621 or CS 1622 you will likely discuss
this algorithm in more detail
• If plenty of memory is available, there is a good
chance that the garbage collector will never run
See Example1.java
21
Lecture 2: Building New Classes
• Java has many predefined classes
Class library contains hundreds of classes, each
designed for a specific purpose
• See API –
http://download.oracle.com/javase/6/docs/api/
• However, in many situations we may need a
class that is not already defined
We will have to define it ourselves
There are two primary techniques for doing this
• Composition (Aggregation)
• Inheritance
22
Lecture 2: Composition
• With composition, we build a new class
using components (instance variables) that
are from previously defined classes
We compose the class from existent "pieces"
"Has a" relationship between new class and old
classes
New class has no special access to its instance
variable objects
Methods in new class are often implemented by
utilizing methods from the instance variable
objects
23
Lecture 2: Composition
public class CompoClass
{
private String name;
private Integer size;
public CompoClass(String n, int i)
{
name = new String(n);
size = new Integer(i);
}
public setCharAt(int i, char c)
{
StringBuilder b = new StringBuilder(name);
b.setCharAt(i, c);
name = b.toString();
}
}
We cannot access the inner representation of the String, and
String objects are immutable, so we must change it in the
rather convoluted way shown above
24
Lecture 2: Inheritance
• With inheritance, we build a new class
(subclass) by extending a previously defined
class (superclass)
Subclass has all of the properties (data and
methods) defined in the superclass
"Is a" relationship between subclass and
superclass
• Subclass is a superclass, and subclass objects can be
•
assigned to superclass variables
Not vice versa!
– Superclass IS NOT a subclass and superclass objects
cannot be assigned to subclass variables
25
Lecture 2: Inheritance
// Assume SubFoo is a subclass of Foo – see notes
// below and on board
Foo f1;
SubFoo s1;
f1 = new Foo(); // fine
f1 = new SubFoo(); // also fine – however, now we
// only have access to the public methods and
// variables defined in class Foo()
f1.foomethod(); // fine
f1.subfoomethod(); // illegal
((SubFoo)f1).subfoomethod(); // fine, since now the ref.
// has been cast to the actual class
s1 = new SubFoo(); // also fine – now all SubFoo
// public methods and variables are accessible
s1.subfoomethod(); fine
s1.foomethod(); // also fine
s1 = new Foo(); // illegal
26
Lecture 2: Polymorphism
• Allows superclass and subclass objects to be
accessed in a regular, consistent way
Array or collection of superclass references can
be used to access a mixture of superclass and
subclass objects
If a method is defined in both the superclass
and subclass (with identical signatures), the
version corresponding to each class will be used
in a call from the array
• Idea is that the methods are similar in nature but the
redefinition in the subclass gears the method more
specifically to the data / properties of the subclass
27
Lecture 2: Polymorphism
• Ex. Each subclass
overrides the move()
method in its own way
Animal [] A = new Animal[3];
A[0] = new Bird();
A[1] = new Person();
A[2] = new Fish();
for (int i = 0; i < A.length; i++)
A[i].move();
move()
move()
• References are all the same,
but objects are not
• Method invoked is that
associated with the OBJECT,
NOT with the reference
28
move()
Lecture 2: Polymorphism
• Polymorphism is implemented utilizing two
important ideas
1) Method overriding
• A method defined in a superclass is redefined in a
subclass with an identical method signature
• Since the signatures are identical, rather than
overloading the method (ad hoc polymorphism), it is
instead overriding the method
– For a subclass object, the definition in the subclass
replaces the version in the superclass, even if a
superclass reference is used to access the object
> Superclass version can still be accessed via the super
reference
29
Lecture 2: Polymorphism
2) Dynamic (or late) binding
• The code executed for a method call is associated
with the call during run-time
• The actual method executed is determined by the
type of the object, not the type of the reference
• Polymorphism is very useful if we want to
access collections of mixed data types
consistently
Ex: A collection of different graphical figures,
each with a draw() method
• Each is drawn differently, so it has a different draw()
method, but the call is consistent
30
Lecture 2: Abstract Classes
• Abstract classes
Sometimes in a class hierarchy, a class may be
defined simply to give cohesion to its subclasses
• No objects of that class will ever be defined
• But instance data and methods will still be inherited
by all subclasses
This is an abstract class
• Keyword abstract used in declaration
• One or more methods declared to be abstract and are
thus not implemented
• No objects may be instantiated
31
Lecture 2: Abstract Classes
Subclasses of an abstract class must implement
all abstract methods, or they too must be
declared to be abstract
Advantages
• Can still use superclass reference to access all
subclass objects in polymorphic way
– However, we need to declare the methods we will need
in the superclass, even if they are abstract
• No need to specifically define common data and
methods for each subclass - it is inherited
Helps to organize class hierarchy
•
See API for many examples
32
Lecture 3: Interfaces
• Java allows only single inheritance
A new class can be a subclass of only one parent
(super) class
There are several reasons for this, from both the
implementation (i.e. how to do it in the compiler
and interpreter) point of view and the programmer
(i.e. how to use it effectively) point of view
However, it is sometimes useful to be able to
access an object through more than one
superclass reference
33
Lecture 3: Interfaces
• Interfaces allow us to do this:
A Java interface is a named set of methods
• Think of it as an abstract class with no instance data
• Static constants are allowed
• No static methods are allowed
Any Java class (no matter what its inheritance)
can implement an interface by implementing the
methods defined in it
A given class can implement any number of
interfaces
34
Lecture 3: Interfaces
Ex:
public interface Laughable
{
public void laugh();
}
public interface Booable
{
public void boo();
}
• Any Java class can implement Laughable by
implementing the method laugh()
• Any Java class can implement Booable by
implementing the method boo()
35
Lecture 3: Interfaces
• Ex:
public class Comedian implements Laughable, Booable
{
// various methods here (constructor, etc.)
public void laugh()
{
System.out.println(“Ha ha ha”);
}
public void boo()
{
System.out.println(“You stink!”);
}
}
36
Lecture 3: Interfaces
Recall our previous discussion of polymorphism
This behavior also applies to interfaces – the interface acts
as a superclass and the implementing classes implement
the actual methods however they want
An interface variable can be used to reference any object
that implements that interface
• However, only the interface methods are
accessible through the interface reference
Ex:
Laughable [] funny = new Laughable[3];
funny[0] = new Comedian();
funny[1] = new SitCom(); // implements Laughable
funny[2] = new Clown(); // implements Laughable
for (int i = 0; i < funny.length; i++)
funny[i].laugh();
See ex16.java, Laughable.java and Booable.java from CS 0401 Handouts
37
Lecture 3: “Generic” Operations
Let’s look at a simple example that should
already be familiar to you: Sorting
• In CS 401 you should have discussed selection sort
• Simple algorithm:
– Find smallest, swap into location 0
– Find next smallest, swap into location 1, etc.
What if we want to sort different types (ints,
doubles, Strings, or any Java type)?
• We need to write a different method for each one!!!
– The argument array must match the parameter array
• Or do we??
– Can we write a single method that can sort anything?
> Use an interface! Discuss
38
Lecture 3: “Generic” Operations
Consider the (old) Comparable interface:
• It contains one method:
int compareTo(Object r);
• Returns a negative number if the current object is less
than r, 0 if the current object equals r and a positive
number if the current object is greater than r
Look at Comparable in the API
•
Consider what we need to know to sort data:
• is A[i] less than, equal to or greater than A[j]
Thus, we can sort Comparable data without
knowing anything else about it
• Awesome!
• Polymorphism allows this to work
39
Lecture 3: “Generic” Operations
Think of the objects we want to sort as “black
boxes”
• We know we can compare them because they
•
implement Comparable
We don’t know (or need to know) anything else
about them – even though they may have many other
methods / instance variables
– Show on board
Thus, a single sort method will work for an array
of any Comparable class
• Did I mention that this was awesome!?
40
Lecture 3: "Generic" Operations
Note: In JDK 1.5 Java improved its generic
abilities by introducing parameterized types,
interfaces and methods
• We will discuss these in more detail at different points
•
throughout the term
Right now, let's just look at the Comparable interface
– Old Version
public interface Comparable
{ public int compareTo(Object rhs) }
– New Version
public interface Comparable<T>
{ public int compareTo(T rhs) }
41
Lecture 3: "Generic" Operations
Both versions allow arbitrary objects to be
compared
The difference is that in the parameterized
version, the types of the objects can be
established and checked at compile-time
With the original version, this could not be done
until run-time
To see this consider the parameter to the
compareTo() method
– In the orginal version it is Object
– In the parameterized version it is T (i.e. whichever type
is passed into the parameter)
42
Lecture 3: "Generic" Operations
• Now, for 2 objects, C1 and C2, consider the call
C1.compareTo(C2)
• In the orginal version, the compiler could not do any
type checking, since C2 can be any Object
– So if C2's object was incompatible with C1's object (i.e.
apples and oranges) this problem would not be known
until program execution
• In the new version, the compiler can check the type
of C2 and make sure it matches with the type set for
T in the definition of compareTo
– If the types are incompatible, the compiler will give an
error
43
Lecture 3: "Generic" Operations
• Why do we care?
– Compilation errors are typically much easier to resolve
than run-time errors
– We'd like to "push" as much of the error-checking as
possible to compile-time, while preserving the flexibility
of the language
– Parameterized types allow this to be done
Let's put all of this together in another handout
• See Example2.java, People.java, Worker.java,
Student.java, SortArray.java
44
Lecture 3: More on Java Generics
• Java allows for generic interfaces, classes
and methods
We saw interface example Comparable<T>
Let's look at a simple class example and a simple
method example
• We will revisit this topic again probably more than once
Let's try to (very simply) mimic the functionality
of a Java array
• We want to be a create an object with an arbitrary
number of locations
– However, once created, the size is fixed
45
Lecture 3: More on Java Generics
• We want the underlying type to be any Java type
– However, it should be homogeneous – cannot mix
types (except if type is a superclass)
• We want to be able to assign a value to a location
• We want to be able to retrieve a value from a location
• We want to be able to tell the size of the array
Let's see how to do this with Java Generics
• See MyArray.java and Example3.java
– Note that MyArray is not really a useful type – it is just
meant to demonstrate parameterized Java types
46
Lecture 4: Abstract Data Types
• Abstract Data Types (ADTs)
We are familiar with data types in Java
• For example some primitive data types: int, float,
double, boolean, or reference types such as String
We can think of these as a combination (or
encapsulation) of two things:
• The data itself and its representation in memory
– For classes these are the instance variables
• The operations by which the data can be manipulated
– For classes these are the methods
47
Lecture 4: Abstract Data Types
For example, the int type in Java
• We can think of it simply as whole numbers,
represented in some way in the computer, but this
would be incomplete
• What makes integers useful is the operations that we
can do on them, for example +, -, *, /, % and others
• It is understanding the nature of the data together
with the operations that can be done on it that make
ints useful to us
• We also discussed BigInteger previously
48
Lecture 4: Abstract Data Types
So where does the abstract part come in?
• Note that in order to use ints in our programs, we
ONLY need to know what they are and what their
operations are
– We do NOT need to know their implementation details
• Does it matter to me how the int or BigInteger is
•
•
represented in memory?
Does it matter how the actual division operation is
done on the computer?
For the purposes of using integers effectively in most
situations…NO!
49
Lecture 4: Abstract Data Types
• More generally speaking, an ADT is a data type (data
+ operations) whose functionality is separated from
its implementation
– The same functionality can result from different
implementations
– Users of the ADT need only to know the functionality
• Naturally, however, to actually be used, ADTs must be
implemented by someone at some point
– Implementer must be concerned with the
implementation details
In this course you will look at ADTs both from
the user's and implementer's point of view
50
Lecture 4: ADTs vs. Classes
• The previous slides should be familiar to you
We have already discussed the idea of data
abstraction from classes
ADTs are language-independent representations
of data types
• Can be used to specify a new data type that can then
be implemented in various ways using different
programming languages
Classes are language-specific structures that
allow implementation of ADTs
• Only exist in object-oriented or object-based
languages
51
Lecture 4: ADTs vs. Classes
A given ADT can be implemented in different
ways using different classes
• We will see some of these soon
• Ex: Stack, Queue, SortedList can be implemented in
different ways
A given class can in fact be used to represent
more than one ADT
• The Java class ArrayList can be used to represent a
Stack, Queue, Deque and other ADTs
52
Lecture 4: Interfaces as ADTs
• Consider again interfaces
Specify a set of methods, or, more generally a
set of behaviors or abilities
Do not specify how those methods are actually
implemented
Do not even specify the data upon which the
methods depend
• These fit reasonably well with ADTs
ADTs DO specify the data, but we can infer
much about the data based on the methods
53
Lecture 4: Interfaces as ADTs
• The text will typically use interfaces as ADTs
and classes as ADT implementations
Using the interface we will have to rely on
descriptions for the data rather than actual data
• The data itself is left unspecified and will be detailed
in the class(es) that implement the interfaces
– This is ok since the data is typically specific to an
implementation anyway
• Ex: ADT Stack
– Push an object onto the top of the Stack
– Pop an object off the top of the Stack
> At this (ADT) level we don't care how the data is actually
represented, as long as the methods work as specified
54
Lecture 4: ADTs for Collections of Data
• Many ADTs (especially in this course) are
used to represent collections of data
Multiple objects organized and accessed in a
particular way
The organization and access is specified by the
ADT
• Done through interfaces in Java
The specific implementation of the data and
operations can be done in various ways
• Done through classes in Java
We will examine many of these this term!
55
Lecture 4: ADT Bag
• Consider our first detailed ADT: the Bag
Think of a real bag in which we can place things
No rule about how many items to put in
No rule about the order of the items
No rule about duplicate items
No rule about what type of items to put in
• However, we will make it homogeneous by requiring
the items to be the same class or subclass of a
specific Java type
Let’s look at the interface
• See BagInterface.java
56
Lecture 4: ADT Bag
Note what is NOT in the interface:
• Any specification of the data for the collection
– We will leave this to the implementation
– The interface specifies the behaviors only
> However, the implementation is at least partially implied
> Must be some type of collection
• Any implementation of the methods
Note that other things are not explicitly in the
interface but maybe should be
• Ex: What the method should do
• Ex: How special cases should be handled
• We typically have to handle these via comments
57
Lecture 4: ADT Bag
Ex: public boolean add(T newEntry)
• We want to consider specifications from two points
of view:
1) What is the purpose / effect of the operation in the
normal case?
2) What unusual / erroneous situations can occur and
how do we handle them?
• The first point can be handled via preconditions and
postconditions
– Preconditions indicate what is assumed to be the state
of the ADT prior to the method's execution
– Postconditions indicate what is the state of the ADT
aftter the method's execution
– From the two we can infer the method's effect
58
Lecture 4: ADT Bag
– Ex: for add(newEntry) we might have:
Precondition:
Bag is in a valid state containing N items
Postconditions:
Bag is in a valid state containing N+1 items
newEntry is now contained in the Bag
– This is somewhat mathematical, so many ADTs also have
operation descriptions explaining the operation in plainer
terms
> More complex operations may also have more complex
conditions
– However, pre and postconditions can be very important
for verifying correctness of methods
59
Lecture 4: ADT Bag
• The second point is often trickier to handle
– Sometimes the unusual / erroneous circumstances are
not obvious
– Often they can be handled in more than one way
– Ex: for add(newEntry) we might have
> Bag is not valid to begin with due to previous error
> newEntry is not a valid object
– Assuming we detect the problem, we could handle it by
> Doing a "no op"
> Returning a false boolean value
> Throwing an exception
– We need to make these clear to the user of the ADT so
he/she knows what to expect
60
Lecture 4: Using a Bag
• A Bag is a simple ADT, but it can still be
useful
See examples in text
Here is another simple one
• Generate some random integers and count how many
of each number were generated
• There are many ways to do this, but one is with a bag
• See Example4.java
– Q: Is this the most efficient way of doing this?
– A: Hard to tell unless we can see how the Bag is
implemented
– Let’s do that next!
61
Lecture 5: Implementing a Bag
• Ok, now we need to look at a Bag from the
implementer's point of view
How to represent the data?
• Must somehow represent a collection of items
(Objects)
How to implement the operations?
Clearly, the implementation of the operations
will be closely related to the representation of
the data
• Minimally the data representation will "suggest" ways
of implementing the operations
62
Lecture 5: Array Implementation of a Bag
• Let's first consider using an array
Makes sense since it can store multiple values
and allow them to be manipulated in various
ways
private Object [] bag; // old way
private T [] bag; // current way
• Ok, but is just an array enough?
We know the size of an array object, once
created is fixed
We also know that our Bag must be able to
change in size (with adds and removes)
63
Lecture 5: Array Implementation of a Bag
Thus we need to create our array, then keep
track of how many locations are used with
some other variable
private int numberOfEntries;
But how big to make the array?
What if we run out of room?
• Note that the above questions are (mostly)
irrelevant to the client, but are quite important to
the implementer
Two approaches to take:
• Use a fixed size and when it fills it fills
• Dynamically resize when necessary (transparently)
64
Lecture 5: Fixed Size Array
• Fixed Size Array
Idea:
• Initialize array in the constructor
• Size is passed in as a parameter
• Once created, the size is constant as long as the list is
being used
• Once array fills, any "add" operations will fail until
space is freed (through "remove" or "clear")
Advantage?
• Easier for programmer to implement
65
Lecture 5: Fixed Size Array
Disadvantages:
• ADT user (programmer) may greatly over-allocate the
array, wasting space
– Overcompensates to not run out of room
• Program user (non-programmer) may run out of room
at run-time
– If programmer does not do above
• Neither of these is desirable
However, let's briefly look at this implementation
anyway
• Much of it will be the same for the dynamic structure
– Only differences are when array fills
66
Lecture 5: Fixed Size Array
Let's start with the simple method we have been
discussing:
public boolean add (T newEntry)
{
// what do we need to do in the normal case?
// what do we do in the abnormal case?
}
• Recall our data:
private T [] bag;
private int numberOfEntries;
• Let's figure this out
– See board
67
Lecture 5: Fixed Size Array
Let’s look at code from text:
public boolean add(T newEntry)
{
boolean result = true;
if (isFull())
{
result = false;
}
else
{ // assertion: result is true here
bag[numberOfEntries] = newEntry;
numberOfEntries++;
} // end if
return result;
} // end add
68
Lecture 5: Fixed Size Array
How about a bit more complicated operation?
public boolean remove(T anEntry)
{}
• What do we need to do here?
• Think of the “normal case”
– Must first find the item
– Then must remove it
> How?
• Think of unusual or special cases
• Let’s work up some code / pseudocode on the board
69
Lecture 5: Fixed Size Array
Consider the author’s code
/**
Removes one occurrence of a given entry from this bag.
@param anEntry the entry to be removed
@return true if the removal was successful, or false
otherwise
*/
public boolean remove(T anEntry)
{
int index = getIndexOf(anEntry);
T result = removeEntry(index);
return anEntry.equals(result);
} // end remove
70
Lecture 5: Fixed Size Array
private int getIndexOf(T anEntry)
{
int where = -1;
boolean found = false;
for (int index = 0;
!found && (index < numberOfEntries); index++)
{
if (anEntry.equals(bag[index]))
{
found = true;
where = index;
} // end if
} // end for
return where;
} // end getIndexOf
71
Lecture 5: Fixed Size Array
private T removeEntry(int givenIndex)
{
T result = null;
if (!isEmpty() && (givenIndex >= 0))
{
result = bag[givenIndex]; // entry to remove
numberOfEntries--;
bag[givenIndex] = bag[numberOfEntries];
// replace entry to remove with last entry
bag[numberOfEntries] = null;
// remove reference to last entry
} // end if
return result;
} // end removeEntry
72
Lecture 5: Fixed Size Array
Approach to implementing the other methods
should be the same
• What is the method supposed to do?
• What can go wrong and what do we do about it?
• Does our code do what we want it to do?
See text for discussion of more operations
See ArrayBag.java for entire implementation
• Note: Due to publisher restrictions, I am putting the
author’s implementations in a directory that is not
accessible outside of Pitt’s domain
– If you want to access these you must do so from a Pitt
IP address
73
Lecture 5: Dynamic Size Array
• Dynamic Size Array
Idea:
• Array is created of some initial size
– Constructor can allow programmer to pass the size in,
or we can choose some default initial size
• If this array becomes filled, we must:
– Create a new, bigger array
– Copy the data from the old array into the new one
– Assign the new array as our working array
• Some questions:
1) How big to make the new array?
2) How do we copy?
3) What happens to the old array?
74
Lecture 5: Dynamic Size Array
1) How big to make the new array?
• Clearly it must be bigger than the old array, but how
much bigger?
• What must we consider when deciding the size?
– If we make the new array too small, we will have to
resize often, causing a lot of overhead
– If we make the new array too large, we will be wasting
a lot of memory
– Let's make the new array 2X the size of the old one
–This way we have a lot of new space but are not
using outrageously more than we had before
–We will see more specifically why this was
chosen later
75
Lecture 5: Dynamic Size Array
2) How do we copy?
• This is pretty easy – just start at the beginning of
the old array and copy index by index into the new
array
• Note that we are copying references, so the objects
in the new array are the same objects that were in
the old array
3) What happens to the old array?
• It is garbage collected
Let's try this on the board, then look at code
• See ResizableArrayBag.java and Example5.java
• Note how it is largely the same as ArrayBag.java
76
Lecture 5: Dynamic Size Array
Let's look in particular at the resizing process
• Resizing is initiated when an add is performed on a list
with a full array:
public boolean add(T newEntry)
{
ensureCapacity();
// add new entry after last current entry
bag[numberOfEntries] = newEntry;
numberOfEntries++;
return true;
} // end add
– Only difference from ArrayBag is ensureCapacity()
– The resizing process is transparent to the user of the
ResizableArrayBag class
> For this operation, add() always succeeds
77
Lecture 5: Dynamic Size Array
So what does ensureCapacity() do?
• Private method to do what we described:
private void ensureCapacity()
{
if (numberOfEntries == bag.length)
bag = Arrays.copyOf(bag, 2 * bag.length);
} // end ensureCapacity
• See Arrays.copyOf() API
• Note that instance variable numberOfEntries is not
changed
– Why?
78
Lecture 5: Contiguous Memory Data Structures
• Both Bag implementations so far use
contiguous memory
Locations are located next to each other in
memory
• Given the address of the first location, we can find all of
the others based on an offset from the first
0
1
…
i
i+1
…
Benefits of contiguous memory:
• We have direct access to individual items
– Access of item A[i] can be done in a single operation
79
Lecture 5: Contiguous Memory
• Direct access allows us to use efficient algorithms such
•
as Binary Search to find an item
Arrays and array-based DS are also fairly simple and
easy to use
Drawbacks of contiguous memory
• Allocation of the memory must be done at once, in a
large block as we just discussed
– If we allocate too much memory we are being wasteful
– If we do not allocate enough, we will run out
> We have seen how our Bag can resize transparently, but
recall that this requires allocating new memory and
copying into it, which takes time to do
80
Lecture 5: Contiguous Memory
• Inserting or deleting data "at the middle" of an array
may require shifting of the other elements
– Also requires some time to do
– We did not need to do this with our Bag, but other data
structures (ex: Lists) may require this
We will discuss the details of "how much" time is
required later
• This deals with algorithm analysis
81
Lecture 6: Linked Data Structures
• Let's concentrate on the drawbacks of
contiguous memory
Is there an alternative way of storing a
collection of data that avoids these problems?
What if we can allocate our memory in small,
separate pieces, one for each item in the
collection
• Now we allocate exactly as many pieces as we need
• Now we do not have to shift items, since all of the
items are separate anyway
– Draw on board
82
Lecture 6: Linked Data Structures
But how do we keep track of all of the pieces?
• We let the pieces keep track of each other!
• Let each piece have 2 parts to it
firstNode
– One part for the data it is storing
– One part to store the location of the next piece
> This is the idea behind a linked-list
data
data
data
data
83
Lecture 6: Linked Data Structures
• Idea of Linked List:
If we know where the beginning of the list is
And each link knows where the next one is
Then we can access all of the items in the list
• Our problems with contiguous memory
now go away
Allocation can be done one link at a time, for
as many links as we need
New links can be "linked up" anywhere in the
list, without shifting needed
• Demonstrate on board
84
Lecture 6: Linked Lists
• How can we implement linked lists?
The key is how each link is implemented
As we said, two parts are needed, one for data
and one to store the location of the next link
• We can do this with a self-referential data type
class Node
{
private T data;
private Node next;
…
• A NODE is a common name for a link in a linked-list
• Note why it is called "self-referential"
85
Lecture 6: Singly Linked Lists
• Linked-List Implementation Variations
Singly Linked List
• The simple linked-list we just discussed is a singlylinked list
–
–
–
–
firstNode
Links go in one direction only
We can easily traverse the list from the front to the rear
We CANNOT go backwards through the list at all
This list is simple and (relatively) easy to implement, but
has the limitations of any "one way street"
– This implementation is developed in Chapter 3
86
Lecture 6: Singly Linked Lists
There are other variations of linked lists:
• Doubly linked list
• Circular linked list
We will discuss these shortly
For now we will keep things very simple
87
Lecture 6: Linked Bag Implementation
Let's look at this implementation a bit
public class LinkedBag<T> implements BagInterface<T>
{
private Node firstNode;
private int numberOfEntries;
…
private class Node
{
private T data;
private Node next;
private Node(T dataPortion)
{ this(dataPortion, null); }
private Node(T dataPortion, Node nextNode)
{ data = dataPortion; next = nextNode; }
} // class Node
…
} // class LinkedBag
– Note that Node is a private inner class
88
Lecture 6: Node As an Inner Class
Why is it done this way?
• Since Node is declared within LinkedBag, methods in
LinkedBag can access private declarations within Node
• This is a way to get "around" the protection of the
private data
– LinkedBag will be needing to access data and next of its
Nodes in many of its methods
– We could write accessors and mutators within Node to
allow this access
– However, it is simpler for the programmer if we can
access data and next directly
– They are still private and cannot be accessed outside of
LinkedBag
• On the downside, with this implementation, we cannot
use Node outside of the LinkedBag class
89
Lecture 6: Linked Bag Implementation
Now let's see how we would implement some of
our BagInterface methods
public boolean add (T newEntry)
{
Node newNode = new Node(newEntry); // create Node
newNode.next = firstNode;
// link it to prev. front
firstNode = newNode;
// set front to new Node
numberofEntries++;
// increment entries
return true;
} // method add
• Compare to add() in the array
implementation
• What is different?
• Is this a problem?
90
Lecture 6: Linked Bag Implementation
• Trace on board
– Try a few adds in example
• Note insertion is at the front of the bag
– New node is created and newEntry is put in it
– New node becomes new front of list, push old front
back
• Are there any special cases
– Ex: What if the bag is empty?
> firstNode will be null
> Will this be a problem?
> Any other special cases here?
91
Lecture 6: Linked Bag Implementation
Ok, that operation was simple
• How about something that requires a loop of some
sort?
• Let’s look at the contains() method
– Just like for the array, we will use sequential search
– Just like for the array, we start at the beginning and
proceed down the bag until we find the item or reach
the end
– So what is different?
>
>
>
>
How do we “move down” the bag?
How do we know when we have reached the end?
Discuss
Let’s look at the code
92
Lecture 7: Linked Bag Implementation
public boolean contains(T anEntry)
{
return getReferenceTo(anEntry) != null;
} // end contains
private Node getReferenceTo(T anEntry)
{
boolean found = false;
Node currentNode = firstNode;
while (!found && (currentNode != null))
{
if (anEntry.equals(currentNode.data))
found = true;
else
currentNode = currentNode.next;
} // end while
return currentNode;
} // end getReferenceTo
93
Lecture 7: Linked Bag Implementation
Let’s look at one more operation:
public boolean remove(T anEntry)
•We want to remove an arbitrary item from the Bag
– How do we do this?
– Think about the contains() method that we just
discussed
– How is remove similar and how is it different?
> Find the entry in question
> Then remove it
> For find we can use the getReferenceTo() method that
we just discussed
> So what about the actual remove part?
94
Lecture 7: Linked Bag Implementation
• Consider again the properties of a Bag
– The data is in no particular order
• We could remove the actual Node in question but
perhaps we can do it more easily
– The front Node is very easy to remove
> Trace on board
– So let’s copy the item in the front Node to the Node
that we want to remove
> Then we remove the front Node
– Logically, we have removed the data we want to
remove
> Keep in mind that the Nodes are not the data – they are
simply a mechanism for accessing the data
> Also keep in mind that this would NOT be ok if the data
need to stay in some kind of order
95
Lecture 7: Linked Bag Implementation
public boolean remove(T anEntry)
{
boolean result = false;
Node nodeN = getReferenceTo(anEntry);
if (nodeN != null)
{
nodeN.data = firstNode.data; // copy data from
// first Node
remove(); // removes first Node
result = true;
} // end if
return result;
} // end remove
96
Lecture 7: Linked Bag Implemention
public T remove()
{
T result = null;
if (firstNode != null)
{
result = firstNode.data;
firstNode = firstNode.next;
numberOfEntries--;
} // end if
return result;
} // end remove
Two steps to the removal
• Move firstNode to its next value
• Decrement numberOfEntries
97
Lecture 7: Linked Bag Implementation
There are other methods that we have not
discussed
Look over them in the text and in the source
code
Look at Example6.java
• Note how the data is ordered differently in the
different Bag implementations
– However, it is irrelevant to the functionality
98
Lecture 7: Node as a Separate Class
• Node class as a separate (non-inner) class
Some object-oriented purists believe it is better
to never "violate" the private nature of a class'
data
If done this way, the Node class must also be
a parameterized type
class Node<T>
{
private T
data;
private Node<T> next;
…
99
// data portion
// link to next node
Lecture 7: Node as a Separate Class
Access to next and data fields must now be
done via accessors and mutators, so these must
be included in the Node<T> class
• Ex: getData(), getNextNode() accessors
• Ex: setData(), setNextNode() mutators
– Look at rest of Node<T> class code
• See handout
Let's look at a method in LinkedBag.java we
have already discussed, but now using this
variation
• remove() method
• Differences from previous version are shown in red
100
Lecture 7: Node as a Separate Class
public boolean remove(T anEntry)
{
boolean result = false;
Node<T> nodeN = getReferenceTo(anEntry);
if (nodeN != null)
{
nodeN.setData(firstNode.getData());
remove();
result = true;
}
return result;
}
public T remove()
{
T result = null;
if (firstNode != null)
{
result = firstNode.getData();
firstNode = firstNode.getNextNode();
numberOfEntries--;
} // end if
return result;
}
101
Lecture 7: ADT List
• Consider another ADT: the List
We can define this in various ways – by its name
alone it is perhaps only vaguely specified
Let's look at how the text looks at it:
• Data:
– A collection of objects in a specific order and having the
same data type
– The number of objects in the collection
• Operations:
– add(newEntry)
– add(newPosition, newEntry)
– remove(givenPosition)
102
Lecture 7: ADT List
–
–
–
–
–
–
–
–
clear()
replace(givenPosition, newEntry)
getEntry(givenPosition)
contains(anEntry)
getLength()
isEmpty()
isFull()
toArray()
• See Ch. 12 for detailed specifications
• We will look at a few of these and see the similarities
to and differences from our Bag ADT
103
Lecture 7: Using a List
• Recall that at this point we are looking at a
•
List from a user's point of view
So what can we use it for?
A List is a very general and useful structure
• See ListInterface.java
For example:
• We can use it for Last In First Out behavior (how?)
• We can use it for First in First Out behavior (how?)
• We can access the data by index and add/remove at a
given location
• We can search for an item within the list
104
Lecture 7: ADT List
How about using it as a Bag?
• We could but would need to add the Bag methods
It may not be the ideal ADT for some of these
behaviors
• We will look at how some of these operations are
done and their efficiencies soon
However, we may choose to use it because it
can do ALL of these things
See Example7.java
105
Lecture 7: Java Standard List
• Standard Java has a List interface
Superset of the operations in author's
ListInterface
Some operations have different names
Special cases are handled differently
• Often with exceptions in the standard ADT
Indexing starts at 0
But the idea is the same
Look up List in the Java API
See Example7b.java
106
Lecture 8: Implementing a List
• Ok, now we need to look at a list from the
implementer's point of view
How to represent the data?
• Must somehow represent a collection of items
(Objects)
How to implement the operations?
Clearly, the implementation of the operations
will be closely related to the representation of
the data
• Minimally the data representation will "suggest" ways
of implementing the operations
107
Lecture 8: Array Implementation of a List
• Let's first consider using an array
Makes sense since it can store multiple values
and allow them to be manipulated in various
ways
private T [] list; // same as for Bag
We also need to keep track of the logical size
private int numberOfEntries;
To allow for an arbitrary number of items, we
will dynamically resize when needed
• Again, the same idea as for our Bag
108
Lecture 8: Array List Implementation
Let's start with an add method
• Unlike for Bag, with our List we can add at an
arbitrary index
public boolean add (int newPosition, T newEntry)
{
}
• Recall our data:
private T [] list;
private int numberOfEntries;
• Let's figure this out
– See board
109
Lecture 8: Array List Implementation
• Let's look at the code from the text
public boolean add(int newPosition, T newEntry)
{
boolean isSuccessful = true;
if ((newPosition >= 1) &&
(newPosition <= numberOfEntries + 1))
{
ensureCapacity();
makeRoom(newPosition);
list[newPosition - 1] = newEntry;
numberOfEntries++;
}
else
isSuccessful = false;
return isSuccessful;
} // end add
110
Lecture 8: Array List Implementation
How does makeRoom() work?
• A basic "shifting" algorithm
– However, be CAREFUL to shift from the correct side
– If you start on the wrong side you will copy, not shift
private void makeRoom(int newPosition)
{
assert (newPosition >= 1) && (newPosition <= numberOfEntries+1);
// move each entry to next higher index, starting at end of
// list and continuing until the entry at newPosition is moved
int newIndex = newPosition-1; int lastIndex = numberOfEntries-1;
for (int index = lastIndex; index >= newIndex; index--)
list[index+1] = list[index];
} // end makeRoom
– Try going the other way and see the result!
> Show on board
– Note also that the method is private – why?
111
Lecture 8: Array List Implementation
What about removing data?
public T remove (int givenPosition)
{
}
• Since the data must stay contiguous, in a sense we
are doing the opposite of what we did to insert
– Remove and return the item
– Shift the remaining items over to fill in the gap
– Decrement numberOfEntries
112
Lecture 8: Array List Implementation
• Let's look at the code from the text
public T remove(int givenPosition)
{
T result = null; // return value
if ((givenPosition >= 1) && (givenPosition <=
numberOfEntries))
{
// get entry to be removed
assert !isEmpty();
result = list[givenPosition-1];
// move subsequent entries toward entry to be removed,
// unless it is last in list
if (givenPosition < numberOfEntries)
removeGap(givenPosition);
numberOfEntries--;
} // end if
return result; // return reference to removed entry
// or null if givenPosition is invalid
113
} // end remove
Lecture 8: Array List Implementation
How does removeGap() work?
• Again a basic "shifting" algorithm – now the other way
– We must still be careful about where to start
private void removeGap(int givenPosition)
{
assert(givenPosition >= 1) && (givenPosition <
numberOfEntries);
// shifts entries that are beyond the entry to be
// removed to next lower position.
int removedIndex = givenPosition-1;
int lastIndex = numberOfEntries-1;
for (int index = removedIndex; index < lastIndex; index++)
list[index] = list[index+1];
} // end removeGap
– Again try going the other way and see the result!
– Note that we did not need removeGap for the Bag, but
we do for List – why?
114
Lecture 8: Array List Implementation
Approach to implementing the other methods
should be the same
• What is the method supposed to do?
• What can go wrong and what do we do about it?
• Does our code do what we want it to do?
See text for discussion of more operations
See AList.java for entire implementation
• Note: As with other author’s code segments, I will put
this in a “Pitt only” directory
115
Lecture 8: Standard Java List Classes
• We mentioned previously that in standard
Java there is a List interface similar to the
author's ListInterface
So how is the standard List implemented?
Recall that for now we are considering only
array-based implementations
• ArrayList is a class developed as part of the standard
Java Collections Framework
– Built from scratch to implement the List interface
– Uses a dynamic expanding array (similar to what we
discussed but with a slightly different size increase
factor)
116
Lecture 8: Standard Java List Classes
– In real applications where a List is needed you will
likely use ArrayList
• Vector is a class created before the Java Collections
Framework was developed
– Designed to be a dynamically expanding collection
– When the Collections Framework was developed,
Vector was retrofitted into it through the addition of the
standard List methods
– Previous methods were also kept, so for a lot of
operations there are two (almost) equivalent methods
in the Vector class, for ex:
> public E remove(int index)
> public void removeElementAt(int index)
– Note return types
117
Lecture 8: Standard Java List Classes
There is one other interesting difference
between Vector and ArrayList
• Vector is synchronized and ArrayList is not
• What does this mean?
– If multiple Threads attempt to modify a Vector "at the
same time", only one will be allowed to do so
– Idea is that the data remains consistent when used
with multiple Threads
> ArrayList makes no such guarantee
• So what are Threads, you ask?
– Objects that allow parts of programs to execute in
"pseudo-parallel"
– We will not really discuss them here
> You may discuss them in another course
118
Lecture 8: Implementing a List
Note: Text has a version (p. 332) of author’s List
interface implemented using the Vector class
• This is SILLY!
• Vector already implements the standard List interface
using an array
– Using it via composition to implement a DIFFERENT List
interface does not make sense
– We are adding an unnecessary extra level of coding
– However, it does show how composition works and
how we implement most of the operations by calling
similar methods in the Vector class
> So if for some reason we REALLY needed the author’s
ListInterface we could do it
119
Lecture 8: Linked List Implementation
• Consider now implementing our
ListInterface using a linked data structure
Much of the implementation is identical to our
LinkedBag
• Singly-linked list structure
• Node inner class with data and next fields
• Adding a new item at the front of the list is identical
• Finding an item in the list is identical
However, there are some important differences
between the two
120
Lecture 8: Linked List Implementation
The List interface requires data to be kept in
positional order
• Thus, we cannot arbitrarily move data around
– Bag always removed Nodes from the front and moved
data to allow arbitrary delete
• We can also insert and remove in a given position
– Will need to add and remove Nodes from the middle of
the list
> This was not needed for LinkedBag
Let’s focus on the parts of the LL that differ from
the LinkedBag
• For example, consider the remove(int givenposition)
method
121
Lecture 8: Linked List Implementation
public boolean remove(int givenPosition)
• What do we need to do here?
– We must first get to the object at givenPosition (i)
> There is a private method getNodeAt() to do this
> We will see the code soon
– Then we must "remove" it
> We must do this in such a way that the rest of the list is
still connected
> We must link the previous node to the next node
firstNode
Previous
Node
Next
Node
i
122
Lecture 8: Linked List Implementation
But notice that by the time we find the node
we want to delete, we have "passed" up the
node we need to link
• Since the links are one way we can't go back
firstNode
nodeBefore
nodeToRemove
nodeAfter
i
• Solution?
– Find the node BEFORE the one we want to remove
– Then get the one we want to remove and the one
after that, and change the links appropriately
123
Lecture 9: Linked List Implementation
Let's look at the getNodeAt() method:
/** Task: Returns a reference to the node at a given position.
* Precondition: List is not empty; 1 <= givenPosition <= length.
*/
private Node getNodeAt(int givenPosition)
{
assert !isEmpty() && (1 <= givenPosition) &&
(givenPosition <= numberOfEntries);
Node currentNode = firstNode;
// traverse the list to locate the desired node
for (int counter = 1; counter < givenPosition; counter++)
currentNode = currentNode.getNextNode();
assert currentNode != null;
return currentNode;
} // end getNodeAt
• Note that we start at the front of the list and follow
the links down to the desired index
– How does this compare to getting to a specific index in
an array-based list? 124
Lecture 9: Linked List Implementation
What if givenPosition > numberOfEntries?
An assertion error
This will crash our program!
So why don't we handle this possible error?
• The method is private
• The idea is that as class designers, we make sure the
error cannot occur – that is why it is an "assert"
• Users of the class cannot call this method, so there is
no problem for them
– Ex: See public T getEntry(int givenPosition)
> The index test is done BEFORE getNodeAt() is called
• We also saw this in the AList class
125
Lecture 9: Linked List Implementation
Other issues?
• What else should we be concerned with when trying
to delete a node?
– If the index is invalid we cannot delete
– Are there any special cases we have to worry about?
> This is VERY IMPORTANT in many data structures and
algorithms
> We discussed this for BagInterface but there were not
really any problems
– But what about for ListInterface?
> deleting the front node
> deleting the last remaining node (also the front node)
– Let's see, if the front node is deleted, the node before
it will be ???????????
> Special case!!!
126
Lecture 9: Linked List Implementation
• Let's look at the code:
public T remove(int givenPosition)
{
T result = null;
// initialize return value
if ((givenPosition >= 1) && (givenPosition <= numberOfEntries))
{
assert !isEmpty();
if (givenPosition == 1)
// case 1: remove first entry
{
result = firstNode.getData(); // save entry to be removed
firstNode = firstNode.getNextNode();
}
else
// case 2: givenPosition > 1
{
Node nodeBefore = getNodeAt(givenPosition-1);
Node nodeToRemove = nodeBefore.getNextNode();
Node nodeAfter = nodeToRemove.getNextNode();
nodeBefore.setNextNode(nodeAfter); // disconnect node to be
removed
result = nodeToRemove.getData(); // save entry to be removed
} // end if
numberOfEntries--;
} // end if
return result;
// return removed entry, or null if operation fails
127
} // end remove
Lecture 9: Singly Linked List Variations
• First and Last References
We discussed before that if we are inserting a
node at the end of the list, we must traverse the
entire list first to find the last previous node
This is inefficient if we do a lot of adds to the
end of the list [we'll discuss the particulars later]
We could save time if we kept an additional
instance variable (lastNode) that always refers
to the end of the list
• Now adding to the end of the list is easy!
• This was also suggested by a student – good idea!
However, it has some other interesting issues
128
Lecture 9: Singly Linked List Variations
• See on board and discuss what the issues might be
Thus, adding an extra instance variable to save
time with one operation can increase the
complexity of other operations
• Only by a small amount here, but we still need to
consider it
Let's look at an operation both without and with
the lastNode reference
• Text looks at add() methods so let's look at a different
•
one
Let's try remove()
– Let's think about this
129
Lecture 9: Singly Linked List Variations
When, if at all, will we need to worry about the
lastNode reference?
With all of these methods we want to think
about
• The "normal" case, or what we usually expect
• The "special" case that may only occur under certain
circumstances
Normal case:
• We remove a node from the "middle" of the list and
the lastNode reference does not change at all
Can we think of 2 special cases here?
• They are somewhat related
130
Lecture 9: Singly Linked List Variations
1) Removing the last (end) node in the list
• This clearly will affect the lastNode reference
• How do we know when this case occurs?
• How do we handle it
2) Removing the only node in the list
• Clearly this case is also 1) above, since the only
node is also the last node
• However, we should consider it separately, since
there may be special things that must be done if the
list is becoming empty
• How do we know when this case occurs?
• How do we handle it?
131
Lecture 9: Singly Linked List Variations
public T remove(int givenPosition)
{
T result = null;
if ((givenPosition >= 1) && (givenPosition <= numberOfEntries))
{
assert !isEmpty();
if (givenPosition == 1)
{
result = firstNode.getData();
firstNode = firstNode.getNextNode();
if (numberOfEntries == 1)
lastNode = null;
}
else
{
Node nodeBefore = getNodeAt(givenPosition-1);
Node nodeToRemove = nodeBefore.getNextNode();
Node nodeAfter = nodeToRemove.getNextNode();
nodeBefore.setNextNode(nodeAfter);
result = nodeToRemove.getData();
if (givenPosition == numberOfEntries)
lastNode = nodeBefore;
Code to handle
deleting only node
Code to handle
deleting last node
}
} // end if
numberOfEntries--;
} // end if
return result;
// end remove
132
Lecture 9: Singly Linked List Variations
• Circular Linked List
Now instead of null, the last node has a
reference to the front node
What is good about this?
Which node(s) should we keep track of?
• Why?
– Think about adding at the beginning or end
– Can be effectively used for a Queue (see board)
– We will look at this more later
lastNode
133
Lecture 9: Other Linked List Variations
• Doubly Linked List
Each node has a link to the one before and the
one after
• Call them previous and next
• Now we can easily traverse the list in either direction
– Gives more general access and can be more useful
– This is more beneficial if we have a reference to the end
of the list as well as the beginning, or we make it circular
– Used in standard JDK LinkedList and in author’s Deque
• Some operations may be somewhat faster
• But more overhead involved
– What overhead do we mean here?
• We may look in more detail if we have time
134
Lecture 9: Stacks
• One of the simplest and most commonly used
data structures is the Stack
Stack
• Data is added and removed from one end only (typically
called the top)
• Logically the top item is the only one that can even be
seen
– Think of a plate warmer in a buffet
• Fundamental Operations
– Push an item onto the top of the stack
– Pop an item from the top of the stack
– Peek at the top item without disturbing it
• See StackInterface.java
135
Lecture 9: Stacks
• A Stack organizes data by Last In First
•
Out, or LIFO (or FILO – First In Last Out)
This access, although simple, is useful for
a variety of problems
Let's look at a few applications before we
discuss the implementation
• Run-time Stack for method calls (especially
recursive calls)
– We will see this when we discuss recursion
– When a method is called, its activation record is
pushed onto the run-time stack
– When it is finished, its activation record is popped
from the run-time stack
136
Lecture 10: Stacks
• Testing for matching parenthesis
(()())() – match
((((())))) – match
((()) – don't match (not enough right parens)
())( – don't match (parens out of order)
([)] – don't match (wrong paren type)
• How can we code this using a Stack?
– Let's solve this problem together
– Ok, what do we need:
> A character variable to store the current character
> A Stack (we need to figure out how it's used)
> A way to input the data
137
Lecture 10: Stacks
• Discuss different cases and develop idea
– When do we push, when do we pop and how do we
test?
– Let's consider the cases one at a time and see what we
need to do to determine them
– Do on board
• Look at code: Driver.java & BalanceChecker.java
– From the Authors
138
Lecture 10: Stacks
• Stacks can also be used to evaluate postfix expressions
Operators follow operands
• Useful since no parentheses are needed
Ex: 20 10 6 – 5 4 * + 14 - / = ??
General algorithm?
• Idea is that each operator seen is used on the two
most recently seen (or generated) operands
– So for example, the "–" is used on 10 and 6
– So what do we do with operands before seeing an
operator, or after we evaluate an intermediate result?
> Discuss and trace example on board
139
Lecture 10: Stacks
We can also use a stack to convert from infix
notation to postfix notation
• Ex: (a + b) * (c – d * e)
•
•
•
ab+cde*–*
This process is somewhat more complicated, since
we need to be able to handle operands, operators
(of different precedence) and possibly parentheses
We will also need a StringBuilder (or StringBuffer)
to store the result
This process is discussed in detail in Section 5.115.16 of the text
– Read over it carefully – it is explained quite well in
the book
140
Lecture 10: Stacks
• Stack Implementation?
A Stack can easily be implemented using
either an array or a linked list
• Array:
– Push?
– Pop?
– See ArrayStack.java
• Linked List:
– Push?
– Pop?
– See LinkedStack.java
141
Lecture 10: Stacks
• In Java Collections Framework: class Stack extends
class Vector, defining the Stack operations
appropriately
– Look at code
– Note style problem: All Vector operations are still
available, allowing user to violate Stack restrictions
– Would have been better to make the Stack an
interface, as was done with the Queue
142
Lecture 10: Algorithm Analysis
• Consider different ADT implementations
We have talked about efficiency differences, but
we have been somewhat vague about it
Now we will look at algorithm efficiencies in a
more formal way
• Mathematically
Why do we care about formalizing this?
• Consider all of the work involved in implementing a
new ADT
– It is non-trivial to get all of the operations working
correctly
– Many special cases and much debugging is required
143
Lecture 10: Algorithm Analysis
• If we could determine whether or not an
implementation was good before actually doing the
work, it could save us a lot of time
– Inefficient potential implementations could be
abandoned before they are even done
Ex: Sum of integers example in text (Sections 4.1-4.2)
Ex: One you should be familiar with
• Searching a sorted array
– Assume the array has N items in it
– Sequential search can take up to N tests to find the item
– Binary search will take at most log2N tests to find the
item
> So is this a big difference?
144
Lecture 11: Algorithm Analysis
• Let's first look at the tests for 1 search:
N
lg2N
8
3
16
4
32
5
64
6
…
…
1024
10
1M
20
145
Lecture 11: Algorithm Analysis
• Now consider multiple searches
Let's say for example I need to do 1 million
searches of 1 million items
• For sequential search this could be up to
– 1M x 1M = 1T = 1012 WOW!
• For binary search this would be
– 1M x 20 = 20M = 2x107 What a difference
Assume each test takes a nanosecond (10-9)
• For sequential search we need
– 1012(10-9) = 103 seconds
= (103)(1 minute/60 seconds)
= 16.6666 minutes
146
Lecture 11: Algorithm Analysis
• For binary search we need
– 2x107(10-9) = 0.02 sec
The difference is amazing
• Just rethinking our algorithm takes us from something
that would take minutes to something that just takes
a fraction of a second
• Other examples can have even more extreme
differences
– See CS 1501
• By analyzing our algorithm BEFORE implementing it,
we can thus avoid algorithms that will require too
much time to run
147
Lecture 11: Algorithms and Complexity
• Measuring Execution Time
How to compare execution times of algorithms?
• Certainly we can time them empirically
– This will give us actual run-times that we can use to
compare
– Very useful for algorithms/ADTs that have already
been developed into programs – already implemented
– But we said previously that often it is good to get a
ballpark on the runtime of an algorithm/ADT BEFORE
actually implementing it
> Perhaps we wouldn't want to go through the effort if
the algorithm is not going to be useful
148
Lecture 11: Algorithms and Complexity
Asymptotic analysis
• Do not time actual program – in fact we may not
necessarily even have a program
• Instead do the following:
1) Determine some key instruction or group of
instructions that controls the overall run-time
behavior of the algorithm
– For example, for sorting we need to compare items to
each other
– Even though sorting involves other instructions, we
can say that the overall run-time is directly
proportional to the number of comparisons done
149
Lecture 11: Algorithms and Complexity
2) Determine a formula / function for how the number
•
of key instructions increase as the problem size
increases (typically we use the variable N for the
problem size)
We typically are concerned with two different
results
– Worst Case Time: What is the formula for the
MAXIMUM number of key instructions relative to N
> We should know what the worst case time can be so
that we can plan for it if necessary
– Average Case Time: What is the formula for the
AVERAGE number of key instructions relative to N
> How will the algorithm do normally?
150
Lecture 11: Algorithms and Complexity
3) Only worry about the order of magnitude
– We use the measure Big-O for this
– For a given formula, we ignore lower order terms and
constant multipliers
• Ex: Let's say we determine the formula for the
comparisons for a given sorting algorithm in the
worst case to be (N2/2) – (N/2)
– We say the Big-O run-time of this sorting algorithm is
O(N2)
• We ignore lower order terms because …
– they become less significant as the problem size
increases
> Compare some function growth rates to see this point –
see board.
151
Lecture 11: Algorithms and Complexity
• We ignore constant multipliers because …
– they can depend on programmer, lang., computer, etc.
>
>
>
>
Program A written by Joe Schmoe runs in time 4N
Program B written by Jill Schmill runs in time 2N
Maybe Jill is a better programmer than Joe
Maybe one compiler makes more efficient code than the
other
How about some simple examples:
• Constant time O(1)
Y = X;
i++;
• Linear time O(N)
for (int i = 0; i < N; i++)
do_some_constant_time_operation
152
Lecture 11: Algorithms and Complexity
• Quadratic Time O(N2)
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++)
do_some_constant_time_op;
• We will look at some others as well later on
• So for searching that we mentioned previously:
– Sequential Search is O(N)
> Why? Single while loop with up to N iterations
– Binary Search is O(lg2N)
> Why? This one is a bit trickier
> We still have a loop, but now the number of iterations is
very different
> Let's look at the code from the standard Java library
> In the java.util.Arrays class
153
Lecture 11: Algorithms and Complexity
public static int binarySearch(Object[] a, Object key) {
int low = 0;
int high = a.length-1;
while (low <= high) {
int mid =(low + high)/2;
Object midVal = a[mid];
int cmp = ((Comparable)midVal).compareTo(key);
if (cmp < 0)
low = mid + 1;
else if (cmp > 0)
high = mid - 1;
else
return mid; // key found
}
return -(low + 1); // key not found.
}
What is the "worst case" for this?
154
Lecture 11: Algorithms and Complexity
To simplify calculations we'll cheat a bit:
1) Assume that the array is cut exactly in half with each
iteration
– In reality it may vary by one element either way
2) Assume that the initial size of the array, N is an
exact power of 2, or 2K for some K
– In reality it can be any value
– However, it will not affect our results
Ok, so we have the following:
• Initially: N0 = 2K
2K-1 (in terms of K)
• At iteration 1, N1 = N0/2 =
• ...
• Last iteration is when N = 1 = 20 (in terms of K)
155
Lecture 11: Algorithms and Complexity
• We do one comparison (test) per iteration
• Thus we have a total of K+1 comparisons maximum
– But N = 2K
– So K = lg2N
– Which makes K + 1 =
lg2N + 1
• This leads to our final answer of
156
O(lg2N)
Lecture 11: Algorithms and Complexity
• Let's look at another example
Consider our Bag implementations
We can now formally analyze the run-times of
some operations, to determine which
implementation is better for which operations (if
at all)
157
Lecture 12: Algorithms and Complexity
add(newEntry)
• Recall that this version of the method adds to the end
of the list
• Runtime for Resizable Array ?
O(1)
We can go directly to the last location and insert there
• What about the time to resize?
The answer above is a bit deceptive
Some adds take significantly more time, since we have to first
allocate a new array and copy all of the data into it – O(N) time
So we have O(N) + O(1) O(N) total
158
Lecture 12: Algorithms and Complexity
• So we have an operation that sometimes takes O(1)
and sometimes takes O(N)
How do we handle this issue?
•
Amortized Time (see http://en.wikipedia.org/wiki/Amortized_analysis )
• Average time required over a sequence of operations
• Individual operations may vary in their run-time, but
•
we can get a consistent time for the overall
sequence
Let's stick with the add() method for resizable bag
and consider 2 different options for resizing:
1) Increase the array size by 1 each time we resize
2) Double the array size each time we resize (which is
the way the authors actually did it)
159
Lecture 12: Algorithms and Complexity
1) Increase the array size by 1 each time we resize
• Note that with this approach, once we resize we will
have to do it with every add
• Thus rather than O(1) our add() is now O(N) all the
time
• Specifically, assume the initial array size is 1
–
–
–
–
–
On insert 1 we just add the item (1 assignment)
On insert 2 we allocate and assign 2 items
On insert 3 we allocate and assign 3 items
…
Overall for N add() ops look at the total number of
assignments we have to make:
1 + 2 + 3 + … + N = N(N+1)/2 O(N2)
160
Lecture 12: Algorithms and Complexity
2) Double the array size each time we resize
Insert #
# of assignments
End array
size
1
1
1
2
2 = 1 + 20
2
3
3 = 1 + 21
4
4
1
4
5
5 = 1 + 22
8
…
1
8
9
9 = 1 + 23
16
…
1
16
17
17 = 1 + 24
32
…
1
32
32
1
32
161
Lecture 12: Algorithms and Complexity
Note that every row has 1 assignment (blue)
Rows that are 2K + 1 for some K have an
additional 2K assignments (red) to copy data
So for N adds, we have a total of
• N assignments for the actual add
• 20 + 21 + … + 2x for the copying
• What is x?
[ceiling (lg2N) – 1] ( = lg2N – 1 if N is a power of 2)
• This gives us the geometric series
lg2 N 1
2
i
2
lg2 N
1 N 1 O( N )
i 0
162
Lecture 12: Algorithms and Complexity
Total is N + (N-1) = 2N-1 O(N)
Since we did N add() operations overall, our
amortized time is O(N)/N = O(1) – constant
Recall that when increasing by 1 we had O(N2)
overall for the sequence, which gives us O(N) in
amortized time
• Note how much better our performance is when we
double the array size
Ok, that one was a bit complicated
• Had a good deal of math in it
• But that is what algorithm analysis is all about
• If you can do some math you can save yourself some
programming!
163
Lecture 12: Algorithms and Complexity
• What about the run-time for the singly linked list?
– Recall the add() method that adds to the front of the
list
– Discuss
Text discusses other Bag operations
• It turns out that for the Bag, the run-times for the
array and the linked list are the same for every
operation
• This will not always be the case
– We will see a difference soon for List operations
164
Lecture 12: Algorithms and Complexity
What about the Stack implementations?
• ArrayStack adds and removes from end of the array
– O(1) with no required resizing
– O(N) to add when resizing is necessary
– As we discussed with the Bag, even with the resizing if
we double the size, the add() method is amortized O(1)
• LinkedStack adds and removes from front of list
– O(1) always to create or remove node
So all Stack operations are constant time
165
Lecture 12: List Run-time Complexity
• What are the Big-O complexities for our List
implementations?
We saw for the Bag that it did not matter
(much) whether we used an array or a linked list
Can we say the same for the ListInterface?
Let’s look at one operation in particular to
highlight the difference:
• getEntry(int i)
• This accesses an arbitrary location in the list
• Let’s compare our AList and LList implementations
with regard to this operation
166
Lecture 12: List Run-time Complexity
For the AList, we simply index our array, and
can access entry in:
• O(1) time
What about our LList?
• Now it depends on the index
• Sequential access requires us to traverse the list i
•
Nodes to get to Nodei
Worst case?
– getEntry() is O(N) worst case for the LinkedList
> Note that it could be less, depending on where the object
is located
> So maybe we should also consider the average case
here, to be thorough
167
Lecture 12: List Run-time Complexity
To do this we need to make an assumption
about the index chosen
• Let's assume that all index values are equally likely
– If this is not the case, we can still do the analysis, if we
know the actual probability distribution for the index
choice
• Our assumption means that, given N choices for an
index, the probability of choosing a given index, i,
(which we will call P(i)) is
– 1/N for any i
• Let's define our key operation to be "looking at" a
node in the list
– So for a given index i, we will require i operations
– Let's call this value Ops(i)
168
Lecture 12: List Run-time Complexity
• Now we can define the average number of operations
to be:
Ave Ops
•
•
=
Sum_over_i (Ops(i) * P(i))
=
Sum_over_i (i * 1/N)
=
1/N * Sum_over_i (i)
=
1/N * [N * (N+1)]/2
=
(N+1)/2
In an absolute sense, this is better than the worst
case, but asymptotically it is the same (why?)
So in this case the worst and average cases are the
same
– This will not always be the case, as we will soon see
169
Lecture 13: Recursion
• Recursion
Idea
• Some problem P is defined/solved in terms of one or
more problems P', which are identical in nature to P
but smaller in size
Requirements
• 1 or more base cases in which no recursive call is
made
• 1 or more recursive cases in which the algorithm is
defined in terms of itself
• The recursive cases must eventually lead to a base
case
170
Lecture 13: Recursion
• Simple Examples of Recursive Algorithms
A lot of recursive problems have their origins in
mathematics
• Factorial – N! =
– Iterative definition: N * (N-1) * (N-2) * … * 1
– Recursive definition:
N! = N * (N-1)!
N! = 1
when N > 0
when N = 0
– Let's look at our 3 requirements:
> 1 base case when N = 0
> 1 recursive case when N > 0
> Since recursive case has argument of N-1, it should
always lead to a base case…but does it always?
> Be CAREFUL – make sure it always works!
171
Lecture 13: Recursion
Let's look at another simple example
• Integer Powers – XN
– Iterative Definition: X * X * X * … * X – N times
– Recursive Definition:
XN = X * X(N-1)
XN = 1
when N > 0
when N = 0
• Our 3 requirements
– 1 base case when N = 0
– 1 recursive case when N > 0
– Decrementing of N gives similar situation to that of
factorial
> Normally base case is always reached, unless N is initially
negative
172
Lecture 13: Recursion
• How to code using recursion?
Many recursive programs are very similar to the
underlying mathematical definitions
Let's look at Factorial:
public long factorial (int N)
{
if (N < 0)
throw new IllegalArgumentException();
if (N <= 1)
return 1;
return N * factorial(N-1);
}
• Note that negative N generates an exception
• Function is calling itself, using the result in the return
expression
173
Lecture 13: Recursion
• How does recursion work?
2 important ideas allow recursion to work
• Activation Record (AR)
– A block of memory allocated to store parameters, local
variables and the return address during a
function/method call
– An AR is associated with each method CALL, so if a
method is called multiple times, multiple ARs are
created
• Run-Time Stack (RTS)
– Area of computer's memory which maintains ARs in
Last In First Out (LIFO) order
> We will discuss Stacks in detail soon
174
Lecture 13: Recursion
When a method is called
• An AR containing the parameters, return address and
local variables is pushed onto the top of the RTS
• If the method subsequently calls itself, a new, distinct AR
containing new data is pushed onto the top of the RTS
• The AR at the TOP of the RTS represents the currently
executing call
– ARs below represent previous calls that are waiting to be
returned to
• When top call terminates, control returns to the address
from top AR and then the top AR is popped from the RTS
See Example9.java
175
Lecture 13: Recursion
Let's look at one more simple example
• Sequential Search
– Find key in an array by checking each item in sequence
– We know how to do this iteratively
> Simple for loop or while loop to go through each item
> We have done this with the contains() method for both
the dynamic array bag and the linked bag
– Let's see how to do it recursively
> Remember to always consider the problem in terms of a
smaller problem of the same type
– Remember that we need
> Base case
> Recursive case
> Recursive calls must lead to base case
176
Lecture 13: Recursion
• In order to search for a key in an array of length N
we check the length
– If length == 0, we are done (base case not found)
– Else check the first element of the array
> If first element == key, we are done (base case
found)
> Else Sequential Search the remaining N-1 elements
(recursive case)
• Once we have this idea, we can quickly convert it
into code
See SeqSDemo.java
177
Lecture 14: Exam 1
• Exam One
178
Lecture 15: Recursion and Divide and Conquer
• So far
Recursive algorithms that we have seen (see
text for more) are simple, and probably would
NOT be done recursively
• The iterative solutions work fine
• They are just used to demonstrate how recursion
works
However, recursion often suggests approaches
to problem solving that are more logical and
easier than without it
• For example, divide and conquer
179
Lecture 15: Recursion and Divide and Conquer
Let's look at one of our earlier recursive
problems – Power function, XN
• We have already seen a simple iterative solution using
•
a for loop
We have already seen and discussed a simple
recursive solution
– Note that the recursive solution does recursive calls
rather than loop iterations
– However both algorithms have the same runtime:
> We must do O(N) multiplications to complete the
problem
– Can we come up with a solution that is better in terms
of runtime?
> Let's try Divide and Conquer
180
Lecture 15: Recursion and Divide and Conquer
• Divide and Conquer
The idea is that a problem can be solved by
breaking it down to one or more "smaller"
problems in a systematic way
• Usually the subproblem(s) are a fraction of the size of
•
•
the original problem
Usually the subproblems(s) are identical in nature to
the original problem
It is fairly clear why these algorithms can typically be
solved quite nicely using recursion
181
Lecture 15: Recursion and Divide and Conquer
• We can think of each lower level as solving
the same problem as the level above
The only difference in each level is the size of the
problem, which is ½ of that of the level above it
Note how quickly the problem size is reduced
Classic problem: Can you fold a sheet of paper in
half more than 7 times Google it – many links!
182
Lecture 15: Recursion and Divide and Conquer
How can we apply this to the Power fn?
• We typically need to consider two important things:
1) How do we break up or "divide" the problem into
subproblems?
– In other words, what do we do to the data to process it
before making our recursive call(s)?
2) How do we use the solutions of the subproblems to
generate the solution of the original problem?
– In other words, after the recursive calls complete, what
do we do with the results?
• For XN the problem "size" is the exponent, N
– So a subproblem would be the same problem with a
smaller N
183
Lecture 15: Recursion and Divide and Conquer
• Let's try cutting N in half – use N/2
1) We want to define XN somehow in terms of XN/2
– We can't forget the base case
2) We need to determine how the original problem is
solved in terms of the solution XN/2
– Do on board (and see notes below)
• Will this be an improvement over the other 2
versions of the function?
– It seems like it since the problem is being cut in half
each time
– Informal analysis shows we only need O(log2N)
multiplications in this case (see text)
> Same idea as the analysis for binary search
> Let's look at the code – Power.java
184
Lecture 15: Recursion and Binary Search
• Now let's reconsider binary search, this
time using using recursion
Recall that the data must be in order
You are searching for object S
How do we divide?
• Cut the array in half – makes sense since the
iterative version cuts the array in half as well
How do we conquer?
• This is trickier – in fact we may not really need to
do anything here at all – let's see
185
Lecture 15: Recursion and Binary Search
Ok, what about base case?
• Two cases actually
– Base case not found – array size is down to zero
– Base case found – key matches current item
What about the recursive case?
• Consider the middle element, M, and check if S is:
– Equal to M: you are done and you have found it
> One of the base cases
– Less than M: recurse to the left side of the array
– Greater than M: recurse to the right side of the array
• Same logic as the iterative version
186
Lecture 15: Recursion and Binary Search
• Proceeding in this fashion removes ½ of the
remaining items from consideration with each guess
– i.e. with each recursive call
• Let's compare this to iterative binary search
• We will also compare it to sequential search
• See BSTest.java
– Counts the number of comparisons required for the
searches
– Clearly as N gets larger the difference becomes quite
significant
• Also read Chapter 18 of the Carrano text
– It discusses both sequential search and binary search
187
Lecture 15: More Recursion
• So far
Every recursive algorithm we have seen can be
done easily in an iterative way
• Even the divide and conquer algorithms (Binary
Search, Power function) have simple iterative
solutions
Can we tell if a recursive algorithm can be easily
done in an iterative way?
• Yes – any recursive algorithm that is exclusively tail
•
recursive can be done simply using iteration without
recursion
Most algorithms we have seen so far are exclusively
tail recursive
188
Lecture 15: Tail Recursion
• So what is tail recursion?
Recursive algorithm in which the recursive call
is the LAST statement in a call of the method
• Look at algorithms so far to see this is true (ignore
trace versions, which add extra statements)
– Note Power does some math after the call, but it can
still be done easily in an iterative way, even the divide
and conquer version
• What are the implications of tail recursion?
Any tail recursive algorithm can be converted
into an iterative algorithm in a methodical way
• In fact some compilers do this automatically
189
Lecture 16: Overhead of Recursion
• Why do we care?
Recursive algorithms have overhead associated
with them
• Space: each activation record (AR) takes up memory
in the run-time stack (RTS)
– If too many calls "stack up" memory can be a problem
– We saw this when we had to increase the stack size for
BSTest.java
• Time: generating ARs and manipulating the RTS takes
time
– A recursive algorithm will always run more slowly than
an equivalent iterative version
190
Lecture 16: Overhead of Recursion
• So what good is recursion?
1) For some problems, a recursive approach is
more natural and simpler to understand than
an iterative approach
• Once the algorithm is developed, if it is tail
recursive, we can always convert it into a faster
iterative version (ex: binary search, power)
2) For some problems, it is very difficult to even
conceive an iterative approach, especially if
multiple recursive calls are required in the
recursive solution
Example: Backtracking problems
191
Lecture 16: Recursion and Backtracking
• Idea of backtracking:
Proceed forward to a solution until it becomes
apparent that no solution can be achieved
along the current path
• At that point UNDO the solution (backtrack) to a
point where we can again proceed forward
Example: 8 Queens Problem
• How can I place 8 queens on a chessboard such
that no queen can take any other in the next move?
– Recall that queens can move horizontally, vertically or
diagonally for multiple spaces
• See on board
192
Lecture 16: 8 Queens Problem
How can we solve this with recursion and
backtracking?
• We note that all queens must be in different rows
and different columns, so each row and each column
must have exactly one queen when we are finished
– Complicating it a bit is the fact that queens can move
diagonally
• So, thinking recursively, we see the following
– To place 8 queens on the board we need to
> Place a queen in a legal (row, column)
> Recursively place 7 queens on the rest of the board
• Where does backtracking come in?
– Our initial choices may not lead to a solution – we
need a way to undo a choice and try another one
> See example on board
193
Lecture 16: 8 Queens Problem
Using this approach we come up with the solution
as shown in 8-Queens handout
• JRQueens.java
Idea of solution:
• Each recursive call attempts to place a queen in a
specific column
– A loop is used, since there are 8 squares in the column
• For a given call, the state of the board from previous
placements is known (i.e. where are the other
queens?)
– This is used to determine if a square is legal or not
• If a placement within the column does not lead to a
solution, the queen is removed and moved "down" the
column
194
Lecture 16: 8 Queens Problem
• When all rows in a column have been tried, the call
•
•
terminates and backtracks to the previous call (in the
previous column)
If a queen cannot be placed into column i, do not
even try to place one onto column i+1 – rather,
backtrack to column i-1 and move the queen that had
been placed there
See handout for code details
• Why is this difficult to do iteratively?
We need to store a lot of state information as
we try (and un-try) many locations on the board
• For each column so far, where has a queen been
placed?
195
Lecture 16: 8 Queens Problem
The run-time stack does this automatically for us
via activation records
• Without recursion, we would need to store / update
•
this information ourselves
This can be done (using our own Stack rather than
the run-time stack), but since the mechanism is
already built into recursive programming, why not
utilize it?
There are many other famous backtracking
problems
• http://en.wikipedia.org/wiki/Backtracking
196
Lecture 17: Towers of Hanoi
• Another Famous Recursive Algorithm:
Towers of Hanoi Problem
Problem:
• We have 3 towers
• On first tower we have disks of decreasing size
• Goal is to get all disks onto last tower, but
– We can only move one disk at a time
– We can never put a larger disk on top of a smaller one
Let's play and see why it is so difficult to solve in
an iterative way
• Volunteer?
197
Lecture 17: Towers of Hanoi
• Why is this problem difficult iteratively?
A recursive algorithm with a single recursive call
still provides a linear chain of calls
Calls build run-time stack
Stack shrinks as calls finish
198
Lecture 17: Execution Trees
When a recursive algorithm has 2 calls, the
execution trace is now a binary tree, as we saw
with the trace on the board
• This is execution is more difficult to do without
recursion
– To do it, programmer must create and maintain his/her
own stack to keep all of the various data values
– This increases the likelihood of errors / bugs in the
code
Later we will see some other classic recursive
algorithms with multiple calls
• Ex: MergeSort, QuickSort
199
Lecture 17: Sorting
• Sorting is a very common and useful process
We sort names, salaries, movie grosses, Nielsen
ratings, home runs, populations, book sales, to
name a few
It is important to understand how sorting works
and how it can be done efficiently
By default, we will consider sorting in increasing
order:
• For all indices, i, j: if i < j, then A[i] <= A[j]
– Note we are allowing for duplicates here
– Note that for decreasing order we simply change right
side to A[i] >= A[j]
200
Lecture 17: Simple Sorts
• Simple Sorting Algorithms
Insertion Sort
Idea:
• "Remove" the items one at a time from the original
array and "Insert" them into a new array, putting
them into the correct sorted order as you insert
• We could accomplish this by using two arrays as
implied above, but that would double our memory
requirements
– We'd rather be able to sort in place
> Use only a constant amount of extra memory
201
Lecture 17: Simple Sorts
• To actually implement we are going to think of the
array in two parts
SORTED
UNSORTED
• In each iteration of our outer loop, we will take an
item out of the UNSORTED section and put it into its
correct relative location in the SORTED section
0
1
2
3
4
5
6
7
20
20
20
10
40
30
30
20
70
40
40
30
30
70
50
40
50
50
70
50
10
10
10
70
80
80
80
80
60
60
60
60
202
Lecture 17: Simple Sorts
• Let's look at some code (from prev. text edition)
public static <T extends Comparable<? super T>> void insertionSort(T[] a, int n)
{
insertionSort(a, 0, n - 1);
} // end insertionSort
public static <T extends Comparable<? super T>>
void insertionSort(T[] a, int first, int last)
{
int unsorted, index;
for (unsorted = first + 1; unsorted <= last; unsorted++)
{
// Assertion: a[first] <= a[first + 1] <= ... <= a[unsorted - 1]
T firstUnsorted = a[unsorted];
insertInOrder(firstUnsorted, a, first, unsorted - 1);
} // end for
} // end insertionSort
private static <T extends Comparable<? super T>>
void insertInOrder(T element, T[] a, int begin, int end)
{
int index;
for (index = end; (index >= begin) && (element.compareTo(a[index]) < 0); index--)
{
a[index + 1] = a[index]; // make room
} // end for
a[index + 1] = element; // Assertion: a[index + 1] is available
} // end insertInOrder
203
Lecture 17: Simple Sorts
The code is a bit wordy – the authors present it
in this way to be more readable
Idea:
• Initial method has only array and length as params
• This calls an overloaded version with start and end
index values as params – allows us to sort only part of
the array if we want
– Each iteration in this method brings one more item
from the unsorted portion of the array into the sorted
portion
– It does this by calling another method to actually move
the value into its correct spot
> Values are shifted from left to right, leaving a "hole" in
the spot where the item should be
204
Lecture 17: Simple Sorts
• Run-time of InsertionSort?
Consider it in terms of comparisons of array items
• What is the WORST possible case scenario?
– Consider each iteration of the insertionSort loop
>
>
>
>
when unsorted = 1, 1 comparison in insertInOrder method
when unsorted = 2, 2 comparisons in insertInOrder method
…
when unsorted = N-1, N-1 comps in insertInOrder method
– Overall we get 1 + 2 + … + N-1 = (N-1)(N)/2
> Considering Big O, we have O(N2)
• On average, the actual comparisons are a bit better, but it
is still O(N2)
205
Lecture 17: Simple Sorts
Can we use InsertionSort on a linked list?
• What do you think?
• Yes – in fact it is probably more natural with a linked
list
– At each iteration simply remove the front node from the
list, and "insert it in order" into a second, new list
– In this case we are not creating ANY new nodes – just
moving the ones we have around
– Do demo no board
• Run-time?
– Same run-time, but interestingly, the worst case
situation is the opposite of that for the array version
> Discuss
• See p. 208 in text
206
Lecture 18: Simple Sorts
• Two other well-known simple sorts:
SelectionSort
• At iteration i of the outer loop, find the ith smallest
item and swap it into location i
i = 0 : find 0th smallest and swap into location 0
i = 1 : find 1th smallest and swap into location 1
…
i = N-1 : find (N-1)th smallest and swap into loc N-1
• Also a very simple implementation using nested for
•
•
loops (or method calls, as shown in text)
We saw this algorithm earlier in the term with
Example2.java (go back and look at SortArray.java)
See example on board
207
Lecture 18: Simple Sorts
BubbleSort
– Item j is compared to item j+1
– If data is sorted, item j should be less than item j+1
> In this case we do nothing
– If item j is greater than item j+1, they are out of order
> In this case we swap them
– Continue from beginning again until sorted
0
1
2
3
4
5
6
50
30
40
70
10
80
20
30
50
40
70
10
80
20
30
40
50
70
10
80
20
30
40
50
70
10
80
20
30
40
50
10
70
80
20
30
40
50
10
70
80
20
30
40
50
10
70
20
80
208
Lecture 18: Simple Sorts
Text also discusses recursive implementations of
InsertionSort and SelectionSort
• As with Sequential Search and some other simple
•
problems, this is more to show how it can be done
rather than something that we would actually do
Read over these explanations and convince yourselves
that the recursive versions do the same thing as the
iterative versions
209
Lecture 18: Simple Sorts
SelectionSort also has O(N2) run-time
Note that all of these simple sorting algorithms
have similar run-times in the worst case
• InsertionSort – O(N2)
• SelectionSort – O(N2)
• BubbleSort – O(N2)
For a small number of items, their simplicity
makes them ok to use
But for a large number of items, this is not a
good run-time
We'd like to come up with something better
210
Lecture 18: Shellsort
• To improve on our simple sorts it helps to
•
consider why they are not so good
Let's again consider InsertionSort
What about the algorithm makes its
performance poor?
Consider what occurs with each comparison
• Either nothing (if items are relatively in order)
• Or a data move of 1 location
– i.e. it only moves a small amount
• If the data is greatly out of order, it will take a lot of
comparisons to get into order
211
Lecture 18: Shellsort
If we can move the data farther with one com-
parison, perhaps we can improve our run-time
This is the idea of Shellsort
• Rather than comparing adjacent items, we compare
•
items that are farther away from each other
Specifically, we compare and "sort" items that are K
locations apart for some K
– i.e. We Insertionsort subarrays of our original array that
are K locations apart
• We gradually reduce K from a large value to a small
one, ending with K = 1
– Note that when K = 1 the algorithm is straight
Insertionsort
212
Lecture 18: Shellsort
0
1
2
3
4
5
6
7
40 20
70
60
50
10
80
30
40 10
70
30
50
20
80
60
40 10
70
30
50
20
80
60
40 10
50
20
70
30
80
60
40 10
50
20
70
30
80
60
10 20
30
40
50
60
70
80
The idea is that by the time K = 1, most of the data will
not have very far left to move
213
K=4
K=2
K=1
Lecture 18: Shellsort
It seems like this algorithm will actually be
worse than Insertionsort – why?
• It's last "iteration" is a full Insertionsort
• Previous iterations do Insertionsorts of subarrays
Yet, when timed it actually outperforms
Insertionsort
• Exact analysis is tricky, and depends on initial value
for K
– Insertionsort actually has a very good run-time (O(N))
in the best case – Shellsort moves the data toward this
best case
• A good implementation will have about N3/2 execution
– Compare to N2 for large N
– See text for more details
214
Lecture 18: Shellsort
public static <T extends Comparable<? super T>>
void shellSort(T[] a, int first, int last)
{
int n = last - first + 1; // number of array elements
for (int space = n / 2; space > 0; space = space / 2)
{
for (int begin = first; begin < first + space; begin++)
incrementalInsertionSort(a, begin, last, space);
} // end for
} // end shellSort
private static <T extends Comparable<? super T>>
void incrementalInsertionSort(T[] a, int first, int last, int space)
{
int unsorted, index;
for (unsorted = first+space; unsorted<=last; unsorted=unsorted+space)
{
T firstUnsorted = a[unsorted];
for (index = unsorted - space; (index >= first) &&
(firstUnsorted.compareTo(a[index])<0); index = index - space)
{
a[index + space] = a[index]; } // end for
a[index + space] = firstUnsorted;
} // end for
} // end incrementalInsertionSort
215
Lecture 18: Improved Sorts
• Even Better Sorting Algorithms
If we approach sorting in a different way, we
can improve the run-time even more
How about using Divide and Conquer?
• General Idea:
– Define sorting an array of N items in terms of sorting
one or more smaller arrays (for example, of size N/2)
• As we said previously (for Binary Search), this works
well when implemented using recursion
– So we will look at the next two sorting algorithms
recursively
216
Lecture 18: Divide and Conquer Sorts
• How can we apply D and C to sorting?
Questions to consider:
1) How do we "divide" the problem into subproblems?
•
•
Do we break the array in half, or in some other
fragment?
Do we break it up by index value, or in some other
way?
2) How do we use the solutions of the subproblems to
determine the overall solution?
•
Once our recursive call(s) complete, what more
needs to be done (if anything) to complete the sort?
• Let's examine these questions for MergeSort and
QuickSort, two famous D and C sorting algorithms
217
Lecture 19: Idea of MergeSort
1) How do we "divide" the problem?
Simply break the array in half based on index
value
• Given the initial array
0
1
2
3
4
5
6
7
40
80
60
20
30
10
70
50
• We divide it into
0
1
2
3
4
5
6
7
40
80
60
20
30
10
70
50
• We then recursively divide each side, getting
0
1
2
3
4
5
6
7
40
80
60
20
30
10
70
50
218
Lecture 19: Idea of MergeSort
• We continue recursively until we reach the base case
– We know any array of size 1 is sorted already
– In the case below, we have 8 "arrays", each of size 1
> Recall that physically, however, we still have only 1 array
> The subarrays are determined by index restrictions
0
1
2
3
4
5
6
7
40
80
60
20
30
10
70
50
• Once the base case is reached, we have to determine
how to "put the pieces back together again"
219
Lecture 19: Idea of MergeSort
2) How do we use subproblem solutions to
solve the overall problem?
When the recursive calls complete, we will
have two sorted subarrays, one on the left and
one on the right
• Let's look at this from the first call's point of view
0
1
2
3
4
5
6
7
20
40
60
80
10
30
50
70
• How do we produce a single sorted array from these
two sorted subarrays?
220
Lecture 19: Idea of MergeSort
• We "merge" them together, moving the next
appropriate item into an overall sorted array
0
1
2
3
4
5
6
7
20
40
60
80
10
30
50
70
0
1
2
3
4
5
6
7
10
20
30
40
50
60
70
80
– Note that this is where we are really doing the "work"
of the sort. We are comparing items and moving them
based on those comparisons
221
Lecture 19: MergeSort
• Now we can look at pseudocode
MergeSort(A)
if (size of A > 1)
Break A into left and right halves
Recursively sort left half
Recursively sort right half
Merge sorted halves together
Looking at the pseudocode, the algorithm seems
pretty easy
• The only part that requires some thought is the merge
• Ok, let's look at some real code now
• See TextMergeQuick.java
222
Lecture 19: MergeSort Runtime
• How long does MergeSort take to run?
Consider an original array of size N
The analysis is tricky due to the recursive calls
• Let's think of the work "level by level"
– At each level of the recursion we need to consider and
possibly move O(N) items
• Since the size is cut in half with each call, we have a
•
total of O(log2N) levels
Thus in total we have N x log2N work to do, so our
runtime is O(Nlog2N)
– Note that when multiplying Big-O terms, we do NOT
throw out the smaller terms
223
Lecture 19: MergeSort Runtime
• Keep in mind that we are looking at MergeSort "level
•
by level" simply to do the analysis
The actual execution of MergeSort is a tree
execution, similar to what we did for Hanoi
– Note that we recursively sort the left side of the array,
going down all the way to the base case, and then
merging back, before we even consider the right side
– Draw execution flow on board
• Yet we know Towers of Hanoi required 2N-1 moves
while MergeSort only requires O(NlgN) comparisons
– Why this difference?
– Recall how the problem size decreases:
> Towers of Hanoi N-1
> MergeSort N/2
224
Lecture 19: MergeSort Overhead
MergeSort's runtime of O(Nlog2N) is a definite
improvement over our primitive sorts
• However, in order to "merge" we need an extra array
for temporary storage
– We are NOT sorting in place here
• This adds memory requirements
– Although O(N) memory these days is not that big of a
deal
• More importantly, copying to and from this extra
memory slows down the algorithm in real terms
– The asymptotic runtime is very good, but when actually
timed in practice we can do better
Let's try another approach: QuickSort
225
Lecture 19: Idea of QuickSort
1) How do we "divide" the problem?
QuickSort takes a different approach
• Instead of using index values to divide, break up the
data based on how it compares to a special data
value, called the pivot value.
• We compare all values to the pivot value, and place
them into 3 groups:
Data <= Pivot
Pivot
Data >= Pivot
• Since we are dividing by comparing values to
another value, note that the division may NOT be
exactly in half
226
Lecture 19: Idea of QuickSort
Let's look at an example:
0
1
2
3
4
5
6
7
40
80
60
20
30
10
70
50
Same original data as MergeSort example
• Now the "divide" has a different result
• Before we can divide, we need to choose the pivot value
– Can be any item – let's make it the last one, or A[last]
> We will later see a better way to do this
– In this case it is A[7] or the value 50
> However, at the end of the "divide", the pivot may end up in
a different index, since it should be "between" the two sides
227
Lecture 19: Idea of QuickSort
• Let's call this dividing partition
• Partition of our data using 50 as the pivot yields:
0
1
40
2
10
30
<= pivot
3
4
5
20
50
pivot
80
6
7
70
60
>= pivot
– We will see how partition is implemented shortly
What does this achieve?
• Certainly the data is not yet sorted
• However, now we know that at least 1 item in the
array is in its CORRECT, sorted location
– Which one?
• The rest of the data is now "more sorted" than it was,
since it is at least on the correct "side" of the array
228
Lecture 19: Idea of QuickSort
• Naturally, the "divide" is not complete without
recursive calls
– For QuickSort, we can now recursively sort the left
"side" and the right "side"
> Recall that these sides may not be exactly ½ of the array
We are now ready for pseudocode:
QuickSort(A)
if (size of A > 1)
Choose a pivot value
Partition A into left and right sides
based on the pivot
Recursively sort left side
Recursively sort right side
229
Lecture 19: Idea of QuickSort
2) How do we use subproblem solutions to
solve the overall problem?
We don't have to do anything!
Note that we are comparing during partition
• Since the pivot is already in its correct spot, if we
recursively sort the left side and we recursively
sort the right side, the whole array is sorted
So even though we need to consider 2) here,
we don't need to do anything to accomplish
it (unlike MergeSort)
• However, implementing 1) for QuickSort requires
much work, also unlike MergeSort
230
Lecture 20: QuickSort
So how is the partition done?
We'd like to do this in place if possible
• No extra array/vector needed
Let's look at the code and trace the example on
the board
• See Quick.java
• Note that this is still a simple version
• We will look at the text version after we discuss the
run-time
231
Lecture 20: QuickSort
Partition: basic idea
• Start with a counter on the left of the array and a
counter on the right of the array
• As long as data at left counter is less than the pivot,
do nothing (just increment counter)
• As long as the data at right counter is greater than
the pivot, do nothing (just decr. counter)
– Idea here is that data is already on the correct side, so
we don't have to move it
• When left counter and right counter "get stuck", it
means there is data on the left that should be on the
right, and vice versa
– So swap the values and continue
232
Lecture 20: QuickSort
0
1
2
3
4
5
6
7
40
b
80
b
60
20
30
10
c
70
c
50
– A[b] is greater than the pivot, but on left
– A[c] is less than the pivot but on right
– Swapping them puts things straight
0
1
2
3
4
5
6
7
40
10
b
60
b
20
30
c
80
c
70
50
– A[b] is again greater than the pivot
– A[c] is again less than the pivot
– Swap again to put things straight
233
INITIALLY:
left = 0
right = 7
pivot = 50
pivotIndex = 7
b=0
(indexFromLeft)
c=6
(indexFromRight)
Lecture 20: QuickSort
0
1
2
3
4
5
6
7
40
10
30
b
20
b
c
60
b
c
80
70
50
– The values again are on the "wrong side", but this time
note that b >= c
– This means we are done with the partition except for
one last step – what is that?
> We must put the pivot into the right place
> Swap A[pivotIndex] and A[indexFromLeft] (A[b])
> Set pivotIndex = indexFromLeft
0
1
2
3
4
5
6
7
40
10
30
20
50
80
70
60
234
Lecture 20: QuickSort
0
1
2
3
4
5
6
7
40
10
30
20
50
80
70
60
– Now we recursively sort the left side (blue) and
recursively sort the right side (orange), and we are
finished
– Note that the pivot from this first partition is never
again touched – it is in its absolute correct spot
– The other items, however, could move considerably
within their sides of the array
235
Lecture 20: QuickSort
• How long does QuickSort take to run?
The performance of QuickSort depends on the
"quality" of the divide
• Depends on how other values relate to the pivot
Let's look at 2 different scenarios:
1) Pivot is always the middle value in a partition
• Show on board
• This execution trace is similar to that of MergeSort,
and the overall Big-O runtime is also O(Nlog2N)
– However, since an extra array is NOT needed in
QuickSort, the measured runtime will usually be faster
than MergeSort
236
Lecture 20: QuickSort
2) Pivot is always an extreme element in a partition
• Note that this is not the index of the pivot, but
rather where the pivot ends up after the partition is
complete
– Show on board
– Develop and discuss run-time
– Recall the idea of divide and conquer
> Recursive calls are a fraction of original size (ex: ½)
– However, in this case the recursive calls are only one
smaller than the original size (N-1)
> Thus we are losing the power of divide and conquer in
this case
– Run-time ends up being O(N2)
> Same as the simple sorts
237
Lecture 20: QuickSort
So which run-time will we actually get?
• It depends on how the data is originally distributed
and how the pivot is chosen
– Our simple version of Quicksort picks A[last] as the
pivot
> This makes the interesting worst case of the data being
already sorted!
> Reverse sorted data is also a worst case
238
Lecture 20: QuickSort
• We can make the worst case less likely to occur by
choosing the pivot in a more intelligent way
– The text version uses Median of Three
• Median of Three Idea:
– Don't pick the pivot from any one index
– Rather consider 3 possibilities each time we partition
> A[first], A[mid], A[last]
– Order these items and put the smallest value back into
A[first], the middle into A[mid] and the largest into
A[last]
> So now we know that A[first] <= A[mid] <= A[last]
– Now use A[mid] as the pivot
• Now reconsider already sorted data
– Now it is a best case!
239
Lecture 20: QuickSort
However, median of three does not guarantee
that the worst case (N2) will not occur
• If only reduces the likelihood and makes the situation
in which it would occur not obvious
So we say:
• The EXPECTED run-time of QuickSort is O(Nlog2N)
• The WORST CASE run-time of QuickSort is O(N2)
For code, see
• TextMergeQuick.java
240
Lecture 20: QuickSort
• Other variations / optimizations:
What if we choose the pivot index randomly?
• For each call, choose a random index between first
and last (inclusive) and use that as the pivot
• Worst case?
– Could be just as bad as the simple pivot choice
• Average case?
– It is very unlikely that a random pivot will always be
bad
– Overall this should give good results
– However, we have overhead of generating random
numbers
241
Lecture 20: QuickSort
When to stop recursion?
• Simple QuickSort stops when logical size is 1
• However, benefit of divide in conquer decreases as
problem size gets smaller
– At some point, the cost of the recursion outweighs the
D and C savings
– So choose a size > 1 to stop recursing and switch to
another (good) algorithm at that point
– What to choose?
> InsertionSort!!!
> Why? Even though it is poor overall, if the data is
“mostly” sorted due to QuickSort, we will be close to the
best case for InsertionSort and maybe we will get better
overall results!
– See TextMergeQuick.java
242
Lecture 20: QuickSort vs MergeSort
• So which do we prefer, MergeSort or
QuickSort?
MergeSort has a more consistent runtime than
QuickSort
However, in the normal case, QuickSort outperforms MergeSort
• Due to the extra array and copying of data, MergeSort
•
is "normally" slower than QuickSort
This is why many predefined sorts in programming
languages are actually QuickSort
– Ex: In JDK Arrays.sort() uses QuickSort
243
Lecture 21: Iterators
• Recall what the ListInterface (and List) is
A set of methods that indicates the behavior
of classes that implement it
Nothing is specified about how the classes
that implement List are themselves
implemented
• The data could be stored in an array, as in the
•
•
author's AList class
The data could be stored in a linked list, as in the
author's LList class
The data could be stored in some other way
244
Lecture 21: Iterators
Question: How can users of any List class
access the data in a sequential way?
• We could copy the data into an array and return the
array – then we can access the array
– This is what the toArray() method does
• Can we do it without having to make a new array?
An iterator is a program component that allows
us to iterate through a list in a sequential way,
regardless of how the list is implemented
• The details of HOW we progress are left up to the
•
implementer
The user of the interface just knows it goes through
the data
245
Lecture 21: Iterators
• Why do we need these?
• What good are they?
We will see that the implementation can be a
bit convoluted, leading to questions like "are
these things really worth while?"
• Iterators are good for two main reasons:
1) They allow multiple "iterations" to co-exist on
the same underlying object
2) They can tailor the implementation of the
iteration to the underlying data structure,
without requiring the client to know it
246
Lecture 21: Iterators
1) Multiple "co-existing" iterations
Consider the following situation:
• We have a set of data and we want to find the mode
of that set
– What is the mode? Statistics anyone?
• How can we do this?
– Start at the first value
> See how many times it occurs – i.e. search through the
rest of the list
– Proceed to the next value
> Do the same
– Continue all the way through, keeping track of the
value with the highest count
247
Lecture 21: Iterators
• Show on board
• Note that we have two separate "iterations" through
the list being accessed in the same code
– One is going through the list, identifying each item
– The other is counting the occurrences of that item
– Logically, they are separate, even though they are
progressing through the same list
• For a List, we can also do this with nested for loops
and the get() method
– However, the implementation of get() is very inefficient
for a linked list
> As we discussed in Example8.java
– This leads us to the next point…
248
Lecture 21: Iterators
2) Tailor the implementation to the data
structure
Consider again Example8.java
• When printing out either the AList or the LList, we
use get(i) to get the next item
• For the AList this is fine, since we have direct access
to the locations
• However, for the LinkedList this is TERRIBLE
– get(0) – 1 operation
– get(1) – 2 operations
– get(2) – 3 operations
…
249
Lecture 21: Iterators
As we discussed, this gives us 1 + 2 + 3 + …
• Result is O(N2) for list of size N
Why is it so poor for a linked list?
• Each get() operation restarts at the beginning of the list
What if we could "remember" where we stopped
the last time and resume from there the next
time?
An iterator tailored to a linked list can do this for
us, thereby saving a LOT of time
• Show on board
250
Lecture 21: Iterators
• Consider the following methods:
public boolean hasNext();
– See if there are any elements remaining to iterate through
public T next();
– Retrieve and return the next element in the sequence,
advancing the iterator by one position
public void remove();
– Remove the last item that was returned (via a call to
next()) from the underlying data structure
Consider these separate from any other
functionality that a given class might have
• So we will make them an interface
251
Lecture 21: Iterators
• Consider the Java Iterator interface:
public interface Iterator<T>
{
public boolean hasNext();
public T next();
public void remove();
}
This is a simple iterator that can be used with
most Collections
But how is this interface implemented?
• Also, where is it implemented?
– We want it to be part of a List, but how can that be done,
since List is itself an interface?
– This is a bit convoluted, so we need to consider this
carefully
252
Lecture 21: Iterators
• There are two ways we can implement this
interface:
Internally: A list includes these methods
amongst the other methods that it already has
• This solves problem 2) because we can tailor the
•
implementation to the underlying class
However, it does NOT solve problem 1) since we still
only have one "state" available in the iteration
Discuss
•
Externally: A new object is created "on top" of
the list that implements these methods
253
Lecture 21: Iterators
• We write our list classes so that each has the ability to
generate an iterator object that allows sequential
access to its elements, without violating data
abstraction
– Thus the iterator object is separate (but related to) the
underlying list that it iterates over
• Multiple iterator objects can be created for a given list,
each with its own current "state"
The external implementation will thus be
preferable and is the technique that is used in
standard Java, so we will look at this one in
more detail
254
Lecture 21: Iterators
• Note: This code depends heavily on object•
oriented ideas and coding, so keep that in
mind
Idea: We only add a single extra method to
our List interface:
public Iterator<T> getIterator()
This will return an iterator built on top of the
current list, but with its own "state" so that
multiple iterators can be used on one list
Let's look at that method for the linked list
implementation:
255
Lecture 21: Iterators
public Iterator<T> getIterator()
{
return new IteratorForLinkedList();
} // end getListIterator
So this method is easy – the work is in creating
the new class IteratorForLinkedList
• This class will be built on the current list and will
•
•
simply have the ability to go through all of the data in
the list in an efficient way
Since it is tailored to the linked list, we can make it a
private (inner) class and it can directly access our
linked list instance variables
Let's look at the details in handout
See Example12.java and LinkedListWithIterator.java
256
Lecture 21: Iterators
Let's now focus on the implementation
• Recall that we said the iterator could be tailored to the
underlying list
• The interface is the same, but the way it is done
depends on whether the list is implemented with an
array or a linked list
The LL implementation uses a Node reference as
the sole instance variable for the iterator
• It is initialized to firstNode when the iterator is created
• It progresses down the list with each call to next()
• Note that with a single Node reference, remove() is
not possible
– Why? Discuss
257
Lecture 21: Iterators
• So what if we wanted to allow remove()?
– We would need a second reference to keep track of the
previous node in the iteration
– This is what is done in the Standard Java LinkedList
iterator
So what would we need for the array
implementation?
• Discuss
• We need only an integer to store the index of the
•
•
current value in the iteration
It is incremented with each call to next()
remove() can be implemented
– Must shift to fill in gap
258
Lecture 21: ListIterator
• The Iterator interface can be used for any
Java Collection
This includes our List<T> interface, but also
others:
• Ex: Set<T>, SortedSet<T>
For a List, we can add more functionality to our
iterator
• Basically we can traverse in both directions rather than
•
one direction only
Does this have any implications on our implemetations?
– Singly Linked List will not support a ListIterator!
259
Lecture 21: ListIterator
public interface ListIterator<T> extends Iterator<T>
{
boolean hasNext();
T next();
boolean hasPrevious();
T previous();
int nextIndex();
int previousIndex();
void remove();
void set(T o);
void add(T o);
}
Note that this iterator is bidirectional, and it allows
objects to be added or removed
260
Lecture 21: ListIterator
As we discussed previously for Iterator, the best
way to implement a ListIterator is to
• Implement it "externally", meaning that the methods
are not part of the class being iterated upon
– We build a ListIterator object on top of our list so we
can have multiple iterations at once
• Make the class that implements the ListIterator an
inner class so that it has access to the list details
– Allows us to tailor our ListIterator to the underlying
data structure in the most efficient way
However, we need a bit more logic to handle
traversal in both directions, as well as both set()
and remove()
261
Lecture 21: ListIterator
Regarding the logic
• It is explained in great detail in the text – read it over
• Sections 15.41-15.50
• Let’s look briefy at the standard Java ArrayList
• Note that we need to keep track of the current
"direction" to allow the remove() and set() methods to
work correctly
– Ex: In order to implement remove() in an array, we
need keep track of the index of the last item that was
removed
– See code
262
Lecture 21: ListIterator
• Another interesting issue:
The structure of iterators allow for multiple
iterations on the same underlying list
However, if we start modifying the underlying
list, we can get into a lot of problems
• If one iterator modifies the list it will affect the other,
and it could lead to an exception
Because of this, the Standard Java iterators do
not allow "concurrent modification"
• If one iterator modifies the list, other current iterators
are invalidated, and will generate an exception if used
263
Lecture 21: Iterable Interface
With JDK 1.5, the Iterable interface was
introduced
This is simply:
public interface Iterable<T>
{
Iterator<T> iterator();
}
So any class with an iterator can also implement
Iterable
• See MyArrayIterable.java and Example3b.java
264
Lecture 22: Intro to Trees
• Consider the primary data structures that
we have examined so far:
ArrayList, LinkedList
• also Stack and Queue which we know superficially
and will look at more soon
All of these have been LINEAR data structures
• Data is organized such that items have a single
predecessor and a single successor
– Except first (no predecessor) and last (no successor)
• We can draw a single "line" through all elements
These data structures have worked well, but…
• Can we benefit from organizing the data differently?
265
Lecture 22: Intro to Trees
• Tree structures
In a linked list, each node had a reference to
at most one previous and one next node
• What if we allowed nodes to have references to
more than one next node?
Root Node – has no parent node
Interior Node – has a
parent and at least one
child node
Leaf Node – has no children
266
Lecture 22: Intro to Trees
• A tree is a non-linear data structure, since we cannot
draw a single line through all of the elements
Some more definitions:
• For any node V, if P = parent(V), then V = aChild(P)
• For any node V, the descendants of V are all nodes
that can be reached from V
Parent(V)
V
siblings (all have
same parent)
s
s
s
Descendants of
V
267
Lecture 22: Intro to Trees
• For any node V, the subtree rooted at V is V and all
of its descendants
– From V's point of view this is a tree in itself
• Now we can define a tree recursively:
T is a tree if
1) T is empty (no nodes) – base case, or
2) T is a node with 0 children – base case
or 1 or more children that are all trees – recursive
case
– Do example on board
268
Lecture 22: Intro to Trees
• How do we represent an arbitrary tree?
1) We can have a node with data and a linked
list of children
• Draw on board
• Note that the number of children can be arbitrary
– List could be long if node has many children
2) We can have a node with data, and two
references,
• One to left child and one to right sibling
– Draw on board
• Number of children can still be arbitrary
• Now nodes are all the same
More on arbitrary trees in CS 1501
269
Lecture 22: Binary Trees
• In many applications, we can limit the
structure of our tree somewhat
BINARY TREE
• A tree such that all nodes have 0, 1, or 2 children
Recursive definition:
T is a binary tree if
1) T is empty (base case) or
2) T is a node with the following structure
left
element
right
– where element is some data value
– where left and right are binary trees (recursively)
270
Lecture 22: Binary Tree Properties
• Consider a binary tree with n nodes:
Height of the tree is the maximum
number of nodes from the root to any leaf
• Tree to right has a height of 6
We can also think of heights of
subtrees of trees
• Subtree rooted at X has a height
of 3
x
Height is an important property
• Many binary tree algorithms have runtimes proportional to the tree height
• Let's establish some bounds on height
271
Lecture 22: Binary Tree Properties
Maximum Height:
• Given a binary tree with n nodes, what is the
maximum value it could have for its height
• How would the maximum height tree look?
– Discuss and see notes below
Minimum Height:
• Given a binary tree with n nodes, what is the
minimum value it could have for its height?
• Assume for simplicity that n = 2k-1 for some k
• How would this minimum height tree look?
• How can we justify its height value?
– Discuss
272
Lecture 22: Binary Tree Properties
• A minimum height tree will have the maximum
•
branching at each node
Given n = 2k-1, this tree will be a Full Tree
– All interior nodes have 2 children
– All leaves are on the same, last level of the tree
> Ex: Tree on bottom of this slide is a full tree of height 3,
and it has 23-1 = 7 nodes
– So how can we relate n (7) to the height (3)?
– Note the number of nodes at each level of a full tree:
Level 1: 1 node = 20
Level 2: 2 nodes = 21
Level 3: 4 nodes = 22
…
Level i: 2i-1 nodes
– The total number of nodes is the sum of the nodes
at each level
273
Lecture 22: Binary Tree Properties
– Recall that n = 2k-1 for some k
– Recall (from the last slide) that
n = 20 + 21 + … + 2h-1 for some h
> Note that h is the height of the tree, so if we can solve
for h we are done
– Thus, we get
2k-1 = n = 20 + 21 + … + 2h-1 for some h
– Using math, we know that
20 + 21 + … + 2h-1 = 2h-1 (geometric sum)
– Now we have
2k-1 = 2h-1
– Adding 1 to both sides we get
2k = 2h
– Taking the log2 of both sides we get
h=k
274
Lecture 22: Binary Tree Properties
• But we want the height in terms of the number of
nodes, n:
2k-1 = n
2k = n+1
k = log2(n+1)
– So the minimum height of a binary tree with n nodes =
h = k = log2(n+1)
• Note this is for a tree with 2k-1 nodes
– Binary trees can have any number of nodes – will this
change the formula?
> Not significantly
– More generally we can say that the minimum height
for a tree with n nodes is O(log2n)
– Now we also know that a Full Tree of height h has
2h-1 nodes
275
Lecture 23: Binary Tree Properties
• Note that most trees CANNOT be Full Trees, since
•
all Full Trees have 2i-1 nodes (1, 3, 7, 15, etc)
However, a tree with ANY number of nodes can be
a Complete Binary Tree
– A complete tree is a full tree up to the second last
level with the last level of leaves being filled in from
left to right
> If the last level is completely filled in, the tree is Full
– A Complete Binary Tree of height h has between 2h-1
and 2h-1 nodes
– A nice property of a complete binary tree is that its
data can be efficiently stored in an array or vector
> Do demo on board
276
Lecture 23: Height of a Binary Tree
So how do we find out the height for a given
tree?
We can define this recursively as well:
• Height(T)
If T is empty, return 0
else
Let LHeight = Height of left subtree
Let RHeight = Height of right subtree
Return (1 + Max(LHeight, RHeight))
• Let's look at an example
– Tree we looked at previously in Slide 271
277
Lecture 23: Height of a Binary Tree
Trace on board
with class
278
Lecture 23: Representing a Binary Tree
• We'd like to be able to do operations on
binary trees
Implement the height that we just discussed
Traverse the tree in various ways
Find other properties
• Max or min value
• Number of nodes
• Before we can do these we need to find a
good way to represent the tree in the
computer
279
Lecture 23: Representing a Binary Tree
We'll do this in an object-oriented way, as we
did with our lists
It is a bit complicated, so we need to pay
attention to all of the steps
public interface TreeInterface<T>
{
public T getRootData();
public int getHeight();
public int getNumberOfNodes();
public boolean isEmpty();
public void clear();
}
• Note that this interface is for general trees
• Let's make it more specific for binary trees
280
Lecture 23: Representing a Binary Tree
public interface BinaryTreeInterface<T> extends
TreeInterface<T>, TreeIteratorInterface<T>
{
public void setTree(T rootData);
public void setTree(T rootData,
BinaryTreeInterface<T> leftTree,
BinaryTreeInterface<T> rightTree);
}
• This simply allows for an "easy" assignment of
•
•
binary trees
We'll look at TreeIteratorInterface<T> later
Now we have the basic functionality of a binary tree
– but we need to get the basic structure
281
Lecture 23: Representing a Binary Tree
public interface BinaryNodeInterface<T>
{
public T getData();
public void setData(T newData);
public BinaryNodeInterface<T> getLeftChild();
public BinaryNodeInterface<T> getRightChild();
public void setLeftChild(BinaryNodeInterface<T>
leftChild);
public void setRightChild(BinaryNodeInterface<T>
rightChild);
public boolean hasLeftChild();
public boolean hasRightChild();
public boolean isLeaf();
public int getNumberOfNodes();
public int getHeight();
public BinaryNodeInterface<T> copy();
}
• Gives the basic functionality of a node
282
Lecture 23: Representing a Binary Tree
Summary so far:
• TreeInterface
• TreeIteratorInterface
– Give the basic functionality of a tree
• BinaryTreeInterface
– Adds a couple of methods for binary trees
• BinaryNodeInterface
– Gives the basic functionality of a node in the tree
Interfaces give us the ADTs
• Now we need some classes to implement these
interfaces
283
Lecture 23: Representing a Binary Tree
Let's look at the nodes first:
class BinaryNode<T> implements BinaryNodeInterface<T>,
java.io.Serializable
{
private T data;
private BinaryNode<T> left;
private BinaryNode<T> right;
// See .java file for methods
}
• Self-referential, just as linked list nodes
– However, can now branch in two directions
• Now we can easily define a binary tree
284
Lecture 23: Representing a Binary Tree
public class BinaryTree<T> implements
BinaryTreeInterface<T>, java.io.Serializable
{
private BinaryNodeInterface<T> root;
// See .java file for methods
}
• Idea:
A BinaryTree has one instance variable – a reference to a
BinaryNodeInterface (which is implemented by a BinaryNode)
A BinaryNode has 3 instance variables
• An reference to T to store data for that node
• Left and right references to subtree nodes
• Creation by Composition
– To manipulate a BinaryTree we must manipulate its underlying nodes
285
Lecture 23: Representing a Binary Tree
We will come back to the BinaryTree<T> class
later on
For now we will look at the BinaryNode<T>
class and see how some of the operations are
done
• Finding the height, traversals, etc.
• In fact we can implement a binary tree solely with
BinaryNode<T> if we choose
– You will do this in Assignment 5
• However, we can formalize our tree better using a
separate BinaryTree<T> class so we will do that later
For now consider the BinaryNode<T> class…
286
Lecture 23: Implementing Some Operations
Ok, let's first look at code that determines the
height:
private int getHeight(BinaryNode<T> node)
{
int height = 0;
if (node != null)
height = 1 + Math.max(getHeight(node.left),
getHeight(node.right));
return height;
}
• Note that actual code is not really different from
the pseudocode we looked at in Slide 277 and
that we already traced
287
Lecture 23: Implementing Some Operations
How about copying a tree?
• Copying an array or linked list is fairly simple, due to
their linear natures
• However, it is not immediately obvious how to copy a
binary tree such that the nodes are structurally the
same as the original
• Luckily, recursion again comes to the rescue!
– If we view copying the tree as a recursive process, it
becomes simple!
– To copy tree T, we simply
> Make a new node for the root and copy its data
> Recursively copy the left subtree into the left child
> Recursively copy the right subtree into the right child
288
Lecture 23: Implementing Some Operations
Let's now look at code for copy():
public BinaryNodeInterface<T> copy()
{
BinaryNode<T> newRoot = new BinaryNode<T>(data);
if (left != null)
newRoot.left = (BinaryNode<T>)left.copy();
if (right != null)
newRoot.right = (BinaryNode<T>)right.copy();
return newRoot;
} // end copy
– Note the similarities (and differences) to the
code for getHeight()
– Both are essentially traversing the entire
tree, processing the nodes as they go
289
Lecture 23: Trace of copy() method
BinaryNode<Integer> T2 = (BinaryNode<Integer>) T1.copy()
T2
newRoot
newRoot.left
newRoot.right
this
10
20
T1
this
10
newRoot
newRoot.left
newRoot.right
30
20
this
30
newRoot
40
50
40
public BinaryNodeInterface<T> copy()
{
BinaryNode<T> newRoot = new BinaryNode<T>(data);
if (left != null)
newRoot.left = (BinaryNode<T>)left.copy();
if (right != null)
newRoot.right = (BinaryNode<T>)right.copy();
return newRoot;
} // end copy
290
50
Note: View this
in a ppt
slideshow to see
the animation
Lecture 23: Binary Tree Traversals
• So what about traversing itself?
Again, unlike linear structures (array, linked list)
it is not obvious
However, if we think recursively, we can still do
it in a fairly easy way:
• Consider a tree node T
left
Data
right
– I can traverse the subtree rooted at T if I
Traverse T's left subtree recursively
Visit T itself (i.e. access its data in some way)
Traverse T's right subtree recursively
291
Lecture 23: Binary Tree Traversals
There are 3 common traversals used for binary
trees
• They are all similar – the only difference is where the
•
current node is visited relative to the recursive calls
PreOrder(T)
if (T is not empty)
Visit T.data
PreOrder(T.left)
PreOrder(T.right)
• InOrder(T)
if (T is not empty)
InOrder(T.left)
Visit T.data
InOrder(T.right)
292
Lecture 23: Binary Tree Traversals
• PostOrder(T)
if (T is not empty)
PostOrder(T.left)
PostOrder(T.right)
Visit T.data
• Let's look at an example
– We'll traverse a tree using all 3 to see how it proceeds
and what output it generates
50
30
10
5
80
40
20
90
45
293
85
95
Lecture 23: Binary Tree Traversals
Note that in the example shown, the InOrder
traversal produces the data IN ORDER
• This is NOT ALWAYS the case – it is only true when the
data is organized in a specific way
– If the tree is a Binary Search Tree – we will see this later
The actual code for these traversals is not any
more complicated than the pseudocode
• See BinaryNode.java and Example14.java
• It uses one tree that is NOT a BST and one that is
• Note how the work is done through the recursive calls
– The run-time stack "keeps track" of where we are
• Runtime of these traversals?
– Discuss and see note below
294
Lecture 24: Binary Tree Traversals
Note again how the traversals, getHeight() and
copy() are all similar
• In fact all of these methods are traversing the tree
• They differ in the order (pre, in, post) and what is
•
done at each node as it is visited
For example:
– getHeight() can be thought of as a postorder traversal,
since we have to get the height of both subtrees before
we know the height of the root
– copy() is actually a combo of all 3 orderings
> The root node is created preorder
> The left child is assigned inorder
> The right child is assigned postorder
295
Lecture 24: Binary Tree Traversals
Can these traversals be done iteratively?
• Yes but now we need to "keep track" of where we
are ourselves
• We do this by using our own stack of references
– The idea is that the "top" BinaryNode reference on our
stack is the one we are currently accessing
• This works but it is MUCH MORE COMPLICATED than
•
the recursive version
The author uses the iterative versions these
traversals to implement iterators of binary trees
– We will see how much harder these are to do
iteratively
– However, we can't use the recursive version for an
iterator, since it needs to proceed incrementally
296
Lecture 24: Binary Search Trees
• Binary Trees are nice, but how can we use
them effectively as data structures?
One way is to organize the data in the tree in a
special way, to create a binary search tree (BST)
• A BST is a binary tree such that, for each node in the
tree
– All data in the left subtree of that node is less than the
data in that node
– All data in the right subtree of that node is greater than
the data in that node
> Note that this definition does not allow for duplicates. If
we want to allow duplicates we should add "or equal to"
to one of the above lines (but not both)
297
Lecture 24: Binary Search Trees
Naturally, we can also define BSTs
recursively:
• A binary tree, T, is a BST if either
1) T is empty (base case) or
2) T is a node with the following structure
left
data
right
> where all values in the tree rooted at left are less
than data
> where all values in the tree rooted at right are
greater than data
> where left and right are BSTs
298
Lecture 24: Binary Search Trees
50
BST
30
10
5
80
40
20
90
45
85
50
BST
NOT A BST
30
50
10
5
95
30
10
299
80
40
20
90
Lecture 24: BST Interface
• Let's back up a bit now
We haven't defined the BST ADT yet (i.e. the
methods that make up a BST):
Actually, the text defines a more general
SearchTreeInterface, which our BST will
implement:
public boolean contains(T entry)
– Is an entry in the tree or not?
public T getEntry(T entry)
– Find and return and entry that "equals" the param
entry
> If the key matches return the object; otherwise return
null
300
Lecture 24: BST Interface
public T add(T newEntry)
– Add a new entry into the tree
> New object is put into its appropriate location, keeping
the search property of the tree intact
> If an object matching newEntry is already present in the
tree, replace it and return the old object
> What if we don't want to replace it? Implications?
public T remove(T entry)
– Remove entry from the tree and return it if it exists;
otherwise return null
public Iterator<T> getInorderIterator()
– Return an iterator that will allow us to go through the
items sequentially from smallest to largest
> Go back and look at Iterator<T> interface
301
Lecture 24: BST Search
Before we discuss the implementation details
• Let's get the feel for the structure by seeing how we
would do the getEntry(T entry) method
• Consider a recursive approach (naturally):
– What is our base case (or cases)?
> If tree is empty – not found
> else if key matches node value -- found
– What are our recursive cases?
> If key < node value, search left subtree
> else if key > node value, search right subtree
– How do we use our recursive results to determine our
overall results?
> Simply pass result from recursive call on
> Trace an example
302
Lecture 24: BST Search vs. Sorted Array Search
• Notice the similarity between this algorithm and the
binary search of a sorted array
– This is NOT coincidental!
– In fact, if we have a full binary tree, and we have the
same data in an array, both data structures would
search for an item following the exact same steps
> Let's look for item 45 in both data structures:
0
1
2
3
4
5
6
10
30
40
50
70
80
90
2
3
1
1 50
2 30
10
80
3
40
70
303
90
Lecture 24: BST Search vs. Sorted Array Search
• In the case of the array, 45 is "not found" between 40
•
•
and 50, since there are no actual items between 40
and 50
In the case of the BST, 45 is "not found" in the right
child of 40, since the right child does not exist
Both are base cases of a recursive algorithm
– Same runtimes since the height of a full tree is O(log2n)
Immediately, we see an advantage of the BST
over the LinkedList
• Although access to nodes requires references to be
•
followed, the tree structure improves our search time
from O(n) to O(log2n)
Ok, now is a BST also an improvement over the array?
304
Lecture 24: BST Implementation
• To answer that question, we need to look at
some more operations
• Let's first look more at the BST structure
• BST Implementation
We will use the BinaryTree as the basis
We can implement it either recursively or
iteratively
• We'll look at both versions
public class
BinarySearchTree<T extends Comparable<? super T>>
extends BinaryTree<T> implements
SearchTreeInterface<T>, java.io.Serializable
305
Lecture 24: BST Implementation
We will concentrate on four things:
• getEntry() method
– contains() can be easily derived from getEntry()
• add() method
• remove() method
• getInorderIterator() method
These provide the basic functionality of a Binary
Search Tree:
• Finding an object within the tree
• Adding a new object to the tree
• Removing an object from the tree
• Traversing the tree to view all objects
306
Lecture 24: BST Implementation
getEntry()
• We already discussed the idea of this method in a
recursive way
• Now let's look at the actual code and trace it
• See recursive BinarySearchTree.java
• See iterative BinarySearchTree.java
– Note how iterations of the loop correspond to recursive
calls
• See how contains() is easily derived
307
Lecture 24: BST Implementation
add()
• This one is more complicated
• Special case if tree is empty, since we need to create
a root node
• Otherwise, we call addEntry(), which proceeds much
like getEntry()
– However, we have more to consider. Consider
possibilities at current node (call it temp):
1) New data is equal to temp.data
– Store old value, assign new value and return old node
2) New data is less than temp.data
– If temp has a left child, go to it
– else add a new node with the new data as the left child
of temp
308
Lecture 24: BST Implementation
3) New data is greater than temp.data
– If temp has a right child, go to it
– else add a new node with the new data as the right
child of temp
• Of course, the actual code is trickier than the
pseudocode above
•
•
•
Let's trace the recursive version to see how it works
See recursive version of BinarySearchTree.java
One interesting difference from getEntry()/findEntry()
•
•
•
The base case for addEntry() must be at an actual node
We cannot go all the way to a null reference, since we
must link the new node to an existing node
If we go to null we have nothing to link the new node to
309
Lecture 24: BST Recursive addEntry() Method
Adding 25 to the BST
rootNode
25<50, go left
root
Note: Run-Time Stack goes
downward in this case
rootNode
25<30, go left
50
30
rootNode
25>10, go right
10
rootNode
25>20, right null
5
80
40
20
90
45
25
310
85
95
To see this correctly you must run
it in a Powerpoint slideshow
Lecture 24: BST add() Method
This is elegant but it still it (obviously) requires
many calls of the method
• As we know, this adds overhead to the algorithm
If we do the process iteratively, this overhead
largely goes away
• See iterative version
• Trace
• As with findEntry(), since the recursive calls are
"either" "or" but not both, the iteration is very simple
and actually preferred over the recursion
311
Lecture 24: BST remove() Method
remove()
– Idea is simple: 1) Find the node and, 2) Delete it
– However, it is much trickier than add – why?
– Unlike add(), which is always at a leaf, the remove()
operation could remove an arbitrary node
> Depending upon where that node is, this could be a
problem
> Let's look at 3 cases, and discuss the differences
between them
node is a leaf
node has 1 child
312
node has 2
children
Lecture 24: BST remove() Method
1) Node is a leaf
• This one is easy – simply set its parent's appropriate
child reference to null (so we need a ref. to parent)
• Garbage collector takes care of the rest
2) Node has one child
• Still not so bad…in fact this looks a lot like what?
• Deleting a node from a linked list
– Set parent's child reference to node's child reference
3) Node has two children
• This one is tricky!
• Why -- only one reference coming in but two going
out
313
Lecture 24: BST remove() Method
• So to actually delete the node would require
•
significant reorganization of the tree
But do we really even need to delete the NODE?
– No, we need to delete the DATA
– Perhaps we can accomplish this while leaving the node
itself where it is
• How?
– Recall that what is important about a BST is the BST
Property (i.e. the ordering)
– The shape is irrelevant (except for efficiency concerns,
which we will discuss next)
– So perhaps we can move data from another node into
the node whose value we want to delete
> Perhaps the other node will be easier to delete
314
Lecture 25: BST remove() Method
• How do we choose this node?
– Consider an inorder traversal of the tree
– We could substitute the value directly before (inorder
predecessor) or the value directly after (inorder
successor)
• How to find this node?
– Consider inorder predecessor – it is the largest value
that is less than the current value
– So we go to the left one node, then right as far as we
can
• What if this node also has two children?
– Will not ever – since we know by how we found it that
it has no right child
315
Lecture 25: BST remove() Method
• Let's look at the code to see how this is done
– We'll look at the iterative version
– Recursive version works, but due to the same issues
we discussed for add(), we will prefer the iterative
• Note that the code looks fairly tricky, but in reality
we are just going down the tree one time, then
changing some references
– A lot of the complexity of the code is due to the
author's object-oriented focus
316
Lecture 25: Deleting a Node with 2 Children from a BST
50
30
25
10
5
80
40
20
90
45
25
85
95
• 30 is found
It has two children
Find Inorder Predecessor
To see this correctly you must run
it in a Powerpoint slideshow
• Go left
• Go right until null
Overwrite current node with inorder
precessor
Delete inorder predecessor
317
Lecture 25: BST getInoderIterator() Method
getInorderIterator()
• As we discussed previously, this will be a step-bystep inorder traversal of the tree
• It is done iteratively so that we can pause
indefinitely after each item is returned
• Still the logic is much less clear than for the
recursive traversals
• This method is implemented in the BinaryTree
class, so we don't have to add anything for
BinarySearchTree
– See BinaryTree.java
318
Lecture 25: BST getInorderIterator() Method
• What data and methods do we need?
– Method simply returns an instance of private
InorderIterator object
– Recall the methods we need for an iterator()
> hasNext() – is there an item left in the iteration?
> next() – return the next item in the iteration
– We also need some instance variables
> To mimic the behavior of the run-time stack, we will use
our own Stack object
> Plus we need a BinaryNode to store the current node
• How will it work?
– Think about behavior of inorder traversal
– We need to duplicate this iteratively
319
Lecture 25: BST getInorderIterator() Method
• Initially (in the constructor), set the currentNode to
•
the root
For each call of next()
– Go left from root as far as we can, pushing all nodes
onto the stack
– Top of the stack will be the next value in the iteration
(nextNode)
– Then set the currentNode to the right child of nextNode
> After nextNode we should traverse the its right subtree
> That is what currentNode now represents
> It could be null – in this case the previous node had no
right subtree, and we backtrack
• Let's trace this execution
320
Lecture 25: BST getInorderIterator() Method
nodeStack
root
50
30
currentNode
nextNode
10
5
80
40
20
90
45
85
95
To see this correctly you must run it in a Powerpoint slideshow
Trace is only partially shown (up to 40)
321
Lecture 25: BST Run-times
• So how long will getEntry() (and contains()),
add() and remove() take to run?
It is clear that they are all proportional in run-time
to the height of the tree
So if the BST is balanced
• getEntry(), add() and remove() will all be O(log2N)
If the BST is very unbalanced
• getEntry(), add() and remove() will all be O(N)
Given normal use, the tree tends to stay balanced
• However, it could be unbalanced if the data is inserted
in a particular way
– Ex: If we do add()s of sorted data from a file
322
Lecture 25: BST Run-times
Thus, in the AVERAGE CASE, BST give us
O(log2N) for Find, Insert and Delete
In the WORST CASE, BST gives us O(N) for
Find, Insert and Delete
• So how does a BST compare to a Sorted
array or ArrayList?
Recall that a sorted array gives us (average)
• O(log2N) to find an item using binary search
• O(N) to add or remove an item (due to shifting)
Thus, in the average case, BST is better for
Insert and Delete and about the same for Find
323
Lecture 25: Balanced BSTs
• "On average", a BST will remain balanced
But it is possible for it to become unbalanced,
yielding worst case run-times
• Can we guarantee that the tree remains
balanced?
Yes, for example the AVL Tree (Chapter 27)
• When Inserts or Deletes are done, nodes may be
"rotated" to ensure that the tree remains balanced
However, these rotations add overhead to the
operations
• If we time the operations, on average it is actually
slower than the regular BST
324
Lecture 25: Queues
• Queue
Data is added to the end and removed from
the front
Logically the items other than the front item
cannot be accessed
• Think of a bowling ball return lane
– Balls are put in at the end and removed from the front,
and you can only see / remove the front ball
Fundamental Operations
• enqueue an item to the end of the queue
• dequeue an item from the front of the queue
• front – look at the top item without disturbing it
325
Lecture 25: Queues
• A Queue organizes data by First In First Out,
•
or FIFO (or LILO – Last In Last Out)
Like a Stack, a Queue is a simple but
powerful data structure
Used extensively for simulations
• Many real life situations are organized in FIFO, and
Queues can be used to simulate these
• Allows problems to be developed and analyzed on the
computer, saving time and money
326
Lecture 25: Queues
Ex: A bank wants to determine how best to
set up its lines to the tellers:
• Option 1: Have a separate line for each teller
• Option 2: Have a single line, with the customer at
•
the front going to the next available teller
How can we determine which will have better
results?
– We can try each one for a while and measure
> Obviously this will take time and may create some
upset customers
– We can simulate each one using reasonable data and
compare the results
Other (often more complex) problems can also
be solved through simulation
327
Lecture 25: Queues
• Queue Implementation?
We need a structure that has access to both the
front and the rear
We'd like both enqueue and dequeue to be O(1)
operations
We have two basic approaches:
• Use a linked-list based implementation
• Use an array based implementation
Let's consider each one
328
Lecture 26: Queues
• Queue using a Linked List
This implementation is fairly straightforward as
long as we have a doubly linked list or access
to the front and rear of the list
• enqueue simply adds a new object to the end of the
•
•
•
list
dequeue simply removes an object from the front of
the list
Other operations are also simple
We can build our Queue from a LinkedList object,
making the implementation even simpler
– This is more or less done in the JDK
– See Queue.java, LinkedList.java
329
Lecture 26: Queues
• Note that Queue is an interface
• The LinkedList class implements Queue (among other
things)
– Note that in one way this is a good use of interfaces as
ADTs
– Even though LinkedList can do a lot more than just the
Queue operations, if we use a Queue reference to the
object, we restrict it to the Queue operations
> Compare this to Stack, which was implemented as a class
The text author also uses an interface, but
implements the Queue from stratch
• See LinkedQueue.java from text
– Linked list with front and rear references is used
330
Lecture 26: Queues
• Are there other linked options?
Recall from Slide 133 when we looked at linked
lists, we considered a circular linked list
• The extra link gives us all the functionality we need
for a Queue
– enqueue?
newNode = new Node(newEntry, lastNode.next);
lastNode = newNode;
– dequeue?
frontNode = lastNode.next;
lastNode.next = frontNode.next;
lastNode
return frontNode;
331
Lecture 26: Queues
The text takes this notion one step further:
• Logic enqueue and dequeue are as we expect
• However, when we dequeue, rather than removing
the node (and allowing it to be garbage collected), we
instead just "deallocate it" ourselves
– This way we save some overhead of creating new
nodes all the time
• We keep two references: queueNode and freeNode
– queueNode is the front of the queue
> This will be the next node dequeued
– freeNode is the rear of the queue
> This will be the next node enqueued – if none left we will
then create a new node
• This can be useful in a language such as C++ that
does not have garbage collection
332
Lecture 26: Queues
• Queue using an array
Arrays that we have seen so far can easily add at
the end, so enqueue is not a problem
• Can clearly be done in O(1) time
• We may have to resize, but we know how to do that too
However, removing from the front is trickier
• In ArrayList, removing from the front causes the
remaining objects to be shifted forward
– This gives a run-time of O(N), not O(1) as we want
• So we will not use an ArrayList
– Instead we will work directly with an array to implement
our Queue
333
Lecture 26: Queues
How can we make dequeue an O(1) operation?
• What if the front of the Queue could "move" – not
necessarily be at index 0?
– We would then keep a head index to tell us where
the front is (and a tail index to tell where the end is)
• Ok…so now we can enqueue at the rear by
incrementing the tail index and putting the new
object in that location and we can dequeue in the
front by simply returning the head value and
incrementing the head index
H
30
T
80
60
40
334
70
Lecture 26: Queues
This implementation will definitely work, but it
has an important drawback:
• Both enqueue and dequeue increment index values
• Once we increment front past a location, we never
•
use that location again
Thus, as the queue is used the data migrates
toward the end of the array
Clearly this is wasteful in terms of memory
•
What can we do to fix this problem?
• We need a way to reclaim the locations at the front
of the array without spending too much time
– So shifting is not a good idea
– Any ideas?
335
Lecture 26: Queues
How about proceeding down the array as we did
before, but when we get to the end, we wrap
around back to the beginning
We call this a circular queue, since we use the
array locations in a circular way
– Circular queue before enqueue of 80
H
60
40
70
50
T
90
50
90
– Circular queue after enqueue of 80
T
H
80
60
40
336
70
Lecture 26: Queues
How can this be done?
Actually it is quite simple
• When we increment the front and rear index values we
do so mod the array length, or
backIndex = (backIndex + 1) % queue.length;
queue[backIndex] = newEntry;
• As long as backIndex+1 is less than queue.length, the
•
result is a normal increment
However, once backIndex+1 == queue.length, taking
the mod will result in 0, returning us to the beginning
of the array
One remaining question: how do we know if the
queue is empty or full?
337
Lecture 26: Queues
• Both indexes move throughout the array
– Show example on board
– front == (back+1) % queue.length when array is full or
empty
• One easy solution is to keep track of the size with an
•
extra instance variable
Text doesn't want to do that (even though the size of
a queue is often needed)
– Rather, they keep one location in the array empty, even
if the queue is full
> Array is full when front == (back + 2) % queue.length
> Empty when front == (back + 1) % queue.length
– I don’t know why they do it this way!!!!!!
Let's look at some more code
• See ArrayQueue.java 338
Lecture 26: Array vs. Linked List Implementations
• So far we have discussed both array and
linked list based data structures:
For List interface we have ArrayList (and Vector)
and LinkedList
For Stack we have subclass of Vector (as we
discussed) or of LinkedList
For Queue we have linked list version in text
(LinkedQueue.java) and also the circular arraybased version (ArrayQueue.java)
• So which do we prefer?
It depends!
339
Lecture 26: Array vs. Linked List Implementations
Consider Stack and Queue
• As long as resizing is done in an intelligent way, the
array versions of these tend to be a bit faster than
the linked list versions
– Stack: push(), pop() are O(1) amortized time for both
implementations, but they are a constant factor faster
in normal use with the array version
– Queue: enqueue(), dequeue() are O(1) amortized time
for both implementations, but they are a constant
factor faster in normal use with the array version
• But notice that the Vector does not automatically
"downward" size when items are deleted, so that
Vector-based Stack will not either
– It could waste memory if it previously had many items
and now has few
340
Lecture 26: Array vs. Linked List Implementations
In general, you need to decide for a given
application which implementation is more
appropriate
In real life, however (especially now)
• Most of these data structures are predefined in a
library
– Java Collections Framework
> Stack is array-based, Queue is LL-based
– C++ Standard Template Library
• It's still good to understand how they are
implemented, but more often than not we just use the
standard version, due to convenience
341
Lecture 26: Priority Queues
• Queues organize data FIFO
• Sometimes we want to remove data by
other rules
"Those traveling with small children may board"
Tip the maitre d' to get a table
Your Java program is running out of memory so
the garbage collector needs to run
• This is the idea of a Priority Queue
Data is removed by priority order, rather than
FIFO.
342
Lecture 26: Priority Queues
• Methods:
Similar in nature to Queue
• add() an item to the PQ
– Similar to enqueue
• remove() and return the highest priority item
– Similar to dequeue
• peek() at the highest priority item
– Similar to getFront
The difference is the order of the removals
• See PriorityQueueInterface.java
343
Lecture 26: Priority Queues
• Implementation?
Consider unsorted array:
• add()?
• peek()?
• remove()?
• Run-times?
Consider sorted array:
• add()?
• peek()?
• remove()?
• Run-times?
[see notes on bottom of slide]
344
Lecture 27: Priority Queues
How about a linked-list?
• Unsorted will be similar to unsorted array
• Sorted does not buy us anything
– Why?
For any of the above implementations, consider
a sequence of N adds followed by N removes
• Let's figure out the total run-time and the amortized
time per operation
– Do on board
> [Also see Notes on the bottom of this slide]
• Can we do better?
– Yes, with a HEAP
345
Lecture 27: Heaps
• Idea of a heap:
Partial ordering of data in a logical complete
binary tree
For each node, T, in the tree:
• T.data has a higher priority than T.lchild.data
• T.data has a higher priority than T.rchild.data
• Note that NOTHING IS SAID about how T.lchild.data
and T.rchild.data compare to each other
– We do not care – could be either way
• This is why it is a partial ordering
– Compare to BST, which is a complete ordering
> In that case, we define a specific relationship between
siblings
346
Lecture 27: Heaps
• Higher priority here can mean either greater than or
less than in terms of value
– Min Heap: Highest priority value is the smallest
> Ex: Seedings in an event, rankings, etc.
– Max Heap: Highest priority value is the largest
> Ex: Salary, batting average, goals per game, etc.
– The logic is the same for both
> Text uses Max Heap
> Look at PriorityQueue.java and MaxHeapInterface.java
> We could very easily switch this to a Min Heap if needed
• Look at simple example on the board
347
Lecture 27: Heaps
Ok, how do we do our PQ / MaxHeap operations:
• peek() / getMax() is easy – ROOT of tree
How about add and remove?
• add() / add() is not as simple
• remove() / removeMax() is even trickier
• For both we are altering the tree, so we must ensure
that the HEAP PROPERTY is reestablished
– We need to carefully consider where / how to add and
remove to keep the tree valid but also not cost too
much work
348
Lecture 27: Heaps
Idea of add():
• Add new node at next available leaf
• Push the node "up" the tree until it reaches its
appropriate spot
– We'll call this upHeap
• See example on board
Idea of removeMax():
• We must be careful since root may have two children
– Similar problem exists when deleting from BST
– To delete that node will require a major reworking of
the tree
• Instead of deleting root node, we overwrite its value
with that of the last leaf
349
Lecture 27: Heaps
• Then we delete the last leaf -- easy to delete a leaf
– And we guarantee that the tree is still complete
• But now root value may not be the max
• Push the node "down" the tree until it reaches its
appropriate spot
– We'll call this downHeap
• See example on board
350
Lecture 27: Heaps
Run-time?
• Complete Binary Tree has height lgN
• upHeap or downHeap at most traverse height of the
tree
• Thus add() and removeMax are always O(lgN) worst
case
• For N add + removeMax operations:
– N x lgN = O(NlgN)
• Amortized the operations are (clearly) O(lgN) each
• This is definitely superior to either the array or linked
list implementation
351
Lecture 27: Implementing a Heap
• How to Implement a Heap?
We could use a linked binary tree, similar to that
used for BST
• Will work, but we have overhead associated with
dynamic memory allocation and access
– To go up and down we need child and parent
references
– Must keep track of "last leaf in tree" reference
But note that we are maintaining a complete
binary tree for our heap
It turns out that we can easily represent a
complete binary tree using an array
352
Lecture 27: Implementing a Heap
Idea:
• Number nodes row-wise starting at 1
• Use these numbers as index values in the array
• Now, for node at index i
Parent(i) = i/2
LChild(i) = 2i
RChild(i) = 2i+1
• See example on board
Now we have the benefit of a tree structure
with the speed of an array implementation
See MaxHeap.java
353
Lecture 27: Mutable and Immutable Objects
• Many classes that we build contain mutator
methods
Methods that allow us to change the content of
an object
Objects that can be changed via mutators are
said to be mutable
Ex: StringBuilder
• append() method adds characters to the current
StringBuilder
Ex: Rectangle2D.Double
• setFrame() method changes size and location
354
Lecture 27: Mutable and Immutable Objects
Ex: ArrayList
• add(), remove() for example
• Some classes do not contain mutator
methods
Objects from these classes are said to be
immutable
Ex: String
• Cannot alter the string once the object is created
Ex: wrapper objects (Integer, Float, etc)
• Allow accessors but no mutators
355
Lecture 28: Mutable and Immutable Objects
• Implications of Mutable vs. Immutable
Objects
Complications of being immutable
• Actions that could be simple as a mutation require
more work if a new object must be created
– Ex: Concatenating Strings
String S1 = "Hello ";
S1 = S1 + "there";
> We must create and assign a new object rather than just
append the string to the existing object
> If done repeatedly this can cause a lot of overhead
> Show on board
356
Lecture 27: Mutable and Immutable Objects
Complications of being mutable
• Consider collections of objects
• When we add an object to a collection, it doesn't
mean we give up outside access to the object
• If we subsequently alter the object "external" to the
collection, we could destroy a property of the
collection
– Ex: Consider a BST or a MaxHeap
– In either of these cases the data must meet a certain
requirement based on its value
– Altering an object within the BST or MaxHeap could
cause the collection to no longer satisfy the BST
property or the Heap property
357
Lecture 27: Cloning
• What to do?
We can make objects immutable
We can put clones of our original objects into
the collection
• However, we still must be careful not to mutate the
objects within the collection
– Since some access methods return references to the
objects within the collection
– To be very safe our accessors should themselves return
clones of the objects rather than references to the
originals
What is cloning?
358
Lecture 27: Copying and Deep vs. Shallow Copy
• Java objects can be copied using the clone()
method
clone() is defined in class Object, so it will work
for all Java classes
• However, you must override it for new classes to work
properly
– It needs to know what data in the new class to copy
– This is somewhat tricky to do, especially for subclasses
– see Employee.java for syntax
clone() is already defined for Java arrays (and
some other classes), so we can use it for them
without overriding
359
Lecture 27: Copying and Deep vs. Shallow Copy
• clone() is typically defined to do a shallow
copy of the data in an object
This means that when the object is copied,
objects that it refers to are NOT copied
• Ex: If cloning an array of StringBuilders, we get a new
array but NOT new StringBuilders
– Show on board
• This can cause data sharing/aliasing that you must be
•
aware of
See Example15.java and Employee.java for example
360
Lecture 27: Copying and Deep vs. Shallow Copy
Generally speaking, (true) deep copying is more
difficult than shallow copying
• We need to follow all references in the original and
make copies for the clone()
– This could be several levels deep
• Ex: A Binary Search Tree
– The BST object has only one instance variable – a
reference to the root node
– A shallow copy would only copy this single reference
– A deep copy would have to traverse the entire tree,
copying each node AND copying the data in each node
AND …
> For a deep-er copy we can use the copyNodes() which
calls the copy() method that we discussed previously
361