Transcript Chabter 2

Chapter 2
The Big Picture
1
Overview
●
The big picture answers to several questions.

What are data structures?

What data structures do we study?

What are Abstract Data Types?

Why Object-Oriented Programming (OOP) and
Java for data structures?

How do I choose the right data structures?
2
2.1 What Are Data Structures
3
2.1 What Are Data Structures
●
A data structure is an aggregation of data components
that, together, constitute a meaningful whole.


The components themselves may be data structures.
Stops at some “atomic” unit.
Atomic or primitive type A data type whose elements
are single, non decomposable data items (can be
broken into parts)
● Composite type A data type whose elements are
composed of multiple data items
(ex: take tow integers (simple elements) x, y to form a
point (x, y)
●
4
2.2 What Data Structures Do We
Study?
●
An array is an aggregation of entries that are
arranged in contiguous fashion with the
provision of a single-step random access to any
entry.

There are numerous other situations where more
sophisticated data structures are required.

Data structure categories:
●
●

Linear
Non-linear
Category is based on how the data is conceptually
organized or aggregated.
5
2.2 What Data Structures Do We
Study?
●
Linear structures

List, Queue and Stack are linear collections, each
of them serves as a repository in which entries may
be added or removed at will.
●
Differ in how these entries may be accessed once they
are added.
6
2.2 What Data Structures Do We
Study?
●
List

The List is a linear collection of entries in which
entries may be added, removed, and searched for
without restrictions.
●
Two kinds of list:


Ordered List
Unordered List
7
2.2 What Data Structures Do We
Study?
●
Queue

Entries may only be removed in the order in which
they are added.
●
●
First in First out (FIFO) data structures
No search for an entry in the Queue
8
2.2 What Data Structures Do We
Study?
●
Stack

Entries may only be removed in the reverse order in
which they are added.
●
●
Last In, First Out (LIFO)
No search for an entry in the Stack.
9
2.2 What Data Structures Do We
Study?
●
Trees:

A tree is a nonlinear structure in which each node is
capable of having many successor nodes, called
children.

Trees are useful for representing hierarchical
relationships among data items.

Root The top node of a tree structure; a node with
no parent
10
Tree
11
2.2 What Data Structures Do We
Study?
●
Binary Tree

Binary tree A tree in which each node is capable
of having two child nodes, a left child node and a
right child node

Leaf A tree node that has no children
12
A Binary Tree
13
2.2 What Data Structures Do We
Study?
●
General Tree

Models a hierarchy such as the organizational
structure of a company, or a family tree.
●
A non-linear arrangement of entries, it is a generalization
of the binary tree structure, hence the name.
14
2.2 What Data Structures Do We
Study?
●
●
Binary Search Tree
A binary tree in which the key value in any node
is greater than the key value in its left child and
any of its descendants (the nodes in the left
subtree) and less than the key value in its right
child and any of its descendants (the nodes in
the right subtree)
15
2.2 What Data Structures Do We
Study?
●
AVL Tree

Height-balanced, binary search tree.
●
AVL Tree derives its importance from the fact that it
speeds up this search process to a remarkable degree.
16
2.2 What Data Structures Do We
Study?
●
Heap as a Priority Queue

A priority queue is a specialization of the FIFO
Queue.
●
●
●
Entries are assigned priorities.
The entry with the highest priority is the one to leave first.
The heap is a special type of binary tree.
17
2.2 What Data Structures Do We
Study?
●
Hash Table:

Hash Functions: A function used to manipulate the
key of an element in a list to identify its location in
the list

Hashing: The technique for ordering and accessing
elements in a list in a relatively constant amount of
time by manipulating the key to identify its location
in the list

Hash table: Term used to describe the data
structure used to store and retrieve elements using
hashing
18
Using a hash function to Determine the
Location of the Element in an Array
19
2.2 What Data Structures Do We
Study?
●
Graphs

A general tree is a special kind of graph, since a
hierarchy is a special system of relationships
among entities.
●
●
●
Graphs may be used to model systems of physical
connections such as computer networks, airline routes,
etc., as well as abstract relationships such as course prerequisite structures.
Standard graph algorithms answer certain questions we
may ask of the system.
Two kinds of graphs:


Directed Graph—asymmetric relationship
Undirected Graph—a symmetric relationship
20
Data Type
●
A data type can be formally defined by
describing:
1. the collection of elements that it can represent.
2. the operations that may be performed on those
elements
Simple data type examples: integers, real
numbers, characters
21
Data abstraction
Data abstraction The separation of a data type’s
logical properties from its implementation. (it
reduce complexity)
●
Ex: Integers are physically represented in
different ways on different computers. (a binarycoded decimal, a sign-and-magnitude binary,
two's-complement binary notation) Although you
may not be familiar with these terms, that hasn’t
stopped you from using integers.
22
2.3 What are Abstract Data Types?
Abstract data type (ADT) A data type whose
properties (domain and operations) are
specified independently of any particular
implementation

A data structure is an abstract data type.
●
i.e. The primitive data types built into a language integer,
real, character and boolean.

●
We do not at all worry about its internal representation.
We know we can perform certain operations on integers
that are guaranteed to work the way they are intended to.

“+”, “-”, “*”, “/” language designers and compiler writers were
responsible for designing the interface for these data types, as
well as implementing them.
23
2.3 What are Abstract Data Types?
●
Exactly how these operations are implemented
by the compiler in the machine code is of no
concern.

On a different machine, the behavior of an integer
does not change, even though it's internal
representations may change.

As the programmers were concerned, the primitive
data types were abstract entries.
24
2.3 What are Abstract Data Types?
●
A data structure consisting of several layers of
abstraction.

A Stack (of integers) may be built using a List (of
integers), which in turn may be built using an array
(of integers).
●
●
Integer and array are language-defined types.
Stack and List are user-defined.
25
2.3 What are Abstract Data Types?
26
2.3 What are Abstract Data Types?
●
Since an ADT makes a clean separation
between interface and implementation, the user
only sees the interface and therefore does not
need tamper with the implementation.

The responsibility of maintaining the implementation
is separated from the responsibility of maintaining
th code that uses an ADT.

This makes the code easier to maintain.

ADTs may be used many times in various contexts.
●
A List ADT may be used directly in application code, or
may be used to build another ADT, such as the Stack.
27
2.3 What are Abstract Data Types?
●
Object-oriented programming is paradigm that
addresses exactly these issues.
28
2.4 Why OOP and Java for Data
Structures?
●
OOP paradigm views the program as a system
of interacting objects.

Objects in a program may model physical objects,
or abstract entities.

An object encapsulates state and behavior.
●
●
For instance, the state of an integer, is its current value, it
behavior is the set of arithmetic operations that may be
applies to integers.
A class is analogous to the ADT, an object is
analogous to a variable of an ADT.

The user of an object is called its client.
29
2.4 Why OOP and Java for Data
Structures?
●
A stack may be built using a List ADT.
●
A stack interface defines four operations:

push

pop

get-Size

isEmpty
30
2.4 Why OOP and Java for Data
Structures?
●
The stack object contains a List object which
implements its state, and the behavior of the
Stack object is implemented in terms of the List
object's behavior.
31
2.4 Why OOP and Java for Data
Structures?
●
Reuse by inheritance.

Data structures may be built by inheriting from other
data structures; if data structure A inherits from data
structure B, then every A object is is-A B object.
●
AVL Tree can inherit from a Binary Search Tree, since
every AVL Tree is is-A special, height balanced, binary
search tree.

Every binary search tree is not an AVL tree.
32
2.4 Why OOP and Java for Data
Structures?
●
●
Inheriting means sharing the implementation
code.
A language that implements the OOP paradigm
automatically ensures:

Separation of interface from implementation by
encapsulation

Code reuse inheritance when permitted.
33
2.5 How Do I choose the Right Data
Structures?
●
The interface of operations that is supported by
a data structure is one factor to consider when
choosing between several available data
structures.

The efficiency of the data structures:
●
●
How much space does the data structure occupy?
What are the running times of the operation in its
interface?
34
2.5 How Do I choose the Right Data
Structures?
●
Example

Implementing a printer queue, requires a queue
data structure.
●
●

Maintains a collection of entries in no particular order.
An unordered list would be the appropriate data structure
in this case.
It is not too difficult to fit the requirements of the
application to the operations supported by a data
structure.
●
It is more difficult to choose from a set of candidate data
structures that all meet the operational requirements.
35
2.5 How Do I choose the Right Data
Structures?
●
Important, and sometimes contradictory factor
to consider:

The running time of each operation in the interface.
●
●
●
A data structure with the best interface with the best fit
may not necessarily be the best overall fit, if the running
times of its operations are not up to the mark.
When we have more than one data structure
implementation whose interfaces satisfy our
requirements, we may have to select one based on
comparing the running times of the interface operations.
Time is traded off for space,

i.e. more space is consumed to increase speed, or a reduction in
speed is traded for a reduction in the space consumption.
36
2.5 How Do I choose the Right Data
Structures?
●
Time-space tradeoff

We are looking to “buy” the best implementation of
a stack.
●
StackA. Does not provide a getSize operation.

●
●
i.e. there is not single operation that a client can use to get the
number of entries in StackA.
StackB. Provides a getSize operation, implemented in
the manner we discussed earlier, transferring entries
back and forth between two stacks.
StackC. Provides a getSize operation, implemented as
follows: a variable called size is maintained that is
incremented every time an entry is pushed, and
decremented every time an entry is popped.
37
2.5 How Do I choose the Right Data
Structures?
●
Three situations:

Need to maintain a large number stacks, with no
need to find the number of entries.

Need to maintain only one stack, with frequent need
to find the number of entries.

Need to maintain a large number of stacks. With
infrequent need to find the number of entries.
38
2.5 How Do I choose the Right Data
Structures?
●
Situation 1, StackA fits the bill.

Tempting to pick StackC, simply because we may
want to play conservative: what if we need getSize
in the future?
39
2.5 How Do I choose the Right Data
Structures?
●
Situation 2, StackB or Stack C.

Need to use getSize.

getSize in StackB is more time-consuming than that
in StackC.

We need only one stack, the additional size variable
used by StackC is not an issue.

Since we need to use getSize frequently, it is better
to with StackC.
40
2.5 How Do I choose the Right Data
Structures?
●
Situation 3 presents a choice between StackB
and StackC.

If getSize calls are infrequent, we may choose to go
with StackB and suffer a loss in speed.

The faster getSize delivered by StackC is at the
expense of an extra variable per stack, which may
add up to considerable space consumption since
we plan to maintain a number of stacks.
41
2.5 How Do I choose the Right Data
Structures?
●
getSize in StackB is more time-consuming that
that in StackC.

How can we quantify the time taken in either case?

For each data structure we study, we present the
running time of each operation in its interface.
42