Transcript ch02

CS 361 – Chapter 2
• 2.1 – 2.2 Linear data structures
– Desired operations
– Implement as an array or linked list
– Complexity of operations may depend on underlying
representation
• Later we’ll look at nonlinear d.s. (e.g. trees)
Linear
• There are several linear data structures
– Each has desired ADT operations
– Can be implemented in terms of (simpler) linear d.s.
– 2 most common implementations are array & linked
list
• Common linear d.s. we can create
–
–
–
–
–
–
Stack
Queue
Vector
List
Sequence
…
Note: terminology not universal
Implementation
• Array implementation
– Already defined in programming language
– Fast operations, easy to code
– Drawbacks?
• Linked list implementation
– We define a head and a tail node
– Each node has prev and next pointers, so there are no
orphans
– Space efficient, but trickier to implement
– Need to allocate/deallocate memory often, which may
have unpredictable execution time in practice
• Other implementations possible, but unusual
– Array for LL, queue for stack, etc.
Vector
• Each item in collection has a rank: how many items
come “before” me
• Rank is essentially an index, but an array
implementation is free to put items anywhere (e.g.
starting at 1 instead of 0)
• Some useful operations we’d like
(names may vary)
–
–
–
–
get(rank)
set(rank, item)
insert(rank, item)
remove(rank)
See p.61 for meaning of ins/del
Which of these operations require(s) a loop?
List ADT
• Not concerned with index/rank
• Position of item is at beginning or end of list, or
before/after some other item.
– The ketchup is next to the peanut butter, … But you first should
know where peanut butter is
• Some useful operations:
– getFirst() and getLast()
– prev(p) and next(p)
– replace(p, newItem)
– swap(p, q)
– Inserting an item at either end of list, or before/after existing one
– remove(p)
Which operations inherently require a loop?
Compare implementations
• We can compare array vs. LL implementations based on
an analysis of how they perform d.s. operations.
– Primarily determined by their representation
• “Sequence”
–
–
–
–
•
•
•
•
•
Combine functionality of vector and list
Again: terminology (vector vs. sequence) not universal.
Sometimes we want to exploit array feature or LL feature
Table on p. 67 compares operation complexity
Always O(1): size, prev, next, replace, swap
O(1) for array only: retrieve/replace at specific rank
O(1) for list only: insert/remove at a given node
Always O(n): insert/remove at rank
What about searching for an element?
Trees
• Read section 2.3
–
–
–
–
Terminology
Desired operations
How to traverse, find depth & height
Binary trees
– Binary tree properties
– Traversals
– Implementation
Definitions
• Tree = connected acyclic graph
• Rooted tree, as opposed to a free tree
– More useful for us
– Nodes arranged in a hierarchy, by level starting with
the root node
• Other terms related to rooted trees:
– Relationships between nodes much richer than a LL:
parent, child, sibling, subtree, ancestor, descendant
– 2 types of nodes:
• Internal
• External, a.k.a. Leaf
Definitions (2)
Continuing with rooted trees from now on…
• Ordered tree = children of a node are ranked 1st, 2nd, 3rd,
etc.
• Binary tree = each node has at most 2 children, called
the left and right child
– Not the same as an ordered tree with 2 children. If a
node has only 1 child, we still need to tell if it’s the left
or right child.
– (More on binary trees later)
• Different kinds of trees  difficult to implement a silverbullet tree d.s. for all occasions
Why trees?
• Many applications require information stored
hierarchically.
– Many classification systems
– Document structure
– File system
– Computer program
– Mathematical expression
– Others?
• We mean the data is hierarchical in a logical sense. The
low-level rep’n of the data may still be linear. That will
be the programmer’s secret.
Desired tree ops
• getRoot()
• findParent(v)
• findChildren(v) – returns list or iterator
– An iterator is an object of a special class having methods next()
and hasNext()
• isLeaf(v)
• isRoot(v)
And then some operations not so tree specific:
• swapValuesAt(v1, v2)
• getValueAt(v)
• setValueAt(v)
Desiderata (2)
• findDepth(v) – distance to
root
• findHeight() – max depth
of all nodes
• preorderTraversal(v)
– Initially call with root
– Recursive function
– Can be done as
iterator
• postorderTraversal(v)
– analogous
• Pseudocode:
preorder(v):
process v
for each child c of v:
preorder(c)
postorder(v):
for each child c of v:
postorder(c)
process v
See why they are called pre and
post? Try an example tree.
Binary trees
• Each node has  2 children.
• Very useful for CS applications
• Special cases
– Full binary tree = each node has 0 or 2 children. Suitable for
arithmetic expressions. Also called “proper” binary tree.
– Complete binary tree = All levels except the deepest have the
maximum nodes possible. Deepest level has all of its m nodes
in the m leftmost positions.
• Generalizations
– Positional tree (as opposed to ordered tree) = children have a
positional number. E.g. A node may have three children at
positions 1, 3 and 6.
– K-ary tree = positional tree where there is no child having
position higher than k
Binary tree ops
•
•
•
•
findLeftChild(v)
findRightChild(v)
findSibling(v) – how would this work?
preorder & postorder traversals can be simplified a little,
since we know we have  2 children
• A 3rd traversal! inorder
inorder(v):
inorder(v.left)
process v
inorder(v.right)
• For modeling a mathematical expression, these
traversals give rise to: prefix, infix and postfix notation!
Binary tree properties
• Suppose we have a full binary tree
• n = total number of nodes, h = height of tree
• Think about why these are true…
• h + 1  # leaves  2h
• h  # internal nodes  2h – 1
• log2 (n + 1) – 1  h  (n – 1) / 2
Expression as tree
• Arithmetic expression is inherently hierarchical
• We also have linear/text representations.
– Infix, prefix, postfix
– Note: prefix and postfix do not need grouping symbols
– Postfix expression can be easily evaluated using a stack
• Example: (25 – 5) * (6 + 7) + 9
into a tree
– Which is the last operator performed?  This is the root.
And we can deduce where left and right subtrees are.
– Next, for the subtree: (25 – 5) * (6 + 7), last op is the *, so this is
the “root” of this subtree.
– Notes:
• Resulting binary tree is “full.”
• Numbers are leaves; operators are internal. This is why the
tree drawing is straightforward.
Postfix eval
•
•
•
•
Our postfix expression is: 25 5 – 6 7 + * 9 +
When you see a number… push.
When you see an operator… pop 2, evaluate, push.
When no more input, pop answer.
25
25
5
5
25
–
20
6
7
+
*
9
+
6
20
7
6
20
13
9
20 260 260 269
Tree & traversal
• Given a (binary) tree, we can find its traversals. √
• How about the other way?
– Mathematical expression had enough context information that 1
traversal would be enough.
– But in general, we need 2 traversals, one of them being inorder.
• Example: Draw the binary tree having these traversals.
Postorder:
SCXHRJQT
Inorder:
SRCHXTJQ
– Hint: End of the postorder is the root of the tree. Find where the
root lies in the inorder. This will show you the 2 subtrees.
Continue with each subtree, finding its root and subtrees, etc.
• Exercise: Find 2 distinct binary trees t1 and t2 where
preorder(t1) = preorder(t2) and
postorder(t1) = postorder(t2).
Euler tour traversal
•
•
•
•
General way to encompass all 3 traversals.
Text p.88 shows “shrink wrap” image of tree
We visit each node on its left, underneath, and its right.
Pseudocode
eulerTour(v):
do v’s left side action
eulerTour(v.left)
do v’s under action
eulerTour(v.right)
do v’s right side action
//
//
//
//
//
west
southwest
south
southeast
east
Applications
•
•
•
•
Can adapt eulerTour( ):
Preorder traversal: “below” and “right” actions are null
Inorder traversal: “left” and “right” actions are null
Postorder traversal: “left” and “below” actions are null
• Elegant way to print a fully parenthesized expression:
– Left action: print “(“
– Under action: print node contents
– Right action: print “)”
Tree implementation
• Binary trees: internal representation may be array or
links
• General trees: array too unwieldy, just do links
• Array-based representation
–
–
–
–
–
Assign each node in the tree an index
Root = 1
If a node’s index is p, left child = 2p and right child = 2p + 1
Array operations are quick
Space inefficient. In worst case, n nodes would require index
values up to 2n–1. (How would this happen?) Exponential space
complexity is bad.
Implement as links
• For binary tree
– Each node needs:
• Contents
• Pointers to left child, right child, parent
– Tree overall needs a root node to start with.
• For general rooted tree
– Each node needs:
• Contents
• List of pointers to children; pointer to parent
– Tree overall needs a root node to start with.