Transcript lec5

Recursive Data Structures and
Grammars
• Themes
–
–
–
–
Recursive Description of Data Structures
Grammars and Parsing
Recursive Definitions of Properties of Data Structures
Recursive Algorithms for Manipulating and Traversing
Data Structures
• Examples
– Lists
– Trees
– Expressions and Expression Trees
Grammars
• Syntactic Categories (non-terminals)
– <number>
– <digit>
– <expr>
• Production Rules (replace syntactic category on
the rhs by the lhs, “|” is or)
–
–
–
–
<expr>  <expr> + <expr>
<expr>  <number>
<number>  <digit> <number>
<digit>  0|1|2|3|4|5|6|7|8|9
Derivation
• Repeatedly replace syntactic categories by the lhs
of rules whose rhs is equal to the syntactic
category
• <expr>  <expr>+<expr>
 <expr>+<expr>+<expr>
 <number>+<expr>+<expr>
 <number>+<number>+<expr>
 <number>+<number>+<number>
Derivation (e.g. 2)
• <number>  <digit><number>
 <digit><digit><number>
 <digit><digit><digit>
 <digit><digit>3
 <digit>23
 123
• When there are no more syntactic categories, the
process stops and the resulting string is said to be
derived from the initial syntactic category.
Languages
• The language, L(<S>), derivable from the
syntactic category <S> using the grammar
G is defined inductively.
• Initially L(<S>) is empty
• If <S>  X1    Xn is a production in G
and si = Xi is a terminal or si  L(Xi), then
the concatenation s1s2 …sn is in L(<S>)
Language
• The number of strings of length n in the
language L(<number>) is 10n.
• Proof is by induction.
Language
• <B>  () | (<B>)
• L(<B>) = strings of n left parens followed
by n right parens, for n >= 0.
Systematic Generation
• C statement example
Binary Trees
• A binary tree is
– empty
– consists of a node with 3 elements
• value
• left, which is a tree
• right, which is a tree
Height of Binary Trees
• Height(T) = -1 if T is empty
• max(Height(T.left),Height(T.right)) + 1
• Alternative: Max over all nodes of the level
of the node.
Number of Nodes of a Binary
Trees
• Nnodes(T) = 0 if T is empty
• Nnodes(T.left) + Nnodes(T.right) + 1
Internal Path Length
• IPL(T) = 0 if T is empty
• IPL(T) = IPL(T.left) + IPL(T.right) +
Nnodes(T)-1
• Alternative: Sum over all nodes of the level
of the node.
External Format for Binary Trees
• <bintree>  []
 [<value>,<bintree>,<bintree>]
•
•
•
•
[],
[1,[],[]],
[2,[1,[],[]],[]], [2,[[],[1,[],[]]]
[3, [2,[1,[],[]],[]], []], [3, [2,[[],[1,[],[]]],[]]
[3, [1,[],[]], [1,[],[]]],
[3, [],[2,[1,[],[]],[]]], [3, [],[2,[[],[1,[],[]]]]
Recurrence for the Number of
Binary Trees
• Let Tn be the number of binary trees with n
nodes.
• T0 = 1, T1 = 1, T2 = 2, T3 = 5
n 1
T
n
  T k T n  k 1
k 0
Binary Search Trees
• Binary Tree
• All elements in T->left are <= T->value
• All elements in T->right are >= T->value
Inorder traversal
• Recursively visit nodes in T.left
• visit root
• Recursively visit nodes in T.right
• An in order traversal of a BST lists the
elements in sorted order. Proof by
induction.
Parse Tree
• A derivation is conveniently stored in a tree,
where internal nodes correspond to
syntactic categories and the children of a
node correspond to the element of the rhs in
the rule that was applied
Example Parse Tree
<number>
/
\
<digit> <number>
|
/
\
1
<digit> <number>
|
|
2
<digit>
|
3
Recursive Decent Parser
• Balanced parentheses
Ambiguous Grammars
<expr>
<expr>
/ | \
/ | \
<expr>+<expr> <expr>+<expr>
/ | \
/ | \
<expr>+<expr>
<expr>+<expr>