Transcript Lecture 32

Lecture 32
• Closure Properties for CFL’s
– Kleene Closure
• construction
• examples
• proof of correctness
– Others not covered in lecture
• union, concatenation
• CFL’s versus regular languages
– regular languages subset of CFL
1
Closure Properties for CFL’s
Kleene Closure
2
CFL closed under Kleene Closure
• Let L be an arbitrary CFL
• Let G1 be a CFG s.t. L(G1) = L
– G1 exists by definition of L1 in CFL
•
•
•
•
Construct CFG G2 from CFG G1
Argue L(G2) = L*
There exists CFG G2 s.t. L(G2) = L*
L* is a CFL
3
Visualization
• Let L be an arbitrary CFL
• Let G1 be a CFG s.t. L(G1) = L
L
L*
– G1 exists by definition of L1 in CFL
•
•
•
•
Construct CFG G2 from CFG G1
Argue L(G2) = L*
There exists CFG G2 s.t. L(G2) = L*
L* is a CFL
CFL
G1
G2
CFG’s
4
Algorithm Specification
• Input
– CFG G1
• Output
– CFG G2 such that L(G2) = (L(G1))*
CFG G1
A
CFG G2
5
Construction
• Input
– CFG G1 = (V1, S, S1, P1)
• Output
– CFG G2 = (V2, S, S2, P2)
• V2 = V1 union {T}
– T is a new symbol not in V1 or S
• S2 = T
• P2 = P1 union {T --> TS1 | l}
6
Closure Properties for CFL’s
Kleene Closure Examples
7
Example 1
• Input grammar:
–
–
–
–
V = {S}
S = {a,b}
S=S
P:
S --> aa | ab | ba | bb
V2 = V1 union {T}
T is a new symbol not in V1 or S
S2 = T
P2 = P1 union {T --> TS | l}
• Output grammar
–
–
–
–
V = {S, T}
S = {a,b}
Start symbol is T
P:
T --> TS | l
S --> aa | ab | ba | bb
8
Example 2
• Input grammar:
–
–
–
–
V = {S, T}
S = {a,b}
Start symbol is T
P:
T --> TS | l
S --> aa | ab | ba | bb
V2 = V1 union {T}
T is a new symbol not in V1 or S
S2 = T
P2 = P1 union {T --> TS | l}
• Output grammar
–
–
–
–
V = {S, T, U}
S = {a,b}
Start symbol is U
P:
U --> UT | l
T --> TS | l
S --> aa | ab | ba | bb
9
Closure Properties for CFL’s
Kleene Closure Proof of Correctness
10
Is our construction correct?
• How do we prove our construction is correct?
– Informal
• Test some strings
• Review logic behind construction
– Formal
• First, show every string derived by G2 belongs to (L(G1))*
– That is, show L(G2) is a subset of (L(G1))*
• Second, show every string in (L(G1))* can be derived by G2
– That is, show (L(G1))* is a subset of L(G2)
• Both proofs will be inductive proofs
– Inductive proofs and recursive algorithms go well together
11
L(G2) is a subset of (L(G1))*
• We want to prove the following
– If x in L(G2), then x is in (L(G1))*
• This is equivalent to the following
– If T ==>*G2 x, then x is in (L(G1))*
– The two statements are equivalent because
• x in L(G2) means that T ==>*G2 x
• We break the second statement down as follows:
– If T ==>1G2 x, then x is in (L(G1))*
– If T ==>2G2 x, then x is in (L(G1))*
– If T ==>3G2 x, then x is in (L(G1))*
– ...
12
L(G2) is a subset of (L(G1))*
• Statement to be proven:
– For all n >= 1, if T ==>nG2 x, then x is in (L(G1))*
– Prove this by induction on n
• Base Case:
– n=1
– The only string x such that T ==>1G2 x is the string l
• Follows from inspection of G2
– The string l belongs to (L(G1))*
• Follows from definition of Kleene Closure
13
Inductive Case
• Inductive Hypothesis:
– For 1 <= j <= n, if T ==>jG2 x, then x is in (L(G1))*
• Note, this is a “strong” induction hypothesis
• Statement to be Proven in Inductive Case:
– For n above, if T ==>n+1G2 x, then x is in (L(G1))*
• Proving this statement
– Let x be an arbitrary string such that T ==>n+1G2 x
– There are two possible first derivation steps
• Case 1: T ==>G2 l ==>nG2 x
• Case 2: T ==>G2 TS ==>nG2 x
– These 2 cases follow from looking at grammar G2
14
Case Analysis
• Case 1: T ==>G2 l is not possible
– n >= 1 which means n+1 >= 2
• Case 2: T ==>G2 TS ==>nG2 x
– This means x has the form uv where
• T ==><n u
• S ==><n v
– This follows because in n steps, we go from TS to x
– Thus, T can take at most n-1 steps to generate u
– Likewise for S generating v
– String u belongs to (L(G1))*
• Follows from the inductive hypothesis
15
Concluding Case 2:
T ==>G2 TS ==>nG2 x
– String v belongs to L(G1)
• Follows from S ==>* G2 v and
• Our construction insures that all strings derived from S in L(G2) are
also in L(G1)
– Conclude x belongs to (L(G1))*
• x = uv where u belongs to (L(G1))* and v belongs to L(G1)
• By definition of Kleene closure, this means x=uv belongs to (L(G1))*
• Wrapping up inductive case
– In all possible derivations of x, we have shown that x belongs to
(L(G1))*
– Thus, we have proven the inductive case
• Conclusion
– By the principle of mathematical induction, we have shown that
L(G2) is a subset of (L(G1))*
16
(L(G1))* is a subset of L(G2)
• We want to prove the following
– If x is in (L(G1))*, then x is in L(G2)
• This is equivalent to the following
– If x is in (L(G1))*, then T ==>*G2 x
– The two statements are equivalent because
• x in L(G2) means that T ==>*G2 x
• We break the second statement down as follows:
– If x is in (L(G1))0, then T ==>*G2 x
– If x is in (L(G1))1, then T ==>*G2 x
– If x is in (L(G1))2, then T ==>*G2 x
– ...
17
(L(G1))* is a subset of L(G2)
• Statement to be proven:
– For all n >= 0, if x is in (L(G1))n, then x is in L(G2)
– Prove this by induction on n
• Base Case:
– n=0
– The only string x such that x is in (L(G1))0 is the string l
• Follows from definition of (L(G1))0
– The string l belongs to L(G2)
• Follows from production T --> l
18
Inductive Case
• Inductive Hypothesis:
– For 0 <= j <= n, if x is in (L(G1))j, then T ==>*G2 x
• Note, this is a “strong” induction hypothesis
• Statement to be Proven in Inductive Case:
– For n above, if x is in (L(G1))n+1, then T ==>*G2 x
• Proving this statement
– Let x be an arbitrary string in (L(G1))n+1
– This means x = uv where
• u is a string in (L(G1))n
• v is a string in L(G1)
– This follows from definition of (L(G1))n+1 and that n>=0
19
Deriving x
– x = uv where
• u is a string in (L(G1))n
• v is a string in L(G1)
– T ==> G2 TS ==>* G2 uS ==>* G2 uv = x
• The first derivation step is possible from our grammar
• The second sequence of derivation steps follows from T ==>* G2 u
– This follows from applying the induction hypothesis
• Third sequence of derivation steps follows from S ==>* G2 v
– This follows from the fact that v is in L(G1) and
– Our construction includes all productions from G1
– Thus T ==>* G2 x
• The inductive case follows
• The result is proven by the principle of mathematical
induction
20
Construction for Set Union
• Input
– CFG G1 = (V1, S, S1, P1)
– CFG G2 = (V2, S, S2, P2)
• Output
– CFG G3 = (V3, S, S3, P3)
• V3 = V1 union V2 union {T}
– Variable renaming to insure no names shared between V1 and V2
– T is a new symbol not in V1 or V2 or S
• S3 = T
• P3 = P1 union P2 union {T --> S1 | S2}
21
Construction for Set
Concatenation
• Input
– CFG G1 = (V1, S, S1, P1)
– CFG G2 = (V2, S, S2, P2)
• Output
– CFG G3 = (V3, S, S3, P3)
• V3 = V1 union V2 union {T}
– Variable renaming to insure no names shared between V1 and V2
– T is a new symbol not in V1 or V2 or S
• S3 = T
• P3 = P1 union P2 union {T --> S1S2}
22
CFL’s and regular languages
23
CFL Closure Properties
• CFL’s are closed under Kleene closure
– Just proven, proof also in book
• CFL’s are closed under set union
– Proof in book
• CFL’s are closed under set concatenation
– Proof in book
• What can we conclude from these 3 results?
– It follows that regular languages are a subset of CFL’s
24
Regular languages subset of CFL
• Recursive definition of regular languages
– Base Case:
• {}, {l}, {a}, {b} are regular languages over {a,b}
• P={}, P={S --> l}, P={S --> a}, P={S --> b}
– Inductive Case:
• If L and L’ are are regular languages, then L*, LL’, L
union L’ are regular languages
• Use previous constructions to see that these
resulting languages are also context-free
25
Other CFL Closure Properties
• We will show that CFL’s are NOT closed under
many other set operations
• Examples include
– set complement
– set intersection
– set difference
26
Language class hierarchy
?
H
H
Equal
REG
CFL
REC
RE
All languages over alphabet S
27