Regular Expressions/Languages
Download
Report
Transcript Regular Expressions/Languages
Regular Expressions/Languages
• Regular languages
– Inductive definitions
– Regular expressions
• syntax
• semantics
• Not covered in lecture
1
Regular Languages
(Regular Expressions)
2
Regular Languages
• New language class
– Elements are languages
• We will show that this language class is identical
to LFSA
– Language class to be defined by Finite State Automata
(FSA)
– Once we have shown this, we will use the term “regular
languages” to refer to this language class
3
Inductive Definition of Regular Languages
• Base case definition
– Let S denote the alphabet
– {}, {l} are regular languages
– {a} is a regular language for any character a e
S
• Inductive case definition
– If L1 and L2 are regular languages, then
• L1 union L2 is a regular language
• L1 concatenate L2 is a regular language
• L1* is a regular language
• Completeness
– Only languages generated using above rules are regular languages
4
Inductive Definition of Integers
• Base case definition
– 0 is an integer
• Inductive case definition
– If x is an integer, then
• x+1 is an integer
• x-1 is an integer
• Completeness
– Only numbers generated using the above rules are
integers
5
Proving a language is regular
• Prove that {aa, bb} is a regular language
– {a} and {b} are regular languages
• basic definition
– {aa} = {a}{a} is a regular language
• concatenation rule
– {bb} = {b}{b} is a regular language
• concatenation rule
– {aa, bb} = {aa} union {bb} is a regular language
• union rule
• Typically, we will not go through this process to prove a language is
regular
6
Regular Expressions
• How do we describe a regular language?
– Use set notation
• {aa, bb, ab, ba}*
• {a}{a,b}*{b}
– Use regular expressions R
• Inductive def of regular languages and regular
expressions on page 72
• (aa+bb+ab+ba)*
• a(a+b)*b
7
R and L(R)
• How we interpret a regular expression
– What does a regular expression R mean to us?
• aaba represents the regular language {aaba}
• f represents the regular language {}
• aa+bb represents the regular language {aa, bb}
– We use L(R) to denote the regular language
represented by regular expression R.
8
Precedence rules
• What is L(ab+c*)?
– Possible answers:
•
•
•
•
{a}({b} union {c}*}
({a}{b,c})*
({ab} union {c})*
{ab} union {c}*
– Correct answer
– Must know precedence rules
• * first, then concatenation, then +
9
Precedence rules continued
• Precedence rules similar to those for
arithmetic expressions
– ab+c2
• (a times b) + (c times c)
• exponentiation first, then multiplication, then
addition
• Think of Kleene closure as exponentiation,
concatenation as multiplication, and union as
addition and the precedence rules are identical
10
Regular expressions are strings
• Let L be a regular language over the
alphabet S
– A regular expression R for L is just a string
over the alphabet S union {(, ), +, *, f, l}
which follows certain syntactic rules
• That is, the set of legal regular expressions is itself a
language over the alphabet S union {(, ), +, *}
– f, a*aba are strings in the language of legal reg. exp.
– )(, *a* are strings NOT in the language of legal reg. exp.
11
Semantics
• We give a regular expression R meaning when we
interpret it to represent L(R).
– aaba is just a string
– we interpret it to represent the language {aaba}.
• We do similar things with arithmetic expressions
– 10+72 is just a string
– We interpret this string to represent the number 59
12
Key fact
• A language L is a regular language iff there
exists a reg. exp. R such that L(R) = L
– When I ask for a proof that a language L is
regular, rather than going through the inductive
proof we saw earlier, I expect you to give me a
regular expression R s.t. L(R) = L
13
Examples
• Prove that {} is regular
– f
• Prove that {l} is regular
– l
• Prove that {a,b}* is regular
– (a+b)*
• Prove that the set of even length strings over {a,b}
is regular
– ((a+b)(a+b))* or
– (aa+ab+ba+bb)*
14
Summary
• Regular expressions are strings
– syntax for legal regular expressions
– semantics for interpreting regular expressions
• Regular languages are a new language class
– A language L is regular iff there exists a regular
expression R s.t. L(R) = L
• We will show that the regular languages are
identical to LFSA
15