ECE-548 Sequential Machine Theory

Download Report

Transcript ECE-548 Sequential Machine Theory

Regular Languages
Sequential Machine Theory
Prof. K. J. Hintz
Department of Electrical and Computer Engineering
Lecture 3
Comments, additions and modifications
by Marek Perkowski
Languages
• Informal Languages
– English
– Body
– Bureaucratize
• Formal Languages
– Rule-based
– Elements are decidable
– No deeper understanding required
Formal Language
• All the Rules of the Language Are Explicitly
Stated in Terms of the Allowed Strings of
Symbols, e.g.,
– Programming languages, e.g., C, Lisp, Ada
– Military communications (formal “informal” L)
– Digital network protocols
A to ...
Alphabet: a finite set of symbols, aka I, 
–
–
–
–
Roman:
Binary:
Greek:
Cyrillic:
{ a, b, c, ... , z }
{ 0, 1 }
{ a, c, e, g, i, k, l, ... }
{ Ж, Й, Њ, С, Р, ... }
String
String, word: a finite ordered sequence of
symbols from the alphabet, usually written
with no intervening punctuation
– x1 = “ t h e “
– x2 = “ 0 1 0 1 1 0 “
– x3 = “ c i g a  e
“
– x4 = “ Ж Й С Р “
a
String
• Reverse of String
– The sequence of symbols written backwards
 x1r " e h t "
• Reverse of Concatenation
– Strings themselves must be reversed
 x  y R  y R  x R
String
• Length or Size of String
– The number of symbols
x1  3
x2  6
x3  9
x4  4
x3  x4  13
Strings
• Null String, Empty String, e, 
– A string of length or size zero
– The symbols e or , meant to denote the null
string, are not allowed to be part of the
language
Substring
• A String, v, Is a Substring of a String, w, iff
There Are Strings x and y Such That
–
–
–
–
w=xvy
x is called the prefix
y is called the suffix
x and/or y could be 
Kleene Closure
• Set of All Strings, *, I*
– Order IS important
– Not the same as P  , the powerset of the
alphabet, since order is NOT important in the
powerset
Concatenation Operator
• If x, y  I*, then the concatenation of x and
y is written as
–z=x
– e.g., if
y
• x = “Red”
|x|= 3
• y = “skins”
|y|= 5
• z = x  y = “Redskins“ | z | = 8
Concatenation Operator
• Concatenation of Any String With the Null
String Results in the Original String
–xe=ex=x
–x=x=x
• Concatenation is Associative
– x = abc
y=def
z= ghi
–(xy)z=x(yz)
– abcdefghi = abcdefghi
Language
• Language, L: Any Subset of the Set of All
Strings of an Alphabet
L  *
L  I*
I*
L1
L2
Classes of Languages
• Enumerated Languages
– Defined by a List of All Words in the Language
• Le = { “quidditch”, “nimbus 2000”, }
• not very interesting
• Rule-based
– Defined by Properties or a Set of Rules
L r   w I * : w has the property P 
Rule Based Languages
• A Test to Determine Whether a String Is a
Member of a Language
• A Means of Constructing Strings That Are
in the Language
– Must be able to construct ALL strings in the
language
– Must be able to construct ONLY strings in the
language
Rule-Based Language Example
Let I = { a, b }
• A Language That Consists of All Two Letter
Strings
– L = { aa, ab, ba, bb }
–  is not an element of the language
Empty Language
• Null Language, Empty Language, : The
Language With No Words in It
– Not the same as 
–  can be made into a language with words
L    
– A language consisting only of  is still a
language
Kleene Star
If L  I * is a language, then
• L* Is the Set of All Strings Obtained by
Concatenating Zero or More Strings of L.
• Concatenation of Zero Strings Is 
• Concatenation of One String Is the String
Itself
• L+ = L* - { }
Kleene Closure Example
• L = { 0, 1}
L* = {
,
0, 00, 000, ... , 0*,
1, 11, 111, ... , 1*,
01, 001, 0001, ... , 0*1,
... }
Kleene Closure Examples
• L = { ab, f }
L* = { ,
ab, abf, fab, ffab, ffabf, ... }
• *
={ }
• if
L
={ }
then
L*
={ }
Kleene Closure Examples
Let I = { a, b }
• L = Language ( ( ab )* )
{, ab, abab, ababab, ... }
which is not the same as
• L = Language ( a* b* )
{, a, b, ab, aab, abb, ... }
The language of all strings of a’s and b’s in
which the a’s, if any, come before the b’s
Recursive Language Definition
• Variation of Rule-Based
• Three-step Process
1. Specify some basic elements of the set
2. Specify the rules for forming new elements
from old elements of the set
3. Specify that elements not in 1 or 2 above are
NOT elements of the set
Recursive Example
• Two Equivalent Recursive Definitions of
Rational Numbers
– Rational #1
1. Rat_1 = { -, ... -3, -2, -1, 1, 2, 3, ... ,  }
2. if p, q  Rat_1, then p/q  Rat_1
3. the only rational numbers are those generated by 1
and 2 above.
Recursive Example
– Rational #2
1. Rat_2 = { -1, +1 }
2. if p, q  Rat_2, p,q != 0, then (p+q)/p  Rat_2
3. the only rational numbers are those generated by 1
and 2 above.
e.g.,
n  1  1
11
2 1

2
3
n
1
1
1
generates all integers
Interest in Recursive Definitions
• Allows Us to Prove Some Statements About
What Is Computable.
• Leads to Proof by Induction
Principle of Mathematical
Induction*
Let A Be a Subset of the Natural Numbers
• 0  A, and
• for each natural number, n,
– if
– implies
– then
* Lewis & Papadimitriou, pg. 24
{ 0, 1, ..., n }  A ,
(n + 1)  A
A=N
Mathematical Induction
• In practice, mathematical induction is used
to prove assertions of the form
For all natural numbers, n,
property P is true
Mathematical Induction Practice
To prove statements of the form
A = { n : P is true of n }, three steps
1. Basis Step: show that 0  A,
i.e., P is true of n = 0
2. Induction Hypothesis: assume that for some
arbitrary, but fixed n > 0, P holds for each
natural number 0, 1, ... , n
Mathematical Induction Practice
3. Induction Step: use the induction hypothesis
(that P is true of n) to show that P is true of (n
+ 1)
• By the Induction Principle, Then A=N and
Hence, P Is True of Every Natural Number.
Induction Example*
Show that for any n  0,
 n 2  n
1 2    n  

 2 
1. Basis Step
 02  0
0

 2 
0 0
 true for n  0
* Lewis & Papadimitriou, pg. 25
Induction Example
2. Induction Hypothesis
Assume that for some n  0,
 m2  m
1 2    m  

 2 
when m  n
Induction Example
3. Induction Step
1  2   n   n  1  1  2   n  n  1
 n2  n
where 1  2   n is replaced by 
 from the
 2 
induction hypothesis
 n2  n

  n 1
 2 
n 2  n  2n  2

2
Induction Example



n 2  2n  1   n  1
2
n  1   n  1


2
which shows that the hypothesis is true since if it was true
for n  0, then it must be true for any n  1
2
Another Induction Example
• Define EVEN as
1. 0 is in EVEN
2. if x  EVEN then so is x + 2
3. The only elements of EVEN are those
produced by 1 & 2 above.
• Prove by induction that all of elements of
EVEN end in either 0, 2, 4, 6, or 8.
Induction Example (cont)
Proof
1. Basis Step
0  EVEN by definition, therefore the property is
true of the zero’th step since 0  { 0, 2, 4, 6, 8
}
2. Induction Hypothesis
Assume that the last digit of
(m+2)  { 0, 2, 4, 6, 8 } for 0 < m < n
Induction Example
3. Induction Step
n
0
1
2
3
4
 EVEN n
 EVEN
ends in {0,2,4,6,8}
2
...
by step 2
4
n
2n + 2
6
n+1
(2n+2)+2
8
n+1
2(n+1)+2
0+2=2, 2+2=4,4+2=6
10
6+2=8, 8+2=0  {0,2,4,6,8}
Regular Expressions
• Shorthand Notation for Concisely
Expressing Languages
• Defined Recursively
• Lead to a Definition of Regular Languages
• Provide Finite Representation of Possibly
Infinite Languages
• Lead to Lexical Analyzers
Regular Expressions Notation
Language a
a
ab
 a, b
a*
 a *
a
 a

with operator precedence being
highest
*

Kleene Star
Concatenation
lowest
 or +
Set Union
Regular Expressions Over I
•  and  are regular expressions
• a is a regular expression for each a  I
• If r and s are regular expressions, then so
are r  s, r  s, and r*
• No other sequences of symbols are regular
expressions
Regular Expressions Alternative
1. L(  ) = { }
L( a ) = { a }
If p and q are regular expressions, then
2. L( pq )
= L( p ) L( q )
3. L( p  q )
= L( p )  L( q )
4. L( p* )
= L( p )*
Regular Expressions Example
What is L3 ( ( a  b )* a ) ?
L 3  L  a  b * L a 
2
 L   a  b  * a
1
 L  a  b  * a
4
  L  a   L  b   * a
3
   a    b   * a
 1, 1 
  a, b  * a
 definition 
  w  a, b  *: w ends with a 
Regular Expressions
• Boolean OR Distributes over Concatenation
L  language   a  bc  c * b 
L  language  ac * b  bcc * b 
– which is the language of all strings beginning
with a, ending with b, and having none or more
c’s in the middle, and,
– all strings beginning and ending with b and
having at least one c in the middle
Regular Expressions
• The Boolean OR Operator Can Distribute
When It Is Inside a Kleene Starred
Expression, but Only in Certain Ways
L  language  a  bc * b
  a  bc a  bc a  bc  b
 a * b   bc * b
  ab  bcb *
Regular Expressions
• Useful String
– ( a + b )* = the set of all strings of a and b of
any length
– L = Language ( ( a + b )* )
– { , ab, abab, abaab, abbaab, babba, bbb, ... }
Regular Languages
• If L  I* is finite, then L is regular.
• If L1 and L2 are regular, so are
– L3= L1  L2
– L4= L1  L2 = {x1  x2 | x1  L1 , x2  L2 }
• If L is regular, then so is L*, where * is the
Kleene Star
Regular Languages
• If L Is a Finite Language, Then L Can Be
Defined by a Regular Expression.
• The Converse Is Not True. That Is, Not All
Regular Expressions Represent Finite
Languages.
• L = Language( ( a + b )* ) Is Infinite Yet
Regular