Transcript ppt
Question Session 2
Outline
• Regular Expression
• Backus-Naur form
• FIZ and FLIZ Programming
Regular Expressions: Tool for
Lexical Analyzer
• Regular expression: A notation to specify a pattern that
matches a set of strings
• Theoretical synyax:
• A single character is a regular expression
• (Concatenation) RS is a re, means R followed by S
• (Alternation) R|S is a re, means R or S
• (Kleene star) R* means zero or more copies of R
slide 3
RE Syntax in Lex/Flex
http://flex.sourceforge.net/manual/Patterns.html
• ‘x’
match the character 'x'
• ‘.’
• ‘[xyz]’
any character (byte) except newline
a character class; in this case, the pattern matches
either an 'x', a 'y', or a 'z'
• ‘[abj-oZ]’ a "character class" with a range in it; matches
an 'a', a 'b', any letter from 'j'
through 'o', or a 'Z'
• ‘[^A-Z]’
a "negated character class", i.e., any
character but
those in the
class. In this case, any character
EXCEPT an uppercase letter.
slide 4
RE Syntax in Lex/Flex
http://flex.sourceforge.net/manual/Patterns.html
• ‘[^A-Z\n]’
any character EXCEPT an uppercase letter
or a newline
• ‘[a-z]{-}[aeiou]’ the lowercase consonants
• ‘r*’
zero or more r's, where r is any regular
expression
• ‘r+’
• ‘r?’
• ‘r{2,5}’
• ‘r{2,}’
• ‘r{4}’
one or more r's
zero or one r's (that is, “an optional r”)
anywhere from two to five r's
two or more r's
exactly 4 r's
slide 5
RE Syntax in Lex/Flex
http://flex.sourceforge.net/manual/Patterns.html
• ‘{name}’
the expansion of the ‘name’ definition
• ‘"[xyz]\"foo"’
the literal string: ‘[xyz]"foo’
• ‘(r)’ match an ‘r’; parentheses are used to override precedence
• ‘rs’
the regular expression ‘r’ followed by the regular
expression ‘s’; called concatenation
• ‘r|s’
either an ‘r’ or an ‘s’
• ‘^r’
an ‘r’, but only at the beginning of a line
• ‘r$’
an ‘r’, but only at the end of a line
slide 6
Validating Credit Card Numbers
• All Visa card numbers start with a 4. New cards
have 16 digits. Old cards have 13.
• ^4[0-9]{12}([0-9]{3})?$
• Discover card numbers begin with 6011 or 65. All
have 16 digits.
• ^6(011|5[0-9]{2})[0-9]{12}$
• Overall:
• ^(4[0-9]{12}([0-9]{3})
# Visa
• | 6(011|5[0-9]{2})[0-9]{12})$ # Discover
Validating Email Address
• Example: [email protected] (only alphabet or integer)
• ^[A-Za-z0-9]+(.[A-Za-z0-9]+)*
•
@[A-Za-z0-9]+(.[A-Za-z0-9]+)*(.[A-Za-z]{2,})$;
Backus-Naur form
•
BNF or Backus-Naur form is used in CS to
describe context-free grammars. It is often used
to describe the syntax of programming
languages.consists of one or more of the
following:
<nonterminal> ::= __expression__
•
Where __expression__ consists of one or more
terminals and nonterminals or nothing (epsilon).
Parse an integer:
• How to parse an integer string?
• Regular expression:
• 0|-?[1-9][0-9]*
• ^(-?[1-9]+\d*)$|^0$
• BNF:
•
•
•
•
•
•
integer ::= 0 | sign non-zero <- positive, negative or 0
sign ::= empty | <- plus (empty) or minus
non-zero ::= lead-digit number <- first digit can’t be 0
lead-digit ::= 1|2|3|4|5|6|7|8|9
number::= digit | number digit
digit ::= 0|1|2|3|4|5|6|7|8|9
Consider the following BNF:
• A -> x | yA | yAzA
• where x,y,z are terminals.
• Consider yyxzx
• Construct a parse tree:
• A
• /\
• y A
•
/|\`\
• yAzA
•
| |
•
x x
A
/|\`\
yAzA
/\ \
y A x
|
x
<- two possible trees: ambiguity!
Ambiguity: a particular string can
have more than one parse tree
• Consider the following rules:
• statement = ...
• | selection-statement
• selection-statement = ...
• | IF ( expression ) statement
• | IF ( expression ) statement ELSE statement
• Construct an ambiguous string
• IF (a) IF (b) s; ELSE s2;
What is FIZ
• FIZ
•
F is for functional programming
•
I is for integer
(we only use integer
data type)
•
Z is for zero, denoting the simplicity of the
language
• In Functional Programming, one
•
defines functions
•
writes expressions
• Syntax: instead of writing f(a, b, c); we write (f a b c)
slide 13
Basic Features of FIZ
• non-negative integer
evaluates to its value
• (inc exp) evaluates to value of exp + 1
• (dec exp) evaluates to halt if value of exp is 0
•
otherwise evaluates to value of exp - 1
• (ifz cexp texp fexp)
•
evaluates to value of texp when cexp
evaluates to 0
•
and to value of fexp when cexp evaluates
to none 0
slide 14
Defining New Functions
• (define (name arguments) function-body )
• Define logaz (rounded to the floor, assume a z > 0)
• Log a z:
• Base: z < a: 0
• Rec: 1+loga(z/a)
• Div a b:
• Base: a < b: 0
• Rec: 1+(div (a-b) b)
• Sub a b:
• Base: b=0: a
• Rec: sub a-1 b-1 (a >=b is guaranteed)
• Less a b:
• Base: b=0: 1
• Rec: if a=0: 0; o/w: Less a-1 b-1
slide 15
logaz
• (define (less a b)
• (ifz b 1
• (ifz a 0
•
(less (dec a) (dec b)))))
• (define (sub a b)
• (ifz b a
•
(sub (dec a) (dec b))))
• (define (div a b)
• (ifz (less a b) 0
• (inc (div (sub a b) b))))
• (define (log a z)
• (ifz (less z a) 0
• (inc (log a (div z a)))))
The FLIZ Language
• There are the following constant expressions
• 123
; a number
• []
; the empty list
• [c c … c] ; a list of one of more constant expressions
•Examples:
•123
•[1 2 3]
•[]
; 123
; [ 1 2 3 ]
; [ ]
•[1 [2 3] [4 5]] ; [ 1 [ 2 3 ] [ 4 5] ]
•[1 [] 2 [[3 4]]]; [ 1 [ ] 2 [ [ 3 4 ] ] ]
slide 17
The FLIZ Language: Builtin
Functions
• (list exph expt)
a list obtained by inserting the value
of exph as first element to the value of expt
• (head exp)
first element of the list exp
• (tail exp) a list obatined by removing the first element
• (ifa cexp texp fexp) when cexp evaluates to a number,
evaluates to texp, otherwise evaluates to fexp
• (ifn cexp texp fexp)
•
evaluates to value of texp when cexp evaluates to []
•
and to value of fexp when cexp evaluates not to []
slide 18
Len l
• Count length of a list
• Base: l is []: 0
• Rec: 1+(tail l)
• (define (len l)
• (ifn l 0 (inc (len (tail l)))))
Member a l
• If integer a is member of l, return 0, else 1
• if l is []: 1
• o/w:
• if (head l) is number
• If a=(head l): 0
• o/w: evaluate (member (tail l))
• o/w:
• If (member (head l))=0: 0
• o/w: evaluate (member (tail l))
Member a l
• (define (member a l)
• (ifn l 1
• (ifa (head l)
•
(ifz (eq a (head l))
•
0
•
(member a (tail l)))
•
(ifz (member a (head l))
•
0
•
(member a (tail l))))))
Tips
• Tips of Lab2:
• Part 1: write down the base case and recursion first
• Supporting List: define new non-terminals, then
construct AST (multiple alternatives).
• A straightforward way is to use a single AST node, which points
to the value of the parsed list.
• Use a sub-tree (multiple AST node) to represent a list.
• Lab 2 DDLs
•
•
•
•
Part 1: 02/06/2017
Part 2: 02/06/2017
Part 3
Part 4: 02/13/2017