Transcript Here

Languages and Compilers
(SProg og Oversættere)
Bent Thomsen
Department of Computer Science
Aalborg University
With acknowledgement to John Mitchell and Elsa Gunter who’s slides this lecture is based on.
1
Sequence control
• Implicit and explicit sequence control
– Expressions
• Precedence rules
• Associativity
– Statements
• Sequence
• Conditionals
• Iterations
– Subprograms
– Declarative programming
• Functional
• Logic programming
2
Expression Evaluation
• Determined by
– operator evaluation order
– operand evaluation order
• Operators:
– Most operators are either infix or prefix (some
languages have postfix)
– Order of evaluation determined by operator
precedence and associativity
3
Example
• What is the result for:
3+4*5+6
• Possible answers:
–
–
–
–
41 = ((3 + 4) * 5) + 6
47 = 3 + (4 * (5 + 6))
29 = (3 + (4 * 5)) + 6 = 3 + ((4 * 5) + 6)
77 = (3 + 4) * (5 + 6)
4
Example Again
• In most language, 3 + 4 * 5 + 6 = 29
• … but it depends on the precedence of operators
5
Operator Precedence
• Operators of highest
precedence evaluated
first (bind more tightly).
Level
Operator
Operation
Highest
** abs not
Exp, abs,
negation
• Precedence for operators
usually given in a table,
e.g.:
• In APL, all infix
operators have same
precedence
* / mod rem
Lowest
+-
Unary
+-&
Binary
= <= < > =>
Relations
And or xor
Boolean
Precedence table for ADA
6
C precedence levels
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Precedence Operators
17
tokens, a[k], f()
.,->
16
++, -15*
++, -, -, sizeof
!,&,*
14
typename
13
*, /, %
12
+,11
<<, >>
10
<,>,<=, >=
9
==, !=
8
&
7

6
|
5
&&
4
||
3
?:
2
=, +=, -=, *=,
/=, %=, <<=, >>=,
&=, =, |=
1
,
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
Operator names
Literals, subscripting, function call
Selection
Postfix increment/decrement
Prefix inc/dec
Unary operators, storage
Logical negation, indirection
Casts
Multiplicative operators
Additive operators
Shift
Relational
Equality
Bitwise and
Bitwise xor
Bitwise or
Logical and
Logical or
Conditional
Assignment
Sequential evaluation
7
Associativity
•
•
•
When we have sorted precedence we need to
sort associativity!
What is the value of:
7–5–2
Possible answers:
–
In Pascal, C++, SML associate to the left
7 – 5 – 2 = (7 – 5) – 2 = 0
– In APL, associate to the right
7 – 5 – 2 = 7 – (5 – 2) = 4
8
Special Associativity
• In languages with built in support for infix
exponent operator, it is standard for it to
associate to the right:
2 ** 3 ** 4 = 2 ** (3 ** 4)
• In ADA, exponentiation in non-associative; must
use parentheses
9
Operand Evaluation Order
• Example:
A := 5;
f(x) = {A := x+x; return x};
B := A + f(A);
• What is the value of B?
• 10 or 15?
10
Example
• If assignment returns the assigned value, what is the
result of
x = 5;
y = (x = 3) + x;
• Possible answers: 6 or 8
• Depends on language, and sometimes compiler
– C allows compiler to decide
– SML forces left-to-right evaluation
• Note assignment in SML returns a unit value
• .. but we could define a derived assignment operator in
SML as fn (x,v)=>(x:=v;v)
11
Solution to Operand Evaluation Order
• Disallow all side-effect
– “Purely” functional languages try to do this –
Miranda, Haskell
– Problem: I/O, error conditions such as overflow are
inherently side-effecting
12
Solution to Operand Evaluation Order
• Disallow all side-effects in expressions but allow
in statements
– Problem: not applicable in languages with nesting of
expressions and statements
13
Solution to Operand Evaluation Order
• Fix order of evaluation
– SML does this – left to right
– Problem: makes some compiler optimizations hard to
impossible
• Leave it to the programmer to be sure the order
doesn’t matter
– Problem: error prone
14
Short-circuit Evaluation
• Boolean expressions:
• Example: x <> 0 andalso y/x > 1
• Problem: if andalso is ordinary operator and both
arguments must be evaluated, then y/x will raise
an error when x = 0
15
Boolean Expressions
• Most languages allow (some version of)
if…then…else, andalso, orelse not to
evaluate all the arguments
• if true then A else B
– doesn’t evaluate B
16
Boolean Expressions
• if false then A else B
– doesn’t evaluate A
• if b_exp then A else B
– Evaluates b_exp, then applies previous rules
17
Boolen Expressions
• Bexp1 andalso Bexp2
– If Bexp1 evaluates to false, doesn’t evaluate Bexp2
• Bexp1 orelse Bexp2
– If Bexp1 evaluates to true, doesn’t evaluate Bexp2
18
Short-circuit Evaluation – Other Expressions
• Example: 0 * A = 0
• Do we need to evaluate A?
• In general, in f(x,y,…,z) are the arguments to f
evaluated before f is called and the values are passed, or
are the unevaluated expressions passed as arguments to f
allowing f to decide which arguments to evaluate and in
which order?
19
Eager Evaluation
• If language requires all arguments to be evaluated before
function is called, language does eager evaluation and
the arguments are passed using pass by value (also
called call by value) or pass by reference
20
Lazy Evaluation
• If language allows function to determine which
arguments to evaluate and in which order,
language does lazy evaluation and the arguments
are passed using pass by name (also called call
by name)
21
Lazy Evaluation
• Lazy evaluation mainly done in purely functional
languages
• Some languages support a mix
• Effect of lazy evaluation can be implemented in
functional language with eager evaluation
– Use thunking fn()=>exp and pass function instead of
exp
22
Infix and Prefix
•
•
•
•
•
•
•
•
•
Infix notation: Operator appears between operands:
2+35
3+69
Implied precedence: 2 + 3 * 4  2 + (3 * 4 ),
not (2 + 3 ) * 4
Prefix notation: Operator precedes operands:
+235
+ 2 * 3 5  (+ 2 ( * 3 5 ) )  + 2 15  17
Prefix notation is sometimes called Cambridge Polish
notation – used as basis for LISP
23
Polish Postfix
• Postfix notation: Operator follows operands:
•
23+5
•
2 3 * 5 + (( 2 3 * 5 +)  6 5 +  11
• Called Polish postfix since few could pronounce Polish
mathematician Lukasiewicz, who invented it.
• An interesting, but unimportant mathematical curiosity
when presented in 1920s. Only became important in
1950s when Burroughs rediscovered it for their ALGOL
compiler.
24
Evaluation of postfix
• 1. If argument is an operand, stack it.
• 2. If argument is an n-ary operator, then the n arguments
are already on the stack. Pop the n arguments from the
stack and replace by the value of the operator applied to
the arguments.
•
•
•
•
•
•
•
•
Example: 2 3 4 + 5 * +
1. 2 - stack
2. 3 - stack
3. 4 - stack
4. + - replace 3 and 4 on stack by 7
5. 5 - stack
6. * - replace 5 and 7 on stack by 35
7. + - replace 35 and 2 on stack by 37
25
Importance of Postfix to Compilers
•
•
Code generation same as expression evaluation.
To generate code for 2 3 4 + 5 * +, do:
1.
2.
3.
4.
5.
6.
7.
stack L-value of 2
stack L-value of 3
stack L-value of 4
+ - generate code to take R-value of top stack element (L-value of 4)
and add to R-value of next stack element (L-value of 3) and place Lvalue of result on stack
stack L-value of 5
* - generate code to take R-value of top stack element (L-value of 5)
and multiply to R-value of next stack element (L-value of 7) and
place L-value of result on stack
+ - generate code to take R-value of top stack element (L-value of 35)
and add to R-value of next stack element (L-value of 2) and place Lvalue of result (37) on stack
26
Control of Statement Execution
•
•
•
•
Sequential
Conditional Selection
Looping Construct
Must have all three to provide full power of a
Computing Machine
27
Basic sequential operations
• Skip
• Assignments
– Most languages treat assignment as a basic operation
– Some languages have derived assignment operators such as:
• += and *= in C
– In SML assignment is just another (infix) function
• := : ‘‘a ref * ‘‘a -> unit
• I/O
– Some languages treat I/O as basic operations
– Others like, C, SML, Java treat I/O as functions/methods
• Sequencing
– C;C
• Blocks
– Begin …end
– {…}
28
Conditional Selection
• Design Considerations:
– What controls the selection
– What can be selected: modern languages allow
any kind of program block
– What is the meaning of nested selectors
29
Conditional Selection
• Single-way
– IF … THEN …
– Controlled by boolean expression
30
Conditional Selection
• Two-way
– IF … THEN … ELSE
– Controlled by boolean expression
– IF … THEN … usually treated as degenerate form of
IF … THEN … ELSE
– IF…THEN together with IF..THEN…ELSE require
disambiguating associativity
31
Multi-Way Conditional Selection
• CASE
– Typically controlled by scalar type
– Each selection has own block of statements it
executes
– What if no selection is is given?
• Language gives default behavior
• Language forces total coverage, typically
with programmer-defined default case
32
Multi-Way Conditional Selection
• SWITCH
–
–
–
–
Similar to Case
One block of code for whole switch
Selection specifies program point in block
break used for early exit from block
• ELSEIF
– Equivalent to nested if…then…else…
33
Multi-Way Conditional Selection
• Non-deterministic Choice
– Syntax:
if <boolean guard> -> <statement>
[] <boolean guard> -> <statement>
...
[] <boolean guard> -> <statement>
fi
34
Multi-Way Conditional Selection
• Non-deterministic Choice
– Semantics:
• Randomly choose statement whose
guard is true
• If none
–Do nothing
–Cause runtime error
35
Multi-Way Conditional Selection
• Pattern Matching in SML
datatype ‘a tree = LF of ‘a | ND of (‘a tree)*(‘a tree)
- fun print_tree (LF x) = (print(“Leaf “);print_a(x))
| print_tree (ND(x,y)) = (print(“Node”);
print_tree(x);
print_tree(y));
36
Multi-Way Conditional Selection
• Search in Logic Programming
–
–
–
–
–
Clauses of form
<head> :- <body>
Select clause whose head unifies with current goal
Instantiate body variables with result of unification
Body becomes new sequence of goals
37
Example
• APPEND in Prolog: append([a,b,c], [d,e], X)
•
 X = [a,b,c,d,e]
• Definition:
•
append([ ], X, X).
•
append( [ H | T], Y, [ H | Z]) :- append(T, Y, Z).
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
38
Loops
•
•
•
•
Main types:
Counter-controlled iteraters (For-loops)
Logical-test iterators
Recursion
39
For-loops
• Controlled by loop variable of scalar
type with bounds and increment size
• Scope of loop variable?
– Extent beyond loop?
– Within loop?
40
For-loops
• When are loop parameters calculated?
– Once at start
– At beginning of each pass
41
Logic-Test Iterators
• While-loops
– Test performed before entry to loop
• repeat…until and do…while
– Test performed at end of loop
– Loop always executed at least once
42
Gotos
•
•
•
•
Requires notion of program point
Transfers execution to given program point
Basic construct in machine language
Implements loops
43
Gotos
• Makes programs hard to read and
reason about
• Hard to know how a program got to a
given point
• Generally thought to be a bad idea in a
high level language
44
Fortran Control Structure
10 IF (X .GT. 0.000001) GO TO 20
11 X = -X
IF (X .LT. 0.000001) GO TO 50
20 IF (X*Y .LT. 0.00001) GO TO 30
X = X-Y-Y
30 X = X+Y
...
50 CONTINUE
X =A
Y = B-A
GO TO 11
…
45
Historical Debate
• Dijkstra, Go To Statement Considered Harmful
– Letter to Editor, C ACM, March 1968
– Now on web: http://www.acm.org/classics/oct95/
• Knuth, Structured Prog. with go to Statements
– You can use goto, but do so in structured way …
• Continued discussion
– Welch, GOTO (Considered Harmful)n, n is Odd
• General questions
– Do syntactic rules force good programming style?
– Can they help?
46
Control structures represented as flowcharts
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
47
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
48
Spaghetti code
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
49
Prime Programs
• Earlier discussion on control structures seemed
somewhat ad hoc.
• Is there a theory to describe control structures?
• Do we have the right control structures?
• Roy Maddux in 1975 developed the concept of a prime
program as a mechanism for answering these questions.
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
50
Proper programs
• A proper program is a flowchart with:
– 1 entry arc
– 1 exit arc
– There is a path from entry arc to any node to exit arc
• A prime program is a proper program which has no
embedded proper subprogram of greater than 1 node.
(i.e., cannot cut 2 arcs to extract a prime subprogram
within it).
• A composite program is a proper program that is not
prime.
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
51
Prime decomposition
• Every proper program can be decomposed into a hierarchical set
of prime subprograms. This decomposition is unique (except for
special case of linear sequences of function nodes).
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
52
Structured programming
• Issue in 1970s: Does this limit what programs can be written?
• Resolved by Structure Theorem of Böhm-Jacobini.
• Here is a graph version of theorem originally developed by
Harlan Mills:
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
53
Structured Programming
• Use of prime programs to define structured
programming:
• Concept first used by Dijkstra in 1968 as gotoless
programming.
• Called structured programming in early 1970s• Program only with if, while and sequence control
structures.
Programming Language design and Implementation -4th Edition
Copyright©Prentice Hall, 2000
54
Advance in Computer Science
• Standard constructs that structure jumps
if … then … else … end
while … do … end
for … { … }
case …
• Modern style
– Group code in logical blocks
– Avoid explicit jumps except for function return
– Cannot jump into middle of block or function body
• But there may be situations when “jumping” is the right
thing to do!
55
Exceptions: Structured Exit
• Terminate part of computation
–
–
–
–
Jump out of construct
Pass data as part of jump
Return to most recent site set up to handle exception
Unnecessary activation records may be deallocated
• May need to free heap space, other resources
• Two main language constructs
– Declaration to establish exception handler
– Statement or expression to raise or throw exception
Often used for unusual or exceptional condition, but not necessarily.
56
Exceptions
• Exception: caused by unusual event
– Detected by hardware
– Detected in program
• By compiler
• By explicit code in program
57
Exceptions
• Built-in only or also user defined
• Can built-in exceptions be raised
explicitly in code
• Carry value (such as a string) or only
label
58
Exceptions
• Exception handling: Control of
execution in presence of exception
• Can be simulated by programmer
explicitly testing for error conditions
and specifying actions
– But this is error prone and clutters
programs
59
Exception Handlers
• Is code separate unit from code that can
raise the exception
• How is an exception handler bound to an
exception
• What is the scope of a handles: must
handler be local to code unit that raises it
• After handler is finished, where does the
program continue, if at all
• If no handler is explicitly present, should
there be an implicit default handler
60
ML Example
exception Determinant; (* declare exception name *)
fun invert (M) =
(* function to invert matrix *)
…
if …
then raise Determinant (* exit if Det=0 *)
else …
end;
...
invert (myMatrix) handle Determinant => … ;
Value for expression if determinant of myMatrix is 0
61
C++ Example
Matrix invert(Matrix m) {
if … throw Determinant;
…
};
try { … invert(myMatrix); …
}
catch (Determinant) { …
// recover from error
}
62
C++ vs ML Exceptions
• C++ exceptions
– Can throw any type
– Stroustrup: “I prefer to define types with no other purpose than exception
handling. This minimizes confusion about their purpose. In particular, I
never use a built-in type, such as int, as an exception.”
-- The C++
Programming Language, 3rd ed.
• ML exceptions
– Exceptions are a different kind of entity than types.
– Declare exceptions before use
Similar, but ML requires the recommended C++ style.
63
ML Exceptions
• Declaration
exception name of type
gives name of exception and type of data passed when raised
• Raise
raise name parameters
expression form to raise and exception and pass data
• Handler
exp1 handle pattern => exp2
evaluate first expression
if exception that matches pattern is raised,
then evaluate second expression instead
General form allows multiple patterns.
64
Which handler is used?
exception Ovflw;
fun reciprocal(x) =
if x<min then raise Ovflw else 1/x;
(reciprocal(x) handle Ovflw=>0) / (reciprocal(y) handle Ovflw=>1);
• Dynamic scoping of handlers
– First call handles exception one way
– Second call handles exception another
– General dynamic scoping rule
Jump to most recently established handler on run-time stack
• Dynamic scoping is not an accident
– User knows how to handler error
– Author of library function does not
65
Exception for Error Condition
- datatype ‘a tree = LF of ‘a | ND of (‘a tree)*(‘a tree)
- exception No_Subtree;
- fun lsub (LF x) = raise No_Subtree
| lsub (ND(x,y)) = x;
> val lsub = fn : ‘a tree -> ‘a tree
– This function raises an exception when there is no reasonable
value to return
66
Exception for Efficiency
• Function to multiply values of tree leaves
fun prod(LF x) = x
| prod(ND(x,y)) = prod(x) * prod(y);
• Optimize using exception
fun prod(tree) =
let exception Zero
fun p(LF x) = if x=0 then (raise Zero) else x
| p(ND(x,y)) = p(x) * p(y)
in
p(tree) handle Zero=>0
end;
67
Dynamic Scope of Handler
scope
exception X;
(let fun f(y) = raise X
and g(h) = h(1) handle X => 2
in
g(f) handle X => 4
end) handle X => 6;
handler
Which handler is used?
68
Compare to static scope of variables
exception X;
(let fun f(y) = raise X
and g(h) = h(1)
handle X => 2
in
g(f) handle X => 4
end) handle X => 6;
val x=6;
(let fun f(y) = x
and g(h) = let val x=2 in
h(1)
in
let val x=4 in g(f)
end);
69
Exceptions and Resource Allocation
exception X;
(let
val x = ref [1,2,3]
in
let
val y = ref [4,5,6]
in
… raise X
end
end); handle X => ...
• Resources may be
allocated between
handler and raise
• May be “garbage” after
exception
• Examples
–
–
–
–
Memory
Lock on database
Threads
…
General problem: no obvious solution
70
Continuations
• General technique using higher-order functions
– Allows “jump” or “exit” by function call
• Used in compiler optimization
– Make control flow of program explicit
• General transformation to “tail recursive form”
• Idea:
– The continuation of an expression is “the remaining work to
be done after evaluating the expression”
– Continuation of e is a function applied to e
71
Example
• Expression
– 2*x + 3*y + 1/x + 2/y
• What is continuation of 1/x?
– Remaining computation after division
let val before = 2*x + 3*y
fun continue(d) = before + d + 2/y
in
continue (1/x)
end
72
Other uses for continuations
• Explicit control
– Normal termination -- call continuation
– Abnormal termination -- do something else
• Compilation techniques
– Call to continuation is functional form of “go to”
– Continuation-passing style makes control flow explicit
MacQueen: “Callcc is the closest thing to a
‘come-from’ statement I’ve ever seen.”
73
Capturing Current Continuation
• Language feature
– callcc : call a function with current continuation
– Can be used to abort subcomputation and go on
• Examples
– callcc (fn k => 1);
> val it = 1 : int
• Current continuation is “fn x => print x”
• Continuation is not used in expression.
– 1 + callcc(fn k => 5 + throw k 2);
> val it = 3 : int
• Current continuation is “fn x => print 1+x”
• Subexpression throw k 2 applies continuation to 2
74
More with callcc
• Example
1 + callcc(fn k1=> …
callcc(fn k2 => …
if … then (throw k1 0)
else (throw k2 “stuck”)
))
• Intuition
– Callcc lets you mark a point in program that you can return to
– Throw lets you jump to that point and continue from there
75
Subprograms
1. A subprogram has a single entry point
2. The caller is suspended during execution of the
called subprogram
3. Control always returns to the caller when the called
subprogram’s execution terminates
Functions or Procedures?
•
•
Procedures provide user-defined statements
Functions provide user-defined operators
76
Subprograms
• Specification: name, signature, actions
• Signature: number and types of input
arguments, number and types of output
results
– Book calls it protocol
77
Subprograms
• Actions: direct function relating input
values to output values; side effects on
global state and subprogram internal
state
• May depend on implicit arguments in
form of non-local variables
78
Subprogram As Abstraction
• Subprograms encapsulate local variables,
specifics of algorithm applied
– Once compiled, programmer cannot access
these details in other programs
79
Subprogram As Abstraction
• Application of subprogram does not require
user to know details of input data layout
(just its type)
– Form of information hiding
80
Subprogram Parameters
• Formal parameters: names (and types) of
arguments to the subprogram used in
defining the subprogram body
• Actual parameters: arguments supplied for
formal parameters when subprogram is
called
81
Subprogram Parameters
• Parameters may be used to:
– Deliver a value to subprogram – in mode
– Return a result from subprogram – out mode
– Both – in out mode
• Most languages use only in mode
• Ada uses all
82
Parameter Passing (In Mode)
• Pass-by-value: Subprogram make local copy of
value (r-value) given to input parameter
• Assignments to parameter not visible outside
program
• Most common in current popular languages
83
Parameter Passing (In Mode)
• Pass-by-reference: Subprogram is given an
access path (l-value) to the input parameter
which is used directly
• Assignments to parameter directly effect nonlocal variables
• Same effect can be had in pass-by-value by
passing a pointer
84
Parameter Passing (In Mode)
• Pass-by-name: text for argument is passed
to subprogram and expanded in in each
place parameter is used
– Roughly same as using macros
• Achieves late binding
85
Pass-by-name Example
integer INDEX= 1;
integer array ARRAY[1:2]
procedure UPDATE (PARAM);
integer PARAM
begin
PARAM := 3;
INDEX := INDEX + 1;
PARAM := 5;
end
UPDATE(ARRAY[INDEX]);
•The above code puts 3 in ARRAY[1] and 5 in ARRAY[2]
86
Subprogram Implementation
• Subprogram definition gives template for
its execution
– May be executed many times with different
arguments
87
Subprogram Implementation
• Template holds code to create storage for
program data, code to execute program
statements, code to delete program storage when
through, and storage for program constants
• Layout of all but code for program statments
called activation record
88
Subprogram Implementation
• Code segment
Activation Record
Code to create
activation record inst
Program code
Code to delete
activation record inst
const 1
Return point
Static link
Dynamic link
Result data
Input parameters
Local parameters
const n
89
Activation Records
• Code segment generated once at compiled time
• Activation record instances created at run time
• Fresh activation record instance created for each
call of subprogram
– Usually put in a stack (top inserted first)
90
Activation Records
• Static link – pointer to bottom of activation
record instance of static parent
– Used to find non-local variables
• Dynamic link – pointer to top of activation
record instance of the caller
– Used to delete subprogram activation record instance
at completion
91
Summary
• Expression
– Precedence and associativity
– Evaluation of formal arguments
• Eager, lazy or mixed
• Structured Programming
–
–
–
–
Basic statements
Conditionals
loops
Go to considered harmful
• Exceptions
– “structured” jumps that may return a value
– dynamic scoping of exception handler
• Continuations
– Function representing the rest of the program
– Generalized form of tail recursion
– Used in Lisp, ML compilation
• Subprograms
92