L01_02_03_ADT - Monash University

Download Report

Transcript L01_02_03_ADT - Monash University

COMMONWEALTH OF AUSTRALIA
Copyright Regulations 1969
WARNING
This material has been reproduced and communicated to
you by or on behalf of Monash University pursuant to
Part VB of the Copyright Act 1968 (the Act). The material
in this communication may be subject to copyright under
the Act. Any further reproduction or communication of
this material by you may be the subject of copyright
protection under the Act. Do not remove this notice.
www.monash.edu.au
1
Prepared by:
Bernd Meyer
February 2007
FIT2004
Algorithms & Data Structures
L1-2: ADTs, Proofs & Induction
www.monash.edu.au
Specification versus Implementation
Mathematics
Function
Algebra
Algorithm
Data Structure
Procedure, Method
Class, Type, Module…
Specification
Implementation
Algorithmics
Specification
Implementation
Programming
(from Güting, Datenstrukturen und Algorithmen, Teubner 1992)
www.monash.edu.au
3
Abstract Data Types ADTs
Algebra
Data Structure
ADT
Class, Type, Modul…
www.monash.edu.au
4
ADT: Integer (Syntax)
• ADT integer
sorts int, bool;
ops
+:
int x int
-:
int x int
*:
int x int
/:
int x int
mod: int x int
=:
>:
etc…
->
->
->
->
->
int;
int;
int;
int;
int;
/* ???? */
int x int -> bool;
int x int -> bool;
• Syntax only, no semantics
• cf interfaces in java
www.monash.edu.au
5
ADT: Set of Integer (Syntax)
• ADT integer set
sorts intset, int, bool;
ops
empty:
-> intset;
insert:
intset x int -> intset;
delete:
intset x int -> intset;
contains:
intset x int -> bool;
isempty:
intset -> bool;
union:
Signature
????????
etc…
www.monash.edu.au
6
ADT: Set of Integer (Semantic A, axiomatic)
•
ADT integer set
sorts intset, int, bool;
ops
…
axioms
isempty(empty)
isempty(insert(x,i))
insert(insert(x,i),i)
contains(insert(x,i),i)
contains(insert(x, j),i)
...

true

false

insert(x,i)

true
 contains(x,i) iff i  j
www.monash.edu.au
7
Axiomatic Semantic
• specifies the laws that the functions obey
• without actually giving rules of how to evaluate them
• Problem: axioms have to be “complete”.
– How can you be sure of this?
• Advantage: makes proving properties easier
• Advantage: less prescriptive
• Does not say anything about the implementation
• This is the true level of ADT description
• quickly becomes unwieldy in realistic cases
www.monash.edu.au
8
ADT: Set of Integer (Semantic B, algebraic)
•
ALGEBRA integer set
sorts intset, int, bool;
ops
empty:
-> intset;
insert:
intset x int -> intset;
delete:
intset x int -> intset;
contains:
intset x int -> bool;
isempty:
intset -> bool;
union: ????????
sets
bool
int
= {true, false };
=
Z;
intset = F(Z) =
•
S  Z
S is finite
First: sets have to be made concrete (“Carrier Sets”)
www.monash.edu.au
9
ADT: Set of Integer (Semantic B, algebraic)
•
ALGEBRA integer set
sorts intset, int, bool;
ops
…
sets
…
functions
empty
insert(S, e)
delete(S, e)
isempty(M )
contains(S, e)
•


 S  {e}
 S \ {e}
 (S  )
 (e S)
Second: ops have to be specified.
www.monash.edu.au
10
Algebraic Semantics
•
•
Sets and Functions have to be fully specified
intuitively often easier to understand
•
This provides a so-called MODEL of the ADT
•
can render it more difficult to prove properties
(than using ADT axioms)
•
•
closer to implementation
is more “restrictive” than we want to be
•
however, in practice often easier to use
www.monash.edu.au
11
Alternative algebraic semantics for intset
We can decide on a different representation for intset.
We will write a set {1, 3, 4, 5}
as a nested term cons(1, cons(3, cons(4, cons(5, nil))))
cons is a purely syntactical element not a function to be evaluated
sorts intset, int, bool;
…
sets
bool
= {true, false };
int
= Z;
intset = nil | cons(e:int, s:intset)
We use this notation to specify that inset contains the elements
{nil, cons(1, nil), cons(2, nil), …
cons(1, cons(1, nil)), cons(2, cons(1, nil))), …,
…}
www.monash.edu.au
12
Alternative algebraic semantics for intset
sorts intset, int, bool;
…
Functions
empty()
= nil
isempty(nil)
= true
isempty(cons(e,s))
= false
insert(e, s)
= s if contains(e,s)
insert(e, s)
= cons(e,s) otherwise
delete(e, cons(e,s))
= s
delete(e, cons(f, s)) = cons(f, delete(e, s))
if not (e=f)
delete(e, nil) = nil
contains(e, nil)
= false
contains(e, cons(e,s)) = true
contains(e, cons(f,s)) = contains(e,s) if not(e=f)
www.monash.edu.au
13
Implementing the inset ADT
• the above list set semantics can be directly implemented as a
functional program
• here: using SML
• SML is a functional programming language
• The core of programming in SML is writing and applying
functions
• To learn about SML (not required):
– have a look at http://www.smlnj.org
– read “Elements of ML Programming” J.D. Ullman, Prentice Hall 1998.
– enrol in CSE3082 “Programming Language Paradigms”
www.monash.edu.au
14
Aside: intset in SML/NJ
datatype intset = nil | cons of int * intset;
fun empty ()
fun isempty (nil)
| isempty (cons(x,y))
= nil;
= true
= false;
fun contains (e, nil)
= false
| contains (e, cons(f,s)) = if (e=f) then true
else contains(e,s);
fun insert (e, s)
= if contains(e,s) then s
else cons(e,s);
fun delete (e,nil)
| delete (e, cons(f,s))
= nil
= if not (e=f) then cons(f, delete(e,s))
else s;
This is a fully functional SML/NJ program
www.monash.edu.au
15
Aside: intset in SML/NJ (cont)
Standard ML of New Jersey v110.59 [built: Mon Jun 5 11:53:38 2006]
- use "intset.ml";
[opening intset.ml]
datatype intset = cons of int * intset | nil
val isempty = fn : intset -> bool
val contains = fn : int * intset -> bool
val insert = fn : int * intset -> intset
val delete = fn : int * intset -> intset
val it = () : unit
- val x = cons(4, cons( 2, nil));
val x = cons (4,cons (2,nil)) : intset
- isempty(x);
val it = false : bool
- contains(3, x);
val it = false : bool
- contains(4,x);
val it = true : bool
- val y = delete(4,x);
val y = cons (2,nil) : intset
- contains(4,y);
val it = false : bool
loading the SML
program defining intset
SML extracts and returns the
signature automatically
a process called type-inference.
Defining x to be the set {4, 2}
Using various list functions
www.monash.edu.au
16
Proving ADT properties from axioms
Example
•
ADT integer
sorts integer;
ops
+: integer x integer -> integer;
*: integer x integer -> integer;
axioms
x + y = y + x;
/* commutative */
x + ( y + z ) = (x + y) + z /* associative */
x * ( y + z ) = (x * y) + (x * z)
/* distributive */
x * y = y * x;
…
•
/* commutative */
Prove: x*a + x*b + y*b + y*a = (x+y)*(a+b)
www.monash.edu.au
17
ADTs are model-independent
ALGEBRA integer /* version 1 */
sorts integer;
ops
+:
integer x integer -> integer;
…
sets
integer = N;
functions
+(x,y) = evaluate(x+y)
for the purpose of illustration we assume the usual operations
on natural numbers to be available and accessible via evaluate(.)
This is purely syntactical to make clear that we are relying on a pre-defined
operation on natural numbers N.
www.monash.edu.au
18
ADTs are model-independent
ALGEBRA integer /* version 2: Peano arithmetic */
sorts integer;
ops
+:
integer x integer -> integer;
…
sets
integer = zero | s(x: integer);
/* read an integer is either zero or the successor of some integer */
functions
+(zero,x) = x;
+(s(x), y) = s(+(x,y));
…
These two versions are specifications of the same ADT integer!
The axioms (“laws”) of integer arithmetic are the same for both
Two algebras that fulfill the same axioms but with different carrrier sets are called
(different) models of the ADT.
www.monash.edu.au
19
Is ALGEBRA intset correct?
ALGEBRA intset
sorts intset, int, bool;
…
Functions
isempty(nil)
= true
isempty(cons(e,s))
= false
insert(e, s)
= s if contains(e,s)
insert(e, s)
= cons(e,s) otherwise
delete(e, cons(e,s))
= s
delete(e, cons(f, s)) = cons(f, delete(e, s)) if not (e=f)
delete(e, nil) = nil
contains(e, nil)
= false
contains(e, cons(e,s)) = true
contains(e, cons(f,s)) = contains(e,s) if not(e=f)
•
•
State some theorems that should hold for an inset.
Try to prove or falsify these.
it is usually more difficult to prove theorems for the algebraic spec (later!)
www.monash.edu.au
20
Equality for intset (Attempt 1)
ALGEBRA intset
sorts intset, int, bool;
…
functions
…
equal(nil, nil)
= true
equal(cons(e,s), nil) = false
equal(nil, cons(e,s)) = false
equal(cons(e,s1), cons(f, s2)) = equal(s1, s2) if (e=f)
equal(cons(e,s1), cons(f, s2)) = false otherwise
•
•
•
Is this correct?
Give some examples of equality and check.
What have we specified here?
www.monash.edu.au
21
Equality for intset (Attempt 2)
ALGEBRA intset
sorts intset, int, bool;
…
functions
…
equal(nil, nil)
= true
equal(cons(e,s), nil) = false
equal(nil, cons(e,s)) = false
equal(cons(e,s1), s2) = equal(s1, delete(e, s2))
•
•
•
Is this correct?
Give some examples of equality and check.
What have we specified here?
www.monash.edu.au
22
Equality for intset (finally correct)
ALGEBRA intset
sorts intset, int, bool;
…
functions
…
equal(nil, nil)
= true
equal(cons(e,s), nil) = false
equal(nil, cons(e,s)) = false
equal(cons(e,s1), s2) = equal(s1,delete(e, s2)) if contains(e,s2)
equal(cons(e,s1), s2) = false otherwise
•
What we had specified in Attempt 1 takes order into account!
(Attempt 2 was plain wrong)
•
this was really an algebra for lists!
www.monash.edu.au
23
ADT: Lists
•
ALGEBRA integer list
sorts intlist, int, bool;
ops
empty:
-> intlist;
insert:
intlist x int -> intlist;
delete:
intlist x int -> intlist;
contains:
intlist x int -> bool;
isempty:
intlist -> bool;
append:
intlist x intlist -> intlist
sets
bool
= {true, false };
int
= Z;
intlist = nil | cons(e:int, s:intlist)
…
Define the functions for append!
www.monash.edu.au
24
ADT: Lists
•
ALGEBRA integer list
sorts intlist, int, bool;
ops
…
sets
…
functions
append(nil, l)
= l
append(cons(e,l1),l2)
= cons(e, append(l1, l2))
…
Convince youself that this is correct using examples!
Theorem (append is associative):
append(l1, append(l2, l3)) = append(append(l1, l2), l3)
Try to prove this (induction!)
In an axiomatic definition this would have been one of the axioms.
www.monash.edu.au
25
Summary: why ADTs
• abstract level of specification (a bit like interface in java)
• implementation (language/machine) independent
• supports formal analysis of properties
• cannot (!) analyze time/space complexity: no algorithm
• Models provided by Algebra
• ADTs and Algebras in practice often unwieldy.
– usage often restricted to the level of signatures
– with semantics declared only informally (natural language)
or as an algorithm
www.monash.edu.au
26
Induction
• general mathematical technique
• used for many proofs
• also used for program design
• (in general you will find that proof techniques and
design techniques are often dual)
www.monash.edu.au
27
Induction
• general mathematical proof technique
• to prove an infinite number of theorems (or statements)
– construct a well-ordered sequence of the statements
– prove that the first staement is true
– prove that if a particular (n-th) statement is true,
so is the next (n+1-st) one.
• also used for program design
• (in general you will find that proof techniques and design
techniques are often dual)
www.monash.edu.au
28
Induction on Numbers
• To prove a property P(n) for all integers n, n>= k
– (Base Case) prove P(k)
– (Induction)
prove P(n+1) from P(n)
• Example:
• Base Case:
0 i 
n
n(n  1)
2
0 i  0  1 
1
1(1  1)
2
www.monash.edu.au
29
Induction on Natural Numbers
• To prove a property P(n) for all integers n, n>= k
– (Base Case) prove P(k)
– (induction)
prove P(n+1) from P(n)
• Induction:  0
n1

i
n1
0
n
(n  1)((n  1)  1)
n(n  1)
from  0 i 
2
2
i  (n  1) 

n
0
i
assume P(n)
n(n  1)
 (n  1) 
2
2(n  1)  n(n  1) n 2  3n  2 (n  1)(n  2) (n  1)((n  1)  1)




2
2
2
2
www.monash.edu.au
30
Pitfalls of Induction
• Make sure that you only use properties that are
true for all instances of the theorem! n must be arbitrary for
the induction to work.
• Often the first few cases are “special”.
Don’t start the induction “too early”
Theorem (wrong): “Given n lines no two of which are parallel, they all
intersect in a common point”
Proof (wrong)
– Base: Obviously true for n=2
– Induction: look at (n+1) lines. By induction the first n lines have a point x in
common. By induction the last n lines have a point y in common. x and y
must be identical because otherwise some lines would have more than 1
point in common. Thus they would be identical which contradicts the
assumption that we have (n+1) different lines.
• This is clearly wrong, but why??
www.monash.edu.au
31
Structural Induction
• We can generalize induction on numbers
to induction on structures.
• This works by “peeling off” layers of the
structure.
• Two typical examples in computer science are:
– Lists: induction over the length of the list
– Trees: induction over the height of the tree
• This is called “structural induction”
www.monash.edu.au
32
Using Structural Induction
ALGEBRA integer list
sorts intlist, int, bool;
ops
…
sets
…
functions
length(nil)
= 0
length(cons(e,l))
= 1+length(l)
Example (trivial): prove that length is correct
• Base (n=0): l=nil.
length(l)=0 by definition
• Induction (n+1): l=cons(e,l1)
n+1=length(l)=1+length(l1)=
(* by definition *)
=1+n
(* by induction *)
www.monash.edu.au
33
Using Structural Induction
•
ALGEBRA integer list
sorts intlist, int, bool;
ops
…
sets
…
functions
append(nil, l)
= l
append(cons(e,l1),l2)
= cons(e, append(l1, l2))
…
Convince youself that this is correct using examples!
Theorem (append is associative):
append(l1, append(l2, l3)) = append(append(l1, l2), l3)
www.monash.edu.au
34
Append is associative (base case)
Theorem (append is associative):
append(l1, append(l2, l3)) = append(append(l1, l2), l3)
Base Case (l1=nil)
append(l1, append(l2, l3))=
append(nil, append(l2, l3))=
append(l2, l3)=
append(append(nil,l2), l3)=
append(append(l1,l2), l3)
www.monash.edu.au
35
Append is associative (induction)
Theorem (append is associative):
append(l1, append(l2, l3)) = append(append(l1, l2), l3)
Induction (l1=cons(e,l0))
append(l1, append(l2, l3)))=
append(cons(e,l0), append(l2, l3)))=
cons(e, append(l0, append(l2, l3)))=
cons(e, append(append(l0, l2), l3))=
append(cons(e, append(l0, l2)), l3)=
append(append(cons(e,l0),l2), l3)=
append(append(l1,l2), l3)
(* by definition *)
(* by induction *)
www.monash.edu.au
36
Algorithm Design by Induction
•
Induction can also be used as the guiding principle of
algorithm design.
•
To solve a large problem
1. solve a (trivial) base case
2. extend the solution to stepwise to the larger problem
We will come back to this later
•
•
Quick Example: Sorting a list
1. a single element list is trivially sorted
2. to sort a list cons(e,l) where l is already sorted,
just insert e in the right spot in l
www.monash.edu.au
37
Algorithm Design by Induction
•
ALGEBRA integer list
sorts intlist, int, bool;
ops
…sort: intlist x intlist -> intlist
sets
…
functions
sort(nil)
= nil
sort(cons(e,l))
= insertSorted(e,sort(l))
insertSorted(e,nil)
= cons(e,nil)
insertSorted(e,cons(f,l))= cons(f,insertSorted(e,l)) if (e>f)
insertSorted(e,cons(f,l))= cons(e,cons(f,l)) otherwise
…
Convince yourself that this is correct!
Do you recognize this algorithm??
www.monash.edu.au
38
Proving (Imperative) Program Properties
•
•
This is hard!
We are (mainly) interested in
– Termination
– Correctness
•
Proving properties for a program
(particularly imperative program like C, Java, etc.)
is considerably harder than proving properties from a ADT or
Algebra (this is why we need such specifications!!)
•
It is easier in a functional language like SML because the program
can be understood as an algebraic specification and thus (to some
degree) be manipulated like one.
www.monash.edu.au
39
Proving (Imperative) Program Properties
• The basic “trick” to prove properties of imperative programs
is to derive the properties from simpler
• “assertions” about the program state
(statements that are always true at some spot in the
execution), particularly
• “invariants” assertions in loops that hold for all iterations
www.monash.edu.au
40
Selection Sort
i=0;
while(i<n) do {
// invariant 1: a[0]…a[i-1] is sorted, a[0]…a[i-1] <= a[i]…a[n]
// trivially clear for i=0
// if i>0 then by induction for a[0…i-2] and the smallest in a[i-1…n] was
// swapped into a[i-1] in the last iteration
j = smallestIndex(i,n);
swap(a[i], a[j]);
i++;
}
// assertion 1: i=n (from loop termination)
•
•
Termination: i is increased in every iteration, loop terminates when i=n
Correctness from invariant 1 and assertion 1:
a[0]…a[i-1] sorted && i=n && a[n] >= a[0…i-1] implies a[0…n] sorted.
This was very informal and “sloppy”
www.monash.edu.au
41
Hoare Logic
•
a formalized form of logical reasoning about programs is Hoare Logic. It
uses
– pre and postconditions to specify properties
– first order logic to reason about program fragments
•
First approaches to this kind of reasoning (on flowcharts) were
described by R.W. Floyd. The foundation of modern axiomatic
semantics were laid in C.A.R. Hoare. “An axiomatic basis for
computer programming”. Communications of the ACM
12(10):576-585, October 1969.
•
In practice this approach can become quite unwieldy
www.monash.edu.au
42
First Example of Using Hoare Logic
•We write {A}c{B} to indicate that if A holds before the execution of c
and that {B} will hold after c has been executed.
•We use first order logic to combine assertions and to arrive at a statement
{Q} P {R}
•where P is our entire program, Q are the known preconditions and R is
what we want to prove
www.monash.edu.au
43
Invariants
it is of course much trickier to use assertions in loops
while ( c ) do
{A }
…statements
{B }
end.
We are “passing” the same assertions {A} and {B} in each iteration,
yet we want them to describe what is changing.
Trick: parameterize them with a loop variable.
(*) is such an invariant.
www.monash.edu.au
44
Hoare Rules for a simple imperative Language
• Simple first order logic is not enough to reason
about programs because it cannot tell us what the
consequences of executing a program statement
are.
• Thus we need a special set of logical inference
rules that allows us to do this. These are the socalled Hoare Rules.
• The rules here are much simplified to give the
general “flavour” of Hoare Logic.They have to be
used in conjunction with the general inference
rules of first-order logic.
www.monash.edu.au
45
Hoare Rule for Sequence
This has to be read as
to prove that B holds after the sequence
c0; c1
is executed if A holds at the beginning
find some assertion C such that
– {A}c0{C} holds and
– {C}c1{B} holds.
www.monash.edu.au
46
Hoare Rule for If-Then-Else
This has to be read as
to prove that B holds after the statement
if b then c0 else c1
is executed provided A holds at the beginning
find some assertion C such that
– {A and b}c0{C} holds and
– {C}c1{B} holds.
www.monash.edu.au
47
Hoare Rule for While
This has to be read as
to prove that A still holds after the loop
while b do c
is executed provided A holds at the beginning
we need to show that c does not change A if the
loop is entered at all (if b is true)
{A and b}c{A} holds
(not b) is trivially true when the loop terminates
www.monash.edu.au
48
Hoare Rule for Assignment
This looks complicated. It has to be read as
to prove that B holds after the assignment
X := a
we need to show that B already hols before the
assignment with the additional assumption that
X=a.
To do so, we replace all occurrences of X in B by a
(denoted as B[a/X]) and prove this.
www.monash.edu.au
49
Example of Using Hoare Logic
•
•
What does the following program fragment compute provided
x=n and y=1 at the start of the loop
Let us prove this using Hoare Logic.
(This should make you appreciate how complex proving program properties really is!)
while ( x > 0 ) do
begin
Y := X * Y ;
X := X - 1 ;
end.
www.monash.edu.au
50
Example of Using Hoare Logic
We have to show that
We invoke the rule for while
and choose as the invariant A
We show that A is indeed an invariant by choosing b as
We now need to prove
www.monash.edu.au
51
Example of Using Hoare Logic
to prove
we invoke the sequence rule. This reduces the goal to
and
We invoke the assignment rule on the second part to obtain
with
www.monash.edu.au
52
Example of Using Hoare Logic
we now invoke assignment on the first part to obtain
where C is of course as before so that
Clearly, in normal first order logic we have
Thus we have established that A is an invariant {A} c {A}. Thus also
Using the while rule we can conclude
www.monash.edu.au
53
Words of Wisdom
“Beware of bugs in the above code;
I have only proved it correct, not tried it.''
Donald Knuth in a 5 page memo entitled ``Notes on the van
Emde Boas construction of priority deques: An instructive
use of recursion,'' March 29, 1977
www.monash.edu.au
54
Proving Program Properties Automatically
•
•
Automatically Proving Termination is in general impossible
– this also holds for all other non-trivial program properties
This is Turing’s famous “Halting Problem”
(A. Turing. On computable numbers, with an application to the Entscheidungsproblem, 1937)
–
and the Rice theorem
it will be fully explained in FIT2014 (Theory of Computation, Semester 2) for now the following has to suffice:
•
1.
2.
3.
Imagine a program K(n) that takes any program (listing) l as input and
computes whether this program terminates when it uses its own listing
as input. K returns true/false accordingly.
We write another program M(n) that first runs K(n) and then
– stops if K(n) returns false and
– loops infinitely if K(n) answers true
now we take the listing of M
and give it to M as input.
M applied to itself stops iff
K determines that M applied to itself doesn’t stop!
www.monash.edu.au
55
Summary
•
•
•
•
•
•
Properties of a specifications can be proven
from ADT axioms or using standard algebraic techniques
Induction is central to many proofs
Induction can also serve as a guideline for algorithm design
properties of (imperative) programs can be proven using Hoare Logic
or (pseudo-proven) with less formal assertion-based techniques
•
proving non-trivial program properties automatically is generally
impossible.
www.monash.edu.au
56