Introduction to Functional Programming Using Haskell

Download Report

Transcript Introduction to Functional Programming Using Haskell

Functional Programming Using
Haskell
Dan Vasicek
2010 – 03 – 21
Functional Programming in Haskell
•
•
Haskell Information
Sources
Fundamental concepts
– Functional
Programming
– Sessions, modules,
& scripts
– Polymorphic types
– Order of evaluation
– Patterns
– Lazy evaluation
– Side Effects
• Fundamental Data types
–Boolian
–Numbers
–Characters
• Compound data types
•Tuples
•Lists
•User Defined Types
oEnumerations
• Efficiency
–Evaluation order
–Lazy Evaluation
–Space
•Monads
•Examples
Sources of Information
• Book - “Introduction to Functional
Programming Using Haskell”, Richard Bird,
Pearson Education Limited, England, Prentice
Hall Europe 1998
• Tutorial http://www.cs.utah.edu/~hal/docs/daume02y
aht.pdf
• For writing “real production code” see:
http://haskell.org/haskellwiki/How_to_write_a_
Haskell_program
Three Haskell Systems
• HUGS – Haskell Users Gofer System
– Interpreter only
– http://www.haskell.org/hugs
– Faster than ghc
• GHC – Glasgow Haskell Compiler
– Both interpreter and compiler
– Slower , more complex, and bigger than hugs and
nhc
• NHC - Nearly a Haskell Compiler
– Complier only
– http://www.haskell.org/nhc98/download.html
Functional Programming
• A functional program is a function that solves a problem
• That function may involve several subsidiary functions and
is described in a notation that obeys normal mathematical
principles
• The result of the function is the solution of the problem
and is disjoint from the input to the function
• As in mathematics, once a function is “proven” correct,
changes in the environment will not invalidate your “proof”
• Functions can be passed as arguments and are “first class”
• Functions do not change the global “state”
• Single assignment. Once a “variable” gets a value, it never
changes.
Functional Programming in “C”
• Prohibit the use of pointers?
– Not likely!
– “Careful” use of pointers
• No modification of input parameters
• All “output” is clearly separated from the
input
Output = function_name(input)
Subroutine_name (input; output)
Fundamental Concepts of Haskell
• Polymorphic Static types
– length list – The list can have elements of any type.
So, length is polymorphic. It can be applied to lists of
characters, numbers, tuples, lists, …
length []
=0
length (x:xs) = 1+ length xs
• Where [] is a pattern that means the empty list
• And x:xs is a pattern that means x is the first element of the
input list and xs is the rest of the list (“:” is the cons
operator)
– Called “pattern matching”. And pattern matching is
an important component of Haskell (more later)
Examples of Polymorphism
• head
head (x:xs)
:: [a] -> a
= x
• tail
:: [a] -> [a]
tail (x:xs)
= xs
• Both fail if presented with an empty list
• Both work for lists of anything, even lists of
empty lists and are Polymorphic
• Examples of the Hindley-Milner type system
Order of Evaluation
• Order of evaluation (simplification, or reduction)
is not specified in a functional program
• Define: sq x = x*x
• sq(3+4) could be simplified as
– sq(7) 7*7  49
– (3+4)*(3+4) 7*(3+4) 7*749
• Both orders produce the same result
• The independence of the result from the order is
a characteristic feature functional programs
• The OS is free to choose the “best” order
Lazy Evaluation
• let three x = 3
• let infinity = infinity +1
• Now simplify the expression
– three infinity
– Simplification of infinity first gives
• Three(infinity +1 +1 +1 and so on)
• which does not terminate
– Simplification of three first,
• three infinity = 3
• the expression terminates in one step
• Some simplification orders may terminate while others do not
• In GHCi three infinity =3
• In general, some simplification orders will be more efficient than
others
Lazy Evaluation
• Guarantees termination whenever
termination is possible
• Allows the OS to choose an “efficient”
evaluation order
Side Effects
• A side effect is essentially something that happens in the course of
executing a function that is not related to the output produced by
that function.
• A pure function simply returns a value
• A pure function has no internal state
• A pure function cannot modify the input data
• Given the same arguments a pure function will always produce the
same result
• In GHCi values may be displayed by the interactive environment
• Monadic programming allows functional programs to mimic
imperative programs
• Monads provide a way to execute “Commands” and display values
Monads
• Haskell uses monads to isolate all impure (not
functional) computations from the rest of the
program and perform them in the “safe” way
• The execution order of a functional program is
entirely determined by the operating system.
And this applies to the order of execution of
I/O as well
• Thus, the order of I/O can not be preserved by
a functional program
Example of Scrambled I/O Order
• “Thus, the order of I/O can not be preserved
by a functional program”
• Suppose that your functional program wrote
the words in the following order:
• “be preserved a functional program the order
of I/O can not by Thus,”
Imperative Constructs are NOT
Functional
•
•
•
•
x=x+1 – is not allowed!
All ghci commands are imperative.
The interactive environment is imperative
http://www.haskell.org/haskellwiki/Functional
_programming
• http://www.haskell.org/all_about_monads/ht
ml/class.html
Haskell Fundamental Data Types
• Bool: True or False
• Char: ‘a’ , '\n',
'\x05e0‘, ‘\122’
a newline α
z
• Number:
–1
– 2.718
Compound Data types
Tuples
• (‘a’, “Daniel”, 3.14159) is valid
• (1, map) is a valid tuple. But you will have to
define an I/O Monad to “Show” it.
• The functions for extracting the first and
second element of a pair are defined in the
standard Haskell environment
– fst(x,y) = x
– snd(x,y) = y
• fst(1,2,3) is not defined in the standard
environment
Lists
• Lists – a list is enclosed in square brackets
• The empty list is []
• The cons operator is “:”
• 1:2:3:[] is [1,2,3]
• “Daniel” is ‘D’:’a’:’n’:’i’:’e’:’l’:[] =[‘D’,’a’,’n’,’i’,’e’,’l’]
• ‘D’:”an” = “Dan”
• All elements of a list must be of the same type
• [[1,2],[1]] is a valid list
Comments in Haskell Code
• Single line comments are preceded by ``--''
and continue to the end of the line. For
example:
suc n = n + 1 -- this is a successor function
• Multiline and nested comments begin with {and end with -}. Thus
{- can be used to inactivate a block of code -}
Literate Programming
• A “literate code file” is a file with suffix .lhs
instead of .hs (Literate Haskell)
• Two styles for literate code:
– LaTeX Style : \begin{code} …
\end{code}
– “Bird” Style: prefix code lines with the “>”
character
• Compiler flags allow for reconfiguration of the
literate style
Example: LaTeX Literate Style
Here is a simple example of a literate script
for defining the quicksort function:
\begin{code}
tsort [] = []
tsort (x:xs) = tsort [y | y<-xs, y<x] ++ [x] ++ tsort [y |
y<-xs, y>=x]
\end{code}
Notice that this definition is very inefficient for a
sorted list.
Example: Richard Bird Literate Style
In Bird-style a blank line is required before the code
>fact :: Integer -> Integer
> fact 0 = 1
> fact n = n * fact (n-1)
And a blank line is required after the code as well
Emacs Supports a Multi Mode
Display
• One style for LaTeX
• And a second style for Haskell
• http://www.haskell.org/haskellwiki/Literate_p
rogramming#Haskell_and_literate_programmi
ng
Literate Programming in VIM
• http://www.haskell.org/haskellwiki/Literate_p
rogramming/Vim
Quick Sort Algorithm
qsort [] = []
qsort ( x:xs) = qsort (filter (< x) xs) ++
qsort (filter ( >= x) xs)
• Inefficient! Calls filter twice for xs
• Can use (length (x:xs))2memory
More Efficient quicksort
qsort [] = []
qsort x:xs = qsort ys ++ [x] ++ qsort zs
where (ys, zs) = partition (< x) xs
Avoids filtering xs twice
Still can use n2 memory!
Notice that the < is necessary in the comparison
to preserve the original order of identical
elements
User Defined Types
Enumerated Types
• data Typename
= Type1 | Type2
| Type3
Example of Enumerated Type
module Color
where
data Color
= Red | Orange | Yellow| Green| Blue| Purple | White
| Black
colorToRGB Red = (255,0,0)
colorToRGB Orange = (255,128,0)
colorToRGB Yellow = (255,255,0)
colorToRGB Green = (0,255,0)
colorToRGB Blue = (0,0,255)
colorToRGB Purple = (255,0,255)
colorToRGB White = (255,255,255)
colorToRGB Black = (0,0,0)
Example of Enumerated Types
• colorToRGB Red
• returns the value:
(255,0,0)
• Red == Blue fails because == is not defined
for type Color
• colorToRGB Red == colorToRGB Blue
• Returns the value
False
User Defined Types
• User defined data types are done via a ``data''
declaration having the general form:
data T u1 ... un = C1 t11 ... t1k1
| ...
| Cn tn1 ... Tnkn
• where T is a type constructor; the ui are type
variables; the Ci are (data) constructors; and the tij
are the constituent types (possibly containing some
ui). The presence of the ui implies that the type is
polymorphic --- it may be instantiated by substituting
specific types for the ui
User Defined Types
• data Bool = True | False
– Bool is the “type” constructor
– True and False are the “data” constructors
• data Color = Red | Green | Blue | Indigo
• data Point a = Pt a a
– “a” on the lhs is a “type” variable
• data Tree a = Branch (Tree a) (Tree a) | Leaf a
– “a” is a “constituent type” on the rhs
Type Synonyms
•
•
•
•
•
General Definition Unknown (examples only)
type String = [Char]
type Person = (Name, Address)
type Name = String
data Address = None | Addr String
Pythagorian Triads
module PythagorianTriads
where
triples
:: Int -> [(Int, Int, Int)]
triples n = [(x, y, z) | x <- [1..n], y <- [1..n], z
<- [1..n]]
pyth (x, y, z) = (x*x + y*y == z*z)
ptriads n
= filter pyth (triples n)
ptriads 13 returns [[3,4,5], [4,3,5], [5,12,13]
More Efficient Version
module PythagorianTriads
where
triples
:: Int -> [(Int, Int, Int)]
triples n = [(x, y, z) | x <- [1..n], y <- [x..n], z
<- [y..n]]
pyth (x, y, z) = (x*x + y*y == z*z)
ptriads n
= filter pyth (triples n)
ptriads 13 returns [[3,4,5], [5,12,13], [6,8,10]]
Overloading Operators
class Eq α where
(==) :: α -> α -> Bool
instance Eq Color where
(x == y) = ((colorToRGB x) == (colorToRGB y))
Unfortunately, this does not compile!
Unicode in Haskell
• Haskell 98 specification says that Haskell
supports Unicode
• http://blog.kfish.org/2007/10/survey-haskellunicode-support.html
• http://code.haskell.org/utf8-string/
Unicode table
Unicode Experiment
• Create a list of byte codes for some Hebrew characters:
• hebrew = ['\n', '\x05d0', '\x05d1', '\x05d2', '\x05d3',
'\x05d4', '\x05d5', '\x05d6', '\x05d7', '\x05d8',
'\x05d9','\x5da','\x5db','\x5dc','\x5de','\x5df', '\x05e0',
'\x0e1', '\x05e2', '\x05e3', '\x05e4', '\x05e5', '\x05e6',
'\x05e7', '\x05e8', '\x05e9' , '\x05ea', '\x05eb', '\x05ec',
'\x05ed', '\x05ee', '\x05ef' , '\n','\n‘]
• putStr hebrew
• Result on next slide
Result of “putStr hebrew”
Unicode Greek
The letters printed by my program are in the order
αβ Γ Π Σ σμτΦΘΩδ
And this does not agree with the order in the
above table. Therefore, my environment is not using
this table.
Encoding Problem
• Hexadecimal ‘\x05d0’ = ‘\1488’ decimal
• So, my coding is not the problem
Begin Appendix
• Details of available modules
• Comparison to other languages
• List of some Haskell functions
List of Packages
• http://hackage.haskell.org/packages/archive/
pkg-list.html
Example: Algorithm package
• binary-search library: Binary and exponential searches
• Binpack library: Common bin-packing heuristics.
• DecisionTree library: A very simple implementation of decision
trees for discrete attributes.
• Diff library: O(ND) diff algorithm in haskell.
• dom-lt library: The Lengauer-Tarjan graph dominators algorithm.
• edit-distance library and programs: Levenshtein and restricted
Damerau-Levenshtein edit distances
• funsat library and program: A modern DPLL-style SAT solver
• garsia-wachs library: A Functional Implementation of the GarsiaWachs Algorithm
• Graphalyze library: Graph-Theoretic Analysis library.
• GraphSCC library: Tarjan's algorithm for computing the strongly
connected components of a graph.
Default Packages – provided by the
downloaded system (283
functions)
• ghc-prim
• integer - Arbitrary Precision Integer Arithmetic
• base – basic data types and functions
–31 data types
• rts
More Algorithms
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
hgal library: library for computation automorphism group and canonical labelling of a graph
hmm library: Hidden Markov Model algorithms
incremental-sat-solver library: Simple, Incremental SAT Solving as a Library
infinite-search library: Exhaustively searchable infinite sets.
iproute library: IP Routing Table
kmeans library: K-means clustering algorithm
ListTree library: Combinatoric search using ListT
markov-chain library: Markov Chains for generating random sequences with a user definable behaviour.
Munkres library: Munkres' assignment algorithm (hungarian method)
natural-sort library: User-friendly text collation
Numbers library: An assortment of number theoretic functions
NumberSieves library: Number Theoretic Sieves: primes, factorization, and Euler's Totient
palindromes library and program: Finding palindromes in strings
pqueue-mtl library: Fully encapsulated monad transformers with queuelike functionality.
presburger library: Cooper's decision procedure for Presburger arithmetic.
primes library: Efficient, purely functional generation of prime numbers
queuelike library: A library of queuelike data structures, both functional and stateful.
rangemin library: Linear range-min algorithms.
sat programs: CNF SATisfier
sat-micro-hs program: A minimal SAT solver
satchmo library: SAT encoding monad
satchmo-examples programs: examples that show how to use satchmo
satchmo-funsat library: funsat driver as backend for satchmo
teams library: Graphical modeling tools for sequential teams
TrieMap library: Automatic type inference of generalized tries.
union-find library: Efficient union and equivalence testing of sets.
Modules in the Default Package
array
bytestring
Cabal
containers
directory
editline
filepath
haskell98
hpc
old-locale
old- time
packedstring
pretty
process
random
readline
syb
template-haskell
unix
Win32
Some Haskell Functions
(From the appendix of the book)
• (.) – Functional Composition
• (.) :: (β  γ)  (αβ)  (αγ)
• (f.g)x =f(g x)
•
•
•
•
(++) Concatenation of two lists
(++)
:: [α]  [α]  [α]
[] ++ ys
= ys
(x:xs) ++ ys = x: (xs ++ ys)
More Functions
•
•
•
•
(^) Conjunction
(^)
:: Bool  Bool  Bool
True ^ x = x
False ^ x = False
•
•
•
•
(v) Disjunction
(v)
:: Bool  Bool  Bool
True v x = True
False v x = x
More Functions
•
•
•
•
•
(!!) List indexing
(!!) :: [a]  Int a
[]!!n = error “(!!): Index too large”
(x:xs)!!0 = x
(x:xs)!!(n+1) = xs!!n
• and returns the conjunction of a list of booleans
• and :: [Bool]  Bool
• and = foldr (^) True
More Functions
• concat Concatineates a list of lists
• concat :: [[a]] [a]
• concat = foldr (++) []
• const creates a constant valued function
• const :: a  b a
• const (x,y) = x
More Functions
• cross Applies a pair of functions to the
corresponding elements of a pair
• cross
:: (a b, c d)  (a,c)  (b,d)
• cross (f,g) = pair(f.fst, g.snd)
• curry converts a non-curried function into a
curried one
• curry
:: ((a,b)c) (abc)
• curry f x y = f(x,y)
List Functions documented at:
• http://www.cs.chalmers.se/Cs/Grundutb/Kurs
er/d1pt/d1pta/ListDoc/
Comparison to other languages
•
•
•
•
•
•
•
Haskell separates the definition of a type from the definition of the methods associated with that
type. A class in C++ or Java usually defines both a data structure (the member variables) and the
functions associated with the structure (the methods). In Haskell, these definitions are separated.
The class methods defined by a Haskell class correspond to virtual functions in a C++ class. Each
instance of a class provides its own definition for each method; class defaults correspond to default
definitions for a virtual function in the base class.
Haskell classes are roughly similar to a Java interface. Like an interface declaration, a Haskell class
declaration defines a protocol for using an object rather than defining an object itself.
Haskell does not support the C++ overloading style in which functions with different types share a
common name.
The type of a Haskell object cannot be implicitly coerced; there is no universal base class such as
Object which values can be projected into or out of.
C++ and Java attach identifying information (such as a VTable) to the runtime representation of an
object. In Haskell, such information is attached logically instead of physically to values, through the
type system.
There is no access control (such as public or private class constituents) built into the Haskell class
system. Instead, the module system must be used to hide or reveal components of a class.