Week 14: Notes

applicative functors
monads

You can read about these topics in Learn You a Haskell, chapters 11 and 12.

dynamic programming

In Haskell list elements (like all other values) are evaluated lazily. They may even refer to other elements of the same list:

> xs = [10, 20, xs !! 0 + xs !! 1]
> xs !! 0
10
> xs !! 2
30

Dependencies between list elements may be in any order:

> xs = [xs !! 2 + 2, 0, xs !! 1 + 10]
> xs !! 0
12

So we can build an entire list using a recursive calculation.

Building on this idea, we can implement dynamic programming in Haskell by defining a function and a list using mutual recursion. The list will hold a cache values computed by the function, so that no value will be computed twice.

For example, consider the following function that calculates the Nth Fibonacci number recursively. It is exponentially inefficient:

fib :: Int -> Int
fib 0 = 1
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)

We can use cache each return value in a list:

fib :: Int -> Int
fib n = cache !! n where
  cache = map f [0 ..]
  f 0 = 1
  f 1 = 1
  f n = cache !! (n - 1) + cache !! (n - 2)

Now the function will run much more quickly. However, it will still take O(N²) time to compute the Nth Fibonacci number, since the lookup operator (!!) runs in O(N). To improve this running time we will need to use arrays, the subject of the next section.

Note that each time the function runs it will recompute the cache of Fibonacci values. We can make a slight change to the function to cause the cache to be shared between invocations:

fib :: Int -> Int
fib = (cache !!) where
  cache = map f [0 ..]
  f 0 = 1
  f 1 = 1
  f n = cache !! (n - 1) + cache !! (n - 2)

With this change the 'where' block is not nested inside an invocation of 'fib', so the values that the block computes will be persistent.

dynamic programming using an array

Let's revisit the function we saw above that computes the Nth Fibonacci number using dynamic programming:

fib :: Int -> Int
fib n = cache !! n where
  cache = map f [0 ..]
  f 0 = 1
  f 1 = 1
  f n = cache !! (n - 1) + cache !! (n - 2)

We can make the function more efficient using an array:

fib :: Int -> Int
fib n = cache ! n where
  cache = listArray (0, n) (map f [0 ..])
  f 0 = 1
  f 1 = 1
  f n = cache ! (n - 1) + cache ! (n - 2)

sets and dictionaries

How can we build a purely functional set or dictionary data structure? For a dictionary, first suppose that the keys form a fixed set, and are of a comparable type. Then we can build a balanced binary tree in which each node contains a (key, value) pair. Since the keys are fixed, the tree's shape will not change and it will always be balanced. For a set, if we want to store a dynamically changing subset of values from a known fixed set, we can similarly use a dictionary that maps values in that fixed set to a boolean.

For example, suppose that you are implementing an algorithm on a graph whose vertices are named by strings. Since the graph will not change as the algorithm runs, its vertex names are a fixed set and you could use this approach to store a visited set of vertices, or a map from vertices to integers.

In the general case, when keys may be added dynamically and are not all known at the beginning, you can implement a set or dictionary using a self-balancing binary tree structure such as a red-black or AVL tree.

Haskell's containers package includes modules Data.Set and Data.Map which provide set and map types using balanced binary trees. There are also modules Data.IntSet and Data.IntMap which are more efficient when keys are integers.

graph representations

Just as in other languages, we may represent a graph in Haskell in either adjacency list or adjacency matrix representation.

We can store an adjacency list as an association list mapping each vertex to a list of neighbors:

type Graph a = [(a, [a])]

If the graph vertices are indexed by integer ID, we can alternatively use an array. Then we can look up a vertex's neighbors in O(1):

type Graph = Array Int [Int]

If we want to use an adjacency matrix representation, probably an array will be the most natural choice:

type Graph = Array (Int, Int) Bool

In this class we will mostly use the association-list representation, since we will be using smaller graphs where efficiency doesn't matter too much. For example, here's an undirected graph and its representation:

my_graph :: Graph Char
my_graph = [ ('a', "ce"), ('b', "de"), ('c', "adf"), ('d', "bcef"),
          ('e', "bdf"), ('f', "cdeg"), ('g', "f") ]

Here's a helper function to find a vertex's neighbors:

neighbors :: Eq a => Graph a -> a -> [a]
neighbors graph v = fromJust (lookup v graph)

Here's a function that will find all paths between two vertices in a graph:

paths :: Eq a => Graph a -> a -> a -> [[a]]
paths g v w = f [v] v w
    where f vis v w
            | v == w = [[v]]
            | otherwise =
                [ v : path | n <- neighbors g v, notElem n vis,
                             path <- f (n : vis) n w ]

Let's run it on the graph above:

> paths my_graph 'a' 'g'
["acdbefg","acdefg","acdfg","acfg","aebdcfg","aebdfg","aedcfg","aedfg","aefg"]

This function uses a visited set to avoid walking in circles. However, notice that its visited set is local to each path that it explores. It only prevents a path from intersecting itself. This function may run in exponential time even when it finds only a single path between the start and end vertices, because it may explore an exponential number of paths before finding a path to the destination. For example, consider a graph with a start vertex that points to two subgraphs: the first is a complete subgraph with N vertices, and the second is a subgraph containing only the destination vertex. The function may explore all N! paths in the complete subgraph before discovering the path to the destination. (In fact this is worse than exponential, since N! grows faster than k^N for any k.)

If we want an efficient depth-first or breadth-first search that runs in (almost) O(V + E), we will need a global visited set that is shared among all branches of the exploration. That's the topic of our next section.

depth-first and breadth-first search

Just as in a procedural language, we may implement a depth-first search either (a) by using recursion to explore the graph, or (b) using a stack data structure.

In either case we will need a visited set. In the functions below, we'll use a list to store the visited set. Of course, if we want our code to run efficiently on larger graphs then a structure such as a Data.Set (a balanced binary tree) would be a better choice for the visited set.

To explore a graph recursively in Haskell, we can write a helper function that takes a visited set and a vertex v to explore. The function will return a new visited set, expanded with all the vertices discovered while exploring v and its neighbors.

The function will work as follows: if v has already been visited, it will return the same visited set, unchanged. Otherwise, it will add v to the visited set, and then fold the same function (itself!) over all of v's neighbors. That will explore each neighbor in turn, and accumulate the expansions of the visited set that occur as the neighbors are explored. Here is the code:

-- Perform a depth-first search of a graph, starting with a given vertex.
-- Return a list of all vertices that were visited, in the order in
-- which they were visited.

dfs :: Eq a => Graph a -> a -> [a]
dfs g v = reverse (f [] v)
    where f visited v
            | elem v visited = visited
            | otherwise = foldl f (v : visited) (neighbors g v)

Let's run it on the graph above:

> dfs my_graph 'a'
"acdbefg"

Alternatively, we may use a stack, represented using a Haskell list. Our recursive helper function will pop a vertex from the stack. If it's already in the visited set, it will ignore it and recurse. Otherwise, it will add the vertex to the visited set, and also push its neighbors onto the stack. Here's an implementation:

dfs :: Eq a => Graph a -> a -> [a]
dfs g v = f [] [v]
    where f visited [] = []
          f visited (v : stack)
            | elem v visited = f visited stack
            | otherwise = v : f (v : visited) (neighbors g v ++ stack)

It will return vertices in the same order as the previous function that used a fold:

> dfs my_graph 'a'
"acdbefg"

To perform a breadth-first search instead, we only need to make a tiny change. Let's rename the variable 'stack' to be called 'queue', and append neighbors to the list instead of prepending:

bfs :: Eq a => Graph a -> a -> [a]
bfs g v = f [] [v]
    where f visited [] = []
          f visited (v : queue)
            | elem v visited = f visited queue
            | otherwise = v : f (v : visited) (queue ++ neighbors g v)

Of course, on larger graphs it would be better to use a more efficient queue implementation, such as the functional double-stack queue that we saw a few weeks ago. Furthermore, it would also be better to use a more efficient implementation of the visited set.

As an exercise, you may wish to modify this function so that it returns a list of vertices on the shortest path from the start vertex to a given end vertex. Then each item on the queue will not just be a vertex v; instead it will be a path from the start vertex to v, in reverse order.

We may use a similar function to find shortest paths in any state space, as we did when we solved state space problems in Python or C#. To accomplish that, we can write a function 'neighbors' that returns the neighboring states of any given state.

parser combinators

Parsing is an important and useful task in writing many programs. Given a grammar, there are many ways that we can write a parser for it. For example, we can write a recursive-descent parser by hand, or can use a parser generator (such as the classic UNIX tool yacc) that reads a grammar and outputs code that will parse it.

One interesting way to write a parser is using parser combinators. With this approach, a parser is an object that can parse a particular grammatical item. A combinator is a function that takes one or more parsers as arguments, and returns a parser that combines or transforms the given parser(s) in some way. Using combinators, we can write a parser for a complex grammar by combining smaller parsers in a systematic way.

Haskell includes a parser combinator library called Parsec. In Parsec, the Parser type is a monad, so you can combine parsers using do blocks and operators such as <$>. (Recall that every monad is a functor, so functor operators such as <$> work on monads.) Parsec first appeared around 2000, and has been quite popular and influential - in fact, it has been ported to various other programming languages.

It takes some time to learn Parsec. However it is a powerful tool, and in fact parser combinators are generally my preferred approach for writing any kind of parser.

The official Parsec documentation can be a bit difficult to approach at first. Our quick reference for the Haskell library includes some useful Parsec functions, and may be an easier starting point.

In Parsec, the type Parser a is a parser that can parse and return a value of type a. Note that any parser may succeed or fail. If it sees the text it expects, it will succeed and return a value. If it does not, it will fail and won't return anything.

To get started with Parsec, let's write a parser that can parse a decimal integer. The built-in parser digit has type Parser Char, and will parse and return an ASCII digit. The built-in combinator many1 :: Parser a -> Parser [a] takes a parser p that produces a value of type a. (many1 p) produces a parser that will run p for as many times as it succeeds, and will return a list of all the returned values. p must succeed at least once; otherwise many1 will fail. So we might write

int :: Parser String
int = many1 digit

We can use the top-level function parse to run a parser. It has this type:

parse :: Parser a -> String -> String -> Either ParseError a

The first string argument to parse is a filename to use in error reporting; we can just use the empty string "". The second string argument is the text to parse. parse returns an Either, which is either a Left that holds an error, or a Right that holds the value of a successful parse. Let's run our parser int:

> parse int "" "33259"
Right "33259"

OK, that wasn't too exciting - the parser just produced the same string it was given! Let's modify our parser so that it will return an Int. We might write

int :: Parser Int
int = do
    s <- many1 digit
    return (read s)

That will work:

> parse int "" "33259"
Right 33259

Or instead of the do block above we may use <$>, which is equivalent:

int :: Parser Int
int = read <$> many1 digit

Now let's write a parser that can parse arithmetic expressions such as "2 * (3 + 4 * 5)". Our expressions will include integer constants plus the operators +, -, *, and /. We'd like * and / to have higher precedence than + and -. All operators will be left-associative. We'll use this datatype for representing expressions:

data Expr = Const Int | OpExpr Char Expr Expr
  deriving (Show)

As one possible approach, we could write a context-free grammar that expresses our desired operator precedence, with separate non-terminal symbols for each precedence level. Then we could translate that grammar into Parsec code. However, there is an easier way. Parsec includes a combinator called buildExpressionParser that takes a table of operators with precedences plus a parser for individual terms, and produces a parser that can parse entire expressions. So let's use that.

Parsec includes this type for operator associativities:

data Assoc = AssocNone | AssocLeft | AssocRight

Each element of our operator table will have this type:

data Operator a =
      Infix (Parser (a -> a -> a)) Assoc
    | Prefix (Parser (a -> a))
    | Postfix (Parser (a -> a))

(Note: This is a simplication; the actual Operator type constructor takes more arguments, which you can see in the official Parsec documentation. In our code here we will not give the operator table a type, which will let us avoid dealing with these extra arguments.)

Consider the operator *. In the operator table, we need to provide a parser that parses the character '*', and returns a function that can combine two expressions. We could write it using do:

parse_mul :: Parser (Expr -> Expr -> Expr)
parse_mul = do
    char '*'
    return (OpExpr '*')

However it will be easier to use the built-in operator $>, which is equivalent:

parse_mul :: Parser (Expr -> Expr -> Expr)
parse_mul = char '*' $> OpExpr '*'

Note that you'll need to import Data.Functor to get $>. This operator will work on any functor type (including any monad, of course). You can think of it as replacing the return value with something else:

> Just 4 $> 10
Just 10
> [2, 3, 4] $> 10
[10,10,10]

Instead of writing a version of parse_mul for each operator, we'll write a function infix_op that we can reuse. Here is our operator table:

infix_op c = Infix (sym c $> OpExpr c) AssocLeft

table = [
    [ infix_op '*', infix_op '/' ],
    [ infix_op '+', infix_op '-' ]
  ]

The table is a list of lists. Each sublist contains operators with the same precedence, and the sublists are in order of decreasing precedence.

Finally, let's present the complete parser. Here is the code:

num :: Parser Expr
num = Const . read <$> many1 digit

infix_op c = Infix (char c $> OpExpr c) AssocLeft

table = [
    [ infix_op '*', infix_op '/' ],
    [ infix_op '+', infix_op '-' ]
  ]

term :: Parser Expr
term = num <|> char '(' *> expr <* char ')'

expr :: Parser Expr
expr = buildExpressionParser table term

Above, the parser term uses the <|> operator, which represents a choice. Each term is either a constant integer (represented by num), or an expression surrounded by parentheses. Notice the operators *> and <*, which we've used to combine the parser expr with the parsers for the left and right parentheses. You can think of *> and <* as pointing to the parser whose return value we want; the other return values will be ignored. Note that

char '(' *> expr <* char ')'

is equivalent to

do
    char '('
    e <- expr
    char ')'
    return e

In fact *> and <* will work with any Applicative (including monads, of course):

> Just 3 *> Just 4
Just 4
> Nothing *> Just 4
Nothing

Our parser works:

> parse expr "" "(2+3)*(3-2)"
Right (OpExpr '*' (OpExpr '+' (Const 2) (Const 3)) (OpExpr '-' (Const 3) (Const 2)))

However notice that it will not accept whitespace:

> parse expr "" "(2 + 2)"
Left (line 1, column 3):
unexpected " "
expecting digit, operator or ")"

So let's extend the parser to ignore whitespace. In theory we could just first remove all whitespace from the input string before parsing, but for many grammars that won't work; for example, if we are parsing Haskell code then "f x" is certainly not the same as "fx". So let's use a more sophisticated approach. We'll modify our low-level character parsers so that they will ignore any leading spaces.

The built-in parser spaces will skip 0 or more whitespace characters, and returns (). So we might write

sym :: Char -> Parser Char
sym c = spaces *> char c

and then call e.g. sym '*' when we want to parse a '*' character.

Unfortunately that won't work. The problem is that this parser may consume some whitespace even when it fails. For example, suppose that the expression parser calls (sym '*') to try to parse a '*', but actually the next characters in the input stream are a '' and then a '+'. Inside sym, the call to spaces will consume the '', and then the call to char '*' will fail. So sym will also fail, in a state in which the space character ' ' has been consumed.

An important principle in Parsec is that the choice p <|> q will try q only if p failed without consuming any input. This may seem surprising as first, but actually it makes Parsec efficient because it means that by default Parsec will never backtrack after successfully matching any input character.

In fact, when parsing many programming languages we will never need to backtrack after successfully matching an input token (which may actually consist of several characters). Specifically, these are languages whose context-free grammars belong to the LL(1) or LR(1) classes, though this is a subject for a compilers class.

Furthermore, unrestricted backtracking may often lead to confusing error messages when parsing fails due to a syntax error in the input. So Parsec's approach generally improves the error messages that it produces.

Now, in some cases we may actually want Parsec to backtrack after a successful match. For example, if sym '*' matches some whitespace but then fails, we certainly want the expression parser to try other possible operators. To ask for this behavior, we can use the built-in combinator try. try p will succeed if p succeeds. If p fails, then try p will fail without consuming any input. In other words, it will undo any input consumption that occurred while processing p.

Using try in this way, here's an updated parser that will ignore whitespace in its input:

sym :: Char -> Parser Char
sym c = try (spaces *> char c)

num :: Parser Expr
num = Const . read <$> try (spaces *> many1 digit)

infix_op c = Infix (char c $> OpExpr c) AssocLeft

table = [
    [ infix_op '*', infix_op '/' ],
    [ infix_op '+', infix_op '-' ]
  ]

term :: Parser Expr
term = num <|> sym '(' *> expr <* sym ')'

expr :: Parser Expr
expr = buildExpressionParser table term

Let's run it:

> parse expr "" "(2 + 3) * (3 - 2)"
Right (OpExpr '*' (OpExpr '+' (Const 2) (Const 3)) (OpExpr '-' (Const 3) (Const 2)))