Week 13: Notes

monads

You can read about this topic in Learn You a Haskell, chapter 12 "A Fistful of Monads".

graph representations

Just as in other languages, we may represent a graph in Haskell in either adjacency list or adjacency matrix representation.

We can store an adjacency list as an association list mapping each vertex to a list of neighbors:

type Graph a = [(a, [a])]

If the graph vertices are indexed by integer ID, we can alternatively use an array. Then we can look up a vertex's neighbors in O(1):

type Graph = Array Int [Int]

If we want to use an adjacency matrix representation, probably an array will be the most natural choice:

type Graph = Array (Int, Int) Bool

In this class we will mostly use the association-list representation, since we will be using smaller graphs where efficiency doesn't matter too much. For example, here's an undirected graph and its representation:

my_graph :: Graph Char
my_graph = [ ('a', "ce"), ('b', "de"), ('c', "adf"), ('d', "bcef"),
          ('e', "bdf"), ('f', "cdeg"), ('g', "f") ]

Here's a helper function to find a vertex's neighbors:

neighbors :: Eq a => Graph a -> a -> [a]
neighbors graph v = fromJust (lookup v graph)

Here's a function that will find all paths between two vertices in a graph:

paths :: Eq a => Graph a -> a -> a -> [[a]]
paths g v w = f [v] v w
    where f vis v w
            | v == w = [[v]]
            | otherwise =
                [ v : path | n <- neighbors g v, notElem n vis,
                             path <- f (n : vis) n w ]

Let's run it on the graph above:

> paths my_graph 'a' 'g'
["acdbefg","acdefg","acdfg","acfg","aebdcfg","aebdfg","aedcfg","aedfg","aefg"]

This function uses a visited set to avoid walking in circles. However, notice that its visited set is local to each path that it explores. It only prevents a path from intersecting itself. This function may run in exponential time even when it finds only a single path between the start and end vertices, because it may explore an exponential number of paths before finding a path to the destination. For example, consider a graph with a start vertex that points to two subgraphs: the first is a complete subgraph with N vertices, and the second is a subgraph containing only the destination vertex. The function may explore all N! paths in the complete subgraph before discovering the path to the destination. (In fact this is worse than exponential, since N! grows faster than k^N for any k.)

If we want an efficient depth-first or breadth-first search that runs in (almost) O(V + E), we will need a global visited set that is shared among all branches of the exploration. That's the topic of our next section.

depth-first and breadth-first search

Just as in a procedural language, we may implement a depth-first search either (a) by using recursion to explore the graph, or (b) using a stack data structure.

In either case we will need a visited set. In the functions below, we'll use a list to store the visited set. Of course, if we want our code to run efficiently on larger graphs then a structure such as a Data.Set (a balanced binary tree) would be a better choice for the visited set.

To explore a graph recursively in Haskell, we can write a helper function that takes a visited set and a vertex v to explore. The function will return a new visited set, expanded with all the vertices discovered while exploring v and its neighbors.

The function will work as follows: if v has already been visited, it will return the same visited set, unchanged. Otherwise, it will add v to the visited set, and then fold the same function (itself!) over all of v's neighbors. That will explore each neighbor in turn, and accumulate the expansions of the visited set that occur as the neighbors are explored. Here is the code:

-- Perform a depth-first search of a graph, starting with a given vertex.
-- Return a list of all vertices that were visited, in the order in
-- which they were visited.

dfs :: Eq a => Graph a -> a -> [a]
dfs g v = reverse (f [] v)
    where f visited v
            | elem v visited = visited
            | otherwise = foldl f (v : visited) (neighbors g v)

Let's run it on the graph above:

> dfs my_graph 'a'
"acdbefg"

Alternatively, we may use a stack, represented using a Haskell list. Our recursive helper function will pop a vertex from the stack. If it's already in the visited set, it will ignore it and recurse. Otherwise, it will add the vertex to the visited set, and also push its neighbors onto the stack. Here's an implementation:

dfs :: Eq a => Graph a -> a -> [a]
dfs g v = f [] [v]
    where f visited [] = []
          f visited (v : stack)
            | elem v visited = f visited stack
            | otherwise = v : f (v : visited) (neighbors g v ++ stack)

It will return vertices in the same order as the previous function that used a fold:

> dfs my_graph 'a'
"acdbefg"

To perform a breadth-first search instead, we only need to make a tiny change. Let's rename the variable 'stack' to be called 'queue', and append neighbors to the list instead of prepending:

bfs :: Eq a => Graph a -> a -> [a]
bfs g v = f [] [v]
    where f visited [] = []
          f visited (v : queue)
            | elem v visited = f visited queue
            | otherwise = v : f (v : visited) (queue ++ neighbors g v)

Of course, on larger graphs it would be better to use a more efficient queue implementation, such as the functional double-stack queue that we saw a few weeks ago.

As an exercise, you may wish to modify this function so that it returns a list of vertices on the shortest path from the start vertex to a given end vertex. Then each item on the queue will not just be a vertex v; instead it will be a path from the start vertex to v, in reverse order.

We may use a similar function to find shortest paths in any state space, as we did when we solved state space problems in Python or C#. To accomplish that, we can write a neighbors function that returns the neighboring states of any given state.

leftist heaps

How can we implement a priority queue in Haskell? In Introduction to Algorithms we used a binary heap to implement a priority queue. However a binary heap would be a poor choice in Haskell, since it depends on a mutable array. In theory we could simulate the array using a random-access list as we discussed in last week's lecture, but that would add an extra factor of O(log N) to operations' running times.

There are better alternatives. One possibility would be to use a self-balancing binary tree, such an AVL tree. That would allow us to insert a value in O(log N), and also to remove the smallest value in O(log N).

However there is another, simpler alternative: a data structure known as a leftist heap:

A leftist heap is a binary tree that satisfies two properties. The first is the heap property, which is the same as in a binary heap: every node's value is smaller than that of its children (assuming that we want a min-heap).

For the second property, we must first define a node's rank, which is the length of the shortest path from the node to any empty child. The rank of the empty heap (with no nodes at all) is 0. If a node has any empty children, its rank is 1. The root node above has rank 2, since from it we can reach an empty child by descending to the right twice.

The second property of a leftist heap is that for any node, the rank of its left child is greater than or equal to the rank of its right child. (Note that either child might be empty). For example, in the heap above, the root node's left child has rank 2, and its right child has rank 1.

Because the right child's rank is smaller, the shortest path from any node to any empty child is by descending to the right. So we may simply define a node's rank as 0 (if it is empty), otherwise one plus the rank of its right child.

A leftist heap may be very unbalanced, however any unbalanced part of the heap is strongly weighted to the left. To be more precise, if a non-empty heap has n nodes and rank r, then the following is always true:

r ≤ log₂(n) + 1

To see this, let's prove the equivalent statement n ≥ 2^{r – 1} by induction. If r = 1, then the tree is non-empty and has at least one node, so the statement is trivially true. Otherwise, let r > 1, and assume that the induction hypothesis holds for all smaller r. Then both children of the root have rank at least (r – 1), so each of these two subtrees has at least 2^{r – 2} nodes. And then the entire heap must have at least 2^{r – 1} nodes.

Leftist heaps have an interesting property: we can merge two heaps in O(log N), where N is the total number of nodes in the two heaps. This allows us to perform other operations efficiently. To remove the smallest element in the heap, remove the root node (which must contain the smallest element), then merge its children. To insert into a heap, construct a single-element containing the value to insert, then merge.

Now let's see how to merge two non-empty heaps h₁ and h₂. Assume without loss of generality that h₁ has the smaller root. We will first recursively merge h₂ into h₁'s right subchild. After that, we can swap h₁'s children if necessary to restore the rank property, and we are done.

The merge will run in O(R) in the worst case, where R is the sum of the ranks of heaps h₁ and h₂. We can show this by induction on R. If either heap is empty, there is nothing to do and the statement is trivial. Otherwise, merging h₂ into h₁'s right child will take (R - 1) steps by the inductive hypothesis, since the right child's rank is one less than the rank of h₁. After that, we need only to do a constant amount of work to restore the heap property (assuming that we store each node's rank in the node itself so that we can retrieve it in O(1)). And since R is logarithmic in the total number of heap nodes, the merge will run in O(log N).

Here is an implementation in Haskell:

data LHeap a = Nil | Node (LHeap a) a (LHeap a)

rank :: LHeap a -> Int
rank Nil = 0
rank (Node _ _ right) = 1 + rank right

merge :: Ord a => LHeap a -> LHeap a -> LHeap a
merge Nil h = h
merge h Nil = h
merge h@(Node left x right) h'@(Node left' x' right')
    | x' < x = merge h' h
    | otherwise =
        let h1 = merge right h'
        in if rank left >= rank h1 then Node left x h1 else Node h1 x left

insert :: Ord a => a -> LHeap a -> LHeap a
insert x h = merge (Node Nil x Nil) h

remove_smallest :: Ord a => LHeap a -> (a, LHeap a)
remove_smallest (Node left x right) = (x, merge left right)

In this implementation we compute ranks on the fly rather than storing them in each node. As suggested above, we could store each node's rank in the node itself; then we'd also need to update a node's rank after a merge into its right child. That might result in a slight efficiency gain, at the cost of using more memory.