Week 13: Notes

This week we discussed


You can read about the last three topics above in the Learn You a Haskell book in chapters 9 ("Input and Output"), 12 ("A Fistful of Monads") and 13 ("For a Few Monads More").

However, you should know that the Reader monad described in the book is not the same as the Reader monad from the lecture. The book describes the function type (->) r, which is actually a monad in Haskell. In the lecture, we discussed the Control.Monad.Reader type, which is also a monad. These types can be used for a similar purpose, but they are not the same thing.

Here are some more notes:

amortized running times

When we implement a data structure in a purely functional language, we will often want to show that each operation has a certain amortized running time. More specifically, we want to prove that the total cost of any series of operations is less than the total of the amortized costs of each of those operations.

Suppose that a data structure is initially empty. We will call its initial state x0. And suppose that we perform a series of n operations on the structure, numbered 1 through n. The state of the structure after the i-th operation will be denoted as xi.

Each operation i has an actual cost Ci. Typically it's difficult to compute a precise actual cost in machine instructions or microseconds. So we may adopt a simplified cost model. For example, in Haskell code we might consider an operation's actual cost to be the number of cons operations that it performs, i.e. the number of times that it prepends a new element to an existing list. (Each of these operations is a small memory allocation.)

We will assign an amortized cost to each kind of operation. If operation i has amortized cost Ai, then we want to prove that the total cost of a series of n operations will always be less than its total amortized cost, i.e. that

C1 + ... + CN <= A1 + ... + An

A useful technique for proving amortized costs is to invent a size (or potential) function S for a data structure. The size of an empty structure should be 0, and size of any structure should always be non-negative. If we perform a series of n operations on an initially empty structure, then let Si = S(xi), i.e. the size of the structure after the i-th operation.

Then we would like to show that for every operation i,

Ci + (Si - Si - 1) <= Ai

In other words, the actual cost of the operation plus the size increase that the operation induces is less than or equal to the operation's amortized cost.

If we can show that the equation above always holds, then we will have

C1 + (S1 - S0) + C2 + (S2 - S1) + ... + Cn + (Sn - Sn - 1) <= A1 + ... + An

or

C1 + ... + Cn + (Sn - S0) <= A1 + ... + An

But S0 = 0 and Sn >= 0, so we have

C1 + ... + Cn <= A1 + ... + An

This is what we wanted to show above, i.e. that the total actual cost is no greater than the total of the amortized costs of the operations.

example: functional queue

For example, consider the functional queue data structure that we've described earlier in this course (and you implemented in one homework assignment). We store the queue as two lists called 'front' and 'back'. The enqueue operation prepends a value to the 'back' list. The dequeue operation removes a value from the 'front' list if it's empty. Otherwise, it reverses the 'back' list and moves the result to the front, then removes a value from the 'front' list. Assume that the actual cost of each these operations is the number of cons operations that it performs. We'd like to show that both of these operations run in constant time on average, i.e. that we can assign each of them a constant amortized time.

Let the size S of the queue be the number of elements in the 'back' list. Then

For both operations, we've shown that

Ci + (Si - Si - 1) <= Ai

which proves that the total cost of a series of operations will be no greater than those operations' total amortized cost. The specific amortized costs we chose (2 and 0) are not so important, since really we just want to know that each of the operations runs in constant time on average.

example: symmetric list

Now consider a symmetric list, otherwise known as a double-ended queue. This data structure has operations enqueueFront, dequeueFront, enqueueBack and dequeueBack.

We can implement a symmetric list as a pair of lists 'front' and 'back', just like in the previous example. The enqueueHead operation prepends an element to the 'front' list. dequeueFront will remove the first element of the 'front' list if it's non-empty. Otherwise, it will split the 'back' list into two equal-length halves A and B. It will reverse list B and move it to the front, and will keep list A as the new 'back' list. The enqueueBack and dequeueBack operations work symmetrically. I won't provide Haskell code here, but writing this yourself is a useful exercise.

Let's show that all of the operations on a symmetric list will run in amortized constant time. Let the size S of this structure be the absolute value of the difference in lengths between the 'front' and 'back' lists:

S = abs(length(front) - length(back))

Now consider the operations:

    Let the amortized cost A of this operation be 1.

    (Strictly speaking, we assumed here that N was even, so that after splitting the 'front' and 'back' lists had the same number of elements. In this kind of analysis, a difference of a single element will almost never affect the asymptotic outcome. So we will simply wave our hands and assert that if N is odd, the outcome will be similar.)

The other operations (enqueueBack and dequeueBack) are symmetric.

In conclusion, we've shown that all operations on this data structure run in constant time on average.