This week we discussed
sequence
and
mapM
You can read about the last three topics above in the Learn You a Haskell book in chapters 9 ("Input and Output"), 12 ("A Fistful of Monads") and 13 ("For a Few Monads More").
However, you should know that the Reader monad described in the book is not the same as the Reader monad from the lecture. The book describes the function type (->) r, which is actually a monad in Haskell. In the lecture, we discussed the Control.Monad.Reader type, which is also a monad. These types can be used for a similar purpose, but they are not the same thing.
Here are some more notes:
When we implement a data structure in a purely functional language, we will often want to show that each operation has a certain amortized running time. More specifically, we want to prove that the total cost of any series of operations is less than the total of the amortized costs of each of those operations.
Suppose that a data structure is initially empty. We will call its initial state x0. And suppose that we perform a series of n operations on the structure, numbered 1 through n. The state of the structure after the i-th operation will be denoted as xi.
Each operation i has an actual cost Ci. Typically it's difficult to compute a precise actual cost in machine instructions or microseconds. So we may adopt a simplified cost model. For example, in Haskell code we might consider an operation's actual cost to be the number of cons operations that it performs, i.e. the number of times that it prepends a new element to an existing list. (Each of these operations is a small memory allocation.)
We will assign an amortized cost to each kind of operation. If operation i has amortized cost Ai, then we want to prove that the total cost of a series of n operations will always be less than its total amortized cost, i.e. that
C1 + ... + CN <= A1 + ... + An
A useful technique for proving amortized costs is to invent a size (or potential) function S for a data structure. The size of an empty structure should be 0, and size of any structure should always be non-negative. If we perform a series of n operations on an initially empty structure, then let Si = S(xi), i.e. the size of the structure after the i-th operation.
Then we would like to show that for every operation i,
Ci + (Si - Si - 1) <= Ai
In other words, the actual cost of the operation plus the size increase that the operation induces is less than or equal to the operation's amortized cost.
If we can show that the equation above always holds, then we will have
C1 + (S1 - S0) + C2 + (S2 - S1) + ... + Cn + (Sn - Sn - 1) <= A1 + ... + An
or
C1 + ... + Cn + (Sn - S0) <= A1 + ... + An
But S0 = 0 and Sn >= 0, so we have
C1 + ... + Cn <= A1 + ... + An
This is what we wanted to show above, i.e. that the total actual cost is no greater than the total of the amortized costs of the operations.
For example, consider the functional queue data
structure that
we've described earlier in this course (and you implemented in one
homework assignment). We store the queue as two lists called 'front'
and 'back'. The enqueue
operation prepends a value to the 'back' list. The dequeue
operation removes a value from the 'front' list if it's empty.
Otherwise, it reverses the 'back' list and moves the result to the
front, then removes a value from the 'front' list. Assume that the
actual cost of each
these operations is the number of cons operations that it performs.
We'd like to show that both of these operations run in constant time
on average, i.e. that we can assign each of them a constant amortized
time.
Let the size S of the queue be the number of elements in the 'back' list. Then
An enqueue
operation has actual cost 1 since it prepends a single element. It
increases the size S of the data structure by 1. Then we have
Ci + (Si - Si - 1) = 1 + 1 = 2 = Ai
So let the amortized cost A of this operation be 2.
A dequeue operation might simply remove an element from the 'front' list. In that case its actual cost is 0 and it produces a size change of 0. If it reverses the 'back' list and moves it to the front, then let N be the number of elements in the 'back' list. In this case the actual cost of the operation is N since it must perform N cons operations to reverse the list. This also reduces the size of the data structure by N. So we have
Ci + (Si - Si - 1) = N - N = 0
So let the amortized cost A of this operation be 0.
For both operations, we've shown that
Ci + (Si - Si - 1) <= Ai
which proves that the total cost of a series of operations will be no greater than those operations' total amortized cost. The specific amortized costs we chose (2 and 0) are not so important, since really we just want to know that each of the operations runs in constant time on average.
Now consider a symmetric list, otherwise known as a double-ended queue. This data structure has operations enqueueFront, dequeueFront, enqueueBack and dequeueBack.
We can implement a symmetric list as a pair of lists 'front' and 'back', just like in the previous example. The enqueueHead operation prepends an element to the 'front' list. dequeueFront will remove the first element of the 'front' list if it's non-empty. Otherwise, it will split the 'back' list into two equal-length halves A and B. It will reverse list B and move it to the front, and will keep list A as the new 'back' list. The enqueueBack and dequeueBack operations work symmetrically. I won't provide Haskell code here, but writing this yourself is a useful exercise.
Let's show that all of the operations on a symmetric list will run in amortized constant time. Let the size S of this structure be the absolute value of the difference in lengths between the 'front' and 'back' lists:
S = abs(length(front) - length(back))
Now consider the operations:
The enqueueFront operation has actual cost 1 since it prepends a single element. It will increase the size S by at most 1 (though it might sometimes even decrease S). So
Ci + (Si - Si - 1) <= 1 + 1 = 2
Let the amortized cost A of this operation be 2.
The dequeueFront operation might simply remove an element from the 'front' list. In that case its actual cost is 0 and it produces a size increase of at most 1, so
Ci + (Si - Si - 1) <= 0 + 1 = 1
Now suppose that the 'front' list is empty. Let N be the number of elements in the 'back' list. Then dequeueFront will split the back list into two, and will reverse one half of it and move it to the front. This will use N cons operations, since it will take (N / 2) conses to build each of the new 'front' and 'back' lists. After that, it will remove one element from the 'front' list. Before the operation, the size S is N; afterward it is 1 (since the 'front' list has one fewer element than the back). So the size change is (1 - N). We have
Ci + (Si - Si - 1) <= N + (1 - N) = 1
Let the amortized cost A of this operation be 1.
(Strictly speaking, we assumed here that N was even, so that after splitting the 'front' and 'back' lists had the same number of elements. In this kind of analysis, a difference of a single element will almost never affect the asymptotic outcome. So we will simply wave our hands and assert that if N is odd, the outcome will be similar.)
The other operations (enqueueBack and dequeueBack) are symmetric.
In conclusion, we've shown that all operations on this data structure run in constant time on average.