Here are notes about topics we discussed in lecture 11. For more details about iterators, see the Essential C# textbook or the C# reference pages. For more about dynamic programming, see Introduction to Algorithms, ch. 15 "Dynamic Programming".
Iterators are a powerful language feature that makes it very easy to write code that generates and manipulates sequences. (They are present in some other languages too; for example, they also exist in Python, where they are called generators.)
In C#, an iterator is a
special kind of method that generates a sequence of values. Each time
the caller requests the next value in the sequence, an iterator's
code runs until it calls the yield return
statement,
which yields the next value in the sequence. At that point the
iterator is suspended until the caller requests the next value, at
which point the code continues executing until the next yield
return
statement, and so on. When execution reaches the end of
the iterator method body, the sequence is complete.
An iterator must have return type IEnumerable<T>
(or IEnumerator<T>
) for some (concrete or generic)
type T. Here is a simple iterator:
static IEnumerable<int> range(int start, int end) { for (int i = start ; i <= end ; ++i) yield return i; }
Now, for example, we can add the squares of the numbers from 1 to 10 like this:
using System.Linq; int sum = range(1, 10).Select(i => i * i).Sum();
Here is an iterator representing the infinite Fibonacci sequence:
static IEnumerable<int> fibs() { int a = 1, b = 1; while (true) { yield return a; int c = a + b; a = b; b = c; } }
Using this iterator, it is easy to compute, for example, the sum of the even Fibonacci numbers that are less than or equal to a given number n:
static int fibSum(int n) => fibs().TakeWhile(i => i <= n).Where(i => i % 2 == 0).Sum();
We can use iterators to construct sequences of non-numeric values as well. For example, this iterator yields a sequence of strings representing all lines in a file:
static IEnumerable<string> lines(string filename) { StreamReader r = new StreamReader(filename); while (r.ReadLine() is string s) yield return s; r.Close(); }
Now we can compute the length of the longest line in a file like this:
static int longestLength(string filename) => lines(filename).Select(s => s.Length).Max();
Note that this method will not read all lines of the file into memory at the same time. In other words, this method will work fine even on a file that is much larger than the computer's memory. This is a major advantage of using enumerations over collections such as arrays or Lists.
We
can also use iterators to write methods that take sequences as
arguments and return transformed sequences. In other words, we can
easily write any of the built-in Linq methods using iterators. For
example, here is an implementation of the Linq method Where
,
which filters a sequence by selecting only elements that pass a given
condition:
static IEnumerable<T> where<T>(IEnumerable<T> seq, Predicate<T> cond) { foreach (T t in seq) if (cond(t)) yield return t; }
Dynamic programming is a general technique that we can use to solve a wide variety of problems. Many (but not all) of these problems involve optimization, i.e. finding the shortest/longest/best solution to a certain problem.
Problems solvable using dynamic programming have the following characteristics:
They have a recursive structure. In other words, the problem's solution can be expressed recursively as a function of the solutions to one or more subproblems. A subproblem is a smaller instance of the same problem.
They have overlapping subproblems. In other words, two subproblems Pi and Pj will often depend on the same subproblem Pk. This means that in a direct recursive implemention, the same subproblems will be solved over and over again.
For a dynamic programming problem, typically a direct recursive implementation will run in exponential time. But the actual number of subproblems that need to be solved is much less than exponential. So we can dramatically improve the running time by arranging so that each subproblem will be solved only once. There are two ways to do that. In a top-down implementation, we keep the same recursive code structure but add a cache of solved subproblems. This technique is called memoization. In a bottom-up implementation, we also use a data structure (typically an array) to hold subproblem solutions, but we build up these solutions iteratively.
To illustrate these concepts, let's look at a trivial example of a dynamic programming problem, namely computing the n-th Fibonacci number. Recall that the Fibonacci numbers are defined as
F1 = 1
F2 = 1
Fn = Fn-1 + Fn-2 (n ≥ 3)
yielding the sequence 1, 1, 2, 3, 5, 8, 13, 21, …
Here is a direct recursive implementation of a function to compute the n-th Fibonacci number:
static int fib(int n) => n < 3 ? 1 : fib(n - 1) + fib(n – 2);
What is this function's running time? The running time T(n) obeys the recurrence
T(n) = T(n – 1) + T(n – 2)
This is the recurrence that defines the Fibonacci numbers themselves! In other words,
T(n) = O(Fn)
The Fibonacci numbers themselves increase exponentially. It can be shown mathematically that
Fn = O(φn)
where
φ = (1 + sqrt(5)) / 2
So fib
runs in exponential time!
For this or any other dynamic programming problem, we can look at
the subproblem graph, which shows how subproblem instances
depend on each other. In the subproblem graph, each subproblem is a
vertex, and there is an edge from V1 to V2 if
subproblem V1 depends on V2. Here is the graph
of subproblems we encounter in computing fib(5)
:
The fact that fib(5) and fib(4) both depend on fib(3)
is an example of the overlapping substructure that is characteristic
of dynamic progamming problems. The number of subproblems of fib(n)
is certainly not exponential in n
, but fib
runs in exponential time because it computes subproblems over and
over as it runs.
To improve fib
's efficiency, we need to
ensure that each subproblem will be solved only once. First, here is
a top-down implementation that caches the subproblem results. It uses
a nested method that looks much like the direct recursive
implementation we saw above. Again, this technique is called
memoization:
static int fib(int n) { int[] a = new int[n + 1]; int fiba(int i) { if (a[i] == 0) a[i] = i < 3 ? 1 : fiba(i - 1) + fiba(i - 2); return a[i]; } return fiba(n); }
This memoized version will run in linear time, because
The line
a[i] = i < 3 ? 1 : fiba(i - 1) + fiba(i – 2);
runs only once for each value of i
.
Consequently fiba
is called at most 2n times
during the calculation of fib(n)
.
Next, here is a bottom-up implementation that iteratively computes subproblem solutions from smallest to largest:
static int fib3(int n) { int[] a = new int[n + 1]; a[1] = 1; a[2] = 1; for (int i = 3 ; i <= n ; ++i) a[i] = a[i - 1] + a[i - 2]; return a[n]; }
Clearly this will also run in linear time.
I generally prefer a bottom-up implementation over a top-down memoized implementation for dynamic programming problems, because
A bottom-up implementation is generally more efficient.
The running time of the bottom-up implementation is usually more obvious.
Computing Fibonacci numbers is so easy that the various methods above may seem trivial. Still, these same steps apply to any dynamic programming problem, so they are worth understanding.
The rod cutting problem is a classic dynamic programming problem, and is stated as follows. Suppose that we have a rod that is N cm long. We may cut it into any number of pieces that we like, but each piece's length must be an integer. We will sell all the pieces, and we have a table of prices that tells us how much we will receive for a piece of any given length. The problem is to determine how to cut the rod so as to maximize our profit.
As a concrete instance of this problem, suppose that we have the following table of prices:
length (cm) |
price (Kč) |
---|---|
1 |
1 |
2 |
2 |
3 |
3 |
4 |
6 |
5 |
8 |
6 |
8 |
7 |
9 |
8 |
11 |
And suppose that we have a rod of length 8 cm. How can we cut the rod to receive the greatest profit from selling the pieces?
At first it might appear that a greedy strategy might work, in which we always choose the piece that has the highest ratio of price to length. Let's compute that ratio for the sizes above:
length (cm) |
price / length (Kč/cm) |
---|---|
1 |
1 |
2 |
1 |
3 |
1 |
4 |
1.5 |
5 |
1.6 |
6 |
1.33 |
7 |
1.29 |
8 |
1.38 |
Using a greedy strategy, we would first cut a piece of size 5 cm, since that size has the highest price / length ratio. We could sell this piece for 8 Kč. We will receive 3 Kč for the remaining 3 cm (no matter how we cut it), so our total profit will be 11 Kč.
But we can do better: if we split the original rod into two pieces of size 4 cm each, we will receive 6 Kč + 6 Kč = 12 Kč. This shows that a greedy strategy will not work in general.
Fortunately it is not difficult to express an optimal solution recursively. Let price[k] be the price for selling a piece of size k, as listed in the table above. We want to compute rodProfit(N), which denotes the maximum profit that we can attain by chopping a rod of length N into pieces and selling them. We proceed as follows. Any partition of the rod will begin with a piece of size k cm for some value 1 ≤ k ≤ N. Selling that piece will yield a profit of price[k]. The maximum profit for dividing and selling the rest of the rod will be rodProfit(N – k). So for any k, the maximum profit for any partition that begins with a piece of size k is
price[k] + rodProfit(N – k)
So rodProfit(N), i.e. the maximum profit for all possible partitions, will equal the maximum of the above expression over all values of k in the range 1 ≤ k ≤ N.
Here is a recursive C# method to compute rodProfit(N) given a table of prices for individual rod sizes:
static int rodProfit(int[] prices, int n) { if (n == 0) return 0; int b = int.MinValue; for (int k = 1 ; k <= n ; ++k) b = Max(b, prices[k] + rodProfit(price, n - k)); return b; }
Alternatively, we can write this method more compactly if we first write a higher-order function that computes the maximum value of a function over a range of integers:
// Return the value i in the range start <= i <= end that maximizes f(i). public static int max(int start, int end, Func<int, int> f) => Enumerable.Range(start, end - start + 1).Select(f).Max();
Now we may write rodProfit
more easily as follows:
static int rodProfit(int[] prices, int n) => n == 0 ? 0 : max(1, n, k => prices[k] + rodProfit(prices, n – k));
Either of these formulations of rodProfit will run in exponential time. That's because they will solve the same subproblems over and over again, just like our direct recursive implementation of the Fibonacci sequence.
The subproblem graph for rodProfit has a simple structure: each instance of rodProfit depends on all smaller instances. Here is the subproblem graph for N ≤ 5:
So we can use bottom-up dynamic programming, solving
all instances in order from smallest to largest. Once again, we use
the higher-order function max
that we defined above:
static int rodProfit(int[] prices, int n) { int[] best = new int[n + 1]; best[0] = 0; for (int k = 1 ; k <= n ; ++k) best[k] = max(1, k, j => prices[j] + best[k - j]); return best[n]; }
How efficient is this? The call to max
will run in time
O(k), and we call max
once for each value of k from 1 to
n. We have seen many times that this pattern yields a running time
of O(n2). This is a dramatic improvement over the
previous exponential time solution.
Another classic dynamic programming problem is computing the longest common subsequence of two sequences. We will say that U is a subsequence of S if all the elements of U appear in S in the same order in which they appear in U (though not necessarily contiguously). Then the longest common subsequence of sequences S and T is the longest sequence U that is a subsequence of both S and T.
For example, consider the sequences
S = 2, 6, 3, 6, 7, 8, 2, 3, 5, 6
and
T = 5, 2, 3, 6, 4, 7, 8, 2, 6, 7
The longest common subsequence (lcs) of S and T is
U = 2, 3, 6, 7, 8, 2, 6
Note that the longest common subsequence is not necessarily unique. For example, if
S = 1, 2, 3, 4
T = 2, 1, 3, 4
then both
U1 = 1, 3, 4
and
U2 = 2, 3, 4
are subsequences of S and T. There is no longer common subsequence, so both U1 and U2 are longest common subsequences of S and T.
With a bit of thought we can come up with a clever recursive formulation for longest common subsequences. Suppose that
S = S1, S2, …, Sj
and
T = T1, T2, …, Tk
As a base case, if either j = 0 or k = 0 then either S or T is empty, so the lcs is also the empty sequence.
Otherwise there are two possible cases. First suppose that Sj = Tk. Let v = Sj = Tk. Now the longest common subsequence of S and T must end with v. (If it didn't, we could append v to it to obtain a longer common subsequence). So lcs(S, T) must have the form U, v, where U is the longest common subsequence of S1 .. j - 1 and T1 .. k - 1.
Conversely, suppose that Sj ≠ Tk. Consider any common subsequence U of S and T. If U ends in Sj, then it cannot end in Tk, so U must be a common subsequence of S1 .. j and T1 .. k – 1. Otherwise U does not end in Sj, so it is a common subsequence of S1 .. j - 1 and T1 .. k. So the common subsequences of S and T are the union of the common subsequences of S1 .. j and T1 .. k – 1 and the common subsequences of S1 .. j - 1 and T1 .. k . This implies that the longest common subsequence lcs(S, T) is the longer of lcs( S1 .. j, T1 .. k – 1) and lcs( S1 .. j - 1, T1 .. k ).
Note that in either of these cases we have reduced the problem to a smaller instance, since we have decreased j or k (or both) in the subbproblem(s) that we reference.
We can take strings as a specific example of sequences: a string is a sequence of characters. Here is a recursive C# method that computes the length of the longest common subsequence of two strings, using the recursive fomulation described above:
static int lcs_len(string s, int j, string t, int k) { if (j == 0 || k == 0) return 0; return s[j - 1] == t[k - 1] ? lcs_len(s, j - 1, t, k - 1) + 1 : Max(lcs_len(s, j, t, k - 1), lcs_len(s, j - 1, t, k)); } static int lcs_len(string s, string t) => lcs_len(s, s.Length, t, t.Length);
This method will run in exponential time since it will solve the same subproblems over and over, just like the direct recursive implementations of the other problems above.
We'd now like to use bottom-up dynamic programming to write a more
efficient implementation. For this problem, the subproblem graph has
a two-dimensional structure: each subproblem instance has two
parameters j and k. We can visualize this graph using a
two-dimensional grid, where are rows are numbered from 0 through
length(S) and columns are numbered from 0 through length(T). Let the
value of each grid square Lj,k be the length of lcs(S1
.. j , T1 .. k ). Then according to the recursive
formulation above, L0, k = Lj, 0 = 0 for any j
and k. Furthermore, if j ≥ 0 and k ≥ 0 then Lj,k
depends only on the characters Sj and Tk plus
the values of Lj-1,k-1, Lj,k-1 and Lj-1,k.
So our bottom-up implementation will fill in the values in a
two-dimensional array. We must do so in an order that ensures that
the dependent values Lj-1,k-1, Lj,k-1 and
Lj-1,k will be calculated before we calculate Lj,k
.
Fortunately this is easy to do: we can fill in the table one row (or
column) at a time. Here is an implementation in C#:
static int lcs_len(string s, string t) { int[,] best = new int[s.Length + 1, t.Length + 1]; for (int j = 1 ; j <= s.Length ; ++j) for (int k = 1 ; k <= t.Length ; ++k) if (s[j - 1] == t[k - 1]) best[j, k] = best[j - 1, k - 1] + 1; else best[j, k] = Max(best[j - 1, k], best[j, k – 1]); return best[s.Length, t.Length]; }
If s and t both have length N, then this method's running time is evidently O(N2).
Now we'd like to modify our method so that it returns the actual longest common subsequence as a string, rather than merely returning its length. We could do so by changing the best[] array to be an array of strings rather than an array of integers. In the double loop above, instead of filling in the length of each lcs string (i.e. the length of lcs(S1 .. j , T1 .. k )) we could fill in the strings themselves. But that would not be very efficient: specifically it would require O(N3) time and memory if s and t have length N.
Instead, we first fill in the best[] array with integers using the code above, and afterward we can use these values to reconstruct the lcs itself. To do this, we can use a nested method that computes the lcs recursively following the recursive formulation above. Our method will use the values in the best[] array to decide which way to go at each step (i.e. whether to decrement j or k by 1) rather than having to try both paths as in our original recursive method above. This nested method, though recursive, will run in linear time! Here is the updated implementation:
static string lcs(string s, string t) { int[,] best = new int[s.Length + 1, t.Length + 1]; for (int j = 1 ; j <= s.Length ; ++j) for (int k = 1 ; k <= t.Length ; ++k) if (s[j - 1] == t[k - 1]) best[j, k] = best[j - 1, k - 1] + 1; else best[j, k] = Max(best[j - 1, k], best[j, k - 1]); string lcsa(int j, int k) { if (j == 0 || k == 0) return ""; if (s[j - 1] == t[k - 1]) return lcsa(j - 1, k - 1) + s[j - 1]; return best[j, k - 1] > best[j - 1, k] ? lcsa(j, k - 1) : lcsa(j - 1, k); } return lcsa(s.Length, t.Length); }
Alternatively we could use an iterative loop instead of the recursive
nested method lcsa
, but arguably the nested method is
clearer since it reflects the recursive problem structure that we
have been using all along.