Lecture 11: Notes

Here are notes about topics we discussed in lecture 11. For more details about iterators, see the Essential C# textbook or the C# reference pages. For more about dynamic programming, see Introduction to Algorithms, ch. 15 "Dynamic Programming".

iterators

Iterators are a powerful language feature that makes it very easy to write code that generates and manipulates sequences. (They are present in some other languages too; for example, they also exist in Python, where they are called generators.)

In C#, an iterator is a special kind of method that generates a sequence of values. Each time the caller requests the next value in the sequence, an iterator's code runs until it calls the yield return statement, which yields the next value in the sequence. At that point the iterator is suspended until the caller requests the next value, at which point the code continues executing until the next yield return statement, and so on. When execution reaches the end of the iterator method body, the sequence is complete.

An iterator must have return type IEnumerable<T> (or IEnumerator<T>) for some (concrete or generic) type T. Here is a simple iterator:

  static IEnumerable<int> range(int start, int end) {
    for (int i = start ; i <= end ; ++i)
      yield return i;
  }

Now, for example, we can add the squares of the numbers from 1 to 10 like this:

  using System.Linq;

  int sum = range(1, 10).Select(i => i * i).Sum();

Here is an iterator representing the infinite Fibonacci sequence:

  static IEnumerable<int> fibs() {
    int a = 1, b = 1;
    
    while (true) {
      yield return a;
      int c = a + b;
      a = b;
      b = c;
    }
  }

Using this iterator, it is easy to compute, for example, the sum of the even Fibonacci numbers that are less than or equal to a given number n:

  static int fibSum(int n) =>
    fibs().TakeWhile(i => i <= n).Where(i => i % 2 == 0).Sum();

We can use iterators to construct sequences of non-numeric values as well. For example, this iterator yields a sequence of strings representing all lines in a file:

  static IEnumerable<string> lines(string filename) {
    StreamReader r = new StreamReader(filename);
    while (r.ReadLine() is string s)
      yield return s;
    r.Close();
  }

Now we can compute the length of the longest line in a file like this:

  static int longestLength(string filename) =>
    lines(filename).Select(s => s.Length).Max();

Note that this method will not read all lines of the file into memory at the same time. In other words, this method will work fine even on a file that is much larger than the computer's memory. This is a major advantage of using enumerations over collections such as arrays or Lists.

We can also use iterators to write methods that take sequences as arguments and return transformed sequences. In other words, we can easily write any of the built-in Linq methods using iterators. For example, here is an implementation of the Linq method Where, which filters a sequence by selecting only elements that pass a given condition:

  static IEnumerable<T> where<T>(IEnumerable<T> seq, Predicate<T> cond) {
    foreach (T t in seq)
      if (cond(t))
        yield return t;
  }

dynamic programming

Dynamic programming is a general technique that we can use to solve a wide variety of problems. Many (but not all) of these problems involve optimization, i.e. finding the shortest/longest/best solution to a certain problem.

Problems solvable using dynamic programming have the following characteristics:

They have a recursive structure. In other words, the problem's solution can be expressed recursively as a function of the solutions to one or more subproblems. A subproblem is a smaller instance of the same problem.
They have overlapping subproblems. In other words, two subproblems P_i and P_j will often depend on the same subproblem P_k. This means that in a direct recursive implemention, the same subproblems will be solved over and over again.

For a dynamic programming problem, typically a direct recursive implementation will run in exponential time. But the actual number of subproblems that need to be solved is much less than exponential. So we can dramatically improve the running time by arranging so that each subproblem will be solved only once. There are two ways to do that. In a top-down implementation, we keep the same recursive code structure but add a cache of solved subproblems. This technique is called memoization. In a bottom-up implementation, we also use a data structure (typically an array) to hold subproblem solutions, but we build up these solutions iteratively.

To illustrate these concepts, let's look at a trivial example of a dynamic programming problem, namely computing the n-th Fibonacci number. Recall that the Fibonacci numbers are defined as

F₁ = 1
F₂ = 1
F_n = F_n-1 + F_n-2 (n ≥ 3)

yielding the sequence 1, 1, 2, 3, 5, 8, 13, 21, …

Here is a direct recursive implementation of a function to compute the n-th Fibonacci number:

  static int fib(int n) =>
    n < 3 ? 1 : fib(n - 1) + fib(n – 2);

What is this function's running time? The running time T(n) obeys the recurrence

T(n) = T(n – 1) + T(n – 2)

This is the recurrence that defines the Fibonacci numbers themselves! In other words,

T(n) = O(F_n)

The Fibonacci numbers themselves increase exponentially. It can be shown mathematically that

F_n = O(φⁿ)

where

φ = (1 + sqrt(5)) / 2

So fib runs in exponential time!

For this or any other dynamic programming problem, we can look at the subproblem graph, which shows how subproblem instances depend on each other. In the subproblem graph, each subproblem is a vertex, and there is an edge from V₁ to V₂ if subproblem V₁ depends on V₂. Here is the graph of subproblems we encounter in computing fib(5):

The fact that fib(5) and fib(4) both depend on fib(3) is an example of the overlapping substructure that is characteristic of dynamic progamming problems. The number of subproblems of fib(n) is certainly not exponential in n, but fib runs in exponential time because it computes subproblems over and over as it runs.

To improve fib's efficiency, we need to ensure that each subproblem will be solved only once. First, here is a top-down implementation that caches the subproblem results. It uses a nested method that looks much like the direct recursive implementation we saw above. Again, this technique is called memoization:

  static int fib(int n) {
    int[] a = new int[n + 1];
    
    int fiba(int i) {
      if (a[i] == 0)
        a[i] = i < 3 ? 1 : fiba(i - 1) + fiba(i - 2);
      return a[i];
    }
    
    return fiba(n);
  }

This memoized version will run in linear time, because

The line

  a[i] = i < 3 ? 1 : fiba(i - 1) + fiba(i – 2);

runs only once for each value of i.

Consequently fiba is called at most 2n times during the calculation of fib(n).

Next, here is a bottom-up implementation that iteratively computes subproblem solutions from smallest to largest:

  static int fib3(int n) {
    int[] a = new int[n + 1];
    a[1] = 1;
    a[2] = 1;
    for (int i = 3 ; i <= n ; ++i)
      a[i] = a[i - 1] + a[i - 2];
    return a[n];
  }

Clearly this will also run in linear time.

I generally prefer a bottom-up implementation over a top-down memoized implementation for dynamic programming problems, because

A bottom-up implementation is generally more efficient.
The running time of the bottom-up implementation is usually more obvious.

Computing Fibonacci numbers is so easy that the various methods above may seem trivial. Still, these same steps apply to any dynamic programming problem, so they are worth understanding.

rod cutting

The rod cutting problem is a classic dynamic programming problem, and is stated as follows. Suppose that we have a rod that is N cm long. We may cut it into any number of pieces that we like, but each piece's length must be an integer. We will sell all the pieces, and we have a table of prices that tells us how much we will receive for a piece of any given length. The problem is to determine how to cut the rod so as to maximize our profit.

As a concrete instance of this problem, suppose that we have the following table of prices:

length (cm)	price (Kč)
1	1
2	2
3	3
4	6
5	8
6	8
7	9
8	11

And suppose that we have a rod of length 8 cm. How can we cut the rod to receive the greatest profit from selling the pieces?

At first it might appear that a greedy strategy might work, in which we always choose the piece that has the highest ratio of price to length. Let's compute that ratio for the sizes above:

length (cm)	price / length (Kč/cm)
1	1
2	1
3	1
4	1.5
5	1.6
6	1.33
7	1.29
8	1.38

Using a greedy strategy, we would first cut a piece of size 5 cm, since that size has the highest price / length ratio. We could sell this piece for 8 Kč. We will receive 3 Kč for the remaining 3 cm (no matter how we cut it), so our total profit will be 11 Kč.

But we can do better: if we split the original rod into two pieces of size 4 cm each, we will receive 6 Kč + 6 Kč = 12 Kč. This shows that a greedy strategy will not work in general.

Fortunately it is not difficult to express an optimal solution recursively. Let price[k] be the price for selling a piece of size k, as listed in the table above. We want to compute rodProfit(N), which denotes the maximum profit that we can attain by chopping a rod of length N into pieces and selling them. We proceed as follows. Any partition of the rod will begin with a piece of size k cm for some value 1 ≤ k ≤ N. Selling that piece will yield a profit of price[k]. The maximum profit for dividing and selling the rest of the rod will be rodProfit(N – k). So for any k, the maximum profit for any partition that begins with a piece of size k is

price[k] + rodProfit(N – k)

So rodProfit(N), i.e. the maximum profit for all possible partitions, will equal the maximum of the above expression over all values of k in the range 1 ≤ k ≤ N.

Here is a recursive C# method to compute rodProfit(N) given a table of prices for individual rod sizes:

  static int rodProfit(int[] prices, int n) {
    if (n == 0) return 0;
    
    int b = int.MinValue;
    for (int k = 1 ; k <= n ; ++k)
      b = Max(b, prices[k] + rodProfit(price, n - k));
    return b;
  }

Alternatively, we can write this method more compactly if we first write a higher-order function that computes the maximum value of a function over a range of integers:

  // Return the value i in the range start <= i <= end that maximizes f(i).
  public static int max(int start, int end, Func<int, int> f) =>
    Enumerable.Range(start, end - start + 1).Select(f).Max();

Now we may write rodProfit more easily as follows:

  static int rodProfit(int[] prices, int n) =>
    n == 0 ? 0 :
    max(1, n, k => prices[k] + rodProfit(prices, n – k));

Either of these formulations of rodProfit will run in exponential time. That's because they will solve the same subproblems over and over again, just like our direct recursive implementation of the Fibonacci sequence.

The subproblem graph for rodProfit has a simple structure: each instance of rodProfit depends on all smaller instances. Here is the subproblem graph for N ≤ 5:

So we can use bottom-up dynamic programming, solving all instances in order from smallest to largest. Once again, we use the higher-order function max that we defined above:

  static int rodProfit(int[] prices, int n) {
    int[] best = new int[n + 1];
    best[0] = 0;
    for (int k = 1 ; k <= n ; ++k)
      best[k] = max(1, k, j => prices[j] + best[k - j]);
    return best[n];
  }

How efficient is this? The call to max will run in time O(k), and we call max once for each value of k from 1 to n. We have seen many times that this pattern yields a running time of O(n²). This is a dramatic improvement over the previous exponential time solution.

longest common subsequence

Another classic dynamic programming problem is computing the longest common subsequence of two sequences. We will say that U is a subsequence of S if all the elements of U appear in S in the same order in which they appear in U (though not necessarily contiguously). Then the longest common subsequence of sequences S and T is the longest sequence U that is a subsequence of both S and T.

For example, consider the sequences

S = 2, 6, 3, 6, 7, 8, 2, 3, 5, 6

and

T = 5, 2, 3, 6, 4, 7, 8, 2, 6, 7

The longest common subsequence (lcs) of S and T is

U = 2, 3, 6, 7, 8, 2, 6

Note that the longest common subsequence is not necessarily unique. For example, if

S = 1, 2, 3, 4

T = 2, 1, 3, 4

then both

U₁ = 1, 3, 4

and

U₂ = 2, 3, 4

are subsequences of S and T. There is no longer common subsequence, so both U₁ and U₂ are longest common subsequences of S and T.

With a bit of thought we can come up with a clever recursive formulation for longest common subsequences. Suppose that

S = S₁, S₂, …, S_j

and

T = T₁, T₂, …, T_k

As a base case, if either j = 0 or k = 0 then either S or T is empty, so the lcs is also the empty sequence.

Otherwise there are two possible cases. First suppose that S_j = T_k. Let v = S_j = T_k. Now the longest common subsequence of S and T must end with v. (If it didn't, we could append v to it to obtain a longer common subsequence). So lcs(S, T) must have the form U, v, where U is the longest common subsequence of S_{1 .. j - 1} and T_{1 .. k
- 1}.

Conversely, suppose that S_j ≠ T_k. Consider any common subsequence U of S and T. If U ends in S_j, then it cannot end in T_k, so U must be a common subsequence of S_{1
.. j} and T_{1 .. k – 1.}Otherwise U does not end in S_j, so it is a common subsequence of S_{1
.. j - 1} and T_{1 .. k}. So the common subsequences of S and T are the union of the common subsequences of S_{1
.. j} and T_{1 .. k – 1}and the common subsequences of S_{1
.. j - 1} and T_{1 .. k .}This implies that the longest common subsequence lcs(S, T) is the longer of lcs( S_{1
.. j}, T_{1 .. k – 1}) and lcs( S_{1
.. j - 1}, T_{1 .. k}).

Note that in either of these cases we have reduced the problem to a smaller instance, since we have decreased j or k (or both) in the subbproblem(s) that we reference.

We can take strings as a specific example of sequences: a string is a sequence of characters. Here is a recursive C# method that computes the length of the longest common subsequence of two strings, using the recursive fomulation described above:

  static int lcs_len(string s, int j, string t, int k) {
    if (j == 0 || k == 0) return 0;
    
    return s[j - 1] == t[k - 1] ? lcs_len(s, j - 1, t, k - 1) + 1
      : Max(lcs_len(s, j, t, k - 1), lcs_len(s, j - 1, t, k));
  }
  
  static int lcs_len(string s, string t) =>
    lcs_len(s, s.Length, t, t.Length);

This method will run in exponential time since it will solve the same subproblems over and over, just like the direct recursive implementations of the other problems above.

We'd now like to use bottom-up dynamic programming to write a more efficient implementation. For this problem, the subproblem graph has a two-dimensional structure: each subproblem instance has two parameters j and k. We can visualize this graph using a two-dimensional grid, where are rows are numbered from 0 through length(S) and columns are numbered from 0 through length(T). Let the value of each grid square L_j,k be the length of lcs(S_{1
.. j}, T_{1 .. k}). Then according to the recursive formulation above, L_{0, k} = L_{j, 0} = 0 for any j and k. Furthermore, if j ≥ 0 and k ≥ 0 then L_j,k depends only on the characters S_j and T_k plus the values of L_j-1,k-1, L_j,k-1 and L_j-1,k.

So our bottom-up implementation will fill in the values in a two-dimensional array. We must do so in an order that ensures that the dependent values L_j-1,k-1, L_j,k-1 and L_j-1,k will be calculated before we calculate L_j,k. Fortunately this is easy to do: we can fill in the table one row (or column) at a time. Here is an implementation in C#:

  static int lcs_len(string s, string t) {
    int[,] best = new int[s.Length + 1, t.Length + 1];
    
    for (int j = 1 ; j <= s.Length ; ++j)
      for (int k = 1 ; k <= t.Length ; ++k)
        if (s[j - 1] == t[k - 1])
          best[j, k] = best[j - 1, k - 1] + 1;
        else best[j, k] = Max(best[j - 1, k], best[j, k – 1]);
    
    return best[s.Length, t.Length];
  }

If s and t both have length N, then this method's running time is evidently O(N²).

Now we'd like to modify our method so that it returns the actual longest common subsequence as a string, rather than merely returning its length. We could do so by changing the best[] array to be an array of strings rather than an array of integers. In the double loop above, instead of filling in the length of each lcs string (i.e. the length of lcs(S_{1 .. j}, T_{1 .. k})) we could fill in the strings themselves. But that would not be very efficient: specifically it would require O(N³) time and memory if s and t have length N.

Instead, we first fill in the best[] array with integers using the code above, and afterward we can use these values to reconstruct the lcs itself. To do this, we can use a nested method that computes the lcs recursively following the recursive formulation above. Our method will use the values in the best[] array to decide which way to go at each step (i.e. whether to decrement j or k by 1) rather than having to try both paths as in our original recursive method above. This nested method, though recursive, will run in linear time! Here is the updated implementation:

  static string lcs(string s, string t) {
    int[,] best = new int[s.Length + 1, t.Length + 1];
    
    for (int j = 1 ; j <= s.Length ; ++j)
      for (int k = 1 ; k <= t.Length ; ++k)
        if (s[j - 1] == t[k - 1])
          best[j, k] = best[j - 1, k - 1] + 1;
        else best[j, k] = Max(best[j - 1, k], best[j, k - 1]);
    
    string lcsa(int j, int k) {
      if (j == 0 || k == 0) return "";
      if (s[j - 1] == t[k - 1]) return lcsa(j - 1, k - 1) + s[j - 1];
      return best[j, k - 1] > best[j - 1, k] ? lcsa(j, k - 1) : lcsa(j - 1, k);
    }
    
    return lcsa(s.Length, t.Length);
  }

Alternatively we could use an iterative loop instead of the recursive nested method lcsa, but arguably the nested method is clearer since it reflects the recursive problem structure that we have been using all along.