Week 11: Notes

There was no lecture or tutorial this week.

Our topics this week include searching using recursion and dynamic programming.

You can read about dynamic programming in chapter 15 "Dynamic Programming" of Introduction to Algorithms.

Here are some more notes on these topics:

searching using recursion

In Programming 1, we learned how to search a graph using a depth-first or breadth-first search. We also noted that we can use these algorithms to search through other spaces with a graph-like structure, such as a chessboard, which we can think of as a graph where each square is a vertex with up to 8 neighbors.

In fact, we can use a similar approach to solve an enormous number of other problems. In some cases we have a physical structure that looks like a graph. And in some cases we are searching through a state space, where there is a starting state and from each state there are a set of possible transitions that can take us to other states. For example, if we are trying to unscramble a Rubik's Cube, the start state is the intiial (scrambled) position of the cube, and each transition is a move that we can make that takes us to another state. The goal state is the position in which the cube is unscrambled.

For problems like this, we can use recursion to perform an exhaustive search of the state space. When we do this, we are really performing a depth-first search, since recursion naturally performs a depth-first search of a tree. Depending on the nature of the problem, we may sometimes want to keep a visited set, to prevent us from walking around in circles. For some problems, we may need to stop searching in a certain direction when we reach a contradiction or some other indication that no solution in that direction is possible. It is difficult to give a general set of rules for solving problems like this, since they vary in nature, and the best way to learn how to tackle them is to practice solving various examples. The key idea to keep in mind is that recursion is a powerful tool for exploring an exponential space. In some cases you may want to use a breadth-first search instead of recursion, such as when you need a shortest path through a state space, but in most cases recursion will give the most straightforward solution.

As an example, let's look at a classic problem of this nature. Can we place N queens on an N-by-N chessboard such that none of them attack any other? (Recall that in chess queens can move and attack horizontally, vertically and diagonally, and can move by any number of squares).

Suppose that N = 8. As a first, naive approach to the problem, we could generate all possible positions of 8 queens on an 8-by-8 chessboard, and check each such position to see whether any two queens attack each other. Since no two queens can occupy the same square, the number of possible positions is 64 choose 8, or (64!)/(56!)(8!). That number equals about 4.4 billion. That's a pretty big number, and especially for larger values of N this approach will be infeasible.

A far more efficient approach is possible. We know that each column can hold at most one queen. So we can first place a queen somewhere in column 1; there are 8 possible ways to do so. Then we can place a queen in column 2, and so on. With this approach, the number of ways to place queens is 8⁸, or about 16.8 million, which is a much smaller number than with the previous approach where we were placing queens everywhere.

However we certainly should not generate all 16.8 million of these possibilities, and check each one individually to see if any queens attack each other. Instead, as we place queens, we will cut off the search at any point where two queens attack each other. To put it differently, we will never consider any path in which a queen is placed in a position that attacks another queen. As a trivial example, if we place the first queen at (1, 1), i.e. row 1 and column 1, then we will not even consider placing the next queen at (1, 2). This search pruning technique dramatically decreases the effective search space, and is widely applicable to many problems.

Below is a C# program that finds and prints a solution to the 8 queens problem. Notice that the queens() method is recursive, and uses an array to keep track of all the positions where queens have been placed so far. Study the program to understand how it works. When we run the program, it prints

Q . . . . . . . 
. . . . . . Q . 
. . . . Q . . . 
. . . . . . . Q 
. Q . . . . . . 
. . . Q . . . . 
. . . . . Q . . 
. . Q . . . . . 

== queens.cs ==

using static System.Console;
using static System.Math;

class Queens {
    // Print the board.
    static void print(int[] a) {
        for (int r = 0 ; r < a.Length ; ++r) {
            for (int c = 0 ; c < a.Length ; ++c)
                Write(a[c] == r ? "Q " : ". ");
            WriteLine();
        }
    }
    
    // Return true if we can set a[k] = row without attacking any of the k queens
    // that have already been placed.
    static bool valid(int row, int k, int[] a) {
        for (int i = 0 ; i < k ; ++i)
            if (a[i] == row || Abs(a[i] - row) == k - i)
                return false;
        return true;
    }
    
    // Find a solution, given that k queens have already been placed.
    // a[i] holds the row number of the queen in column i.
    static bool queens(int k, int[] a) {
        if (k == a.Length) {
            print(a);    // found a solution
            return true;
        }
        for (int row = 0 ; row < a.Length ; ++row)
            if (valid(row, k, a)) {
                a[k] = row;
                if (queens(k + 1, a))
                    return true;
                }
        return false;
    }
    
    static void Main() {
        const int n = 8;
        
        queens(0, new int[n]);
    }
}

compositions, revisited

Last week we saw that we can use iterators in C# to recursively generate sequences of subsets, combinations, permutations, and so on. For example, here is a method we saw last week to generate all compositions of an integer. (Recall that a composition of an integer N is a sequences of positive integers whose sum is N.)

static IEnumerable<string> compositions(int n) {
    if (n == 0)
        yield return "";
    else
        for (int i = 1 ; i <= n ; ++i)
            foreach (string s in compositions(n - i))
                yield return i + " " + s;
}
static void printCompositions(int n) {
    foreach (string s in compositions(n))
        WriteLine(s);
}

In fact, we can generate and print any of these kinds of sequences without using iterators at all. Here is a similar recursive method that will print all compositions of an integer, without using an iterator:

  static void compositions(string prefix, int n) {
      if (n == 0)
          WriteLine(prefix.Trim());
      else
          for (int i = 1 ; i <= n ; ++i)
              compositions(prefix + " " + i, n - i);
  }
  
  static void printCompositions(int n) {
      compositions("", n);
  }

The method uses a string parameter "prefix" to accumulate all integers in a composition that it has seen so far. (This is somewhat similar to the parameter int[] a in the queens() method in the previous example.) When it reaches the end of a composition, it prints out the string. (If you like, you can think of this as a state space search, where in the initial state we have no integers in the composition, and at each step we choose an integer i to add to the composition until the composition is complete.)

Which approach is better - using an iterator, or a string prefix? The approach with an iterator is more general, since it returns a sequence of compositions and the caller can do anything they like with that sequence. (This is a more "functional" approach to the problem). But the string prefix approach is arguably simpler, since it gets the job done without using any fancy iterator machinery. Eiter approach might be appropriate depending on the problem you are trying to solve and on your programming style.

dynamic programming

You will learn about dynamic programming both in this class and also in Algorithms and Data Structures 1. As with some other topics, the discussion in ADS 1 will be a bit more theoretical, and in this class we will focus more on writing code to solve particular problems.

Dynamic programming is a general technique that we can use to solve a wide variety of problems. Many of these problems involve optimization, i.e. finding the shortest/longest/best solution to a certain problem.

Problems solvable using dynamic programming generally have the following characteristics:

They have a recursive structure. In other words, the problem's solution can be expressed recursively as a function of the solutions to one or more subproblems. A subproblem is a smaller instance of the same problem.
They have overlapping subproblems. A direct recursive implemention solves the same subproblems over and over again, leading to exponential inefficiency.

Usually we can dramatically improve the running time by arranging so that each subproblem will be solved only once. There are two ways to do that:

In a top-down implementation, we keep the same recursive code structure but add a cache of solved subproblems. This technique is called memoization.
In a bottom-up implementation, we also use a data structure (typically an array) to hold subproblem solutions, but we build up these solutions iteratively.

Generally we prefer a bottom-up solution, because

A bottom-up implementation is generally more efficient.
The running time of the bottom-up implementation is usually more obvious.

This week we will study one-dimensional dynamic programming problems, which have a straightforward recursive structure. Next week we will look at two-dimensional problems, which are a bit more complex.

Fibonacci numbers

Computing the Fibonacci number F_n is a trivial example of dynamic programming. Recall that the Fibonacci numbers are defined as

F₁ = 1
F₂ = 1
F_n = F_n-1 + F_n-2 (n ≥ 3)

yielding the sequence 1, 1, 2, 3, 5, 8, 13, 21, …

Here is a recursive function that naively computes the n-th Fibonacci number:

  static int fib(int n) => n < 3 ? 1 : fib(n - 1) + fib(n – 2);

What is this function's running time? The running time T(n) obeys the recurrence

T(n) = T(n – 1) + T(n – 2)

This is the recurrence that defines the Fibonacci numbers themselves! In other words,

T(n) = O(F_n)

The Fibonacci numbers themselves increase exponentially. It can be shown mathematically that

F_n = O(φⁿ)

where

φ = (1 + sqrt(5)) / 2

So fib runs in exponential time! That may come as a surprise, since the function looks so direct and simple. Fundamentally it is inefficient because it is solving the same subproblems repeatedly. For example, fib(10) will compute fib(9) + fib(8). The recursive call to fib(9) will compute fib(8) + fib(7), and so we see that we already have two independent calls to fib(8). Each of those calls in turn will solve smaller subproblems over and over again. In a problem with a recursive structure such as this one, the repeated work multiplies exponentially, so that the smallest subproblems (e.g. fib(3)) are computed an enormous number of times.

top-down dynamic programming (memoization)

As mentioned above, one way we can eliminate the repeated work is to use a cache that stores answers to subproblems we have already solved. This technique is called memoization. Here is a memoized implementation of fib, using a local function:

  static int fib(int n) {
        int[] cache = new int[n + 1];
        
        int f(int i) {
            if (cache[i] == 0)
                cache[i] = i < 3 ? 1 : f(i - 1) + f(i - 2);
            return cache[i];
        }
        
        return f(n);
    }

Above, the cache array holds all the Fibonacci numbers we have already computed. In other words, if we have already computed F_i, then cache[i] = F_i. Otherwise, cache[i] = 0.

This memoized version runs in linear time, because the line

  cache[i] = n < 3 ? 1 : fib1(n - 1) + fib1(n - 2);

runs only once for each value of i. This is a dramatic improvement!

bottom-up dynamic programming

In this particular recursive problem, the subproblem structure is quite straightforward: each Fibonacci number F_n depends only on the two Fibonacci numbers below it, i.e. F_n-1 and F_n-2. Thus, to compute all the Fibonacci numbers up to F_n we may simply start at the bottom, i.e. the values F₁ and F₂, and work our way upward, first computing F₃ using F₁ and F₂, then F₄ and so on. Here is the implementation:

  static int fib(int n) {
      int[] a = new int[n + 1];
      a[1] = a[2] = 1;
      for (int k = 3 ; k <= n ; ++n)
          a[k] = a[k - 1] + a[k - 2];
      return a[n];
  }

Clearly this will also run in linear time. Note that this implementation is not even a recursive function. This is typical: a bottom-up solution consists of one or more loops, without recursion.

This example may seem trivial. But the key idea is that we can efficiently solve a recursive problem by solving subproblem instances in a bottom-up fashion, and as we will see we can apply this idea to many other problems.

rod cutting

The rod cutting problem is a classic dynamic programming problem. Suppose that we have a rod that is n cm long. We may cut it into any number of pieces that we like, but each piece's length must be an integer. We will sell all the pieces, and we have a table of prices that tells us how much we will receive for a piece of any given length. The problem is to determine how to cut the rod so as to maximize our profit.

We can express an optimal solution recursively. Let prices[i] be the given price for selling a piece of size i. We want to compute profit(n), which is the maximum profit that we can attain by chopping a rod of length n into pieces and selling them. Any partition of the rod will begin with a piece of size i cm for some value 1 ≤ i ≤ n. Selling that piece will yield a profit of prices[i]. The maximum profit for dividing and selling the rest of the rod will be profit(n – i). So profit(n), i.e. the maximum profit for all possible partitions, will equal the maximum value for 1 ≤ i ≤ n of

prices[i] + profit(n – i)

naive recursive solution

Here is a naive recursive solution to the problem:

  // Return the best price for cutting a rod of length n, given a table
  // with the prices for pieces from lengths 1 .. n.
  static int profit(int n, int[] prices) {
      int best = 0;
      for (int i = 1 ; i <= n ; ++i)
          best = Max(best, prices[i] + profit(n - i, prices));
      return best;
  }

This solution runs in exponential time, because for each possible size k it will recompute profit(k, prices) many times.

bottom-up dynamic programming

This solution uses bottom-up dynamic programming:

  // Return the best price for cutting a rod of length n, given a table
  // with the prices for pieces from lengths 1 .. n.
  static int profit(int n, int[] prices) {
      int[] best = new int[n + 1];  // best possible price for each size

      for (int k = 1 ; k <= n ; ++k) {
          // Compute the best price best[k] for a rod of length k.
          best[k] = 0;
          for (int i = 1 ; i <= k ; ++i)
              best[k] = Max(best[k], prices[i] + best[k - i]);
      }

      return best[n];  // best price for a rod of length n
  }

Once again, this bottom-up solution is not a recursive function. Notice its double loop structure. Each iteration of the outer loop computes best[k], i.e. the solution to the subproblem of finding the best price for a rod of size k. To compute that, the inner loop must loop over all smaller subproblems, which have already been solved, looking for a maximum possible profit.

This version runs in O(N²).

bottom-up dynamic programming, extended

Often when we use dynamic programming to solve an optimization problem such as this one, we want not only the value of the optimal solution, but also the solution itself. For example, the function in the previous section tells us the best possible price that we can obtain for a rod of size n, but it doesn't tell us the sizes of the pieces in which the rod should be cut!

So we'd like to extend our solution to return that information. To do that, for each size k < n we must remember not only the best possible prices for a rod of size k, but also the size of the first piece that we should slice off from a rod of that size in order to obtain that price. With that information, we can reconstruct the solution at the end:

  // Return the best price for cutting a rod of length n, given a table
  // with the prices for pieces from lengths 1 .. n.
  static int profit(int n, int[] prices, out int[] sizes) {
      int[] best = new int[n + 1];   // best possible price for a given rod size
      int[] cut = new int[n + 1];    // size of first piece to slice from a given rod size

      for (int k = 1 ; k <= n ; ++k) {
          best[k] = 0;
          for (int i = 1 ; i <= k ; ++i) {      // for every size <= k
              int p = prices[i] + best[k - i];
              if (p > best[k]) {
                  best[k] = p;
                  cut[k] = i;  // remember the best size to cut
              }
          }
      }

      List<int> l = new List<int>();
      for (int s = n ; s > 0 ; s -= cut[s])
          l.Add(cut[s]);
      sizes = l.ToArray();
      return best[n];
  }

Study this function to understand how it works. Like the preceding version, it runs in O(N²).

smallest number of coins

Here is another classic dynamic programming problem. We would like to write a method int numCoins(int sum, int[] coins) that takes a integer and an array of coin denominations. The method should determine the smallest number of coins needed to form the given sum. A sum can contain an arbitrary number of coins of each denomination. For example, if sum = 34 and coins = { 1, 15, 25 }, the method will return 6, since

34 = 15 + 15 + 1 + 1 + 1 + 1

and it is not possible to form this sum with a smaller number of coins.

If we cannot form the given sum using the given denominations of coins, our method will return int.MaxValue.

By the way, you will recall that in a recent homework exercise we wrote a recursive program that can generate all possible ways of forming a sum from a set of coins. So one way to find the smallest number of coins would be to generate all those possible ways, and see which one uses the smallest number. However that will be hopelessly inefficient, because there may be an exponential number of ways to form the sum. Dynamic programming will give us a more efficient solution.

naive recursive solution

First, here is a naive recursive solution to the problem:

  // Return the smallest number of coins needed to form (sum).
  static int numCoins(int sum, int[] coins) {
      if (sum == 0) return 0;
    
      int min = int.MaxValue;
      foreach (int c in coins)
          if (c <= sum)
              min = Min(min, numCoins(sum - c, coins));

      return min == int.MaxValue ? int.MaxValue : min + 1;
  }

This will run in exponential time.

bottom-up dynamic programming

Here is a solution using bottom-up dynamic programming. Its structure is similar to the previous rod-cutting solution.

  // Return the smallest number of coins needed to form (sum).
  static int numCoins(int sum, int[] coins) {
      int[] min = new int[sum + 1];
    
      for (int k = 1 ; k <= sum ; ++k) {
          // Compute min[k], the minimum number of coins needed to add up to k.
          int m = int.MaxValue;
          foreach (int c in coins)
              if (c <= k)
                  m = Min(m, min[k - c]);
          min[k] = m == int.MaxValue ? int.MaxValue : m + 1;
      }
    
      return min[sum];
  }

This version runs in O(N²), where N = sum.

bottom-up dynamic programming, extended

This version returns an array with the values of the coins that form the sum.

  static int[] numCoins2(int sum, int[] coins) {
      int[] min = new int[sum + 1];
      int[] first = new int[sum + 1];
    
      for (int k = 1 ; k <= sum ; ++k) {
          // Compute min[k], the minimum number of coins needed to add up to k.
          // Also remember first[k], the first coin to use in forming the sum k.
          int m = int.MaxValue;
          foreach (int c in coins)
              if (c <= k && min[k - c] < m) {  // best so far
                  m = min[k - c];
                  first[k] = c;
              }
          min[k] = m == int.MaxValue ? int.MaxValue : m + 1;
      }
    
      List<int> l = new List<int>();
      for (int s = sum ; s > 0 ; s -= first[s])
          l.Add(first[s]);
      return l.ToArray();
  }