There was no lecture or tutorial this week.
Our topic this week is generating sequences using recursion.
In the last lecture of Introduction to Algorithms last semester, we briefly discussed how to use recursion to generate sequences such as permutations and combinations of a set. However we didn't have time to explore that topic much, and I did not include it on the Intro to Algorithms exam. In this course we will return to this important topic and will explore it more deeply.
As
a first example, suppose that we'd like to generate all subsets of
the set {1, 2, …, N}. As you know from studying discrete
mathematics, there are 2N
such
subsets. Let's actually try to write a slightly more general C#
method subsets
that
takes integers a and b and returns a sequence of subsets of {a, …,
b}. For now, we will represent each subset as a string, e.g. "1
3 4" for the set {1, 3, 4}.
When we are faced with any recursive problem like this, we need to find a way to first recursively solve a subproblem and then transform the subproblem solution into a solution to the entire problem. It may help to write down the entire solution and a subproblem solution for a specific example. For example, suppose that our goal is to generate subsets of {1, 2, 3}. There are 8 such subsets:
(empty set) 1 2 1 2 3 1 3 2 3 1 2 3
Suppose that we recursively generate subsets of {2, 3}. We will get 4 subsets:
(empty set) 2 3 2 3
We must find a way to transform these into the 8 subsets above. We can now make these key observations:
Every subset of {2, 3} is itself a subset of {1, 2, 3}.
In addition, we can prepend the original value (1) to each subset of {2, 3} to obtain another subset of {1, 2, 3}.
This is the recursive pattern we were seeking. With this understanding, we can now write the method subsets():
// Return all subsets of {a, a + 1, …, b}. static IEnumerable<string> subsets(int a, int b) { if (a > b) yield return ""; else { foreach (var s in subsets(a + 1, b)) { yield return s; yield return a + " " + s; } } }
We can call the method like this:
foreach (string s in subsets(1, 4)) WriteLine(s);
This will print
1 2 1 2 3 1 3 …
Let's study this method in detail. First, if a > b then the set {a, a + 1, …, b} is the empty set. How many subsets of the empty set exist? Be careful: there actually is one, namely the empty set itself. And so if a > b we return the empty string, representing the empty set. That is the base case.
For the recursive case, we recursively generate all subsets of {a + 1, …, b}. Following the recursive pattern we discovered above, for each such subset s we both return s, and also a subset that contains 'a' prepended to s:
yield return s; yield return a + " " + s;
Let's generalize the subsets() method above in two ways:
We like to be able to generate subsets of any set of values, not just the integers {a .. b}.
Instead of representing each subset as a string, we'd like to store it as some kind of set of values that we can manipulate programmatically.
We will write our generalized method using linked lists, which are a good choice of data structure here because we want to be able to prepend elements efficiently (i.e. in O(1) time). Here is our implementation:
class Node<T> { public T val; public Node<T> next; public Node(T val, Node<T> next) { this.val = val; this.next = next; } } // Return a sequence of subsets of the given list. static IEnumerable<Node<T>> subsets<T>(Node<T> list) { if (list == null) yield return null; // empty set else { foreach (Node<T> n in subsets(list.next)) { yield return n; yield return new Node<T>(list.val, n); } } }
Notice the similarities between this method and the subsets() method on integers above.
Consider the problem of generating all compositions of an integer. A composition of an integer N is a sequence of positive integers that add up to N. Order matters: two distinct compositions of N might contain the same set of integers in a different order. For example, the compositions of 4 are
4 3 + 1 2 + 2 2 + 1 + 1 1 + 3 1 + 2 + 1 1 + 1 + 2 1 + 1 + 1 + 1
An integer N has 2N – 1 compositions. To see this, consider a rod of length N. If we want to cut the rod into pieces with integral lengths, then there are (N – 1) places we can cut the rod. In each of those places we can either cut or not cut, which are 2 possibilities. So there are 2N – 1 possible sets of cuts, each of which corresponds to a unique composition.
Once again we must find a recursive pattern. Here it is: given any integer N, let's first choose some value K to be the first element of a composition that we want to generate. We can recursively generate all compositions of the what's left, i.e. of (N – K). For each such composition, we can prepend K to get a composition of N. We must repeat this process for every possible K.
To put it differently, we can rewrite the compositions of 4 above as follows:
4 3 + (compositions of 1) 2 + (compositions of 2) 1 + (compositions of 3)
With this pattern in mind, we can write the code. We represent each composition as a string:
static IEnumerable<string> compositions(int n) { if (n == 0) yield return ""; else for (int i = 1 ; i <= n ; ++i) foreach (string s in compositions(n - i)) yield return i + " " + s; }
Once again note the base case carefully. How many compositions exist of N = 0? There is one – the empty set. That's because the empty set is (vacuously) a set of positive integers that add up to 0.
Also note the strategy that we used to find the recursive pattern here. We chose a first element of the target sequence, subtracted it out, then recursively dealt with the rest. This strategy is often useful for this sort of problem.
Now consider the problem of generating partitions of an integer. A partition of an integer N is a set of positive integers that add up to N. Order does not matter: {2, 4} and {4, 2} are the same set, so they count as only a single partition of 6.
There are 11 partitions of the integer 6:
6 5 1 4 2 4 1 1 3 3 3 2 1 3 1 1 1 2 2 2 2 2 1 1 2 1 1 1 1 1 1 1 1 1 1
Unlike compositions, no closed formula is known for the number of partitions of a given integer N.
Let's try to write a function that will generate all partitions of an integer N. We will write the numbers in each partition in non-increasing order (so that each partition can be written in exactly one way). In theory, we could generate all compositions of N, and then select only those that are in non-increasing order. That would generate the partitions, but would be highly inefficient, so we would like to find a better approach.
A recursive pattern may not be immediately obvious. For example, it is certainly not true that we can prepend 2 to any non-increasing partition of 4 to yield a non-increasing partition of 6. For example, 3 1 is a non-increasing partition of 4, but "2 3 1" is not valid in our schema, since it is not non-increasing (and hence duplicates the partition "3 2 1").
In looking at the partitions of 6 above, we may notice the following pattern. In each partition that begins with 2, the rest of the numbers are a partition of 4 in which every number is at most 2.
This suggests that we need to generalize the problem in order to solve it recursively. Specifically, we will write a recursive function that takes integers N and K and generates the partitions of N that contain only values that are ≤ K, with all numbers in decreasing order.
static IEnumerable<string> partitions(int n, int k) { if (n == 0) yield return ""; else if (n > 0) for (int i = k ; i >= 1 ; --i) foreach (string s in partitions(n - i, i)) yield return i + " " + s; } static IEnumerable<string> partitions(int n) => partitions(n, n);
Note the following:
When the function calls itself recursively, it passes i as the next value for k. That is so all subpartitions that it generates will contain only values that are ≤ i, which ensures that even after i is prepended, the resulting partition will contain numbers in non-increasing order.
The
check if (n > 0)
is
very important. That's because at the moment the function calls
itself recursively, it's possible that i > n and so (n – i) is
negative, so in the recursive call we will have n < 0. In that
case the function must return no results. Without the check for (n >
0), it would recurse forever.
Now, alternatively you
could write the for loop like this:
for (int i = Math.Min(k, n) ; i >= 1 ; --i)
This will ensure that i <= n and so (i – n)
will be non-negative. With this change, the check if (n >
0)
is unnecessary. You may choose whichever version of the
code you prefer.
As a final note, note that the situation we saw here is common: many problems can be solved recursively only when generalized. When you cannot solve a problem using direct recursion, look for a way to generalize it that will let you solve a useful subproblem recursively.