Programming 1, 2020-1
Lecture 13: Notes

compositions of an integer

In the previous lecture we discussed using recursion to solve combinatorial problems, including generating all compositions of an integer. We wrote a top-down solution that accumulates numbers into a string and generates output like this:

>>> compositions(4)
1 1 1 1 
1 1 2 
1 2 1 
1 3 
2 1 1 
2 2 
3 1 
4 

It might be nice to print out '+' signs between the integers in each composition. We could insert '+' signs as we build up a string as we descend the tree, but let's look at a different approach. As we descend the tree, we can accumulate the integers we have chosen on a stack, and join them into a string when we reach a leaf. Here is our solution:

def compositions(n):
    s = []              # stack of integers
    
    def choose(n):
        if n == 0:      # at a leaf
            print(' + '.join([str(i) for i in s]))      # join integers, separated by '+'
        else:
            for i in range(1, n + 1):
                s.append(i)     # push to stack
                choose(n - i)
                s.pop()
    
    choose(n)

Now the output looks nicer:

>>> compositions(4)
1 + 1 + 1 + 1
1 + 1 + 2
1 + 2 + 1
1 + 3
2 + 1 + 1
2 + 2
3 + 1
4

This approach of accumulating values on a stack can be useful in top-down solutions for many combinatorial problems.

partitions of an integer

A partition of an integer N is a set of positive integers whose sum is N. Here are the partitions of 5:

5
4 + 1
3 + 2
3 + 1 + 1
2 + 2 + 1
2 + 1 + 1 + 1
1 + 1 + 1 + 1 + 1

In compositions, order matters: 1 + 3 and 3 + 1 are distinct compositions of 4. In partitions, order does not matter, so there is only one partition of 4 that contains the sum 3 + 1.

As we saw previously, any integer N has 2N – 1 compositions. By contrast, there is no simple formula for the number of partitions of an integer.

Let's write a function that can generate all partitions of an integer N. As a naive approach, we could generate all compositions, and only print out those that are in descending order, so that each partition is printed only once. However that will be inefficient. We'd like an approach that will automatically generate each partition only once.

To write this function we will need to find the recursive structure of the problem. When we generated compositions, we saw that each composition of an integer N consists of some integer k ≤ N followed by a composition of (N – k). This gave us a straightforward solution.

Finding a recursive structure for permutations is not quite as easy. As we can see in the partitions of 5 above, if we are generating a partition of 5 and we first choose the number 2, the possibilities for the remaining values are not the same as the partitions of 5 – 2 = 3. In particular, 3 is a partition of 3, but 2 + 3 is not a partition of 5.

However, we can observe that if we first choose the number 2, the remaining values must be a partition of 3 in which every value is at most 2. That means that we can solve this recursive problem if we generalize it so that instead of finding all partitions of N, we write a function that can find all partitions of N in which every value is at most M for given values of N and M. (This is not the first time we have seen that we need to generalize a problem to solve it recursively.)

Here is our solution:

# Generate all partitions of n.
def partitions(n):
    
    # Generate all partitions of n in which every value is <= m.
    def generate(n, m, s):
        if n == 0:
            print(s)
        else:
            for i in range(min(m, n), 0, -1):
                generate(n - i, i, s + str(i) + ' ')
    
    generate(n, n, '')

Notice that in the for loop above we begin with the value min(m, n). That's because the first value we choose must be less than or equal to both m and n.

An excellent exercise is to write this function in a bottom-up fashion as well.

3-coloring a graph

Given an undirected graph, can we assign one of 3 colors to every vertex of the graph such that no two adjacent vertices have the same color?

This is a constraint satisfaction problem, one of a large set of problems in which we want to choose a set of values such that a number of constraints are satisfied. (In this particular problem, each constraint says that two particular vertices may not have the same color.) In more advanced courses you may study techniques for solving constraint satisfaction problems somewhat efficiently.

However even those techniques have their limits. This particular problem is NP-complete. That means that it belongs to a large class of problems for which no polynomial-time algorithm is known. In other words, no algorithm to solve these problems can run in polynomial time in the worst case (as far as we know). In more advanced courses about computational complexity you will study the precise meaning of NP-completeness.

When faced with a problem such as this one, the most basic approach we can take is to perform an exhaustive search for a solution. In this particular problem this could take exponential time, because with N vertices there are 3N possible colorings. However, we can do better than generating every possible coloring. As we search, we can prune (cut off) the search by not considering values that violate a constraint and hence make a solution impossible.

For example, suppose that we are given a graph in which vertices 1 and 2 are adjacent (and there are various other edges and vertices as well). And suppose that the three colors are red, green and blue. In our search, we may first assume that vertex 1 is red. We must next choose a color for vertex 2. We should exclude the color red since in the set of possibilities we consider, since it will violate the constraint that vertices 1 and 2 must have different colors. By narrowing the search in this way, we will consider far fewer possible colorings than all 3N that are possible (though we will still consider an exponential number in the worst case).

Here is a function to search for a 3-coloring of a graph in adjacency list representation. If it finds a 3-coloring, it returns a list of vertex colors in which each color is represented by the number 0, 1, or 2. It pushes its choices onto a stack as it recurses:

def three_color(g):
    colors = []

    # return set of all possible colors for next vertex
    def next_color(colors):
        i = len(colors)
        return {0, 1, 2} - {colors[v] for v in g[i] if v < i}
        
    def search():
        if len(colors) == len(g):
            return True     # found solution
            
        for color in next_color(colors):
            colors.append(color)
            if search():
                return True
            colors.pop()
        
        return False
    
    if search():
        return colors
    else:
        return None

Again, as an exercise you may wish to rewrite this function using a bottom-up approach.

writing good code

We now know enough Python that we are starting to write larger programs. Especially when writing a larger program, we want to write code in a way that is structured, clear, and maintainable. Functions and classes are essential tools for this purpose.

Here are three general rules to follow for writing good code:

  1. Don't repeat yourself. Beginning programmers often write code that looks like this:

if x > 0:
    … 20 lines of code …
else:
    … the same 20 lines of code, with a few small changes …

    This is bad code. It is hard to read: the differences between the 20-line blocks may be important, but hard to spot. And it is hard to maintain. Every time you change one of the parallel blocks of code, you must change the other. That is a chore, and it's also easy to forget to do that.

    In this situation you should factor out the 20 lines of code into a separate function. You can use function parameters to produce the differences in behavior between the two 20-line blocks. Then you can write

if x > 0:
    my_fun(… arguments)
else:
    my_fun(… different arguments)
  1. Every function should fit on a single screen. Practically speaking this means that functions should generally be limited to about 50 lines. Functions that are much longer than this quickly become hard to understand.

  2. Make variables as local as possible. In other words, avoid global variables when possible. (In many programming languages you can declare variables inside a loop body or other block of code inside a function, which is generally a good practice. However Python does not allow this.)