Lecture 12: Notes

Here are notes about topics from Lecture 12.

For more details about the minimax algorithm and alpha-beta pruning, one good source is Russell and Norvig, Artificial Intelligence: A Modern Approach, Third Edition, sections 5.1 – 5.3.

flatten and flatMap

Occasionallty it is useful to flatten a sequence of sequences, i.e. to combine all of the elements of the subsequences into a single sequence.

For example, let's revisit the method from the last lecture that returns an enumeration of all lines in a file:

  static IEnumerable<string> lines(string filename) {
    StreamReader r = new StreamReader(filename);
    while (r.ReadLine() is string s)
      yield return s;
    r.Close();
  }

We can find all words in the file like this:

  var words = lines('abc.txt').Select(l => l.Split(' '));

This results in a sequence of sequences, more specifically an IEnumerable<string[]>.

Suppose that we'd like all the words to appear in a single top-level sequence. Using an iterator, we can easily write a method that will flatten a sequence of sequences:

  static IEnumerable<T> flatten<T>(IEnumerable<IEnumerable<T>> seq) {
    foreach (IEnumerable<T> e in seq)
      foreach (T t in e)
        yield return t;
  }

Can we pass an IEnumerable<string[]> to this method? Actually yes, because

  1. Any array type T[] implements IEnumerable<T>, so a string[] is an IEnumerable<string>.

  2. IEnumerable<T> is covariant (recall our discussion of covariance from a few lectures ago).

So the following method will return an enumeration of all words in a file:

  static IEnumerable<string> words(string filename) =>
    flatten(lines(filename).Select(l => l.Split(' ')));

Alternatively we can write this using SelectMany, which in most other systems is called "flatMap". SelectMany maps a function over a sequence and then flattens the result:

  static IEnumerable<string> words2(string filename) =>
    lines(filename).SelectMany(l => l.Split(' '));

tuples

Tuple<T,U> is a standard library type that represents a pair of values, which may be of different types. Tuple includes properties Item1 and Item2 that return the first and second value in the pair, respectively.

A useful property of Tuples is that they implement IComparable and are ordered lexicographically: they are compared first by their first element, and then by their second. For example:

    IComparable p = new Tuple<int, int>(1, 2);
    IComparable q = new Tuple<int, int>(1, 3);
    IComparable r = new Tuple<int, int>(2, 1);
    WriteLine(p.CompareTo(q));  // writes -1, i.e. p < q
    WriteLine(q.CompareTo(r));  // writes -1, i.e. q < 3

(Actually a Tuple is not limited to only two items: the standard library includes additional overloaded versions of Tuple that have 3 arguments (i.e. they are triples), 4 arguments and larger numbers of arguments as well.)

To illustrate one possible use of tuples, suppose that we'd like to find the longest word in a sequence of words. We could use an imperative loop:

  static string longest(IEnumerable<string> words) {
    int max = -1;
    string s = null;
    
    foreach (string t in words)
      if (t.Length > max) {
        max = t.Length;
        s = t;
      }
    return s;
  }

But it would be nice to write this in a more compact functional style. We can do so by writing a utility method maxBy:

  // Return the element t of e for which f(t) is largest
  static T maxBy<T>(IEnumerable<T> e, Func<T, int> f) =>
    e.Select(t => new Tuple<int, T>(f(t), t)).Max().Item2;

Here, the Max method chooses the tuple for which the first element, i.e. the value of f(t), is the largest.

Now it is very easy to find the longest word in a sequence:

  static string longest(IEnumerable<string> words) =>
    maxBy(words, w => w.Length);

iterators with recursion

By combining iterators with recursion we can generate some interesting sequences.

First, suppose that we have a binary search tree, with nodes represented using this class:

class Node {
  public int i;
  public Node left, right;
  
  public Node(int i, Node left, Node right) {
    this.i = i; this.left = left; this.right = right;
  }
}

Recall that we can print all values in the tree using a recursive method:

  static void print(Node n) {
    if (n != null) {
      print(n.left);
      WriteLine(n.i);
      print(n.right);
    }
  }

Suppose that we'd now like to generate an enumeration of all values in a tree. Here is a method that will do that:

  static IEnumerable<int> values(Node n) {
    if (n != null) {
      foreach (int i in values(n.left))
        yield return i;
        
      yield return n.i;
      
      foreach (int i in values(n.right))
        yield return i;
    }
  }

Or we can write this more compactly using the Linq methods Append, which appends a value to a sequence, and Concat, which concatenates two sequences:

  static IEnumerable<int> values(Node n) =>
    n == null ? Enumerable.Empty<int>() :
      values(n.left).Append(n.i).Concat(values(n.right));

As another example of iterators with recursion, consider the problem of generating an enumeration of all permutations of a given string. For example, the permutations of "abc" are

Note the recursive structure of the problem. Every permutation of a string S begins with a character c ∈ S followed by a permutation of the string (S – c), i.e the string s with the character c removed. With this in mind, we can write a recursive method:

  static IEnumerable<string> permutations(string s) {
    if (s == "")
      yield return "";
    else
      for (int i = 0 ; i < s.Length ; ++i)
        foreach (string t in permutations(s.Remove(i, 1)))
          yield return s[i] + t;
  }

Finally, consider the problem of generating all compositions of an integer. A composition of an integer k is a set of positive integers that add up to k. Two compositions are considered to be distinct even if they contain the same set of integers in a different order. For example, the compositions of 4 are

We can represent a composition by a sequence of integers, i.e. an IEnumerable<int>. We'd like to write a method that takes an integer k and returns a sequence of compositions, i.e. an IEnumerable<IEnumerable<int>>.

To do this, note the recursive structure of the problem. As is evident with the example k = 4 above, all the compositions of an integer k have the form i + C, where i is an integer in the range 1 ≤ i ≤ k and C is any composition of (k – i). (The single composition of 0 is the empty set.) So we can write

  static IEnumerable<IEnumerable<int>> compositions(int i) {
    if (i == 0) yield return Enumerable.Empty<int>();
    else
      for (int j = i ; j >= 1 ; --j)
        foreach (IEnumerable<int> e in compositions(i - j))
          yield return e.Prepend(j);
  }

Now we can print all the compositions like this:

  static void printCompositions(int i) {
    foreach (IEnumerable<int> c in compositions(i))
      WriteLine(string.Join(" + ", c.Select(j => j.ToString())));
  }

game playing algorithms

We will now discuss how to write programs that can play games such as Tic Tac Toe, checkers (draughts) or even chess. All of these are 2-player abstract strategy games, which are games with the following characteristics:

30 years ago the best human players could still defeat the top computer programs in games such as chess and Go. But this is no longer true: in the 1990s computer programs were developed (notably IBM's Deep Blue) that could defeat the top human chess players. And just in the last few years computer programs (notably Google's AlphaGo) have become stronger than the top human players at Go.

The newest and most powerful game-playing programs are based on neural networks. In this course we will not discuss neural networks at all, and will instead focus on the classic minimax algorithm.

As a first example of a game to play, consider the following very simple game, which I call Stretch. The rules of Stretch are as follows. Stretch is played on a board consisting of a one-dimensional row of N sqaures, which are all initially empty. There are two players, X and O. X moves first. Each player in turn places their symbol in any empty square. The game ends when all squares are full. The winner is the player who has the longest contiguous sequence ("stretch") of their symbol anywhere on the board. If both players have a longest sequence of equal length, the game is a draw.

For example, here is one possible sequence of moves of a game of Stretch played on a 5-square board:

%3

X wins the game, since he ends up with 2 in a row and O's longest stretch consists of only a single symbol.

We may now ask: when Stretch is played on a board with N squares, can the first player (or, perhaps, even the second player) always win? This is a very simple game, but the answer to this question in general may not be immediately obvious. But it is not difficult to analyze the game for small values of N. In particular:

In fact we can see that O can always force a draw for any even value of N, simply by mirroring X's moves. Suppose that N is even, and that board squares are numbered from 0 through (N – 1). Each time that X plays in square i, O can then play in square (N – 1 – i). Then at the conclusion of the game the right half of the board will be symmetric to the lefft half around the board's center point, with X's replaced by O's and O's replaced by X's. This impllies that the game will be a draw, because X's and O's longest sequences will have the same length.

For odd values of N, however, there is no such immediately evident strategy either for X or O.

Let's now consider how to write a computer program that may play a game such as Stretch. For any abstract strategy game, a game tree represents all possible sequences of moves. For example, here is a game tree showing all possible games of Stretch with N = 3. (I have omitted board states that are redundant up to symmetry. For example, I have not included the state in which X first plays in the rightmost square, becacuse it is symmetric to the state in which X's first move is in the leftmost square.)

%3

Note that X is the winner in two of the outcomes depicted here, and the third outcome (X O X) is a draw.

scores and minimax values

We will assign a numerical score to each outcome in a game. For the game of Stretch, if X wins the score is +1; if O wins the score is -1; if the game is a draw, the score is 0. Thus X wishes to maximize the game score, and O wants to minimize it. In fact for all games we consider will adopt a similar convention: the first player (called X or "Max") wants to maximize the score and the second player (called O or "Min") wishes to minimize it.

Some games might have more than three possible scores. For example, we can imagine a game in which players can capture each other's pieces and the score at the end is the difference between the numbers of pieces that each player has remaining. In a game like this, a player wishes not only to win, but also to win by as much as possible.

Consider the following game tree for some abstract strategy game:

tree

The game has two moves. First X plays, bringing the game to either state a, b, or c. Now O plays, bringing the game to one of the nine states in the lowermost row. Each of these states is labelled with a numeric score. Once again, X wishes to maximize the score and O wishes to minimize it. How should X and O choose their moves?

In each of the states a, b, and c it is O's turn to play, and O sbould choose the move that yields the lowest score. So we may assign a value to each of these states, namely the minimum of the scores of all successor states. This is the score of the outcome of the game, assuming that O plays perfectly, i.e. O chooses the move that will minimize the final score.

tree

In the start state it is X's turn to play. X should choose the move that leads to a state of maximal value. So we may likewise assign a value to the start state, namely the maximum of the values of all successor states. This will be the score of the outcome of the game, assuming that both X and O play perfectly.

tree

We may apply the above analysis to a game tree of arbitrary depth, assigning each node its minimax value, obtained by minimizing whenever it is O's turn to play and maximizing when it is X's turn. The process of labelling nodes in this way is the minimax algorithm. The mimimax value of the top node is the score of the final outcome, assuming perfect play by both X and O.

A game tree for chess is far too large for us to draw (it has more nodes than there are atoms in the universe) and is likewise much too large for us to analyze by computer. But if we could draw such a tree and label its nodes with the minimax algorithm, then the value of the start node would be either +1 (indicating that with the right strategy White can always win), 0 (indicating that best play leads to a draw) or -1 (meaning that Black can always win). It is not known what this value would be, since we don't know whether White or even Black can always win at chess. (It certainly seems very unlikely that Black can always win, but it has not been proven that this is not the case.)

As a simple exercise, apply the minimax algorithm to the complete game tree depicted above for Stretch with N = 3. Label each final board position with either +1 (X wins) or 0 (a draw), then compute the values of the other nodes. This will prove the (rather obvious) fact that X can force a win.

implementing minimax in C#

Let us now consider how to implement the minimax algorithm in C#. Our first goal is to compute the minimax value of the start node of a game of Stretch, for a board of arbitrary size N. This will reveal whether the first player (X) can always win for any particular board size.

We can implement the minimax algorithm using a recursive method. This method minimax will take a game state as its input, and will makes recursive calls to determine the minimax values of all successor states, e.g. the states following each possible move that the current player can make. When it is X's turn to play, minimax will compute the maximum of the values of all successors; when O is playing, minimax will compute the minimum.

mimimax must call itself recursively passing an updated board state in which some move have been made. minimax could copy the board state, but that would be computationally expensive, especially since we typically wish to examine thousands or millions of board states in as short a time as possible. So instead we will keep only a single copy of board state in memory. Before each recursive call, minimax will update the board state by making a possible move; after the call returns, minimax will undo the move, returning the board to the previous state.

Here is a program that implements this minimax method as just described for the game of Stretch described above:

The program computes and prints the minimax value of the game tree's start node for every value of N in the range 1 ≤ N ≤ 12. If you run this program, you will see that the first player (X) can win for every odd board size through N = 9, but when N = 11 the game will actually be a draw with perfect play!

Notice that in this program the logic defining the rules of the game is in the Board class, and the minimax algorithm is implemented in a separate Minimax class. I generally recommend this kind of class separation in a game-playing program.

Now let's extend our program so that it can actually play the game, i.e. determine the best move to make at any particular time. Here is an extended implementation including a user interface that lets the user play as either X or O:

Notice that in this updated program we have made only minimal changes to the minimax method. It now keeps track of the best move it has seen in the current board position, and returns it.

Also note that in this program all user interface code is in a Game class, separate from both the Board and Minimax classes. Again, I recommend keeping separate functionality in separate classes in this way.

alpha-beta pruning

In general a game tree may be large, with millions or billions of nodes or more. We'd like to be able to implement the minimax algorithm as efficiently as we can so that we can explore as many nodes as possible per second.

We may often be able to prune (cut off) some branches of the game tree from our search because they are unnecessary to explore. For example, consider the following simple game tree. In this game the only possible outcomes are a win for X (score = +1), a win for O (score =-1) or a draw (score = 0). As usual X plays first. In this tree, leaf nodes contain the scores for each outcome and we have filled in the miniimax values of other nodes:

tree

Assuming that our minimax search explores child nodes from left to right, the shaded nodes above may be pruned from the search. When we see the -1 at the lower left, we know that the node above it will also have value -1, since we are minimizing and no smaller value is possible. So there is no need to examine the following two nodes (with values 1 and 0). Similarly, after we determine that the second node in the second row has value 1, there is no need to examing the node to its right (or any of its children), since we know that the top node will also have value 1, meaning that the game is a win for X.

Simply put, if we are searching for a move for either player and have already found a move that guarantees a win, we need look no further. This technique of extreme value pruning will work for any game has only a bounded set of possible scores, i.e. there is a maximum possible score and a minimum possible score.

Alpha-beta pruning is another technique that can generally prune many more nodes from a game search tree. Consider the two-move search tree we saw above:

tree

As the minimax algorithm runs on this game tree, we first compute the value of node A, which is min(14, 4, 6) = 4. The start node is maximizing, so we now know that the start node's value will be at least 4. Now we descend into the second subtree and observe the leaf node whose score is 3. At this point we know that node B will have a value of at most 3, since that node is minimizing. So B cannot possibly affect the start node's value, which is, once again, at least 4. And so we need not examine the shaded nodes. Instead, we can return immediately, yielding a value of 3 for B. The correct minimax value for B is actually -2, as indicated in the tree above, but it does not matter that we've returned a value that is too large: it will be ignored anyway.

To implement this optimization in general, we keep track of two values α and β as we recursively descend the game tree. α is the minimum score that we have found (so far) that player Max can achieve on the current search path, and similarly β is the maximum score that player Min can achieve. Initially α is -∞ and β is ∞. In the example just described, we set α to 4 once we know the value of node A. Whenever we are minimizing, i.e. looking for the best move for player Min, if we see a value that is less than or equal to α then we can return immediately, since that value will never be used, as in the example above. Similarly, when we are maximizing, if we see a value that is greater than or equal to β then we can return immediately.

Here is another version of our program that plays Stretch, extended to perform both extreme value pruning and alpha-beta pruning:

depth-limited searching

Now let's consider using the minimax algorithm to write a program that plays a real game. Specifically, suppose that we want to write a program to play Connect Four. Although the real game of Connect Four uses colored circular pieces, for our discussion we will call the players X and O.

We cannot hope to explore an entire game tree for Connect Four to determine the minimax value of a given board position. That's because the game tree is simply too large. A Connect Four board has 6 x 7 = 42 squares, each of which may have any of 3 values (X, O, and empty). So there are something like 342 possible board positions (actually somewhat less, since some positions can never be reached since game play stops after one player has won). The number of possible games is greater, since each board position can be reached in many possible ways. The total number of possible games is less than 742, since there are at most 7 possible moves at each turn and a game will have at most 42 turns. These numbers are so large that an exhaustive tree search is not feasible.

We can still, however, use the minimax algorithm to construct a program that plays well. Of course, it will not play perfectly – the only way to achieve that would be to search the entire game tree. Our program will search the game tree only to some fixed depth D. Once it reaches a node of depth D, it will compute an estimated value of the current board position. A higher estimate indicates that we believe it is more likely that X will win; a lower estimate means that a win is more likely for O. In our program, a victory for X will have a score of +10,000, a victory for O will have a score of -10,000 and a draw will have a score of 0. All estimates will fall inside this range: for any estimate e, -10,000 < e < 10,000.