Assignment 4: Warlight

This assignment is worth 25 points and has two parts.

1. Search Algorithm (10 pts)

Implement either

First clone (or update your existing copy of) the ai_1 repository. Look in the src/minimax subdirectory, which has source files for the classes described in this section.

Write a general-purpose implementation that can search any game that implements the following interface:

// S = state type, A = action type
public interface AbstractGame<S, A> {
    S initialState(int seed);
    S clone(S state);
    int player(S state);         // which player moves next: 1 (maximizing) or 2 (minimizing)
    List<A> actions(S state);    // available moves in this state
    void apply(S state, A action);  // apply action to state
    boolean isDone(S state);     // true if game has finished
    double outcome(S state);     // 1000.0 = player 1 wins, 0.0 = draw, -1000.0 = player 2 wins

    double PLAYER_1_WIN = 1000.0, PLAYER_2_WIN = -1000.0, DRAW = 0.0;
}

Your minimax or MCTS class will implement this interface:

interface Strategy<S, A> {
    A action(S state);
}

A Strategy represents a strategy for playing a game. It is essentially a function which, given a state, decides what action the current player should take.

You must implement your algorithm so that it will work even if moves do not strictly alternate. For example, in some games player 1 might move twice, then player 2 might move twice, and so on. (Warlight is like this, since on each player's turn they have two moves: one to place armies and one to attack/transfer.)

minimax

For minimax, you will need this additional interface:

public interface HeuristicGame<S, A> extends AbstractGame<S, A> {
    double evaluate(S state);
}

evaluate is a heuristic evaluation function that can estimate the expected outcome of the game at any state. You will need to use this evaluation function when you cut off the search because it has reached its maximum depth.

You should implement the following class in Minimax.java:

class Minimax<S, A> implements Strategy<S, A> {

    public Minimax(HeuristicGame<S, A> game, int limit) {
        
    }

}

In the constructor, limit represents a search depth. For example, if limit is 2, your action method should consider all possible sequences of 2 actions from the given state. If limit is 0, there is no depth limit and you should search the entire game tree.

Your implementation should include alpha-beta pruning.

hints

Monte Carlo tree search

For Monte Carlo tree search, if you can find the fourth edition of our textbook (Russell & Norvig), then I recommend that you follow the description of MCTS in section 5.4. (This algorithm is not discussed in the third edition.) Otherwise, you may attempt to follow this survey article:

Cameron Browne et al, "A Survey of Monte Carlo Tree Search Methods" (2012)

The article is a bit dense, so if you have questions about it then let me know.

Implement the following class in Mcts.java:

// Monte Carlo tree search
class Mcts<S, A> implements Strategy<S, A> {

  public Mcts(AbstractGame<S, A> game, Strategy<S, A> base, int limit) {
    
  } 
}

In the Mcts constructor:

hints

        double o = game.outcome(s);

        // convert to 1.0 = player 1 win, 0.5 = draw, 0.0 = player 2 win
        double w = (o - AbstractGame.PLAYER_2_WIN) /
                   (AbstractGame.PLAYER_1_WIN – AbstractGame.PLAYER_2_WIN);

testing

The ai_1 repository contains several sample games that implement the interfaces above: TrivialGame, Tic-Tac-Toe, and Connect Four. It would be wise to test your minimax or MCTS implementation on these games before tackling Warlight in the next section.

TrivialGame is, well, trivial. In this game, there are only two moves. The first player chooses a number from 1 to 3, then the second player chooses a number from 1 to 3. Then the game ends; the player with the higher number wins. If the numbers are the same, the game is a draw. Although this game is extremely simple, it may be useful as a first test of your algorithm's functionality.

Of course you are already familiar with Tic-Tac-Toe, and probably with Connect Four as well.

All of the sample games implement the HeuristicGame interface, so they will work with minimax. However, for TrivialGame and TicTacToe the evaluate() function just returns a dummy value. For Connect Four, the evaluate() function computes a heuristic by scanning the board and looking for groups of adjacent stones that contain discs of the same color.

For each of the sample games, there is a command-line script (trivial, tictactoe, connect_four) that launches it. Run the script with no arguments to see a list of options and available strategies. Specify one strategy on the command line to play as a human against that strategy, or two strategies to have them play each other. Use the -sim option to run a series of simulated games. Using these options, you should be able to debug your implementation. You may also find it interesting to see how it does against various strategies as you vary its parameters.

The following strategies are available for all the sample games:

Of course, the 'minimax' and 'mcts' strategies will only work if you implement them, i.e. write the Minimax or Mcts classes. :)

For each game, there are one or more additional strategies available. For TrivialGame:

For TicTacToe:

For Connect Four:

Here are a few examples showing how to launch these sample games:

$ ./trivial minimax:0 random -sim 100

Run 100 games of TrivialGame in which minimax (with no depth limit) plays versus a random strategy.

$ ./tictactoe mcts:100/random basic -sim 100 -v

Run 100 games of Tic-Tac-Toe in which MCTS (with 100 iterations, and using 'random' as the base strategy) plays against 'basic'. Print verbose output, i.e. show the random seed and outcome of each game.

$ ./tictactoe mcts:100/random basic -seed 22

Watch MCTS (with 100 iterations, and using 'random' as the base strategy) play Tic-Tic-Toe interactively against 'basic', with seed 22.

$ ./connect_four heuristic

Play Connect Four interactively against 'heuristic'.

If your implementation of minimax or MCTS is correct, here is how you should do against these games:

When you submit to ReCodEx, it will test your Minimax or Mcts class by playing these sample games, and will expect roughly the win rates described above.

2. Warlight (15 pts)

Download the code from the Warlight repository on GitHub. Read the game rules on the GitHub page.

Write an agent that plays Warlight. You may use your minimax or MCTS implementation from the previous section, or another approach if you prefer.

Assuming you will use minimax or MCTS, you will need to do the following:

class WarlightGame implements HeuristicGame<Game, Move> {
    @Override
    public Game initialState(int seed) {
        return new Game(null);
    }

    ...
}

To receive full credit for this assignment, your agent will need to be able to win 75% of games versus Napoleon, who is the strongest of the built-in agents. Specifically, ReCodEx will run 50 games of your agent versus Napoleon, and you will need to win at least 38 of these games for full credit. If you win at least 33 games (i.e. over 65%), ReCodEx will award you partial credit for the assignment.

There is a time limit of 5 minutes for the 50 games on ReCodEx. That might sound like a lot, but a typical Warlight game might have 50 turns, in each of which your agent will make 2 moves, so in the 50 games you will probably make on the order of 5,000 moves. That means that your agent will probably need to make its moves in about 50 ms on average in order to not exceed the time limit. That should still be enough for several hundred iterations of Monte Carlo tree search if you have a fast base strategy.

hints

3. Tournament

All entries received before the deadline for this assignment will automatically entered into a tournament. There will be three tournament categories:

In the tournament games there will be no time limit per move. However, only agents that are fast enough to complete 50 games vs. Napoleon within the 5-minute time limit on ReCodEx are eligible for the tournament. To be eligible for tournament category 3, your agent must also pass the warlight_multi test on ReCodEx, which plays three 4-player games including your agent and must finish within a time limit of 60 seconds.