Week 5: Notes

There was no lecture or tutorial this week.

Our topics this week are binary trees, findall, breadth-first search, the cut and negation. They are discussed in our textbook Prolog Programming for Artificial Intelligence, Fourth Edition:

Chapter 5 "Controlling Backtracking"
Chapter 6 "Built-in Predicates": 6.6 bagof, setof and findall
Chapter 9 "Operations on Data Structures": 9.2 Representing sets by binary trees, 9.3 Insertion and deletion in a binary dictionary
Chapter 11 "Problem-Solving as Search": 11.3 Breadth-first search

All of these topics are also in the third edition of the book, where the section numbers are the same, except that findall is in section 7.6.

Here are some more notes:

breadth-first search

Last week we discussed several algorithms for searching in a graph in Prolog:

depth-first search
depth-first search with a visited set, to avoid walking around cycles forever
iterative deepening

As we briefly discussed last week, we may also use these algorithms for searching a state space, which includes a start state and a successor predicate succ(S, T) that succeeds once for every successor T of a state S. In a state space search problem, we are trying to find the shortest path from the start state to some goal state.

A finite graph in temporal representation is one example of a state space: its successor predicate is the predicate edge(V, W) that leads from each vertex V to its neighbors W. But a state space is a more general notion that we can use to solve many kinds of problems. For example, if we are trying to solve a Rubik's cube, then the start state may be a scrambled cube, the successor predicate leads from each position to all neighboring positions, and the goal state is an unscrambled cube. Some state spaces may even be infinite.

As we saw last week, iterative deepening is guaranteed to find a shortest path to a goal state, but may be very inefficient in some state spaces because it will consider all possible paths of length (N – 1) before finding any path of length N, and there may be an exponential number of these.

We can overcome this inefficiency by implementing a breadth-first search with a visited set. As we know from our study of algorithms, a breadth-first search of a finite graph with V vertices and E edges will run in time O(V + E) as long as queue operations are O(1). A breadth-first search explores states in order of increasing distance from the start, so even in an infinite state space it will complete in time O(N), where N is the number of states that are closer to the start than the goal, as long as the branching factor (the number of successor states from each state) is bounded by a constant.

To implement a breadth-first search in Prolog, we can use the built-in findall predicate, which can return a list of all solutions to an arbitrary goal. Normally Prolog returns solutions one at a time, but findall returns them all in a list:

?- member(X, [10, 11, 12]).
X = 10 ;
X = 11 ;
X = 12.

?- findall(X, member(X, [10, 11, 12]), L).
L = [10, 11, 12].

We should note that findall is not part of the pure core language: it cannot run backwards, and so any programs using findall is unlikely to work in all directions. But for the purpose of implementing a breadth-first search, that doesn't matter, since we really only want to run such a search in a forward direction.

Our textbook describes a couple of predicates that are similar to findall, namely bagof and setof. These predicates have a significant weakness: if a query has no results, they fail (return false) rather than returning an empty list. That behavior is often undesirable, and so I recommend using findall instead.

We may now implement a breadth-first search for state spaces. Actually we would not like to tie our breadth-first implementation to any particular successor predicate. Instead, we will write it as a higher-order predicate: it will take a parameter Succ, which can be an arbitrary predicate. That way we can write bfs just once and reuse it to search any space we like.

Here is the implementation:

% helper predicate
prepend(L, X, [X | L]).

% if goal is at the head of the queue, return it
bfs(_, [[Goal | Rest] | _], _, Goal, [Goal | Rest]).

% main recursive predicate: bfs(+Succ, +Queue, +Visited, +Goal, -Solution)
bfs(Succ, [[State | Path] | Queue], Visited, Goal, Solution) :-
    findall(X, call(Succ, State, X), Next),    % find all neighboring states
    subtract(Next, Visited, Next1),            % remove already-visited states
    maplist(prepend([State | Path]), Next1, Next2), % prepend each state to path
    append(Queue, Next2, Queue2),              % add all new states to queue
    append(Next1, Visited, Visited1),          % add all new states to visited set
    bfs(Succ, Queue2, Visited1, Goal, Solution).   % recurse

% top-level predicate: bfs(+Succ, Start, +Goal, -Solution)
bfs(Succ, Start, Goal, Solution) :-
    bfs(Succ, [[Start]], [Start], Goal, Solution1),

    reverse(Solution1, Solution).

Note that as the bfs predicate runs, each state S on the queue is actually stored as a list of states from the start state to S, but in reverse order, so that S appears first. This representation is actually quite efficient, because if we discover an edge from S to T, we can construct the vertex list for T by prepending T to the vertex list for S, and prepending is an O(1) operation. Once the goal state appears at the head of the queue, we can reverse its state list to form the solution, i.e. a list of states from the start to the goal.

This implementation will outperform an iterative deepening search in many state spaces, but is still not as efficient as we would like because it does not use an efficient queue implementation: the append operations above run in O(N), not O(1). It is possible to implement a queue efficiently in Prolog (i.e. with enqueue and dequeue operations that run in O(1)) using a slightly more advanced data structure called a difference list, but we will not study that this week.

By the way, the breadth-first search implementation in section 11.3 in the Bratko textbook has a significant weakness: it does not use a global visited set, and so in some spaces it may explore an exponential number of paths unnecessarily, just like the iterative deepening implementation we saw last week. For that reason I recommend that you use this implementation instead.

the cut and negation

You will read in our textbook about the cut operator !, which causes Prolog to terminate a query after it has found the first result. You may be tempted to use the cut to help your programs terminate, and it can sometimes be helpful for this purpose. However, I will point out that the cut does not belong to the language's pure logical core, and so using it has several consequences:

Any predicate using the cut will probably not run in both directions.
Any predicate using the cut will probably return incomplete results for some queries.
In any predicate using the cut, reordering goals may change the results of a query.

As described in the text, the not operator (\+) is defined using the cut, and using it has similar consequences.

By contrast, if you use only pure language features (including integer constraints), then

Any predicate will theoretically run in both directions (though it may sometimes fail to terminate).
Any query that terminates will always return results that are logically complete.
Reordering goals may affect termination, but will never change the results of a query.

So is using the cut, or the not operator, worth it? It is up to you, but you should understand what you are giving up by using these features. In this course, I have emphasized the pure logical features of Prolog since it seems to me that that is where Prolog shines the most – for example, I know of no other mainstream language that can run matrix multiplication backwards to yield matrix inversion.

At the same time, if a predicate will only be used in one direction, the cut does no harm. As a practical example, the iterative deepening predicate that we saw last week will first find the shortest path to the goal vertex, but then will run indefinitely looking for longer and longer solutions. It seems quite reasonable to use the cut to stop after it has found the first solution:

iter_deep(Start, Goal, Path) :- length(Path, _), dfs(Start, Goal, Path), !.