Week 5: Notes

accumulators

Earlier we wrote a predicate that can reverse a list:

rev([], []).
rev([X | L], N) :- append(M, [X], N), rev(L, M).

reverse(L, M) :- same_length(L, M), rev(L, M).

The call to same_length() allows the predicate to terminate in both directions.

Let's try to use this predicate to reverse a list with 10,000 elements:

?- numlist(1, 10_000, L), reverse(L, M).
ERROR: Stack limit (1.0Gb) exceeded

The predicate runs for a number of seconds, then runs out of stack space. Fundamentally the problem is that it runs in time O(N2) on a list with N elements. That's because appending an element to the end of a list takes O(N), so the total running time is O(1 + 2 + ... + N) = O(N2).

This inefficiency is disappointing. After all, in a procedural language we could easily write a function to reverse a linked list in time O(N), using a simple loop. On each iteration, we can remove an element from the head of the original list, and prepend it to the list that we are building.

To imitate this process in Prolog, we may rewrite the predicate using an accumulator:

rev_acc([], A, A).
rev_acc([X | L], A, R) :- rev_acc(L, [X | A], R).

rev_acc(L, R) :- same_length(L, R), rev_acc(L, [], R).

In this version, the top-level predicate rev_acc(L, R) calls the recursive helper predicate rev_acc(L, A, R), passing the empty list [] as the initial value for the accumulator argument A. Each time that rev_acc() recurses, it removes an element from the list L and prepends it to the accumulator A. When we reach the base of the recursion, we return the final value of the accumulator.

As usual, we must call same_length() to ensure that the predicate will terminate in either direction. This predicate will reverse a list of length N in time O(N), since each prepend operation takes constant time.

As another example, suppose that we'd like to write a predicate that computes the sum of all integers from 1 through N. Here's a direct recursive solution:

sum_n(0, 0).
sum_n(N, S) :- N #> 0, N1 #= N - 1, S #= S1 + N, sum_n(N1, S1).

Let's try it:

?- sum_n(1000, N).
N = 500500

It seems to work. Let's try to compute some large sums:

?- sum_n(1_000_000, N).
N = 500000500000

?- sum_n(5_000_000, N).
ERROR: Stack limit (1.0Gb) exceeded

This last call fails. The problem is that the predicate has recursed very deeply, exceeding available memory. This is unfortunate, since in an imperative language such as C we could easily compute this sum using a small constant amount of memory.

As an alternative, we may use an accumulator:

sum_acc(0, A, A).
sum_acc(N, A, S) :- N #>0, N1 #= N - 1, A1 #= A + N, sum_acc(N1, A1, S).

sum_acc(N, S) :- sum_acc(N, 0, S).

As in the previous example, the top-level predicate calls a recursive helper predicate, passing an initial value for the accumulator argument A. Each time that sum_acc() recurses, it computes a new accumulator value and passes it to the next recursive call. At the base of the recursion we return the final accumulator value.

The rewritten predicate runs more quickly, and can compute larger sums:

?- sum_acc(1_000_000, N).
N = 500000500000

?- sum_acc(5_000_000, N).
N = 12500002500000

In the original predicate sum_n() without the accumulator, the sum is computed from right to left. For example, sum_n(4, S) will compute

1 + (2 + (3 + 4))

On the other hand, the predicate sum_acc() computes the sum from left to right. For example, sum_acc(4, S) will compute

((1 + 2) + 3) + 4

Of course, the result will be the same in either case, because addition is associative: (a + b) + c = a + (b + c).

As we can see, using an accumulator may make a program more efficient. On the other hand, our predicates rev_acc() and sum_acc() are slightly more complicated than their non-accumulator versions. Furthermore, in some Prolog programs an accumulator may actually hurt performance. For this reason, I recommend that you use an accumulator only when it provides a significant performance gain.

foldl

The built-in higher-order predicate foldl can combine elements of a list using an arbitrary predicate. Let's see foldl in action:

add(I, J, K) :- I + J #= K.

?- foldl(add, [3, 4, 5], 0, S).
S = 12

The code above adds the members of the list [3, 4, 5], starting with the value 0. In other words, it computes ((0 + 3) + 4) + 5. This is called a left fold because it combines values from the left, just like the accumulators that we wrote in last week's lecture.

In general, foldl takes four arguments: a predicate of three arguments, a list, a starting value for the accumulator and a final value for the accumulator. If we write

foldl(P, [a, b, c], A, R)

then foldl will make a series of calls to P, with this effect:

P(a, A, A1)

P(b, A1, A2)

P(c, A2, R)

Here is an efficient implementation of reverse(), using foldl:

prepend(X, L, [X | L]).

reverse(L, M) :- same_length(L, M), foldl(prepend, L, [], M).

We can easily implement fold() using call(). The implementation will look similar to the accumulator pattern that we saw in the previous lecture:

foldl(_, [], A, A).

foldl(P, [X | L], A, R) :- call(P, X, A, A1), foldl(P, L, A1, R).

Another version of foldl takes five arguments: a predicate of four arguments, two lists, a starting value for the accumulator and a final value for the accumulator. If we write

foldl(P, [a, b, c], [d, e, f], A, R)

then foldl will make a series of calls to P, with this effect:

P(a, d, A, A1)

P(b, d, A1, A2)

P(c, f, A2, R)

In other words, each call to P receives two elements, one from each of the input lists, and uses them to compute the next accuulator value. For example, we can use this version of foldl to compute a dot product:

mul_add(X, Y, A, A1) :- A1 #= A + X * Y.

?- foldl(mul_add, [1, 2, 3], [4, 5, 6], 0, N).
N = 32

sorting

We may implement sorting algorithms such as insertion sort, merge sort or quicksort easily in Prolog.

For example, let's implement insertion sort. As a first step, here's a predicate that can insert a value into a sorted list:

insert(X, [], [X]).

insert(X, [Y | Z], [X, Y | Z]) :- X #=< Y.

insert(X, [Y | Z], [Y | L]) :- X #> Y, insert(X, Z, L).

Let's try it:

?- insert(15, [10, 12, 14, 16], L).
L = [10, 12, 14, 15, 16]

?- insert(A, [10, 12, 16], [10, 12, 14, 16]).
A = 14 

Now we can write insertion sort recursively:

isort([], []).

isort([X | L], M) :- same_length([X | L], M), isort(L, LS), insert(X, LS, M).

It works both forwards and backwards:

?- isort([5, 2, 8, 1, 10, 3], L).
L = [1, 2, 3, 5, 8, 10]

?- isort(L, [10, 15, 20]).
L = [10, 15, 20] ;
L = [15, 10, 20] ;
L = [20, 10, 15] ;
L = [10, 20, 15] ;
L = [15, 20, 10] ;
L = [20, 15, 10].

?- isort(L, [10, 20, 15]).
false.

How long will our predicate take to sort N numbers? In the best case, the input list will already be sorted. Then insert() will run in O(1) on each iteration, so the total time will be O(N). In the worst (and expected) case, insert() will take O(M) to insert into a sorted list of M numbers, so the total running time will be O(N + (N - 1) + ... + 1) = O(N2). These are the same best-case and worst-case times that we would expect from an implementation of insertion sort in a procedural language.

functional data structures

In Prolog we can build common data structures such as stacks, queues, and binary trees. However, since Prolog has no mutable state, the interfaces to these structures will look a bit different than in procedural languages. Every operation on a data structure S, such as pushing a value onto a stack or enqueuing into a queue, will return a new version of the data structure, since S itself cannot be modified.

As we will see, the implementations of functional data structures must also sometimes be different than in procedural languages.

stacks

A functional stack is trivial to implement in Prolog, since a Prolog list itself can act as a stack.

% S is an empty stack.
empty_stack([]). 

% We can push X onto S to make the stack [X | S].
push(X, S, [X | S]).

% We can pop X from [X | S] to make the stack S.
pop(X, [X | S], S).

We can push N numbers onto an empty stack using recursion:

% S is the empty stack with the numbers 1 .. N pushed onto it.
push_to_n(0, []).
push_to_n(N, S) :- N #> 0, N1 #= N - 1, push_to_n(N1, S1), push(N, S1, S).

Or using foldl:

push_to_n2(N, S) :- numlist(1, N, NL), foldl(push, NL, [], S).

queues

We may naively implement a functional queue using a list. When we enqueue values, we append them to the end fo the list. We dequeue values by removing them from the front:

% empty_queue(Q) - Q is an empty queue
empty_queue([]).

% enqueue(X, Q, R) - we can enqueue X onto Q to make R
enqueue(X, Q, R) :- append(Q, [X], R).

% dequeue(X, Q, R) - we can dequeue X from Q to make R
dequeue(X, [X | R], R).

We may use foldl to enqueue a series of values:

?- numlist(1, 10, _L), foldl(enqueue, _L, [], Q).
Q = [1, 2, 3, 4, 5, 6, 7, 8, 9|...].

In fact we may even use foldl to dequeue several values at once:

?- numlist(1, 10, _L), foldl(enqueue, _L, [], _Q),
   length(M, 3), foldl(dequeue, M, _Q, _).
M = [1, 2, 3].

However our naive implementation is inefficient. enqueue() runs in O(N) when there are N values in the queue. Queueing N values successively onto an empty queue will take O(N2).

Alternatively, we may implement a functional queue using two lists, which we will call the front and back lists. For example, here is one such queue:

[a, b, c] / [f, e, d]

To enqueue a value, we prepend it to the back list. To dequeue a value, we remove it from the head of the front list. For example, we may enqueue 'g' and then dequeue 'a' and b' from the queue above, yielding this queue:

[c] / [g, f, e, d]

When dequeueing, if the front list is empty then we reverse the back list and move it to the front. For example, if we dequeue 'c' from the queue above, then the front list will become empty. The next time we dequeue, we will reverse the list [g, f, e, d] to form [d, e, f, g]. We will dequeue 'd', and the remaining items will now be at the front:

[e, f, g] / []

Here is a Prolog implementation:

empty_queue([] / []).

enqueue(X, F / B, F / [X | B]).

dequeue(X, [X | F] / B, F / B).
dequeue(X, [] / B, F / []) :- reverse(B, [X | F]).

With this implementation, enqueue() runs in O(1).

How will dequeue() perform? Suppose that we start with an empty queue, and enqueue and dequeue N values via some series of enqueue and dequeue operations. There will be some number of flips, i.e. operations in which we move values from the back to the front. Each flip of K values will take O(K), since reverse() runs in linear time. The total number of values flipped will be N, since each enqueued value will move to the front exactly once. So the total cost of all the flip operations will be O(N). And so the total time for all N dequeue operations will be O(N) (for removing values from the front list) + O(N) (for all the flip operations) = O(N). And so dequeue() will run in O(1) on average.

binary trees

It is straightforward to represent a binary tree in Prolog. We can represent an empty binary tree as the atom nil, and a non-empty tree as t(L, X, R), where X is the value in the root node and L and R are left and right subtrees. For example, consider this tree:

Here is a predicate defining its representation:

my_tree(T) :- T = t(t(nil, 2, nil),
                    5,
                    t(t(nil, 8, nil), 10, t(nil, 12, nil))).

It's straightforward to write a recursive predicate to count the nodes in a tree:

count(nil, 0).
count(t(L, _, R), N) :- N #>= 1, LN #>= 0, RN #>= 0, N #= LN + RN + 1,
                        count(L, LN), count(R, RN).

?- my_tree(_T), count(_T, N).
N = 5.

We may even generate all trees with a given number of nodes:

?- count(T, 3).
T = t(nil, _, t(nil, _, t(nil, _, nil))) ;
T = t(nil, _, t(t(nil, _, nil), _, nil)) ;
T = t(t(nil, _, nil), _, t(nil, _, nil)) ;
T = t(t(nil, _, t(nil, _, nil)), _, nil) ;
T = t(t(t(nil, _, nil), _, nil), _, nil)

Let's write a predicate maptree(P, T) that succeeds if a predicate P is true for every value in a tree:

maptree(_, nil).
maptree(P, t(L, X, R)) :- call(P, X), maptree(P, L), maptree(P, R).

?- count(T, 2), maptree(=(5), T).
T = t(nil, 5, t(nil, 5, nil)) ;
T = t(t(nil, 5, nil), 5, nil)

the cut

Normally Prolog generates all possible solutions to any query. In some situations, however, we may only want a single solution. The cut operator ! causes Prolog to commit to all choices that it has already made in the current query or predicate. After a cut, Prolog will not backtrack to look for more possible solutions. For example:

?- member(X, [2, 4, 6, 8, 10]), X #> 5, !.
X = 6.

?- append(L, M, [a, b, c]), !.
L = [],
M = [a, b, c].

Here are some more examples:

?- (X = 3; X = 4, !; X = 5).
X = 3 ;
X = 4.

?- (Z = 1; Z = 2), (X = 3; X = 4, !; X = 5).
Z = 1,
X = 3 ;
Z = 1,
X = 4.

Prolog will, however, continue to explore choices that appear to the right of the cut:

?- (X = 3; X = 4, !; X = 5), (Y = 10; Y = 20).
X = 3,
Y = 10 ;
X = 3,
Y = 20 ;
X = 4,
Y = 10 ;
X = 4,
Y = 20.

If a predicate is written using multiple rules, then the cut will affect it just as if it were written as a single rule using the ';' operator. In other words, after a cut no further rules will be considered. For example:

foo(X) :- X = 3.
foo(X) :- X = 4, !.
foo(X) :- X = 5.

?- foo(X).
X = 3 ;
X = 4.

?-

the cut and logical purity

Almost all the Prolog features and predicates that we have seen until now are part of the pure logical core of the language. For example, dif(), member(), append(), select(), maplist(), foldl(), and new-style arithmetic are all logically pure.

As we've noted before, any program written only using these pure logical features has some nice properties: predicates run in multiple directions, query results are sound and complete, and reordering goals will not change the results of a query (though it may affect termination).

The cut can be useful in various situations, however it is not logically pure. This means that a program that uses the cut, or features derived from the cut, will probably not have the properties above.

Let's look at a couple of examples. Consider this predicate:

kind(N, small) :- N #< 10.
kind(N, medium) :- N #>= 10, N #< 100.
kind(N, large) :- N #>= 100.

We might attempt to simplify it using the cut as follows:

kind2(N, small) :- N #< 10, !.
kind2(N, medium) :- N #< 100, !.
kind2(_, large).

With this change, any query of the form kind2(N, K) will still produce a correct value K if N is already instantiated and K is not:

?- kind2(50, K).
K = medium.

?- kind2(200, K).
K = large.

However, if N is uninstantiated then we may now receive answers that are incorrect:

?- kind2(N, large).
true.

The preceding answer claims that all N are big, but that is not true. Similarly, if K is already instantiated then we may also receive incorrect answers:

?- kind2(5, medium).
true.

Furthermore, the results of our query may now depend on the order in which we ask questions:

?- N = 50, K = medium, kind2(N, K).
N = 50,
K = medium.

?- kind2(N, K), N = 50, K = medium.
false.

As another example, suppose that we use the cut to write a version of member(X, L) that will succeed only once, even if X is in the list L multiple times:

mem1(X, [X | _]) :- !.
mem1(X, [_ | L]) :- mem1(X, L).

In the forward direction, it works as intended:

?- mem1(3, [2, 3, 3, 5, 4, 3]).
true.

?-

However in the reverse direction it now returns results that are incomplete:

?- mem1(3, L).
L = [3|_].

?-

As a result, once again we may get different results if we reorder goals in a query:

?- L = [2, 3, 4], mem1(3, L).
L = [2, 3, 4].

?- mem1(3, L), L = [2, 3, 4].
false.

In summary, the cut works well in a forward direction, i.e. at a moment when all variables are already instantiated. So, for example, it's reasonable to use a cut at the top level of a program that is searching for solutions in a state space, at a moment when a first solution is known and you don't want to search for more solutions. However, in situations where code may need to run multidirectionally, the cut is potentially dangerous and requires you to think about your program in a more procedural way.

negation

We may use the cut to implement negation. For example, let's write a predicate not_mem(X, L) that is true if X is not a member of L:

not_mem(X, L) :- member(X, L), !, false; true.

This predicate works as follows. If X is a member of L, then the predicate is false, and the cut prevents it from considering the other branch. Otherwise, the other branch runs and the predicate is true:

?- not_mem(3, [4, 5, 6]).
true.

?- not_mem(3, [4, 3, 6]).
false.

We may generalize this idea by writing a higher-order predicate 'not':

not(P) :- call(P), !, false; true.

?- not(member(3, [4, 5, 6])).
true.

In fact this predicate is built into Prolog, and has the name \+:

?- \+ member(3, [4, 5, 6]).
true.

Because Prolog's negation is defined using the cut, it will really only work in a forward direction. If you apply negation to a term in which all variables are already instantiated, it will behave as you expect. If variables are not instantiated, you may receive results that are unsound or incomplete. For example, consider this query:

?- \+ member(X, [2, 3, 4]).
false

This result is claiming that there are no values X which are not members of the list [2, 3, 4]. But the claim is false, since certainly such X do exist:

?- X = 10, \+ member(X, [2, 3, 4]).
X = 10.

As another example, suppose that we have the following predicates about restaurants:

good_food(ferdinand).
good_food(beseda).

expensive(beseda).

cheap(R) :- \+ expensive(R).

I'm looking for a restaurant that has good food and is inexpensive. This query lets me find one:

?- good_food(R), cheap(R).
R = ferdinand 

However if I ask the questions in the other order, the query fails:

?- cheap(R), good_food(R).
false

if/then/else

We may also use the cut to implement an if/then/else operation. For example, here is a predicate max() which computes the maximum Z of two values X and Y:

max(X, Y, Z) :- X #>= Y, !, Z = X; Z = Y.

?- max(4, 10, Z).
Z = 10.

?- max(7, 3, Z).
Z = 7.

If the goal 'X #>= Y' succeeds, then Z = X and the cut prevents Prolog from considering the other choice. If 'X #>= Y' fails, then Z = Y.

Prolog includes an arrow operator '->' which is defined using the cut, and lets us write this if/then/else choice with an (arguably) nicer syntax:

max(X, Y, Z) :- X #>= Y -> Z = X; Z = Y.

Once again, because this operator is defined using the cut, it will really only work in a forward direction, and may give us unsound or incomplete results in a reverse direction. For example:

?- max(4, X, 10).
false

This result is claiming that there are no values X such that the maximum of 4 and X is 10. Of course, that is wrong.

graphs

There are various possible ways to represent a graph in Prolog. Consider this acylic directed graph:

We could represent the graph using a series of facts:

edge(a, b).
edge(a, e).
edge(b, c).
...

This is sometimes called a temporal representation, because a query such as edge(a, V) will return a series of solutions over time.

Alternatively, we could use an adjacency list representation:

graph1(G) :- G =
    [ a -> [b, e], b -> [c, f], c -> [],
      d -> [], e -> [d, f], f -> [c],
      g -> [d, h], h -> [f] ].

This is sometimes called a spatial representation, since the graph is represented using a data structure.

We can implement graph search algorithms on either sort of graph. Let's choose the adjacency-list representation. We can begin by writing a predicate that can tell whether an edge exists in a graph:

% True if G has an edge from V to W.
edge(G, V, W) :- member(V -> L, G), member(W, L).

depth-first search

Here is a predicate path() that can find a path between two vertices in a graph in adjacency-list representation.

% path(G, V, W, P): True if P is a path from V to W in graph G.

path(_, V, V, [V]).

path(G, V, W, [V | P]) :- edge(G, V, X), path(G, X, W, P).

This predicate is somewhat similar to the ancestor() predicate that we wrote in the first lecture. However, unlike that predicate it builds a path P from the start to the goal.

Let's use path() to find all paths from a to c in the graph above:

?- graph1(_G), path(_G, a, c, P).
P = [a, b, c] ;
P = [a, b, f, c] ;
P = [a, e, f, c]

The predicate will even work multidirectionally:

?- graph1(_G), path(_G, V, W, [a, e, f, c]).
V = a,
W = c

?- graph1(_G), path(_G, a, V, P).
V = a,
P = [a] ;
V = b,
P = [a, b] ;
V = c,
P = [a, b, c] ;
V = f,
P = [a, b, f]
...

Our undirected graph above was acyclic. Let's now add an edge from c to a to make the graph cyclic:

graph2(G) :- G =
    [ a -> [b, e], b -> [c, f], c -> [a],
      d -> [], e -> [d, f], f -> [c],
      g -> [d, h], h -> [f] ].

Let's try using path() to search this graph:

?- graph2(_G), path(_G, a, c, P).
P = [a, b, c] ;
P = [a, b, c, a, b, c] ;
P = [a, b, c, a, b, c, a, b, c]
...

?- path(_G, a, e, P)
<hang>

We see it fails to find a path from a to e. That's because the search walks around the cycle a-b-c in an infinite loop, so it will never consider the edge from a to e.

So how can we search in a graph (or state space) that may contain cycles? Let's look at three possible ways do that.

iterative deepening, revisited

As we saw week, iterative deepening performs a series of depth-first searches, each of which has a maximum depth. The first depth-first search has depth limit 1, the second has limit 2, and so on. Iterative deepening will always find a shortest path to a goal. It will work fine even in graphs with cycles. Although it will explore paths around cycles, that will not cause it to go into an infinite loop since its search depth is always limited. Our implementation looks like this:

% iter_deep(G, V, W, P): True if P is a path from V to W in G.
% Return the shortest path first.
iter_deep(G, V, W, P) :- length(P, _), path(G, V, W, P).

Unlike path(), this predicate returns the shortest path from b to f as its first result:

?- graph2(_G), iter_deep(_G, b, f, P).
P = [b, f] ;
P = [b, c, a, e, f]
<hang>

Iterative deepening is sometimes a useful technique. However, this implementation will never terminate, since it will keep searching for longer and longer paths. Of course, we can use the cut if we want only a single solution:

% iter_deep(G, V, W, P): True if P is the shortest path from V to W in G.
iter_deep(G, V, W, P) :- length(P, _), path(G, V, W, P), !.

However, the predicate will still fail to terminate if there is no path V to W.

Another problem is that this predicate considers all possible paths of length (N - 1) before any path of length N. And unfortunately there may be exponentially many of these.

avoiding self-intersecting paths

As another possible approach, we can amend our path() predicate so that it avoids selecting any vertex that is already in the path that it has built so far. It might look like this:

not_mem(X, L) :- maplist(dif(X), L).

% extend(G, P, Goal, R): true if P can be extended to reach Goal, producing path R

extend(_, P, Goal, R) :- P = [Goal | _], reverse(P, R).

extend(G, [X | P], Goal, R) :- dif(X, Goal), edge(G, X, Y), not_mem(Y, [X | P]),
    extend(G, [Y, X | P], Goal, R).

% path(G, V, W, Path): True if Path is a path from V to W in graph G.

path2(G, V, W, Path) :- extend(G, [V], W, Path).

path2() can search successfully even in a cyclic graph:

?- graph2(_G), path2(_G, a, c, P).
P = [a, b, c] ;
P = [a, b, f, c] ;
P = [a, e, f, c]

?- graph2(_G), path2(_G, a, e, P).
P = [a, e]

This is a step forward, but path2() still has a couple of limitations. First, since it is a depth-first search it is not guaranteed to find the shortest path between two vertices:

?- graph2(_G), path2(_G, b, f, P).
P = [b, c, a, e, f]

Also, this predicate may be inefficient. Unlike our implementation of depth-first search in Introduction to Algorithms last year, this predicate does not use a visited set shared across the entire search. Instead, it only avoids self-intersecting paths. In some graphs this may be much less efficient than using a visited set, since there may be (exponentially) many paths that do not intersect themselves, and path2() may explore many of these.

breadth-first search

A breadth-first search with a visited set will visit each state only once, so it may be far more efficient than iterative deepening or a search that only avoids self-intersecting paths.

A breadth-first search needs a queue. For the moment, we will use a naive queue representation, i.e. a list, where we enqueue by appending to the list. We also need a visited set. For the moment, we will also use a naive visited set representation, i.e. a list.

When the breadth-first search reaches the goal, we would like to return a list of all states on the path from the start to the goal. For this reason, each element on the queue will not be a simple vertex V. Instead, it will be a list of vertices on a path from the start to V, in reverse order. Because this list is in reverse order, we can efficiently extend it by prepending any neighbor of V to it.

Here's one possible solution:

prepend_to(L, X, [X | L]).

% bfs(+Graph, +Queue, +Visited, +Goal, -Path)
% True if Path is a path of states leading from the start to Goal.

bfs(_, [[Goal | P1] | _], _, Goal, Path) :- reverse([Goal | P1], Path).

bfs(Graph, [[V | P1] | Q], Visited, Goal, Path) :-
    member(V -> Neighbors, Graph),
    subtract(Neighbors, Visited, Neighbors1), append(Neighbors1, Visited, Visited2),
    maplist(prepend_to([V | P1]), Neighbors1, Neighbors2), append(Q, Neighbors2, Q2),
    bfs(Graph, Q2, Visited2, Goal, Path).

% top-level entry point
bfs(Graph, Start, Goal, Path) :- bfs(Graph, [[Start]], [Start], Goal, Path).

Of course, we could improve this predicate by using more efficient data structures for the queue and visited set. For the queue, we could use the front/back data structure that we saw in this lecture. For a visited set, SWI-Prolog includes libraries assoc and rbtree that implement efficient dictionaries using balanced binary trees. We could use one of these data structures, representing a set as a dictionary from keys to any fixed value.

breadth-first search in a state space

We can adapt the breath-first search code above to perform a state space search. However, we will need to some way to generate a list of all neighboring states for any state. The built-in predicate findall() can help. Given a variable X and a goal that includes X, findall() will find all possible solutions for X and return them in a list. For example:

?- findall(X, permutation([a, b, c], X), L).
L = [[a, b, c], [a, c, b], [b, a, c], [b, c, a], [c, a, b], [c, b, a]].

Now suppose that our state transition function is defined by a predicate move(S, T), which succeeds if T is a successor state of S. Then we can call findall(X, move(S, X), L) to produce a list of successor states of S.

Recall the coin flipping puzzle we saw last week, a simple example of a state space problem. Let's now generalize it to have any number of coins. There are N coins on a table, initially all heads. We want to flip any number of coins so that they all become tails. Obviously the problem is trivial, but a state space search that knows nothing about the structure of the problem will have to explore an exponential number of states to find a solution.

Here is a solution, using a breadth-first search:

flip(h, t).
flip(t, h).

% move(S, T) - true if we can move from state S to state T
move(S, T) :- select(X, S, Y, T), flip(X, Y).

prepend_to(L, X, [X | L]).

% bfs(+Queue, +Visited, +Goal, -Path)
% True if Path is a path of states leading from the start to Goal.

bfs([[Goal | P1] | _], _, Goal, Path) :- reverse([Goal | P1], Path).

bfs([[S | P1] | Q], Visited, Goal, Path) :-
    findall(X, move(S, X), Neighbors),
    subtract(Neighbors, Visited, Neighbors1), append(Neighbors1, Visited, Visited2),
    maplist(prepend_to([S | P1]), Neighbors1, Neighbors2),  append(Q, Neighbors2, Q2),
    bfs(Q2, Visited2, Goal, Path).

solve(N, P) :- length(Start, N), maplist(=(h), Start),
               length(Goal, N), maplist(=(t), Goal),
               bfs([[Start]], [Start], Goal, P).

Now we can find a solution reasonably quickly for N = 10, even though we have still used naive data structures for the queue and visited set:

?- solve(10, P).
P = [[h, h, h, h, h, h, h, h|...], [t, h, h, h, h, h, h|...], [t, t, h, h, h, h|...], [t, t, t, h, h|...], [t, t, t, t|...], [t, t, t|...], [t, t|...], [t|...], [...|...]|...]

For larger state space searches, we may need to use more efficient data structures as we noted above.