Week 5: Notes

foldl

The built-in higher-order predicate foldl can combine elements of a list using an arbitrary predicate. Let's see foldl in action:

add(I, J, K) :- I + J #= K.

?- foldl(add, [3, 4, 5], 0, S).
S = 12

The code above adds the members of the list [3, 4, 5], starting with the value 0. In other words, it computes ((0 + 3) + 4) + 5. This is called a left fold because it combines values from the left, just like the accumulators that we wrote in last week's lecture.

In general, foldl takes four arguments: a predicate of three arguments, a list, a starting value for the accumulator and a final value for the accumulator. If we write

foldl(P, [a, b, c], A, R)

then foldl will make a series of calls to P, with this effect:

P(a, A, A1)

P(b, A1, A2)

P(c, A2, R)

Here is an efficient implementation of reverse(), using foldl:

prepend(X, L, [X | L]).

reverse(L, M) :- same_length(L, M), foldl(prepend, L, [], M).

Another version of foldl takes five arguments: a predicate of four arguments, two lists, a starting value for the accumulator and a final value for the accumulator. If we write

foldl(P, [a, b, c], [d, e, f], A, R)

then foldl will make a series of calls to P, with this effect:

P(a, d, A, A1)

P(b, d, A1, A2)

P(c, f, A2, R)

In other words, each call to P receives two elements, one from each of the input lists, and uses them to compute the next accuulator value. For example, we can use this version of foldl to compute a dot product:

mul_add(X, Y, A, A1) :- A1 #= A + X * Y.

?- foldl(mul_add, [1, 2, 3], [4, 5, 6], 0, N).
N = 32

functional data structures

In Prolog we can build common data structures such as stacks, queues, and binary trees. However, since Prolog has no mutable state, the interfaces to these structures will look a bit different than in procedural languages. Every operation on a data structure S, such as pushing a value onto a stack or enqueuing into a queue, will return a new version of the data structure, since S itself cannot be modified.

As we will see, the implementations of functional data structures must also sometimes be different than in procedural languages.

stacks

A functional stack is trivial to implement in Prolog, since a Prolog list itself can act as a stack.

% S is an empty stack.
empty_stack([]). 

% We can push X onto S to make the stack [X | S].
push(X, S, [X | S]).

% We can pop X from [X | S] to make the stack S.
pop(X, [X | S], S).

We can push N numbers onto an empty stack using recursion:

% S is the empty stack with the numbers 1 .. N pushed onto it.
push_to_n(0, []).
push_to_n(N, S) :- N #> 0, N1 #= N - 1, push_to_n(N1, S1), push(N, S1, S).

Or using foldl:

push_to_n2(N, S) :- numlist(1, N, NL), foldl(push, NL, [], S).

queues

We may naively implement a functional queue using a list. When we enqueue values, we append them to the end fo the list. We dequeue values by removing them from the front:

% empty_queue(Q) - Q is an empty queue
empty_queue([]).

% enqueue(X, Q, R) - we can enqueue X onto Q to make R
enqueue(X, Q, R) :- append(Q, [X], R).

% dequeue(X, Q, R) - we can dequeue X from Q to make R
dequeue(X, [X | R], R).

We may use foldl to enqueue a series of values:

?- numlist(1, 10, _L), foldl(enqueue, _L, [], Q).
Q = [1, 2, 3, 4, 5, 6, 7, 8, 9|...].

In fact we may even use foldl to dequeue several values at once:

?- numlist(1, 10, _L), foldl(enqueue, _L, [], _Q),
   length(M, 3), foldl(dequeue, M, _Q, _).
M = [1, 2, 3].

However our naive implementation is inefficient. enqueue() runs in O(N) when there are N values in the queue. Queueing N values successively onto an empty queue will take O(N2).

Alternatively, we may implement a functional queue using two lists, which we will call the front and back lists. For example, here is one such queue:

[a, b, c] / [f, e, d]

To enqueue a value, we prepend it to the back list. To dequeue a value, we remove it from the head of the front list. For example, we may enqueue 'g' and then dequeue 'a' and b' from the queue above, yielding this queue:

[c] / [g, f, e, d]

When dequeueing, if the front list is empty then we reverse the back list and move it to the front. For example, if we dequeue 'c' from the queue above, then the front list will become empty. The next time we dequeue, we will reverse the list [g, f, e, d] to form [d, e, f, g]. We will dequeue 'd', and the remaining items will now be at the front:

[e, f, g] / []

Here is a Prolog implementation:

empty_queue([] / []).

enqueue(X, F / B, F / [X | B]).

dequeue(X, [X | F] / B, F / B).
dequeue(X, [] / B, F / []) :- reverse(B, [X | F]).

With this implementation, enqueue() runs in O(1).

How will dequeue() perform? Suppose that we start with an empty queue, and enqueue and dequeue N values via some series of enqueue and dequeue operations. There will be some number of flips, i.e. operations in which we move values from the back to the front. Each flip of K values will take O(K), since reverse() runs in linear time. The total number of values flipped will be N, since each enqueued value will move to the front exactly once. So the total cost of all the flip operations will be O(N). And so the total time for all N dequeue operations will be O(N) (for removing values from the front list) + O(N) (for all the flip operations) = O(N). And so dequeue() will run in O(1) on average.

binary trees

It is straightforward to represent a binary tree in Prolog. We can represent an empty binary tree as the atom 'nil', and a non-empty tree as t(L, X, R), where X is the value in the root node and L and R are left and right subtrees. For example, consider this tree:

Here is a predicate defining its representation:

my_tree(T) :- T = t(t(nil, 2, nil),
                    5,
                    t(t(nil, 8, nil), 10, t(nil, 12, nil))).

It's straightforward to write a recursive predicate to count the nodes in a tree:

count(nil, 0).
count(t(L, _, R), N) :- N #>= 1, LN #>= 0, RN #>= 0, N #= LN + RN + 1,
                        count(L, LN), count(R, RN).

?- my_tree(_T), count(_T, N).
N = 5.

We may even generate all trees with a given number of nodes:

?- count(T, 3).
T = t(nil, _, t(nil, _, t(nil, _, nil))) ;
T = t(nil, _, t(t(nil, _, nil), _, nil)) ;
T = t(t(nil, _, nil), _, t(nil, _, nil)) ;
T = t(t(nil, _, t(nil, _, nil)), _, nil) ;
T = t(t(t(nil, _, nil), _, nil), _, nil)

Let's write a predicate maptree(P, T) that succeeds if a predicate P is true for every value in a tree:

maptree(_, nil).
maptree(P, t(L, X, R)) :- call(P, X), maptree(P, L), maptree(P, R).

?- count(T, 2), maptree(=(5), T).
T = t(nil, 5, t(nil, 5, nil)) ;
T = t(t(nil, 5, nil), 5, nil)

Here is a predicate that flattens a tree into a list:

flatten(nil, []).
flatten(t(L, X, R), M) :- flatten(L, FL), flatten(R, FR),
                          append([FL, [X], FR], M).

Unfortunately it's inefficient. Suppose that the tree has N nodes, and looks like a linked list descending leftward:

Then we will make N calls to append(), each of which will append a single element to the list. The total time will be O(1 + 2 + 3 + … + N) = O(N2).

Alternatively, suppose that the tree is a complete binary tree with N nodes, and let T(N) be the predicate's running time. Then T(N) = 2 T(N / 2) + O(N), so T(N) = O(N log N).

We can write flatten() more efficiently using an accumulator:

% flatten(T, A, B): Flatten all values in the tree T and
% prepend them to A to produce B.

flatten(nil, A, A).

flatten(t(L, X, R), A, B) :- flatten(R, A, A2), flatten(L, [X | A2], B).

% top-level predicate

flatten(T, L) :- flatten(T, [], L).

This implementation will flatten any tree in linear time.

binary search trees

Let's write some predicates that operate on binary search trees. As we recall, in a binary search tree, if a node has value X, then all values in its left subtree are less than X, and all values in the right subtree are greater than X.

Here is a predicate to test membership in a tree:

% mem_tree(X, T) is true if X is contained in the binary search tree T.

mem_tree(X, t(_, X, _)).
mem_tree(X, t(L, Y, R)) :- X #< Y, mem_tree(X, L).
mem_tree(X, t(L, Y, R)) :- X #> Y, mem_tree(X, R).

Inserting into a binary search tree is straightforward:

% insert(X, T, U) is true if we can insert X into the binary search tree T
% to produce the tree U.  If X is already present in T, the predicate
% succeeds but does not insert a duplicate value.

insert(X, nil, t(nil, X, nil)).
insert(X, t(L, X, R), t(L, X, R)).
insert(X, t(L, Y, R), t(L2, Y, R)) :- X #< Y, insert(X, L, L2).
insert(X, t(L, Y, R), t(L, Y, R2)) :- X #> Y, insert(X, R, R2).

Deletion is not quite as easy, and we will not implement it here.

Of course, a disadvantage of a simple binary search tree is that it may become unbalanced. As an alternative, the Prolog standard library includes an implementation of association lists, which are dictionaries implemented using balanced binary trees.

the cut

Normally Prolog generates all possible solutions to any query. In some situations, however, we may only want a single solution. The cut operator ! causes Prolog to commit to all choices that it has already made in the current query or predicate. After a cut, Prolog will not backtrack to look for more possible solutions. For example:

?- member(X, [2, 4, 6, 8, 10]), X #> 5, !.
X = 6.

?- append(L, M, [a, b, c]), !.
L = [],
M = [a, b, c].

?-

Here are some more examples:

?- (X = 3; X = 4, !; X = 5).
X = 3 ;
X = 4.

?- (Z = 1; Z = 2), (X = 3; X = 4, !; X = 5).
Z = 1,
X = 3 ;
Z = 1,
X = 4.

Prolog will, however, continue to explore choices that appear to the right of the cut:

?- (X = 3; X = 4, !; X = 5), (Y = 10; Y = 20).
X = 3,
Y = 10 ;
X = 3,
Y = 20 ;
X = 4,
Y = 10 ;
X = 4,
Y = 20.

If a predicate is written using multiple rules, then the cut will affect it just as if it were written as a single rule using the ';' operator. In other words, after a cut no further rules will be considered. For example:

foo(X) :- X = 3.
foo(X) :- X = 4, !.
foo(X) :- X = 5.

?- foo(X).
X = 3 ;
X = 4.

?-

the cut and logical purity

Almost all of the Prolog features and predicates that we have seen until now are part of the pure logical core of the language. For example, dif(), member(), append(), select(), maplist(), foldl(), and new-style arithmetic are all logically pure.

Any program written only using these pure logical features has some very nice properties:

The cut is useful in various situations, however it is not logically pure. This means that a program that uses the cut, or features derived from the cut, will probably not have the properties above.

Let's look at a couple of examples. Consider this predicate:

kind(N, small) :- N #< 10.
kind(N, medium) :- N #>= 10, N #< 100.
kind(N, large) :- N #>= 100.

We might attempt to simplify it using the cut as follows:

kind2(N, small) :- N #< 10, !.
kind2(N, medium) :- N #< 100, !.
kind2(_, large).

With this change, any query of the form kind2(N, K) will still produce a correct value K if N is already instantiated and K is not:

?- kind2(50, K).
K = medium.

?- kind2(200, K).
K = large.

However, if N is uninstantiated then we may now receive answers that are incorrect:

?- kind2(N, large).
true.

The preceding answer claims that all N are big, but that is not true. Similarly, if K is already instantiated then we may also receive incorrect answers:

?- kind2(5, medium).
true.

Furthermore, the results of our query may now depend on the order in which we ask questions:

?- N = 50, K = medium, kind2(N, K).
N = 50,
K = medium.

?- kind2(N, K), N = 50, K = medium.
false.

As another example, suppose that we use the cut to write a version of member(X, L) that will succeed only once, even if X is in the list L multiple times:

mem1(X, [X | _]) :- !.
mem1(X, [_ | L]) :- mem1(X, L).

In the forward direction, it works as intended:

?- mem1(3, [2, 3, 3, 5, 4, 3]).
true.

?-

However in the reverse direction it now returns results that are incomplete:

?- mem1(3, L).
L = [3|_].

?-

As a result, once again we may get different results if we reorder goals in a query:

?- L = [2, 3, 4], mem1(3, L).
L = [2, 3, 4].

?- mem1(3, L), L = [2, 3, 4].
false.

In summary, the cut works well in a forward direction, i.e. at a moment when all variables are already instantiated. So, for example, it's reasonable to use a cut at the top level of a program that is searching for solutions in a state space, at a moment when a first solution is known and you don't want to search for more solutions. However, in situations where code may need to run multidirectionally, the cut is potentially dangerous and requires you to think about your program in a more procedural way.

negation

We may use the cut to implement negation. For example, let's write a predicate not_mem(X, L) that is true if X is not a member of L:

not_mem(X, L) :- member(X, L), !, false; true.

This predicate works as follows. If X is a member of L, then the predicate is false, and the cut prevents it from considering the other branch. Otherwise, the other branch runs and the predicate is true:

?- not_mem(3, [4, 5, 6]).
true.

?- not_mem(3, [4, 3, 6]).
false.

We may generalize this idea by writing a higher-order predicate 'not':

not(P) :- call(P), !, false; true.

?- not(member(3, [4, 5, 6])).
true.

In fact this predicate is built into Prolog, and has the name \+:

?- \+ member(3, [4, 5, 6]).
true.

Because Prolog's negation is defined using the cut, it will really only work in a forward direction. If you apply negation to a term in which all variables are already instantiated, it will behave as you expect. If variables are not instantiated, you may receive results that are unsound or incomplete. For example, consider this query:

?- \+ member(X, [2, 3, 4]).
false

This result is claiming that there are no values X which are not members of the list [2, 3, 4]. But the claim is unsound, since certainly such X do exist:

?- X = 10, \+ member(X, [2, 3, 4]).
X = 10.

As another example, suppose that we have the following predicates about restaurants:

good_food(ferdinand).
good_food(beseda).

expensive(beseda).

cheap(R) :- \+ expensive(R).

I'm looking for a restaurant that has good food and is inexpensive. This query lets me find one:

?- good_food(R), cheap(R).
R = ferdinand 

However if I ask the questions in the other order, the query fails:

?- cheap(R), good_food(R).
false

if/then/else

We may also use the cut to implement an if/then/else operation. For example, here is a predicate max() which computes the maximum Z of two values X and Y:

max(X, Y, Z) :- X #>= Y, !, Z = X; Z = Y.

?- max(4, 10, Z).
Z = 10.

?- max(7, 3, Z).
Z = 7.

If the goal 'X #>= Y' succeeds, then Z = X and the cut prevents Prolog from considering the other choice. If 'X #>= Y' fails, then Z = Y.

Prolog includes an arrow operator '->' which is defined using the cut, and lets us write this if/then/else choice with an (arguably) nicer syntax:

max(X, Y, Z) :- X #>= Y -> Z = X; Z = Y.

Once again, because this operator is defined using the cut, it will really only work in a forward direction, and may give us unsound or incomplete results in a reverse direction. For example:

?- max(4, X, 10).
false

This result is claiming that there are no values X such that the maximum of 4 and X is 10. Of course, that is wrong.

tutorial programs

We solved these exercises in the tutorial:

Write a predicate digits(L, B, N) that is true if L is a list of digits in base B whose value is N. For example, digits([3, 4, 5], 10, 345) is true. Use a left fold.

join(B, D, A, A1) :- 0 #=< D, D #< B, A1 #= A * B + D.

digits(L, B, N) :- foldl(join(B), L, 0, N).

Write the library function foldl(+P, ?L, ?A, ?R) using call().

foldl(_P, [], A, A). 

foldl(P, [X | L], A, R) :- call(P, X, A, A1), foldl(P, L, A1, R).

Write a function foldr(+P, ?L, ?A, ?R) that performs a right fold over the values in L.

foldr(_P, [], A, A). 

foldr(P, [X | L], A, R) :- call(P, X, A1, R), foldr(P, L, A, A1).

Write a function height(?T, ?H) that is true if H is the height of the binary tree T. Recall that the height of a tree is defined as the maximal distance from the root to any leaf.

height(nil, -1).

height(t(L, _, R), H) :- LH #>= -1, RH #>= -1, H #>= 0,
    H #= max(LH, RH) + 1, height(L, LH), height(R, RH).

Write a predicate foldr_tree(+P, ?T, ?A, ?R) that performs a right fold of a predicate over all values in a tree.

foldr_tree(_P, nil, A, A).

foldr_tree(P, t(L, X, R), A, Ret) :-
    foldr_tree(P, R, A, A1),
    call(P, X, A1, A2),
    foldr_tree(P, L, A2, Ret).

Write a predicate flatten(?T, ?L) that flattens a tree using foldr_tree().

cons(X, L, [X | L]).

flatten(T, L) :- foldr_tree(cons, T, [], L).

In this cryptarithmetic puzzle, every letter stands for a different digit:

  A P P L E
+ G R A P E
+   P L U M
=========== 
B A N A N A

Write a Prolog program that can find a solution to the puzzle.

digits(L, N) :- digits(L, 10, N).

solve(Apple, Grape, Plum, Banana) :-
    Apple = [A, P, P, L, E],
    Grape = [G, R, A, P, E],
    Plum = [P, L, U, M],
    Banana = [B, A, N, A, N, A],
    All = [A, B, E, G, L, M, N, P, R, U],
    all_distinct(All),
    digits(Apple, Ap), digits(Grape, Gr), digits(Plum, Pl),
    digits(Banana, Ba),
    Ap + Gr + Pl #= Ba,
    label(All).

Is it possible to place N queens on an N x N chessboard such that no two queens attack each other? Write a Prolog program that can solve this problem for any N.

no_attack(X1 / Y1, X2 / Y2) :-
    X1 #\= X2, Y1 #\= Y2,
    abs(X1 - X2) #\= abs(Y1 - Y2).

first(X, X / _).

% queens(K, N, Q):
% Q is a list of K positions of queens in the first K rows
% on an N x N board.

queens(0, _N, []).

queens(K, N, [X / K | Q1]) :- K #> 0,
    K1 #= K - 1, queens(K1, N, Q1),
    0 #< X, X #=< N,
    maplist(no_attack(X / K), Q1).

queens(N, Q) :- queens(N, N, Q), maplist(first, XS, Q), labeling([ff], XS).

Write a predicate bfs(+G, +V, +W, ?P) that finds the shortest path from vertex V to vertex W in an undirected graph. Use a breadth-first search.

We discussed this problem, but did not have time to write a predicate. Here's one possible solution:

prepend_to(L, X, [X | L]).

% bfs(+Graph, +Queue, +Visited, +Goal, -Path)
% True if Path is a path of states leading from the start to Goal.

bfs(_, [[Goal | P1] | _], _, Goal, Path) :- reverse([Goal | P1], Path).

bfs(Graph, [[V | P1] | Q], Visited, Goal, Path) :-
    member(V -> Neighbors, Graph),
    subtract(Neighbors, Visited, Neighbors1), append(Neighbors1, Visited, Visited2),
    maplist(prepend_to([V | P1]), Neighbors1, Neighbors2), append(Q, Neighbors2, Q2),
    bfs(Graph, Q2, Visited2, Goal, Path).

% top-level entry point
bfs(Graph, Start, Goal, Path) :- bfs(Graph, [[Start]], [Start], Goal, Path).