Some of this week's topics are covered in Problem Solving with Algorithms:
And in Introduction to Algorithms:
22. Elementary Graph Algorithms
Here are some additional notes.
In the last lecture we studied depth-first search, which is one way of exploring all vertices in a graph. We will now look at breadth-first search, which is another fundamental graph algorithm.
Starting from some vertex V, a breadth-first search first visits vertices adjacent to V, i.e. vertices of distance 1 from V. It then visits vertices of distance 2, and so on. In this way it can determine the shortest distance from V to every other vertex in the graph.
Let's look at two ways to implement a breadth-first search. As a first possibility, we may use a list of vertices that are at the same distance from the starting vertex. Initially this list will contain just the starting vertex, which is at distance 0. Then we'll build a list of neighbors of the starting vertex, which will be at distance 1. After that, we'll build a list of neighbors of those vertices, which will be at distance 2, and so on. During this process we'll use a visited set to remember the vertices that we've already seen, to avoid visiting the same vertex twice and thus walking in a loop.
Here is an implementation. The function bfs() takes a graph in adjacency-list representation, plus the id of a vertex at which to begin the search:
def bfs(g, start): n = len(g) # number of vertices in the graph visited = [False] * n vs = [start] # list of vertices at same distance visited[start] = True while vs != []: next = [] # build a list 'next' of all unvisited neighbors of vertices in vs for v in vs: # for all vertices v at this distance print(f'visiting {v}') for w in g[v]: # for all neighbors w if not visited[w]: visited[w] = True next.append(w) vs = next
We mark each vertex as visited at the moment that we add it to to the list 'next', which ensures that we will not add the same vertex twice.
Alternatively, we can implement a breadth-first search using a queue. Let's suppose that we have a Queue class with operations enqueue() and dequeue(). (We can implement this class using a linked list or a circular array, as we' ve seen in earlier lectures.)
We begin by adding the start vertex to the queue and marking it as visited. In a loop, we repeatedly remove vertices from the queue. Each time we remove a vertex, we mark all of its adjacent unvisited vertices as visited and add them to the queue. The algorithm terminates once the queue is empty, at which point we will have visited all reachable vertices. Here's an implementation:
# breadth-first search def bfs(graph, start): visited = [False] * len(graph) q = Queue() visited[start] = True q.enqueue(start) while not q.is_empty(): v = q.dequeue() print('visiting', v) for w in graph[v]: if not visited[w]: visited[w] = True q.enqueue(w)
This queue-based implementation of a breadth-first search is arguably a bit simpler than the list-based implementation above. However, it does depend on the availability of a queue class.
Note that we mark vertices as visited when we add them to the queue, not when we remove them. This ensures that our algorithm will not add the same vertex to the queue more than once.
As the algorithm runs, all vertices in the queue are at approximately the same distance from the start vertex. To be more precise, at every moment in time there is some value d such that all vertices in the queue are at distance d or (d + 1) from the start vertex.
Like a depth-first search, a breadth-first search does a constant amount of work for each vertex and edge, so it also runs in time O(V + E).
Just as with a depth-first search, we may imagine that at any time during a breadth-first search, each vertex is either undiscovered (white), on the frontier (gray) or explored (black). The queue represents the frontier, i.e. the gray vertices.
Let's revisit the graph of European countries that we saw in last week's lecture notes. Here is a queue-based breadth-first search in progress, starting from Austria:
If we like, as a breadth-first search runs we may record a breadth-first tree, which shows the edges by which the search discovered each vertex in the graph. Equivently, it shows a shortest path from the start vertex to every other vertex. Here is the tree generated by a breadth-first search of the Europe graph:
We may reverse all the arrows in this tree to produce a graph indicating the shortest path from every vertex back to the start vertex:
A common use of breadth-first search is to compute the distance between two vertices, i.e. the length of the shortest path between them.
Let's modify the last breadth-first search function we saw above so that the caller can specify two vertices 'start' and 'end'. The function will print the distance between these vertices, or 'no path' if there is no path between them. In this modified function, each queue element will be a pair that holds both a vertex ID and the distance from the start node to that vertex. Here is the implementation:
def bfs2(g, start, end): n = len(g) visited = [False] * n q = Queue() q.enqueue( (start, 0) ) visited[start] = True while not q.is_empty(): v, dist = q.dequeue() if v == end: # found destination vertex print(f'distance is {dist}') return for w in g[v]: if not visited[w]: visited[w] = True q.enqueue( (w, dist + 1) ) print('no path')
Other variations of this are possible. For example, instead of storing distances in queue elements, we could have an array dist[] that holds the distance from the start vertex to each vertex, or None for undiscovered vertices. (Then we wouldn't actually need the visited[] array, since a vertex v would be visited if and only if dist[v] is not None.)
Depth-first search and breadth-first search are fundamental algorithms for exploring graphs. Actually these algorithms are more broadly applicable: we can use them to solve many problems in which we search through a state space. These problems have a graph-like structure. In each such problem, there is an initial state and a function that maps each state to a list of its possible successor states. We are searching for some path that leads from the initial state to a goal state.
For example, consider the problem of finding the shortest number of hops that a knight can make to travel from one square to another on a chessboard. Recall that a knight's move has an L shape. The following picture indicaes the 8 different squares to which a knight at the given position can move:
In this problem, each state is an (x, y) position on the board. The initial state is the knight's starting square. Each state has up to 8 neighbors, which are the squares that the knight can move to next. The goal state is the position that the knight is trying to reach.
We may solve this problem using a breadth-first search. (A depth-first search would not work, since we are looking for a shortest path.) In fact we need to modify our last breadth-first search function only slightly. Each queue element will be a triple (x, y, dist), where dist is the shortest distance between the starting square and the position (x, y). To compute the neighboring states of each state, we will loop over all possible 8 directions in which a knight may move. Here is an implementation:
dirs = [(1, 2), (-1, 2), (1, -2), (-1, -2), (2, 1), (2, -1), (-2, 1), (-2, -1)] # Print the smallest number of knight moves to get from (x1, y1) # to (x2, y2) on a chessboard. def knight_path(start_x, start_y, end_x, end_y): # visited set = 8 x 8 array of booleans # visited[x][y] is True if we have visited (x, y) visited = [[False] * 8 for i in range(8)] q = Queue() q.enqueue( (start_x, start_y, 0) ) visited[start_x][start_y] = True while not q.is_empty(): x, y, dist = q.dequeue() print(f'visiting {(x, y)}') if (x, y) == (end_x, end_y): print(f'shortest distance is {dist}') return for dx, dy in dirs: x1, y1 = x + dx, y + dy # next position if 0 <= x1 < 8 and 0 <= y1 < 8 and not visited[x1][y1]: q.enqueue( (x1, y1, dist + 1) ) visited[x1][y1] = True print('unreachable')
As an exercise, you could enhance this function so that it prints out a sequence of positions along the shortest path from the start to the end position.
Arithmetic expressions are composed from
numbers and operators that act on those numbers. In this section we
will use the operators +
, -
, *
and /
, the last of which denotes integer division.
In traditional mathematical notation, these
operators are infix operators: they are written between the
values that they act on (which are called operands). For
example, we write 2 + 2
, or 8 - 7
. In this
last expression, 8 and 7 are the operands.
Here are some arithmetic expressions written using infix notation:
(4
+ 5) * (2 + 1)
(7
/ 2)
-
1
We may choose an alternative syntax that uses
prefix notation, in which we write each operator before its
operands. For example, we write + 2 4
to mean the sum of
2 and 4, or / 8 2
to mean 8 divided by 2. Here are the
above expressions rewritten in prefix notation:
* + 4 5 + 2 1 - / 7 2 1
Or we may use postfix notation, in which operators come after
both operands: we might write 4 5 +
to mean the sum of 4
and 5. Here are the above expressions in postfix notation:
4 5 + 2 1 + * 7 2 / 1 -
In infix notation we must write parentheses to distinguish
expressions that would otherwise be ambiguous. For example, 4
+ 5 * 9
could mean either (4 + 5) * 9
or 4
+ (5 * 9)
. (Another way to disambiguate expressions is using
operator precedence. For example, *
is generally
considered to have higher precedence than +
. However,
here we will assume that no such precedence exists.)
In prefix or postfix notation there is no need
either for parentheses or operator precedence, because expressions
are inherently unambiguous. For example, the prefix expression
* + 4 5 9
is equivalent to ((4 + 5) * 9)
,
and the prefix expression + 4 * 5 9
is equivalent to (4
+ (5 * 9))
. There is no danger of confusing these in prefix
form, even without parentheses.