Programming I, 2018-9
Lecture 11 – Notes

nested functions

In Pascal a function or procedure may be nested inside another function or procedure:

// Determine how many elements of a are greater than x * x.
function countGreater(const a: array of integer; x:integer): integer;
var
  y: integer;
  j: integer;
  count: integer = 0;
  
  function isGreater(i: integer): boolean;
  begin
    exit(a[i] > y);
  end;
  
begin
  y := x * x;
  
  for j := 0 to high(a) do
    if isGreater(j) then
      count += 1;
      
  exit(count);
end;

In this example isGreater is a nested function. Notice that it can access both parameters (i.e. a) and local variables (i.e. y) of its containing function.

This example is trivial: there is no need to use a nested function here. But we will soon see that nested functions can be very useful in some circumstances.

maps

A map is an abstract data type representing a mapping from keys to values. For example, a map from integers to integers might look like this:

42 -> 9674
582 -> 1001
1002345 -> 77

A map may not contain duplicate keys.

An alternate term for a map is a dictionary.

The interface for a map looks like this:

type
  map = ...

procedure init(var m: map);

// Add a key-value pair to the map, replacing any previous value for this key.
procedure put(var m: map; key: integer; val: integer);

// Return true if this key is present in the map.
function contains(m: map; key: integer): boolean;

// Return this key's value, or -MaxInt if the key is not in the map.
function get(m: map; key: integer): integer;

// Remove this key and its value from the map if present.
procedure remove(m: map; key: integer);

We can use a map like this:

var
  m: map;

begin
  init(m);
  put(m, 42, 9670);
  put(m, 582, 1001);
  put(m, 42, 9674);  // replace previous value for 42
  put(m, 1002345, 77);
  writeln(get(m, 42));  // writes 9674

How can we implement a map? As with our other abstract data types, there are various possible ways. If we know that all the keys in a map will be in a small fixed integer range, we can simply use an array:

type
  // An array holding each key's value, or -MaxInt if the key is not present.
  map = array[0..999] of integer;

procedure init(var m: map);
var
  i: integer;
begin
  for i := 0 to 999 do
     m[i] := -MaxInt;
end;

procedure put(var m: map; key: integer; val: integer);
begin
  m[key] := val;
end;

function contains(m: map; key: integer): boolean;
begin
  exit(m[key] <> -MaxInt);
end;

function get(m: map; key: integer): integer;
begin
  exit(m[key]);
end;

procedure remove(m: map; key: integer);
begin
  m[key] := -MaxInt;
end;

Usually, however, we will want keys that can be arbitrary values. An array indexed with all possible 32-bit integers would be impractically large and wasteful. And we will also want to build maps whose keys are not integers - for example, a map from strings to integers, or from reals to reals.

Fortunately we can implement a map from any type to any type in a straightforward manner. We have already seen how to implement sets in several ways: using an unordered array, an ordered array or a binary tree. We can easily transform any set implementation into a map implementation by storing a value along with each key.

For example, suppose that we'd like to use a binary tree to store a map from integers to strings. We can use this node type:

type
  node = record
    key: integer;
    val: string;
    left: ^node;
    right: ^node;
  end;

And now we can implement all map operations using the binary tree operations that we already know. For example, to put a key-value pair into this map, we descend the binary tree, looking for the given key. If we find it, we update its value; if we don't, we add a new node to the tree containing the new key-value pair. We can implement the get operation similarly.

quicksort

In recent lectures we've seen several sorting algorithms (bubble sort, insertion sort, mergesort). Quicksort is another sorting algorithm. It is efficient (typically at least twice as fast as mergesort) and is very commonly used.

Quicksort sorts values in an array in place (unlike mergesort, which needs to use extra memory to perform merges). We will assume that array indices begin at 0, like in dynamic arrays in Pascal. Given an array a[0 .. n – 1] of values to sort, quicksort first divides the array into two non-empty partitions a[0 .. p] and a[p + 1 .. n – 1] for some integer p. It does this in such a way that every element in the first partition (a[0 .. p]) is less than or equal to every element in the second partition (a[p + 1 .. n – 1]). After partitioning, quicksort recursively calls itself on each of the two partitions.

The high-level structure of quicksort is similar to mergesort: both algorithms divide the array in two and call themselves recursively on both halves. With quicksort, however, there is no more work to do after these recursive calls. Sorting both partitions completes the work of sorting the array that contains them. Another difference is that merge sort always divides the array into two equal-sized pieces, but in quicksort the two partitions are generally not the same size, and sometimes one may be much larger than the other.

To partition an array, we first choose a random element in the array to use as the pivot. The pivot has some value v. Once the partition is complete, all elements in the first partition will have values less than or equal to v, and elements in the right partition will have values greater than or equal to v.

We define integer variables i and j representing array indexes. Initially i is positioned at the left end of the array and j is positioned at the right. We move i rightward, looking for a value that is greater than or equal to v (i.e. it can go into the second partition). And we move j leftward, looking for a value that is less than or equal to v. Once we have found these values, we swap a[i] and a[j]. After the swap, a[i] and a[j] now hold acceptable values for the left and right partitions, respectively. Now we continue the process, moving i to the right and j to the left. Eventually i and j meet. The point where they meet is the division point between the partitions, i.e. is the index p mentioned above.

Note that the pivot element itself may move to a different position during the partitioning process! There is nothing wrong with that. The pivot really just serves as an arbitrary value to use for separating the partitions.

For example, consider quicksort’s operation on this array:

%3

The array has n = 8 elements. Suppose that we choose 3 as our pivot. We begin by setting i = -1 and j = 8. We move i rightward, looking for a value that is greater than or equal to 3. Immediately we find a[0] = 6 ≥ 3. We also move j leftward, looking for a value that is less than or equal to 3. We soon find a[6] = 2 ≤ 3. So we swap a[0] and a[6]:

%3

Now a[1] = 5 ≥ 3. j moves leftward until j = 3, since a[3] = 3 ≤ 3. So we swap a[1] and a[3]:

%3

Now i moves rightward until it encounters a value that is greater than or equal to 3; the first such value is a[3] = 5. Similarly, j moves leftward until it hits a value that is less than or equal to 3; the first such value is a[2] = 1. So i = 3 and j = 2, and j < i. i and j have crossed, so the partitioning is complete. The first partition is a[0..2] and the second partition is a[3..7]. Every element in the first partition is less than every element in the second partition. Quicksort will now sort both partitions, recursively.

Notice that in this example the pivot element itself moved during the partitioning process. As mentioned above, this is common.

Here is an implementation of the partition algorithm in Pascal:

procedure swap(var i, j: integer);
var
  k: integer;
begin
  k := i;
  i := j;
  j := k;
end;

// Partition an array of at least two elements.
// Return an index p < high(a) such that
//   every element of a[0 .. p] <= every element of a[p + 1 .. high(a)]
function partition(var a: array of integer): integer;
var
  v, i, j: integer;
begin
  v := a[random(high(a))];    // pivot value (never the last element in the array)
  i := -1;
  j := high(a) + 1;
  
  while true do
    begin
      repeat
        i := i + 1;
      until a[i] >= v;
      repeat
        j := j - 1;
      until a[j] <= v;
      
      if i >= j then break;
      swap(a[i], a[j]);
    end;
    
    partition := j;
end;

This partition function has these properties:

  1. All array accesses that happen as partition runs are within the array bounds.

  2. The returned value p is in the range 0 <= p < high(a) (and so both partitions contain at least one element).

  3. All elements in the left partition are less than or equal to all elements in the right partition.

Hopefully these properties make sense intuitively. Proving that these properties are always true is, however, a bit tricky since there are various cases to consider. (We will omit the proof here.)

Note that when we choose a random element to act as the pivot, we intentionally exclude the last element of the array. In other words, the expression a[random(high(a))] chooses a random pivot between a[0] and a[high(a) - 1], but never a[high(a)]. (If we did choose the last element of the array, then partition could return the value high(a), in which case the second partition would be empty, violating property (2) above.)

With partition in place, we can easily write our top-level quicksort procedure:

procedure quicksort(var a: array of integer);
var
  p: integer;
begin
  if length(a) < 2 then exit;
  p := partition(a);
  quicksort(a[0 .. p]);
  quicksort(a[p + 1 .. high(a)]);
end;

performance of Quicksort

To analyze the performance of quicksort, we first observe that partition(a) runs in time O(n), where n = length(a). This is because the number of swaps must be no more than n / 2, and because i and j stay within bounds so their values will increment or decrement at most n times.

In the best case, quicksort divides the input array into two equal-sized partitions at each step. Then we have

T(n) = 2 T(n / 2) + O(n)

This is the same recurrence we saw when we analyzed mergesort in a previous lecture. Its solution is

T(n) = O(n log n)

In the worst case, at each step one partition has a single element and the other partition has all remaining elements. Then

T(n) = T(n – 1) + O(n)

This yields

T(n) = O(n2)

So how will quicksort typically perform?

This depends critically on the choice of pivot elements. But if we choose pivot elements randomly as we have done here, then quicksort’s expected performance will be O(n log n) for any input array. We will not prove this fact in this class. Note, however, that it is somewhat similar to our (also here unproven) claim that if elements are inserted into a binary tree in a random order, then the expected height of the tree will be O(log n), where n is the number of elements in the tree.

representing graphs

A graph consists of a set of vertices (often called nodes in computer science) and edges. Each edge connects two vertices. In a undirected graph, edges have no direction: two vertices are either connected by an edge or they are not. In a directed graph, each edge has a direction: it points from one vertex to another.

Graphs are fundamental in computer science. Many problems can be expressed in terms of graphs, and we can answer lots of interesting questions using various graph algorithms.

We can represent graphs in a computer program in either of two ways. Suppose that a graph has V vertices numbered 0 through V - 1. We can represent the graph using adjacency-matrix representation as a two-dimensional matrix A of booleans, with dimensions V x V. In this representation, A[i, j] is true if and only there is an edge from vertex i to vertex j. If the graph is undirected, then A[i, j] = A[j, i], and so we may optionally save space by storing only the matrix elements above the main diagonal, i.e. elements for which i < j.

Here is a Pascal data type for storing a graph in adjacency-matrix representation, where each vertex is named with a string:

type
  graph = record
    name: array of string;
    adj: array of array of boolean;
  end;

Or we can use adjacency-list representation, in which for each vertex u we store a list of its adjacent vertices,. Here is a Pascal type for a graph in adjacency-list representation:

  node = record
    name: string;
    adj: array of integer;    // adjacency list
  end;

  graph = array of node;

When we store an undirected graph in adjacency-list representation, then if there is an edge from u to v we store u in v’s adjacency list and also store v in u’s adjacency list.

We can store any information that we like in a graph node. In the examples above we have used a string, but you can replace that with any fields that you like, or with no fields at all.

Adjacency-matrix representation is more compact if a graph is dense. It allows us to immediately tell whether two vertices are connected, but it may take O(V) time to enumerate a given vertex’s edges. On the other hand, adjacency-list representation is more compact for sparse graphs. With this representation, if a vertex has e edges then we can enumerate them in time O(e), but determining whether two vertices are connected may take time O(e), where e is the number of edges of either vertex. Thus, the running time of some algorithms may differ depending on which representation we use.

In this course we will usually represent graphs using adjacency-list representation. Each vertex will have a unique integer ID.

Here is a function readGraph that reads an undirected graph in which nodes are named by strings. It uses the adjacency-list graph type defined above, and assigns integer IDs to the nodes as it creates them. Each line of the input contains two words, e.g.

dog cat

This indicates that the nodes named "dog" and "cat" are neighbors, i.e. an edge should exist between them.

// Find a node by name, or create it if it doesn't already exist
function findNode(var g: graph; name: string): integer;
var
  i: integer;
begin
  for i := 0 to high(g) do
    if g[i].name = name then exit(i);
    
  setLength(g, length(g) + 1);
  g[high(g)].name := name;
  exit(high(g));
end;

// Create an directed edge between two graph nodes
procedure connect(var g: graph; i, j: integer);
begin
  setLength(g[i].adj, length(g[i].adj) + 1);
  g[i].adj[high(g[i].adj)] := j;
end;

function readGraph(filename: string): graph;
var
  input: text;
  g: graph;
  s: string;
  i, j, p: integer;
begin
  assign(input, filename);
  reset(input);
  setLength(g, 0);
  
  while not eof(input) do
    begin
      readln(s);
      p := pos(s, ' ');
      i := findNode(g, s[1 .. p - 1]);
      j := findNode(g, s[p + 1 .. length(s)]);
      connect(g, i, j);    // connect i -> j
      connect(g, j, i);    // connect j -> i
    end;

  close(input);
  exit(g);
end;

a graph of Europe

Here is an undirected graph where each node is a country in the European Union. There is an edge between every two neighboring countries, and also between any two countries connected by a car ferry (in this case appearing as a dashed line). Thus, two countries are connected by an edge if it is possible to drive from one to the other. We will use this graph to illustrate various algorithms in the following sections.

map


depth-first search

In this course we will study several algorithms that can search a graph, i.e. explore it by visiting its nodes in some order.

All of these algorithms will have some common elements. As they run, at each moment in time the graph’s vertices fall into three sets: undiscovered vertices, vertices on the frontier (also called the open set), and explored vertices (also called the closed set). At the beginning all vertices are undiscovered. A vertex joins the frontier when the algorithm first sees it. After the algorithm has followed all of the vertex's edges, the vertex becomes explored. By convention, when we draw pictures illustrating graph algorithms we draw vertices as either white (undiscovered), gray (on the frontier) or black (explored).

Note that these algorithms will not usually mark vertices as belonging to one of these three states explicitly. Instead, these states are concepts that help us understand how these algorithms work.

A depth-first search is like exploring a maze by walking through it. Each time we hit a dead end, we walk back to the previous intersection, and try the next unexplored path from that intersection. If we have already explored all paths from the intersection, we walk back to the insersection before that, and so on.

It is easy to implement a depth-first search using a recursive function. In fact we have already done so in this course! For example, we wrote a function to add up all values in a binary tree, like this:

function sum(tree: pnode): integer;
begin
  sum := sum(tree^.left) + tree^.i + sum(tree^.right);
end;

This function visits all tree nodes using a depth-first tree search.

For a depth-first search on arbitrary graphs which may not be trees, we must avoid walking through a cycle in an infinite loop. To accomplish this, when we first visited a vertex we mark it as visited. In other words, all vertices in the frontier and explored sets are considered to be visited. Whenever we follow an edge, if it leads to a visited vertex then we ignore it.

Here is a picture of a depth-first search in progress on our Europe graph, starting from Austria:

map
As the depth-first search progresses it traverses a depth-first tree that spans the original graph. If we like, we may store this tree in memory as an product of the depth-first search. Here is a depth-first tree for the Europe graph:

map
Here is Pascal code that implements a depth-first search using a nested recursive procedure. It determines whether a graph is connected, i.e. whether every vertex is reachable from every other vertex.

function isConnected(const g: graph): boolean;
var
  visited: array of boolean;
  b: boolean;
  
  procedure visit(i: integer);
  var
    j: integer;
  begin
    visited[i] := true;

    for j in g[i].adj do
      if not visited[j] then
         visit(j);
  end;
  
begin
  setLength(visited, length(g));  // sets all array elements to false
  visit(0);

  for b in visited do
    if not b then
      exit(false);
  exit(true);
end;

In this function we begin the depth-first search at node 0, an arbitrary choice. If the graph is connected, we can reach all vertices from every vertex.

In this example we used a depth-first search to determine whether a graph is connected. As we will see in later lectures and in Programming II, a depth-first search is also a useful building block for other graph algorithms: determining whether a graph is cyclic, topological sorting, discovering strongly connnected components and so on.

A depth-first search does a constant amount of work for each vertex (making a recursive call) and for each edge (following the edge and checking whether the vertex it points to has been visited). So it runs in time O(V + E), where V and E are the numbers of vertices and edges in the graph.

breadth-first search

Starting from some node N, a breadth-first search first visits nodes adjacent to N, i.e. nodes of distance 1 from N. It then visits nodes of distance 2, and so on. In this way it can determine the shortest distance from N to every other node in the graph.

We can implement a breadth-first search using a queue. Just like with depth-first graph search, we must remember all nodes that we have visited to avoid walking in circles. We begin by adding the start node to the queue and marking it as visited. In a loop, we repeatedly remove nodes from the queue. Each time we remove an node, we mark all of its adjacent unvisited nodes as visited and add them to the queue. The algorithm terminates once the queue is empty, at which point we will have visited all reachable nodes.

The queue represents the frontier. When we remove a node from the queue, it moves to the explored set. Just like with depth-first graph search, the visited nodes are the frontier nodes and the nodes in the explored set.

As the algorithm runs, all nodes in the queue are at approximately the same distance from the start node. To be more precise, at every moment in time there is some value d such that all nodes in the queue are at distance d or (d + 1) from the start node.

Here is a breadth-first search in progress on our Europe graph, starting from Austria:

map
Just like depth-first search, breadth-first search traverses a tree which spans the original graph. A breadth-first tree indicates a shortest path from the start node to every other node in the graph. (Note, however, that the shortest path between two graph nodes is not necessarily unique.) Here is a breadth-first tree for the Europe graph:

map
We will sometimes store a breadth-first tree as a directed graph in memory with the arrows pointing in the other direction, i.e. toward the start node. This is sometimes called an in-tree. In this representation each node points toward its predecessor in the breadth-first graph. Here is the previous breadth-first tree as an in-tree:

map
Here is an implementation of isConnected using a breadth-first search:

function isConnected(const g: graph): boolean;
var
  visited: array of boolean;
  q: queue;
  i, j: integer;
  b: boolean;
begin
  setLength(visited, length(g));
  init(q);
  enqueue(q, 0);
  visited[0] := true;
  
  while not isEmpty(q) do
    begin
      i := dequeue(q);
      for j in g[i].adj do
        if not visited[j] then
          begin
            enqueue(q, j);
            visited[j] := true;
          end;
    end;

  for b in visited do
    if not b then
      exit(false);
  exit(true);
end;

Note that we must mark nodes as visited when we add them to the queue, not when we remove them. (If we marked them as visited only when removing them, then our algorithm could add the same node to the queue more than once.)

Like a depth-first search, a breadth-first search does a constant amount of work for each vertex and edge, so it also runs in time O(V + E).

non-recursive depth-first search

Suppose that we replace the queue in our preceding breadth-first search function with a stack:

function isConnected(const g: graph): boolean;
var
  visited: array of boolean;
  s: stack;
  i, j: integer;
  b: boolean;
begin
  setLength(visited, length(g));
  init(s);
  push(s, 0);
  visited[0] := true;
  
  while not isEmpty(s) do
    begin
      i := pop(s);
      for j in g[i].adj do
        if not visited[j] then
          begin
            push(s, j);
            visited[j] := true;
          end;
    end;

  for b in visited do
    if not b then
      exit(false);
  exit(true);
end;

The function now performs a depth-first search!

Specifically, this is a non-recursive depth-first search, or a depth-first search with an explicit stack.

This shows that there is a close relationship between stacks and depth-first search. Specifically, a stack is a LIFO (last in first out) data structure. And when we perform a depth-first search, the last frontier node we discover is the first that we will expand by following its edges. Similarly, a queue is a FIFO (first in first out) data structure, and in a breadth-first search the first frontier node we discover is the first that we will expand.

It is sometimes wise to implement a depth-first search non-recursively in a situation where the search may be very deep. This will avoid running out of call stack space, which is fixed on most operating systems.

example: shortest distance between nodes

Let's write a function that takes a graph and two vertex ids and returns an integer representing the length of the shortest path between the vertices. To do so, we will modify our breadth-first search implementation from above. Instead of the visited array, we will now store an array of integer storing the distance from the start node to each node we encounter. If a node is undiscovered, its distance is MaxInt.

function distance(const g: graph; v, w: integer): integer;
var
  dist: array of integer;
  q: queue;
  i, j: integer;
begin
  setLength(dist, length(g));
  for i := 0 to high(g) do
    dist[i] := MaxInt;

  init(q);
  enqueue(q, v);
  dist[v] := 0;
  
  while not isEmpty(q) do
    begin
      i := dequeue(q);
      for j in g[i].adj do
        if dist[j] = MaxInt then
          begin
            enqueue(q, j);
            dist[j] := dist[i] + 1;
          end;
    end;

  exit(dist[w]);
end;

example: maze solvability

We can use our graph search algorithms to solve many problems that have a graph-like structure, even if they are not explicitly problems about graphs.

For example, let's write a function to determine whether a maze is solvable. Specifically, we will write a function that takes an array of array of boolean representing a rectangular maze, where walls are represented by array elements that are true. The function will return a boolean indicating whether there is any path from the upper-left corner (0, 0) to the lower-right corner of the maze. We will assume that in each step we may move up, down, left, or right, but not diagonally.

We could solve this problem using either a depth-first or breadth-first search. We will use a depth-first search, implemented recursively.

We could convert the input array to a graph in adjacency-list representation, where each square in the maze is a separate graph vertex. But that is unnecessary and would be wastefully inefficient. Instead, we can implement a depth-first search directly on the maze itself. At each step, instead of iterating over an adjacency list we will iterate over the neighbors of the current maze square, i.e. the four squares that are above, below, to the left and to the right of the square. Notice how the constant arrays dx and dy let us easily loop over the four compass directions.

Also notice that we can conveniently implement valid as a nested function that can access the local variables width and height.

type
  maze = array of array of boolean;

function solvable(m: maze): boolean;

const
  dx: array[0..3] of integer = (1, -1, 0, 0);
  dy: array[0..3] of integer = (0, 0, 1, -1);
  
var
  width, height: integer;
  visited: array of array of boolean;
  
  function valid(x, y: integer): boolean;
  begin
    exit((0 <= x) and (x < width) and (0 <= y) and (y < height));
  end;
  
  procedure visit(x, y: integer);
  var
    dir, x1, y1: integer;
  begin
    visited[x, y] := true;
    for dir := 0 to 3 do
      begin
        x1 := x + dx[dir];
        y1 := y + dy[dir];
        if valid(x1, y1) and not visited[x1, y1] then
          visit(x1, y1);
      end;
  end;
  
begin
  width := length(m);
  height := length(m[0]);
  setLength(visited, width, height);
  visit(0, 0);
  exit(visited[width - 1, height - 1]);
end;