Programming I, 2018-9
Lecture 8 – Notes

partial arrays

Here's a function that computes the sum of all elements in an array:

function sum(const a: array of integer): integer;
var
  s: integer = 0;
  v: integer;
begin
  for v in a do
    s += v;

  exit(s);
end;

Alternatively, we could write the function's signature (the line declaring the function name, argument types and return type) like this:

type
  intArray = array of integer;

function sum2(const a: intArray): integer;
…

The functions sum and sum2 are not the same. In the declaration of sum, array of integer is an open array type, as we have seen in an earlier lecture. We can pass either a static or a dynamic array to an open array parameter. Conversely, we can pass only a dynamic array to sum2.

Now suppose that we have an array with several integers:

var
  a: array[1..10] of integer = (2, 4, 6, 3, 5, 7, 2, 4, 6, 8);
  i: integer;

A call to sum(a) will return the sum of all array elements. If we want the sum of only some of the elements, we can make a call of this form:

i := sum(a[3..6]);   // now i = 6 + 3 + 5 + 7 = 21

Here, we are passing a partial array to the sum function. Any open array parameter can receive a partial array. We can construct a partial array from either a static array (as in this example) or a dynamic array.

Suppose that we add this line at the beginning of the sum function:

writeln('low = ', low(a), ', high = ', high(a));

When we call sum(a[3..6]), this will print

low = 0, high = 3

This shows that sum has its own view of the slice a[3..6] with different indices. Like dynamic arrays, open arrays are always indexed from 0.

recursion with partial arrays

We can use partial arrays to simplify some of the recursive functions we saw in the last lecture. Specifically, we can pass a partial array instead of passing extra integer arguments representing a range of indices.

For example, here is a recursive function that computes the sum of integers in an array:

function sum(const a: array of integer): integer;
begin
  if length(a) = 1 then exit(a[0]);
  
  exit(a[0] + sum(a[1 .. high(a)]));
end;

You might ask whether we can instead use an array of length 0 as the base case. That is possible, provided that we write the recursive case as follows:

function sum(const a: array of integer): integer;
begin
  if length(a) = 0 then exit(0);
  
  exit(sum(a[0 .. high(a) – 1]) + a[high(a)]);
end;

Unfortunately, however, with this base case our original recursive case will fail with a runtime error:

exit(a[0] + sum(a[1 .. high(a)]));

That's because Pascal's range checking will not allow the first index in a partial array a to be greater than high(a). In other words, if a is an array of length 1, the expression a[1 .. 0] will fail with a runtime error. However, a[0 .. -1] is OK and will yield an empty partial array. In my opinion this inconsistency is a bug in the language design.

sorting in any order

Consider a bubble sort on an array of strings:

procedure swap(var s, t: string);
var
  u: string;
begin
  u := s;
  s := t;
  t := u;
end;

procedure sort(var a: array of string);
var
  i, j: integer;
begin
  for i := high(a) - 1 downto 0 do
    for j := 0 to i do
      if a[j] > a[j + 1] then
        swap(a[j], a[j + 1]);
end;

Suppose we have this array of strings to sort:

var
  a: array[1..6] of string = ('sky', 'Fly', 'high', 'Why', 'ply', 'Try');

The > operator compares strings case-sensitively: 's' and 'S' are considered to be different characters. If we sort this array using the > operator, the result will be

('Fly', 'Try', 'Why', 'high', 'ply', 'sky')

That is because capital letters precede lowercase letters in ASCII encoding.

Suppose that we instead want to compare case-insensitively, so that the sort will yield

('Fly', 'high', 'ply', 'Try', 'sky', 'Why')

We can change the if statement above to

if lowerCase(a[j]) > lowerCase(a[j + 1]) then

We can similarly achieve any ordering we like merely by changing this comparison test.

Let's now suppose that we want to sort an array of integers, putting the even integers in sorted order at the beginning, and the odd integers at the end. In other words, if we start with

(7, 6, 1, 5, 0, 2, 8, 3)

we would like the sort to yield

(0, 2, 6, 8, 1, 3, 5, 7)

We can achieve this by inventing a custom ordering of the integers that looks like this:

… -2, 0, 2, 4, 6, …, -3, -1, 1, 3, 5, …

Let's write a function greater that compares integers in this ordering:

// return true if i follows j in the ordering
//     … -2, 0, 2, 4, …, -1, 1, 3, 5, … 
function greater(i, j: integer): boolean;
begin
  if (i mod 2 = 1) and (j mod 2 = 0) then
    exit(true);  // i is odd, j is even

  if (i mod 2 = 0) and (j mod 2 = 1) then
    exit(false); // i is even, j is odd
    
  exit(i > j);     // ordinary integer comparison
end;

And we can now sort into the desired order by simply using a bubble sort on integers that uses this comparison:

if greater(a[j], a[j + 1]) then
  …

merging sorted arrays

Suppose that we have two arrays, each of which contains a sorted sequence of integers. For example:

a = (3, 5, 8, 10, 12)
b = (6, 7, 11, 15, 18)

And suppose that we'd like to merge the numbers in these arrays into a single array c containing all of the numbers in sorted order.

Fortunately this is not difficult. We can use integer variables i and j to point to members of a and b, respectively. Initially i = j = 0. At each step of the merge, we campare a[i] and b[j]. If a[i] < b[j], we copy a[i] into the destination array, and increment i. Otherwise we copy b[j] and increment j. The entire process will run in linear time, i.e. in O(N) where N = length(a) + length(b).

Let's write a function to accomplish this task:

procedure merge(a, b: array of integer; var m: array of integer);
var
  i, j, k: integer;
begin
  i := 0;
  j := 0;
  for k := 0 to high(m) do
    if (j > high(b)) or
       ((i <= high(a)) and (j <= high(b)) and (a[i] < b[j])) then
      begin
        m[k] := a[i];
        i += 1;
      end
    else
      begin
        m[k] := b[j];
        j += 1;
      end
end;

The trickiest part of this code is the if condition. At each step of the merge, there are three possibilities:

j is out of bounds, i.e. j > high(b). We want to take a[i].
i is out of bounds, i.e. i > high(a). We want to take b[j].
i and j are both in bounds, i.e. i ≤ high(a) and j ≤ high(b). We want to take a[i] only if a[i] < b[j].

The if condition encompasses possibilities (a) and (c).

Note that the expression 'j <= high(b)' in the if condition is actually redundant. That's because if j > high(b), Pascal will never evaluate the expressions after the 'or'. So we could actually rewrite the condition as

if (j > high(b)) or
    ((i <= high(a)) and (a[i] < b[j])) then
    …

Now we can use merge to merge the arrays a and b mentioned above:

var
 a: array[1..5] of integer = (3, 5, 8, 10, 12);
 b: array[1..5] of integer = (6, 7, 11, 15, 18);

 c: array[1..10] of integer;


begin

 merge(a, b, c);

 …

We may also pass partial arrays to merge. Suppose that we have a single array a that includes two sorted segments:

var
  a: array[1..10] of integer = (3, 5, 8, 10, 12, 6, 7, 11, 15, 18);

We can merge a[1..5] and a[6..10] into c:

merge(a[1..5], a[6..10], c);

Can we even merge a[1..5] and a[6..10] back into the array a? At first it might appear that we cannot, because our merge algorithm does not work in place. For example, as we merge the two halves of the array above, it might seem that we will overwrite a[3] with the value 6 before we merge the value 8.

But actually our merge function will work even for merging back into the same array! That's because its two array parameters a and b are passed by value – they are not preceded with const or var. So merge will copy these arrays before its code begins to execute.

mergesort

We now have a function that merges two sorted arrays. We can use this as the basis for implementing a general-purpose sorting algorithm called mergesort.

Mergesort has a simple recursive structure. To sort an array of n elements, it divides the array in two and recursively mergesorts each half. It then merged the two sorted subarrays into a single sorted array. This problem solving approach is called divide and conquer.

For example, consider mergesort’s operation on this array:

Merge sort splits the array into two halves:

It then sorts each half, recursively.

Finally, it merges these two sorted arrays back into a single sorted array:

Here's an animation of mergesort in action on the above array.

Here's an implemention of mergesort, using our merge procedure from above:

procedure mergesort(var a: array of integer);
var
  n: integer;
begin
  if length(a) <= 1 then exit;
  
  n := high(a) div 2;
  mergesort(a[0 .. n]);
  mergesort(a[n + 1 .. high(a)]);
  merge(a[0 .. n], a[n + 1 .. high(a)], a);
end;

Here is the complete set of recursive calls that mergesort will make when initially invoked on an array of size 8:

mergesort(a[0..7])
    mergesort(a[0..3])
        mergesort(a[0..1])
            mergesort(a[0..0])
            mergesort(a[1..1])
            merge(a[0..0], a[1..1], a[0..1])
        mergesort(a[2..3])
            mergesort(a[2..2])
            mergesort(a[3..3])
            merge(a[2..2], a[3..3], a[2..3])
        merge(a[0..1], a[2..3], a[0..3])
    mergesort(a[4..7])
        mergesort(a[4..5])
            mergesort(a[4..4])
            mergesort(a[5..5])
            merge(a[4..4], a[5..5], a[4..5])
        mergesort(a[6..7])
            mergesort(a[6..6])
            mergesort(a[7..7])
            merge(a[6..6], a[7..7], a[6..7])
        merge(a[4..5], a[6..7], a[4..7])
    merge(a[0..3], a[4..7], a[0..7])

What is the running time of mergesort? The helper function merge runs in time O(N), where N is the length of the array c. So the running time of mergesort follows the recurrence

T(N) = 2 ⋅ T(N / 2) + O(N)

We have not seen this recurrence before. As we have mentioned before, in this class we will not formally study how to solve recurrences such as this one. But its solution is

T(N) = O(N log N)

Intuitively, why does mergesort run in O(N log N)? First, notice that the function will recurse to a depth of log₂(N). For example, mergesort(a[0..7]) calls mergesort(a[0..3]), which calls mergesort(a[0..1]), which calls mergesort(a[0..0]). At each recursive call, the size of the array drops in half, so mergesort is called on an array with a single element, which is the base case, at a depth of log₂(N).

Now consider the work that is done at each recursion level. As visible in the call tree above, at the third recursion level we merge a[0..0] and a[1..1], as well as a[2..2] and a[3..3], and so on. Each array element is merged exactly once. At the second recursion level we merge a[0..1] and a[2..3], as well as a[4..5] and a[6..7]. Again, each element is merged once. Because the merges run in linear time, the total merging work at each level is O(N). So the total run time is O(log N) levels times O(N), or O(N log N).

For large N, O(N log N) is much faster than O(N²), so mergesort will be far faster than insertion sort or bubble sort. For example, suppose that we want to sort 1,000,000,000 numbers. And suppose (somewhat optimistically) that we can perform 1,000,000,000 operations per second. An insertion sort might take roughly N² = 1,000,000,000 * 1,000,000,000 operations, which will take 1,000,000,000 seconds, or about 32 years. A mergesort might take roughly N log N ≈ 30,000,000,000 operations, which will take 30 seconds. This is a dramatic difference. :)

stacks

We are now ready to begin our study of data structures. The first sort of structure we will study is called a stack.

A stack is a data structure supporting the push and pop operations. push pushes a value onto a stack, and pop removes the value that was most recently pushed. This is like a stack of sheets of paper on a desk, where sheets can be added or removed at the top.

In other words, a stack is a last in first out data structure: the last element that was added is the first to be removed.

Here is an interface for a stack:

type stack = ...
procedure init(var s: stack);
procedure push(var s: stack; i: integer);
function pop(var s: stack): integer;
function isEmpty(s: stack): boolean;

This interface specifies a stack as an abstract data type. In other words, it specifies how a stack behaves, without specifying how it is implemented. And in fact several different implementations are possible, each with various performance characteristics.

Before we describe how to implement a stack, let's look at how one can be used. For example:

var
  s: stack;
  i: integer;

begin
  init(s);
  
  push(s, 4);
  push(s, 8);
  for i := 1 to 5 do
    push(s, i);
  
  while not isEmpty(s) do
    write(pop(s), ' ');
  writeln;
end;

This code will write

5 4 3 2 1 8 4

implementing a stack with an array

Here's a first attempt at implementing a stack with a dynamic array.

type
  stack = array of integer;

procedure init(var s: stack);
begin
  setLength(s, 0);
end;

procedure push(var s: stack; i: integer);
begin
  setLength(s, length(s) + 1);
  s[high(s)] := i;
end;

function pop(var s: stack): integer;
var
  k: integer;
begin
  k := s[high(s)];
  setLength(s, length(s) - 1);
  exit(k);
end;

function isEmpty(s: stack): boolean;
begin
  exit(length(s) = 0);
end;

The implementation is straightforward: the array contains all stack elements, with the top of the stack (i.e. the most recently pushed element) at the end of the array.

This stack will work fine. But now consider: what will be the running time of the following for loop, as a function of N?

var
  s: stack;
  i: integer;

begin
  init(s);
  for i := 1 to N do
    push(s, i);

The for loop will make N calls to push, which will in turn make N calls to setLength. We have not previously considered how long setLength might take to run. You might think it runs in constant time, but in fact setLength(a, n) runs in time O(n). In other words, setLength runs in time proportional to the length of the array that it is constructing. That is essentially because behind the scenes, a call to setLength will often create a new copy of the array that it is extending. (Just why that happens is related to memory allocation algorithms and is beyond the scope of this course.)

So, then: the for loop above will result in calls to

setLength(s, 1)

setLength(s, 2)

…

setLength(s, n)

This will take time

O(1 + 2 + 3 + … + N) = O(N²)

That's not ideal. How can we do better?

Actually we can modify our array-based stack implementation to be more efficient. Instead of extending the dynamic array by a single element on each push, we will double the size of the dynamic array when we need to increase it. In this new implementation we will represent a stack as follows:

type
  stack = record
    a: array of integer;
    count: integer;
  end;

In this record, a is a dynamic array. count is the number of elements of the array that are currently in use, i.e. currently hold stack elements. In other words, count is the current number of items on the stack, which are in the array elements a[0 .. (count - 1)]. All following elements in a are free. The push procedure will fill in a free array element if there are any; otherwise the array is full, and it will double the array size.

Here is the complete implementation:

procedure init(var s: stack);
begin
  setLength(s.a, 1);
  s.count := 0;
end;

procedure push(var s: stack; i: integer);
begin
  if length(s.a) = s.count then       // array is full
    setLength(s.a, length(s.a) * 2);  // so expand it

  s.a[s.count] := i;
  s.count += 1;
end;

function pop(var s: stack): integer;
var
  n: integer;
begin
  n := s.a[s.count - 1];
  s.count -= 1;
  exit(n);
end;

function isEmpty(s: stack): boolean;

begin
  exit (s.count = 0);
end;

In this updated implementation, how long will this loop take to run, as a function of N?

init(s);

for i := 1 to N do
    push(s, i);

Suppose that N is a power of 2. As the loop runs, we will make the following calls to setLength:

setLength(s.a, 1)  // during init()

setLength(s.a, 2)

setLength(s.a, 4)

setLength(s.a, 8)

…

setLength(s.a, N)

The total running time will be

O(1 + 2 + 4 + 8 + … + N)

How large is this? Let N = 2^b. Then we can use the formula for the sum of a geometric series. Recall that if the first term of a geometric series is a₁, the series has n terms and each term is r times the previous term, then its sum is

a₁(1 – rⁿ) / (1 – r)

So we have

1 + 2 + 4 + 8 + … + 2^b = 1 (1 – 2^{b + 1}) / (1 – 2) = 2^{b + 1}- 1 = 2N – 1 = O(N)

In other words,

1 + 2 + 4 + 8 + … + N = O(N)

You should remember this important fact.

So with our new stack implementation we can push N values in O(N) time. That's a dramatic improvement over our previous implementation. Notice, however, that some pushes will take longer than others. In particular, a single push takes O(N) in the worst case, where N is the current number of items on the stack. However push operations take O(1) on average, since we can perform N of them in O(N).

So now we might ask: can we implement a stack in some other way that lets us push in O(1) even in the worst case? The answer is yes, but to do that we will need to use pointers and dynamic memory allocation, which we will cover in the next lecture.

Programming I, 2018-9 Lecture 8 – Notes