Here's a function that computes the sum of all elements in an array:
function sum(const a: array of integer): integer; var s: integer = 0; v: integer; begin for v in a do s += v; exit(s); end;
Alternatively, we could write the function's signature (the line declaring the function name, argument types and return type) like this:
type intArray = array of integer; function sum2(const a: intArray): integer; …
The functions sum
and sum2
are not the
same. In the declaration of sum
, array of integer
is an open array type, as we have seen in an earlier lecture.
We can pass either a static or a dynamic array to an open array
parameter. Conversely, we can pass only a dynamic array to sum2
.
Now suppose that we have an array with several integers:
var a: array[1..10] of integer = (2, 4, 6, 3, 5, 7, 2, 4, 6, 8); i: integer;
A call to sum(a)
will return the sum of all array
elements. If we want the sum of only some of the elements, we can
make a call of this form:
i := sum(a[3..6]); // now i = 6 + 3 + 5 + 7 = 21
Here, we are passing a partial array to the sum function. Any open array parameter can receive a partial array. We can construct a partial array from either a static array (as in this example) or a dynamic array.
Suppose that we add this line at the beginning of the sum function:
writeln('low = ', low(a), ', high = ', high(a));
When we call sum(a[3..6])
, this will print
low = 0, high = 3
This shows that sum
has its own view of the slice
a[3..6]
with different indices. Like dynamic arrays,
open arrays are always indexed from 0.
We can use partial arrays to simplify some of the recursive functions we saw in the last lecture. Specifically, we can pass a partial array instead of passing extra integer arguments representing a range of indices.
For example, here is a recursive function that computes the sum of integers in an array:
function sum(const a: array of integer): integer; begin if length(a) = 1 then exit(a[0]); exit(a[0] + sum(a[1 .. high(a)])); end;
You might ask whether we can instead use an array of length 0 as the base case. That is possible, provided that we write the recursive case as follows:
function sum(const a: array of integer): integer; begin if length(a) = 0 then exit(0); exit(sum(a[0 .. high(a) – 1]) + a[high(a)]); end;
Unfortunately, however, with this base case our original recursive case will fail with a runtime error:
exit(a[0] + sum(a[1 .. high(a)]));
That's because Pascal's range checking will not allow the first index in a partial array a to be greater than high(a). In other words, if a is an array of length 1, the expression a[1 .. 0] will fail with a runtime error. However, a[0 .. -1] is OK and will yield an empty partial array. In my opinion this inconsistency is a bug in the language design.
Consider a bubble sort on an array of strings:
procedure swap(var s, t: string); var u: string; begin u := s; s := t; t := u; end; procedure sort(var a: array of string); var i, j: integer; begin for i := high(a) - 1 downto 0 do for j := 0 to i do if a[j] > a[j + 1] then swap(a[j], a[j + 1]); end;
Suppose we have this array of strings to sort:
var a: array[1..6] of string = ('sky', 'Fly', 'high', 'Why', 'ply', 'Try');
The > operator compares strings case-sensitively: 's' and 'S' are considered to be different characters. If we sort this array using the > operator, the result will be
('Fly', 'Try', 'Why', 'high', 'ply', 'sky')
That is because capital letters precede lowercase letters in ASCII encoding.
Suppose that we instead want to compare case-insensitively, so that the sort will yield
('Fly', 'high', 'ply', 'Try', 'sky', 'Why')
We can change the if statement above to
if lowerCase(a[j]) > lowerCase(a[j + 1]) then
We can similarly achieve any ordering we like merely by changing this comparison test.
Let's now suppose that we want to sort an array of integers, putting the even integers in sorted order at the beginning, and the odd integers at the end. In other words, if we start with
(7, 6, 1, 5, 0, 2, 8, 3)
we would like the sort to yield
(0, 2, 6, 8, 1, 3, 5, 7)
We can achieve this by inventing a custom ordering of the integers that looks like this:
… -2, 0, 2, 4, 6, …, -3, -1, 1, 3, 5, …
Let's write a function greater
that compares integers
in this ordering:
// return true if i follows j in the ordering // … -2, 0, 2, 4, …, -1, 1, 3, 5, … function greater(i, j: integer): boolean; begin if (i mod 2 = 1) and (j mod 2 = 0) then exit(true); // i is odd, j is even if (i mod 2 = 0) and (j mod 2 = 1) then exit(false); // i is even, j is odd exit(i > j); // ordinary integer comparison end;
And we can now sort into the desired order by simply using a bubble sort on integers that uses this comparison:
if greater(a[j], a[j + 1]) then …
Suppose that we have two arrays, each of which contains a sorted sequence of integers. For example:
a = (3, 5, 8, 10, 12) b = (6, 7, 11, 15, 18)
And suppose that we'd like to merge the numbers
in these arrays into a single array c
containing all of the numbers in sorted order.
Fortunately this is not difficult. We can use integer variables i and j to point to members of a and b, respectively. Initially i = j = 0. At each step of the merge, we campare a[i] and b[j]. If a[i] < b[j], we copy a[i] into the destination array, and increment i. Otherwise we copy b[j] and increment j. The entire process will run in linear time, i.e. in O(N) where N = length(a) + length(b).
Let's write a function to accomplish this task:
procedure merge(a, b: array of integer; var m: array of integer); var i, j, k: integer; begin i := 0; j := 0; for k := 0 to high(m) do if (j > high(b)) or ((i <= high(a)) and (j <= high(b)) and (a[i] < b[j])) then begin m[k] := a[i]; i += 1; end else begin m[k] := b[j]; j += 1; end end;
The trickiest part of this code is the if
condition. At
each step of the merge, there are three possibilities:
j is out of bounds, i.e. j > high(b). We want to take a[i].
i is out of bounds, i.e. i > high(a). We want to take b[j].
i and j are both in bounds, i.e. i ≤ high(a) and j ≤ high(b). We want to take a[i] only if a[i] < b[j].
The if condition encompasses possibilities (a) and (c).
Note that the expression 'j <= high(b)
' in the if
condition is actually redundant. That's because if j > high(b),
Pascal will never evaluate the expressions after the 'or'. So we
could actually rewrite the condition as
if (j > high(b)) or ((i <= high(a)) and (a[i] < b[j])) then …
Now we can use merge
to merge the arrays a and b
mentioned above:
var a: array[1..5] of integer = (3, 5, 8, 10, 12); b: array[1..5] of integer = (6, 7, 11, 15, 18); c: array[1..10] of integer; begin merge(a, b, c); …
We may also pass partial arrays to merge.
Suppose that
we have a single array a
that includes two sorted
segments:
var a: array[1..10] of integer = (3, 5, 8, 10, 12, 6, 7, 11, 15, 18);
We can merge a[1..5]
and a[6..10]
into c
:
merge(a[1..5], a[6..10], c);
Can we even merge a[1..5]
and a[6..10]
back
into the array a? At first it might appear that we cannot, because
our merge algorithm does not work in place. For example, as we merge
the two halves of the array above, it might seem that we will
overwrite a[3] with the value 6 before we merge the value 8.
But actually our merge
function will work even
for merging back into the same array! That's because its two array
parameters a and b are passed by value – they are not
preceded with const
or var
. So merge
will copy these arrays before its code begins to execute.
We now have a function that merges two sorted arrays. We can use this as the basis for implementing a general-purpose sorting algorithm called mergesort.
Mergesort has a simple recursive structure. To sort an array of n elements, it divides the array in two and recursively mergesorts each half. It then merged the two sorted subarrays into a single sorted array. This problem solving approach is called divide and conquer.
For example, consider mergesort’s operation on this array:
Merge sort splits the array into two halves:
It then sorts each half, recursively.
Finally, it merges these two sorted arrays back into a single sorted array:
Here's an animation of mergesort in action on the above array.
Here's
an implemention of mergesort, using our merge
procedure
from above:
procedure mergesort(var a: array of integer); var n: integer; begin if length(a) <= 1 then exit; n := high(a) div 2; mergesort(a[0 .. n]); mergesort(a[n + 1 .. high(a)]); merge(a[0 .. n], a[n + 1 .. high(a)], a); end;
Here is the complete set of recursive calls that mergesort will make when initially invoked on an array of size 8:
mergesort(a[0..7]) mergesort(a[0..3]) mergesort(a[0..1]) mergesort(a[0..0]) mergesort(a[1..1]) merge(a[0..0], a[1..1], a[0..1]) mergesort(a[2..3]) mergesort(a[2..2]) mergesort(a[3..3]) merge(a[2..2], a[3..3], a[2..3]) merge(a[0..1], a[2..3], a[0..3]) mergesort(a[4..7]) mergesort(a[4..5]) mergesort(a[4..4]) mergesort(a[5..5]) merge(a[4..4], a[5..5], a[4..5]) mergesort(a[6..7]) mergesort(a[6..6]) mergesort(a[7..7]) merge(a[6..6], a[7..7], a[6..7]) merge(a[4..5], a[6..7], a[4..7]) merge(a[0..3], a[4..7], a[0..7])
What is the running time of mergesort
? The helper
function merge
runs in time O(N), where N is the length
of the array c. So the running time of mergesort
follows
the recurrence
T(N) = 2 ⋅ T(N / 2) + O(N)
We have not seen this recurrence before. As we have mentioned before, in this class we will not formally study how to solve recurrences such as this one. But its solution is
T(N) = O(N log N)
Intuitively,
why does mergesort
run
in O(N log N)? First, notice that the function will recurse to a
depth of log2(N). For example,
mergesort(a[0..7])
calls mergesort(a[0..3])
,
which calls mergesort(a[0..1])
, which calls
mergesort(a[0..0])
. At each recursive call, the size of
the array drops in half, so mergesort is called on an array with a
single element, which is the base case, at a depth of log2(N).
Now consider the work that is done at each recursion level. As
visible in the call tree above, at the third recursion level we merge
a[0..0]
and a[1..1]
, as well as a[2..2]
and a[3..3]
, and so on. Each array element is merged
exactly once. At the second recursion level we merge a[0..1]
and a[2..3]
, as well as a[4..5]
and
a[6..7]
. Again, each element is merged once. Because the
merges run in linear time, the total merging work at each level is
O(N). So the total run time is O(log N) levels times O(N), or O(N log
N).
For large N, O(N log N) is much faster than O(N2), so mergesort will be far faster than insertion sort or bubble sort. For example, suppose that we want to sort 1,000,000,000 numbers. And suppose (somewhat optimistically) that we can perform 1,000,000,000 operations per second. An insertion sort might take roughly N2 = 1,000,000,000 * 1,000,000,000 operations, which will take 1,000,000,000 seconds, or about 32 years. A mergesort might take roughly N log N ≈ 30,000,000,000 operations, which will take 30 seconds. This is a dramatic difference. :)
We are now ready to begin our study of data structures. The first sort of structure we will study is called a stack.
A stack is a data
structure supporting the push
and pop
operations. push
pushes a value onto a stack, and pop
removes the value that was most recently pushed. This is like a stack
of sheets of paper on a desk, where sheets can be added or removed at
the top.
In other words, a stack is a last in first out data structure: the last element that was added is the first to be removed.
Here is an interface for a stack:
type stack = ... procedure init(var s: stack); procedure push(var s: stack; i: integer); function pop(var s: stack): integer; function isEmpty(s: stack): boolean;
This interface specifies a stack as an abstract data type. In other words, it specifies how a stack behaves, without specifying how it is implemented. And in fact several different implementations are possible, each with various performance characteristics.
Before we describe how to implement a stack, let's look at how one can be used. For example:
var s: stack; i: integer; begin init(s); push(s, 4); push(s, 8); for i := 1 to 5 do push(s, i); while not isEmpty(s) do write(pop(s), ' '); writeln; end;
This code will write
5 4 3 2 1 8 4
Here's a first attempt at implementing a stack with a dynamic array.
type stack = array of integer; procedure init(var s: stack); begin setLength(s, 0); end; procedure push(var s: stack; i: integer); begin setLength(s, length(s) + 1); s[high(s)] := i; end; function pop(var s: stack): integer; var k: integer; begin k := s[high(s)]; setLength(s, length(s) - 1); exit(k); end; function isEmpty(s: stack): boolean; begin exit(length(s) = 0); end;
The implementation is straightforward: the array contains all stack elements, with the top of the stack (i.e. the most recently pushed element) at the end of the array.
This stack will work fine. But now consider: what will be the
running time of the following for
loop, as a function of
N?
var s: stack; i: integer; begin init(s); for i := 1 to N do push(s, i);
The for
loop will make N calls to push
,
which will in turn make N calls to setLength
. We have
not previously considered how long setLength
might take
to run. You might think it runs in constant time, but in fact
setLength(a, n)
runs in time O(n). In other words,
setLength
runs in time proportional to the length of the
array that it is constructing. That is essentially because behind the
scenes, a call to setLength
will often create a new copy
of the array that it is extending. (Just why that happens is
related to memory allocation algorithms and is beyond the scope of
this course.)
So, then: the for loop above will result in calls to
setLength(s, 1) setLength(s, 2) … setLength(s, n)
This will take time
O(1 + 2 + 3 + … + N) = O(N2)
That's not ideal. How can we do better?
Actually we can modify our array-based stack implementation to be more efficient. Instead of extending the dynamic array by a single element on each push, we will double the size of the dynamic array when we need to increase it. In this new implementation we will represent a stack as follows:
type stack = record a: array of integer; count: integer; end;
In this record, a
is a dynamic array. count
is the number of elements of the array that are currently in use,
i.e. currently hold stack elements. In other words, count
is the current number of items on the stack, which are in the array
elements a[0 .. (count - 1)]
. All
following elements in a
are free. The push
procedure will fill in a free array element if there are any;
otherwise the array is full, and it will double the array size.
Here is the complete implementation:
procedure init(var s: stack); begin setLength(s.a, 1); s.count := 0; end; procedure push(var s: stack; i: integer); begin if length(s.a) = s.count then // array is full setLength(s.a, length(s.a) * 2); // so expand it s.a[s.count] := i; s.count += 1; end; function pop(var s: stack): integer; var n: integer; begin n := s.a[s.count - 1]; s.count -= 1; exit(n); end; function isEmpty(s: stack): boolean; begin exit (s.count = 0); end;
In this updated implementation, how long will this loop take to run, as a function of N?
init(s); for i := 1 to N do push(s, i);
Suppose that N is a power of 2. As the loop runs, we will make the following calls to setLength:
setLength(s.a, 1) // during init() setLength(s.a, 2) setLength(s.a, 4) setLength(s.a, 8) … setLength(s.a, N)
The total running time will be
O(1 + 2 + 4 + 8 + … + N)
How large is this? Let N = 2b. Then we can use the formula for the sum of a geometric series. Recall that if the first term of a geometric series is a1, the series has n terms and each term is r times the previous term, then its sum is
a1 (1 – rn) / (1 – r)
So we have
1 + 2 + 4 + 8 + … + 2b = 1 (1 – 2b + 1) / (1 – 2) = 2b + 1 - 1 = 2N – 1 = O(N)
In other words,
1 + 2 + 4 + 8 + … + N = O(N)
You should remember this important fact.
So with our new stack implementation we can push N values in O(N) time. That's a dramatic improvement over our previous implementation. Notice, however, that some pushes will take longer than others. In particular, a single push takes O(N) in the worst case, where N is the current number of items on the stack. However push operations take O(1) on average, since we can perform N of them in O(N).
So now we might ask: can we implement a stack in some other way that lets us push in O(1) even in the worst case? The answer is yes, but to do that we will need to use pointers and dynamic memory allocation, which we will cover in the next lecture.