When you declare a variable or field of any integer type, you can specify a range of values that it may hold:
type time = record hours: 0 .. 23; minutes: 0 .. 59; seconds: 0 .. 59; end; var t: time; i: 10 .. 15;
If you have enabled range checking with the {$r+}
directive, then Pascal will check that a variable's value actually
falls within the given range. For example:
begin t.hours := 30; // RUNTIME ERROR …
The assignment above results in a runtime error since 30 is out of range.
Ranges work for characters, too:
var c: 'a' .. 'z';
In Pascal you may define enumerated types, which have a fixed number of constant values:
type suit = (clubs, diamonds, hearts, spades); // an enumerated type card = record r: 2 .. 14; // 11 = Jack, 12 = Queen, 13 = King, 14 = Ace s: suit; end; var c: card; begin c.r := 8; c.s := diamonds; …
Notice that each of an enumerated type's values (e.g. diamonds
)
is available as a constant anywhere in the program.
In this example, alternatively we could use a string (e.g.
'diamonds'
) or an integer to represent a suit. But when
there are only a small number of possible values, an enumerated type
is bettter than a string because it lets the compiler check that
values are correct. For example, if we used strings to represent
suits and we mistakenly typed
c.s := 'diammonds' // misspelled
then the compiler would not complain, and we'd have a bug in our program. Conversely, with the enumerated type, the line
c.s := diammonds // misspelled
will give a compiler error since the name 'diammonds' is unknown.
If we used an arbitrary integer to represent a suit (e.g. 1 = clubs, 2 = diamonds) then our code would be hard to read: in a line such as
c.s := 4;
it would not be obvious which suit is represented by the number 4. Arbitrary constants embedded in code such as these are called magic numbers and are generally a poor coding practice. Again, an enumerated type is often a better choice here.
A case
statement uses a given value to choose which
of a set of statements to execute:
var c: integer; begin ... readln(c); case c of 2..10: writeln(c); 11: writeln('jack'); 12: writeln('queen'); 13: writeln('king'); 14: begin writeln('ace'); aces := aces + 1; end else writeln('unknown card'); end; // end case
Notice that each label is either a single value (e.g. 12) or a range of values (e.g. 2..10).
The else
clause in a case
statement is
optional. If it is not present, and the supplied value does not match
any case labels, then the entire case
statement is
skipped.
You must use begin
/end
to wrap multiple
statements in any case group except the else
group, in
which begin
/end
are optional.
case
statements work with values of any ordinal
type, e.g. integers, characters or enumerated types. They do not
work with strings (unfortunately). They also cannot be used with
reals.
For integers a and b, the greatest common divisor of a and b, also written as gcd(a, b), is the largest integer that evenly divides both a and b. For example:
gcd(20, 6) = 2
gcd(12, 18) = 6
gcd(35, 15) = 5
The greatest common divisor is useful in various situations, such as simplifying fractions. For example, suppose that we want to simplify 35/15. gcd(35, 15) = 5, so we can divide both the numerator and denominator by 5, yielding 7/3.
Naively, we can find the gcd of two values by trial division:
function gcd(a, b: integer): integer; var i: integer; begin for i := min(a, b) downto 2 do if (a mod i = 0) and (b mod i = 0) then exit(i); exit(1); end;
But this may take time O(N), where N = min(a, b). We'd like to be more efficient, especially since in some situations (e.g. cryptography) we may want to take the gcd of numbers that are very large.
Another way to find the greatest common divisor of two numbers is to factor the numbers into primes, then look for common primes. For example consider finding gcd(252, 90). 252 = 9 * 28 = 22 * 32 * 71. And 90 = 9 * 10 = 21 * 32 * 51. So the common primes are 21 * 32 = 18, and we have gcd(252, 90) = 18.
Euclid’s algorithm is a much more efficient way to find the gcd of two numbers than prime factorization. It is based on the fact that for all positive integers a and b, gcd(a, b) = gcd(b, a mod b). (We will not prove this here, though the proof is not difficult.) So, for example,
gcd(252, 90)
= gcd(90, 72)
=
gcd(72, 18)
= gcd(18, 0)
= 18
We can implement Euclid's algorithm in Pascal like this:
function gcd(a, b: integer): integer; var c: integer; begin while b <> 0 do begin c := a mod b; a := b; b := c; end; exit(a); end;
In all the examples of gcd(a, b) above we had a > b. But note that this function works even when a < b ! For example, if we call gcd(5, 15), then on the first iteration of the while loop we have c = 5 mod 15 = 5. So then we assign a := 15 and b := 5, and further iterations continue as if we had called gcd(15, 5).
Suppose that we call gcd(x, y), with x > y. What is the running
time of our gcd
function as a function of N = x? To see
this, first note that for any integers a, b with a > b, we have (a
mod b) < a / 2. For if 0 < b ≤ a / 2, then (a mod b) ≤
b – 1 < a / 2. Or if a / 2 < b < a, then a div b
= 1, so (a mod b) = a – b < a – a / 2 = a / 2.
Now suppose that we call gcd(x, y) and the function runs for at least a few iterations. Then
before iteration 1: a = x, b = y
before iteration 2: a = y, b = (x mod y)
before iteration 3: a = (x mod y), b = …
In two iterations the first argument has dropped from x to (x mod y) < x / 2. So after every two iterations of the algorithm, the first argument will be reduced by at least a factor of 2. This shows that the algorithm cannot iterate more than 2 log2(x) times. In other words, it runs in time O(log N), since N = x.
We can rewrite our implementation of Euclid's algorithm more compactly using recursion:
function gcd(a, b: integer): integer; begin if b = 0 then exit(a); // base case exit(gcd(b, a mod b)); // recursive case end;
A recursive function calls itself. This is the first time we have seen recursion in this course. Recursion is a powerful technique that can help us solve many problems. We will use it often in this course and in Programming II.
Whenever we write a recursive function, there is a base case and a recursive case.
The base case is an instance that we can solve immediately. In the function above, the base case is when b = 0. A recursive function must always have a base case – otherwise it would loop forever since it would always call itself.
In the recursive case, a function calls itself recursively,
passing it a smaller instance of the given problem. Then it
uses the return value from the recursive call to construct a value
that it itself can return. In the recursive case in this example, we
call gcd(b, a
mod
b)
and return the value that it returns.
We have seen that we can write Euclid's algorithm either iteratively (i.e. using loops) or recursively. In theory, any function can be written either iteratively or recursively. We will see that for some problems a recursive solution is easy and an iterative solution would be quite difficult. Conversely, some problems are easier to solve iteratively. Pascal lets us write functions either way. (By the way, in purely functional languages such as Haskell there are no loops or iteration, so you must always use recursion. But that is a topic for another course.)
Broadly speaking, we will see
that "easy" recursive functions such as gcd
call themselves only once, and it
would be straightforward to write them either iteratively or
recursively. For today we will become familiar with recursion by
considering only these "easy" functions. In the next few
lectures we will see recursive functions that call themselves two or
more times. Those functions will let us solve more difficult tasks
that we could not easily solve iteratively.
For now, here is another example, a recursive procedure:
procedure hi(x: integer); begin if x = 0 then begin writeln('hi'); exit; end; writeln('start ', x); hi(x - 1); writeln('done ', x); end;
If we call hi(3), the output will be
start 3 start 2 start 1 hi done 1 done 2 done 3
Be sure you understand why the lines beginning with 'done' are printed. hi(3) calls hi(2), which calls hi(1), which calls hi(0). At the moment that hi(0) runs, all of these function invocations are active and are present in memory on the call stack:
hi(3) → hi(2) → hi(1) → hi(0)
Each function invocation has its own value of the parameter x. (If this procedure had local variables, each invocation would have a separate set of variable values as well.)
When hi(0) returns, it does not exit from this entire set of calls. It returns to its caller, i.e. hi(1). hi(1) now resumes execution and writes 'done 1'. Then it returns to hi(2), which writes 'done 2', and so on.
Here is another recursive function:
function sum(n: integer): integer; begin if n = 0 then exit(0); exit(n + sum(n - 1)); end;
What does this function do? Suppose that we call sum(3). It will call sum(2), which calls sum(1), which calls sum(0). The call stack now looks like this:
sum(3) → sum(2) → sum(1) → sum(0)
Now
sum(0) returns 0 to its caller sum(1).
sum(1) takes this value 0, adds n = 1 to it and returns 1 to sum(2).
sum(2) takes this value 1, adds n = 2 to it and returns 3 to sum(3).
sum(3) takes this value 3, adds n = 3 to it and returns 6.
We see that given any n, the function returns the sum 1 + 2 + 3 + … + n.
We were given this function and had to figure out what it does. But more often we will go in the other direction: given some problem, we'd like to write a recursive function to solve it. How can we do that?
Here is some general advice. To write any recursive function, first look for base case(s) where the function can return immediately. (As we will soon see, a function may sometimes have more than one base case.) Now you need to write the recursive case, where the function calls itself. At this point you may wish to pretend that the function "already works". Write the recursive call and believe that it will return a correct solution to a subproblem, i.e. a smaller instance of the problem. Now you must somehow transform that subproblem solution into a solution to the entire problem, and return it. This is really the key step: understanding the recursive structure of the problem, i.e. how a solution can be derived from a subproblem solution.
Let's go through some more examples of recursive functions.
We will write a recursive function to return the sum of all elements in an array of integers:
function sum(const a: array of integer): integer;
We can't write this function directly recursively, since it can't call itself and pass a subarray of the array a. (Actually there is a way to do that in Pascal, though we haven't learned about that yet.) So instead we must generalize the problem into a form that we can solve recursively. (This is a very common step in constructing a recursive solution.) Here is a more general function:
// Return the sum of a[l … r], i.e. a[l] + a[l + 1] + … + a[r] function sum1(const a: array of integer; l, r: integer): integer;
Certainly if we can write sum1
then we can trivially write sum
by calling sum1
:
function sum(const a: array of integer): integer; begin exit(sum1(a, 0, high(a)); end;
So now it remains only to write sum1
. We can do so
recursively:
// Return the sum of a[l … r], i.e. a[l] + a[l + 1] + … + a[r] function sum1(const a: array of integer; l, r: integer): integer; begin if r < l then exit(0); // base case: empty set of elements exit(a[l] + sum(a, l + 1, r); end;
The base case here is when r < l
, i.e. the set a[l
… r]
is empty because there are no values i with l ≤
i ≤ r
. Alternatively, we could use this base case:
if l = r then exit(a[l]); // base case: only 1 element
However in recursion it is usually better to use a subproblem of size 0 as a base case when possible. This is both because subproblems of size 0 do actually occur (e.g. an empty array, the empty string), and also because a subproblem of size 0 often has an especially trivial solution.
In the recursive case in this function, we first invoke sum(a,
l + 1, r)
. When writing the function, we pretend that the
recursive function "already works" and will return the
correct sum of the elements a[(l + 1) … r]
. We then
need add only the value a[l]
to obtain the entire sum we
want.
We could alternatively write the recursive case as
exit(sum(a, l, r - 1) + a[r]);
This would compute the same result, but adding the elements in the opposite direction.
Finally, note that we didn't need to make
sum1
quite this general. We could have added only the parameter r, for
example:
// Return the sum of a[0 … r], i.e. a[0] + a[1] + … + a[r] function sum1(const a: array of integer; r: integer): integer; begin if r < 0 then exit(0); // base case: empty set of elements exit(sum(a, 0, r – 1) + a[r]); end;
Similarly, we could have only added the parameter l, and computed the
sum of a[l … high(a)]
.
Here is the first example from today's lab session: a recursive procedure that prints the word 'orange' n times.
procedure orange(n: integer); begin if n = 0 then exit; // base case writeln('orange'); orange(n – 1); end;
Here's a recursive function that returns true if its argument n is a power of 2.
function isPowOfTwo(n: integer): boolean; begin if n = 1 then exit(true); // base case if n mod 2 <> 0 then exit(false); // another base case exit(isPowOfTwo(n div 2)); end;
Note the two base cases here.
Let's write a recursive function that counts the number of times that an integer k occurs in a given array:
function count(const a: array of integer; k: integer): integer;
As in a previous example, we need to generalize this to be able to write it recursively:
// Return the number of occurrences of k in the elements a[l … r]. function count1(const a: array of integer; k: integer; l, r: integer): integer; var c: integer; begin if r < l then exit(0); c := count1(a, k, l + 1, r); if a[l] = k then exit(c + 1) else exit(c); end;
And now count
can simply call count1
:
function count(const a: array of integer; k: integer): integer; begin exit(count1(a, k, 0, high(a))); end;