Programming I, 2018-9
Lecture 9 – Notes

units

In Free Pascal, a unit is a reusable module of code. After you have written a unit, you can use it easily from any other source file that you write.

For example, last week we saw how to implement a stack using an array. Let's put our array-based stack type into a unit. We create a file array_stack.pas that looks like this:

unit array_stack;   // unit name must match filename!

interface

type
  stack = array of integer;

procedure init(var s: stack);
procedure push(var s: stack; i: integer);
function pop(var s: stack): integer;
function isEmpty(s: stack): boolean;

implementation

procedure init(var s: stack);
begin
  setLength(s, 0);
end;

procedure push(var s: stack; i: integer);
begin
  setLength(s, length(s) + 1);
  s[high(s)] := i;
end;

function pop(var s: stack): integer;
var
  k: integer;
begin
  k := s[high(s)];
  setLength(s, length(s) - 1);
  exit(k);
end;

function isEmpty(s: stack): boolean;
begin
  exit(length(s) = 0);
end;

end.

As you can see above, a unit begins with a unit declaration at the top of the file specifying the name of the unit. It must be the same as the source file name without the '.pas' extension.

The interface section declares types, procedures and functions that the unit will export. Procedures and functions declared in this section must be implemented in the following implementation section.

A unit ends with the end keyword followed by a period.

Now suppose that we are writing a program abc.pas. It can use the array_stack unit we just wrote:

// abc.pas

uses array_stack;

var
  s: stack;
  i: integer;

begin
  init(s);
  for i := 1 to 10 do 
    push(s, i);
  while not isEmpty(s) do
    writeln(pop(s));
end.

For this to work, you must place abc.pas and array_stack.pas in the same directory.

A program that uses a unit may call only the procedures and functions declared in the interface section. Any other procedures and functions in the implementation section are private to the unit.

As we learn about more data structures in the remainder of this course, you may wish to create units that implement these structures. Then you can easily use those structures in other programs that you write. Note that when you submit a program to ReCodEx you may upload multiple source files. So you can even use units in your ReCodEx programs; simply upload the necessary units along with your top-level program.

pointers

In Pascal and many other languages, a pointer is a special kind of value that points to another value in memory. In other words, a pointer is an indirect reference to a value.

We will use pointers to build various kinds of linked data structures: linked lists, binary trees, expression trees and so on. These structures will be useful for many purposes.

You can make a pointer to any kind of value, but in this course we will only use pointers to records. Here is a record type pos and a pointer variable:

type
  pos = record
    x, y: integer;
  end;
  
var
  p: ^pos;

The type ^pos means a pointer to a pos. By the way, we usually pronounce ^ as "hat" since the symbol ^ looks something like a hat.

Initially the value of p is undefined. We can use the new operator to dynamically allocate a pos that p will point to:

begin
  new(p);

Now p points to a record of type pos. The values x and y inside that record have undefined values since we haven't stored anything there yet. Let's set those values now:

p^.x := 4;
p^.y := 5;

The expression p^ means the value that p points to. p^ is the record that we dynamically allocated. p^.x is the field x inside that record.

Suppose that we have a second pointer variable q:

var
  q: ^pos;

We can now assign

q := p;

Now q points to the same record that p points to. An assignment between two pointers always makes them point to the same value.

We can set the x field through the pointer q:

q^.x := 6;
writeln(p^.x);  // writes 6

The change to x is also visible through the pointer p. That's because p and q are pointing to the same record.

When we are finished using a dynamically allocated value, we can free it using the dispose operator:

dispose(p);

This returns the record's memory to the operating system. You must never use a value after you have disposed it:

dispose(p);
p^.x := 7;    // BAD – may crash or have unpredictable effects

If two pointers point to the same value and you invoke dispose on one of them, a subsequent access through the other pointer is also invalid:

q := p;
dispose(p);
q^.x := 7;    // BAD – may crash or have unpredictable effects

That's because dispose frees the single object that both pointers point to.

You may give a pointer the special value nil:

p := nil;

nil is a pointer to nothing. This is actually a useful concept (sort of like the empty set, or the number 0), and we will use nil frequently in building linked data structures.

Any reference through nil will crash the program:

p := nil;
writeln(p^.x);  // CRASH

In this situation the program will die with a runtime error. (In many languages this is a dreaded null pointer exception). When you write code that uses pointers, you must take care to ensure that a null pointer runtime error can never occur.

calling functions with pointers

You can pass a pointer to a function, and a function can return a pointer. To do this, however, you must declare a name for the pointer type. This will not compile:

procedure abc(p: ^pos);  // COMPILER ERROR - ^ is not allowed in signature 
 
function xyz(i: integer): ^pos;  // COMPILER ERROR - ^ is not allowed in signature 

Instead, you need to do this:

type
  ppos = ^pos;  // a pointer to a pos

procedure abc(p: ppos);

function xyz(i: integer): ppos;

Here's a procedure that takes a pointer to a pos and increments each of the record's components:

procedure incr(p: ppos);
begin
  p^.x += 1;
  p^.y += 1;
end;

Here's a function that takes an integer i and returns a pointer to a dynamically allocated pos whose fields are both i:

function make(i: integer): ppos;

var
  p: ^pos;
begin
  new(p);
  p^.x := i;
  p^.y := i;
  exit(p);
end;

We could call this function like this:

var
  p: ^pos;
begin
  p := make(4);
  writeln(p^.x);  // writes 4

Like other function parameters, a parameter of pointer type is passed by value by default. That means that the function receives a local copy of the pointer. A change to the local copy will not be visible in the caller. For example:

procedure abc(p: ppos);
begin
  p^.x := 4;
  p := nil;
end;

var
  p: ^pos;

begin
  p^.x := 3;
  p^.y := 3;
  abc(p);
  writeln(p^.x);  // writes 3
  

In the code above, the assignment 'p := nil' only affects the local copy of p inside abc, and not the outer variable p in the main begin/end block.

If you precede a parameter with the var keyword, it will be passed by reference, and a change to its value in the function will be seen in the caller. For example, let's modify the declaration of procedure abc above as follows:

procedure abc(var p: ppos);

Now the assignment 'p := nil' will affect the outer p. And so the sequence

  abc(p);
  writeln(p^.x);

will now crash, because when the function returns p is nil, and a reference through a nil pointer yields a runtime error.

linked lists

We can use pointers to build a useful data structure called a linked list, which looks like this:

%3

Like an array, a linked list can hold a sequence of elements (integers in this case). But it performs quite differently from an array. We can access the jth element of an array in constant time for any j, but inserting or deleting an element at the beginning of an array or in the middle takes time O(N), where N is the length of the array. Conversely, accessing the jth element of a linked list takes time O(j), but insertions and deletions take O(1).

An element of a linked list is called a node. A node contains one or more values, plus a pointer to the next node in the list. The first node of a linked list is called its head. The last node of a linked list is its tail. The tail always points to nil.

By the way, we will sometimes illustrate a linked list more compactly:

2 → 4 → 7 → nil

The two pictures above denote the same structure; the first is simply more detailed.

Here is a node type for a linked list that holds integers:

type
  node = record
    i: integer;
    next: ^node;
  end;

We can build the 3-element linked list pictured above as follows:

var
  p, q, r: ^node;
  
begin
  new(r);
  r^.i := 7;
  r^.next := nil;
  
  new(q);
  q^.i := 4;
  q^.next := r;
  
  new(p);
  p^.i := 2;
  p^.next := q;

Now p points to the head of the list. In general we refer to a linked list using a pointer to its head.

iterating over a list

We often want to iterate over all elements in a list. To do this, we start with a pointer p to the head of the list, and advance at each step like this:

p := p^.next;

Here is a function that iterates over a linked list of integers and computes the sum of all elements:

function sum(list: pnode): integer;
var
  p: ^node;
  s: integer = 0;
begin
  p := list;
  while p <> nil do
    begin
      s += p^.i;
      p := p^.next;
    end;

  exit(s);
end;

If p points to the 3-element list that we built above, then we can now call

writeln(add(p));   // writes 13

Or we can call

writeln(add(nil));    // writes 0

This last call works because nil is a linked list. It is the empty list, i.e. a list with 0 elements.

building a list by prepending

Above we saw code that builds a fixed 3-element list. Of course, we usually want to build a list using a loop that works with any number of nodes.

One way to build a list is by prepending nodes. Recall that to prepend means to add at the beginning. For example, if we prepend the character 'p' to 'ear' the result is 'pear'.

Suppose that we want to build a list with the numbers 1 through 10 in order. We start with nil, which is the empty list. We allocate a node with the value 10 and prepend it to nil, yielding a list with one node. Now we allocate a node with value 9 and prepend it, yield a list with the values 9 and 10. And so on.

Here is a function that builds a linked list of the integers 1 through k by prepending:

function sequence(n: integer): pnode;
var
  head, p: ^node;
  i: integer;
begin
  head := nil;
  
  for i := n downto 1 do
    begin
      new(p);
      p^.i := i;
      p^.next := head;  // prepend p to the list
      head := p;        // now p is the head of the list
    end;
  
  exit(head);
end;

building a list by appending

Alternatively we can build a list by appending, which is only slightly harder. (Recall that to append means to add at the end.) To do this we need to keep two pointers: one to the head of the list and one to the current tail.

Here is a function that builds a linked list of the integers 1 through n by appending:

function sequence(n: integer): pnode;
var
  head, tail, p: ^node;
  i: integer;
begin
  head := nil;
  tail := nil;  
  
  for i := 1 to n do
    begin
      new(p);
      p^.i := i;
      p^.next := nil;

      if head = nil then  // list is empty
        begin
          head := p;
          tail := p;
        end
      else
        begin
          tail^.next := p;  // append after tail
          tail := p;
        end;
    end;
  
  exit(head);
end;

We can use the same technique to build a linked list of values from other sources. For example, here is a very similar function that builds a linked list of values read from standard input until EOF:

function readList: pnode;
var
  head, tail, p: ^node;
  i: integer;
begin
  head := nil;
  tail := nil;  
  
  while not seekEof do
    begin
      new(p);
      read(p^.i);
      p^.next := nil;

      if head = nil then  // list is empty
        begin
          head := p;
          tail := p;
        end
      else
        begin
          tail^.next := p;  // append after tail
          tail := p;
        end;
    end;
  
  exit(head);
end;

example: checking for identical adjacent elements

Let's write a function that takes a linked list and returns true if any two adjacent elements in the list are identical, i.e. have the same value.

function adjacentIdentical(p: pnode): boolean;
begin
  if p = nil then exit(false);
  
  while p^.next <> nil do   // while p doesn't point to the last node
    begin
      if p^.i = p^.next^.i then
        exit(true);
      p := p^.next;
    end;
  
  exit(false);
end;

Note the comparison

if p^.i = p^.next^.i then

This compares the value in the node that p points to with the value in the following node. We can analyze the last term p^.next^.i as follows:

Note also the while condition

while p^.next <> nil do

This stops as soon as p points to the last node in the list. At that point we must stop, because if we attempt to access the following node's value via

p^.next^.i

we will get a runtime error.

Finally note the initial check for the empty list:

  if p = nil then exit(false);

If this check were absent and the function were invoked on the empty list, then the following check

while p^.next <> nil do

would crash.

recursing over a list

Many functions on linked lists can be written easily using recursion. Here is a recursive function to add all elements in a list:

function sum(p: pnode): integer;
begin
  if p = nil then exit(0);
  exit(p^.i + sum(p^.next));
end;

implementing a stack using a linked list

In the last lecture we learned about stacks, which are an abstract data type with these operations:

type stack = ...
procedure init(var s: stack);
procedure push(var s: stack; i: integer);
function pop(var s: stack): integer;
function isEmpty(s: stack): boolean;

We also saw how to implement a stack using an array.

Alternatively we can implement a stack using a linked list. To accomplish this, the type stack will simply be a pointer to a node:

type stack = ^node;

Now our stack operations are quite straightforward:

procedure init(var s: stack);
begin
  s := nil;
end;

procedure push(var s: stack; i: integer);
var 
  n: ^node;
begin
  new(n);
  n^.i := i;
  n^.next := s;
  s := n;
end;

function pop(var s: stack): integer;
var 
  i: integer;
  n: ^node;
begin
  i := s^.i;
  n := s;
  s := s^.next;
  dispose(n);
  exit(i);
end;

function isEmpty(s: stack): boolean;
begin
  isEmpty := (s = nil);
end;

Consider this code that uses a stack:

var
  s: stack;
  i: integer;
begin
  for i := 1 to 10 do
    push(s, i);
  while not isEmpty(s) do
    writeln(pop(s));

Notice that this code will produce the same result whether the stack is implemented as an array or as a linked list. This is a general feature of abstract data types: code that uses them will work correctly no matter how the type is implemented.

Different implementations of an abstract type may, however, have different performance characteristics. In the previous lecture we saw that if we implement a stack using a dynamic array, we can improve its performance by modifying our implementation to double the dynamic array's size each time we need to grow it. Even then, however, we found that the push operation took O(N) in the worst case, where N is the current stack size. Our linked list-based implementation performs differently: push always runs in O(1).

least common multiple

In a recent lecture we discussed how to compute the greatest common divisor of two integers, and learned about Euclid's algorithm, which can perform this computation efficiently. A related concept is the least common multiple of two integers p and q, which is the smallest integer that is divisible by both p and q. For example,

lcm(60, 90) = 180

How can we compute a least common multiple efficiently? Here is a useful fact: for all integers and b,

a · b = gcd(a, b) · lcm(a, b)

And so

lcm(a, b) = a · b / gcd(a, b)

We can compute the gcd efficiently using Euclid's algorithm, so the formula above gives us an efficient way to compute the lcm. Here is a Pascal function to do that:

function lcm(a, b: integer): integer;
begin
  exit(a div gcd(a, b) * b);
end;

Altenatively we could write

  exit(a * b div gcd(a, b));

but the first version is better, since it avoids the risk of integer overflow if (a * b) will not fit in an integer.