Lecture 6: Notes

Some of the material in this lecture (pointers and linked lists) is also covered in chapter 7 of the Pascal Made Simple text.

pointers

A pointer is an indirect reference to a value in memory. The type ^T means a pointer to type T. For example:

var
  p: ^integer;   // a pointer to an integer

The @ (“address of”) operator yields a pointer to a variable:

var
  i: integer;
begin

  i := 7;
  p := @i;      // now p points to i

The ^ (“circumflex” or “hat”) operator yields the value that a pointer points to:

  writeln(p^);  // writes 7
  p^ := 8;
  writeln(i);   // writes 8

You can use the new function to allocate memory dynamically. new takes an argument of any pointer type. For example:

type
  suit = (clubs, hearts, diamonds, spades);
  card = record
    rank: 2 .. 14;
    suit: suit;
  end;

var
  p: ^card;
  q: ^card;

begin
  new(p);   // now p points to a new dynamically allocated card
  p^.rank := random(13) + 2;
  p^.suit := suit(random(4));

You can have many pointers to the same value. The statement

  q := p;

makes q point to the same value as p. Now changes through either pointer will be be visible through the other, since they point to the same data:

  q^.rank := 10;
writeln(p^.rank);
// writes 10

The dispose function deletes memory that was allocated with new:

  dispose(p);    // free the card

The special pointer value nil points to nothing. If you attempt to access ^p where p is nil, your program will crash.

Be aware that new and dispose do not allocate and delete pointers – rather, they allocate and delete data, which you can access indirectly through a pointer. Pointers in a Pascal program are like integers: you can create them on the fly and pass them around easily, and computationally they are extremely cheap (since pointers are just represented as integers internally).

If you want to pass a pointer to a function or procedure, or return a pointer from a function, you must declare a named pointer type:

type
  pint = ^integer;    // pointer to integer

function foo(a: ^integer): ^integer;   // will not compile!

function foo(a: pint): pint;           // this is fine

linked lists

A linked list is a useful data structure constructed by chaining together records with pointers. It is an alternative to an array for storing a series of values, with different performance characteristics.

An element of a linked list is called a node. A node contains one or more values, plus a pointer to the next node in the list. The first node of a linked list is called its head. The last node of a linked list is its tail. The tail always points to nil.

Here is a node type for a linked list that holds integers:

type
  node = record
    i: integer;
    next: ^node;
  end;
  pnode = ^node;  // type for function parameters and return values

Here is a picture of a linked list containing the values 2, 4 and 7:

%3

We can build this linked list like this:

var
  a, b, c: node;
  
begin
  new(c);
  c^.i := 7;
  c^.next := nil;
  
  new(b);
  b^.i := 4;
  b^.next := c;
  
  new(a);
  a^.i := 2;
  a^.next := b;

Now a points to the head of the list. In general we refer to a linked list using a pointer to its head.

building a list by appending

The code above builds a fixed 3-element list. Of course, we usually want to build a list using a loop that works with any number of nodes.

To build a list by appending, we need two pointers: one to the head of the list and one to the current tail. We begin by allocating the first node, and setting both head and tail to point to it. Each time we want to append a node, we allocate it, then link the tail to the new node. Then we update tail to point to the new node, which is now the current tail.

Here is a function that builds a linked list of the integers 1 through k by appending:

function sequence(k: integer): pnode;
var
  head, tail, n: ^node;
  i: integer;
begin
  if k = 0 then exit(nil);
  
  new(head);
  head^.i := 1;
  tail := head;
  
  for i := 2 to k do
    begin
      new(n);
      n^.i := i;
      tail^.next := n;
      tail := n;
    end;
  
  tail^.next := nil;
  exit(head);
end;

We can use the same technique to build a linked list of values from other sources. For example, here is a similar function that builds a linked list of values read from standard input until EOF:

function readList(): pnode;
var
  head, tail, n: ^node;
  
begin
  if seekEof then exit(nil);
  
  new(head);
  read(head^.i);
  tail := head;
  
  while not seekEof do
    begin
      new(n);
      read(n^.i);
      n^.next := nil;
      tail^.next := n;
      tail := n;
    end;

  tail^.next := nil;
  exit(head);
end;

building a list by prepending

Another way to build a list is by prepending nodes. This is a bit easier, since we only need to keep a pointer to the head of the list, which is where nodes are prepended.

Here is a function that builds a linked list of the integers 1 through k by prepending:

function sequence(k: integer): pnode;
var
  head, n: ^node;
  i: integer;
begin
  head := nil;
  
  for i := k downto 1 do
    begin
      new(n);
      n^.i := i;
      n^.next := head;
      head := n;
    end;
  
  exit(head);
end;

building a list by recursion

It is just as easy to build a linked list using a recursive function. Here is a recursive function to build a list of the integers j through k:

function sequence(j, k: integer): pnode;
var
  n: ^node;
begin
  if k < j then exit(nil);  // empty
  
  new(n);
  n^.i := j;
  n^.next := sequence(j + 1, k);
  exit(n);
end;

Of course, if we want the integers 1 through k (like in the previous examples) we can simply call

  sequence(1,  k);

building a list by tail recursion

As a recursive program calls itself, information about each nested recursive call is stored on the call stack. The call stack has a fixed size, so if a program recurses too deeply it may fail with a stack overflow error. On a modern desktop computer, the call stack is large enough to hold hundreds of thousands of nested call frames if they are small. But if a function stores lots of data in local variables (e.g. in an array) then you may run out of stack space with many fewer nested calls.

The function sequence in the previous section is recursive, but it is not tail-recursive. A tail-recursive procedure calls itself as the last statement in its body; a tail-recursive function returns a value that it gets by calling itself. Free Pascal (and many other compilers) can optimize tail recursive procedures and functions so that internally they use iteration rather than recursion, in which case there is no danger of running out of stack space. In Free Pascal, you can enable tail recursion optimization using the {$optimization tailrec} directive.

Here is a recursive procedure that builds a list of the integers j through k, this time using tail recursion:

procedure sequence(j, k: integer; out list: pnode);
var
  n: ^node;
begin
  if k < j then list := nil
  else begin
    new(n);
    n^.i := j;
    sequence(j + 1, k, n^.next);
  end;
end;

Unfortunately, Free Pascal will not actually perform tail recursion optimization on this procedure or any other procedure with a var or out parameter – that’s a limitation of the Free Pascal compiler at the moment. Perhaps this limitation will be removed in the future.

iterating over a list

We very often want to iterate over all elements in a list. To do this, we start with a pointer p to the head of the list, and advance at each step like this:

  p := p^.next;

Here is a function that iterates over a linked list of integers and computes the sum of all elements:

function add(list: pnode): integer;
var
  p: pnode;
  sum: integer = 0;
begin
  p := list;
  while p <> nil do
    begin
      sum := sum + p^.i;
      p := p^.next;
    end;

  add := sum;
end;

recursing over a list

We may also use recursion to visit all elements of a linked list. Here is a recursive function to add all elements in a list:

function add(n: pnode): integer;
begin
  if n = nil then exit(0);
  exit(n^.i + add(n^.next));
end;

example: destroying a linked list

Here is a procedure that takes a pointer to a linked list of integers and calls dispose to destroy every node in the list.

procedure deleteList(p: pnode);
var 
  q: ^node;
begin
  while p <> nil do
    begin
      q := p^.next;
      dispose(p);
      p := q;
    end;
end;

example: deleting all odd numbers from a linked list

Here are two versions of a procedure that deletes all odd numbers from a linked list. Our first version is iterative, and has separate loops to delete odd numbers at the beginning and to delete odd numbers in other positions:

procedure deleteOdd(var p: pnode);

var 
  q, r: ^node;
begin
  // Delete odd nodes at the beginning
  while (p <> nil) and (p^.i mod 2 = 1) do
    begin
      q := p;
      p := p^.next;
      dispose(q);
    end;
  if p = nil then exit;
  
  // Delete odd nodes in other positions
  q := p;
  while (q^.next <> nil) and (q^.next^.i mod 2 = 1) do
    begin
      r := q^.next;
      q^.next := r^.next;
      dispose(r);
    end;
end;

We can simplify this by using recursion. Here is a tail-recursive implementation:

procedure deleteOdd2(var p: pnode);
var
  q: ^node;
begin
  if p = nil then exit;
  if p^.i mod 2 = 1 then
    begin
      q := p;
      p := p^.next;
      dispose(q);
      deleteOdd2(p);
    end
  else deleteOdd2(p^.next);
end;

tricks with pointers

You can pass a variable by reference using a pointer as an alternative to var:

type
  pint = ^integer;

procedure add(p: pint; i: integer);
begin
  ^p := ^p + i;
end;

It is usually clearer just to use var, however.

You could attempt to return a pointer to a local variable that has gone out of scope:

function danger(i: integer): pint;
var
  j: integer;
begin
  j := i;
  danger := @j;
end;

var
  p: ^integer;

begin
  p := danger(5);
  writeln(p^);
end.

This will result in a garbage pointer. Attempting to read or write through this pointer may crash the program or yield unpredictable results. Be careful with pointers.

sieve of Eratosthenes

We've seen in previous lectures that we can determine whether an integer n is prime using trial division, in which we attempt to divide n by successive integers. Because we must only check integers up to `sqrt(n)`, this primality test runs in time O(`sqrt n`).

Sometimes we may wish to generate all prime numbers up to some limit N. If we use trial division on each candidate, then we can find all these primes in time `O(N sqrt N`). But there is a faster way, using a classic algorithm called the Sieve of Eratosthenes.

It's not hard to carry out the Sieve of Eratosthenes using pencil and paper. It works as follows. First we write down all the integers from 2 to N in succession. We then mark the first integer on the left (2) as prime. Then we cross out all the multiples of 2. Now we mark the next unmarked integer on the left (3) as prime and cross out all the multiples of 3. We can now mark the next unmarked integer (5) as prime and cross out its multiples, and so on. Just as with trial division, we may stop when we reach an integer that is as large as `sqrt N`.

Here is the result of using the Sieve of Eratosthenes to generate all primes between 2 and 30:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

We can easily implement the Sieve of Eratosthenes in Pascal:

{$mode delphi}

var
  n, i, j: integer;
  prime: array of boolean;

begin
  write('Enter n: ');
  readln(n);
  setLength(prime, n + 1);
  for i := 2 to n do
    prime[i] := true;
  
  for i := 2 to trunc(sqrt(n)) do
    if prime[i] then
      begin
        j := 2 * i;
        while j <= n do
          begin
            prime[j] := false;
            j := j + i;
          end;
      end;
  
  for i := 2 to n do
    if prime[i] then write(i, ' ');
  writeln;
end.

How long does the Sieve of Eratosthenes take to run?

The inner loop, in which we set prime[j] := false, runs N/2 times when we cross out all the multiples of 2, then N/3 times when we cross out multiples of 3, and so on. So its total number of iterations will be

`N(1/2 + 1/3 + 1/5 + ... + 1/p)`

where p <= `sqrt N`.

The series

`sum_(p prime) 1/p = 1/2 + 1/3 + 1/5 + …`

is called the prime harmonic series. How can we approximate its sum through a given element 1/p?

First, we may note that the elements of the prime harmonic series are a subset of elements of the ordinary harmonic series, which sums the reciprocals of all positive integers:

`1/1 + 1/2 + 1/3 + … `

The ordinary harmonic series diverges, and the sum of all its terms through 1/n is close to ln n. In fact the difference between this sum and ln n converges to a constant:

`lim_(n->oo) [sum_(k=1)^n 1/k - ln n] = 0.577...`

So since p <= `sqrt N` we have

`N(1/2 + 1/3 + 1/5 + ... + 1/p) < N(1/2 + 1/3 + 1/4 + .. + 1/p) = N * O(log sqrt N) = N * O(log N) = O(N log N)`

We can derive a tighter bound using properties of the prime harmonic series itself. Surprisingly, the prime harmonic series diverges. This was first demonstrated by Euler. It grows extremely slowly, and its partial sum through 1/p is close to ln (ln n):

`lim_(n->oo) [sum_(p<=n) 1/p - ln (ln n)] = 0.261...`

This shows in turn that

`N(1/2 + 1/3 + 1/5 + ... + 1/p) = N * O(log log (sqrt N)) = N * O(log log N) = O(N log log N)`

This is very close to O(N). (In fact more advanced algorithms can generate all primes through N in time O(N).)