Lecture 6: Notes

Some of the material in this lecture (pointers and linked lists) is also covered in chapter 7 of the Pascal Made Simple text.

pointers

A pointer is an indirect reference to a variable. The type ^T means a pointer to type T. For example:

var
  p: ^integer;   // a pointer to an integer

The @ (“address of”) operator yields a pointer to a variable:

var
  i: integer;
begin

  i := 7;
  p := @i;      // now p points to i

The ^ (“circumflex” or “hat”) operator yields the value that a pointer points to:

  writeln(p^);  // writes 7
  p^ := 8;
  writeln(i);   // writes 8

You can use the new function to allocate memory dynamically. new takes an argument of any pointer type. For example:

type
  suit = (clubs, hearts, diamonds, spades);
  card = record
    rank: 2 .. 14;
    suit: suit;
  end;

var
  p: ^card;

begin
  new(p);   // now p points to a new dynamically allocated card
  p^.rank := random(13) + 2;
  p^.suit := suit(random(4));

The dispose function deletes memory that was allocated with new:

  dispose(p);    // free the card

The special pointer value nil points to nothing. If you attempt to access ^p where p is nil, your program will crash.

If you want to pass a pointer to a function or procedure, or return a pointer from a function, you must declare a named pointer type:

type
  pint = ^integer;    // pointer to integer

function foo(a: ^integer): ^integer;   // will not compile!

function foo(a: pint): pint;           // this is fine

tricks with pointers

You can pass a variable by reference using a pointer as an alternative to var:

type
  pint = ^integer;

procedure add(p: pint; i: integer);
begin
  ^p := ^p + i;
end;

It is usually clearer just to use var, however.

You could attempt to return a pointer to a local variable that has gone out of scope:

function danger(i: integer): pint;
var
  j: integer;
begin
  j := i;
  danger := @j;
end;

var
  p: ^integer;

begin
  p := danger(5);
  writeln(p^);
end.

This will result in a garbage pointer. Attempting to read or write through this pointer may crash the program or yield unpredictable results. Be careful with pointers.

linked lists

A linked list is a useful data structure constructed by chaining together records with pointers. It is an alternative to an array for storing a series of values, with different performance characteristics.

An element of a linked list is called a node. Here is a node type for a linked list that holds integers:

type node = record
  i: integer;
  next: ^node;
end;

This program reads a series of integers and inserts them into a linked list of nodes as defined above. It then adds up the numbers in the list and prints their sum.

type
  pnode = ^node;

function readList(): pnode;
var
  first: ^node = nil;
  n: ^node;
begin
  while not seekEof do
    begin
      new(n);
      read(n^.i);

      // prepend n to the list
      n^.next := first;
      first := n;
    end;
  readList := first;
end;

function add(p: pnode): integer;
var
  sum: integer = 0;
begin
  while p <> nil do
    begin
      sum := sum + p^.i;
      p := p^.next;
    end;
  add := sum;
end;

var
  first: ^node;

begin
  first := readList();
  writeln(add(first));
end.

constructing a list by appending

The function readList above prepends each newly created node to the list, so numbers end up in reverse order. For some applications this might be fine, but often we would like a list to contain values in the same order in which they are read.

To do this, we must append each node to the list as it is created. One way to do this is to keep separate pointers to the first and last nodes in the list:

function readList2(): pnode;
var
  first: ^node = nil;
  last: ^node = nil;
  n: ^node;

begin
  while not seekEof do
    begin
      new(n);
      read(n^.i);
      n^.next := nil;
      
      if first = nil then
        begin
          first := n;
          last := n;
        end
      else
        begin
          last^.next := n;
          last := n;
        end;
    end;
  readList2 := first;
end;

Alternatively we can keep a dummy node at the beginning of the list. This make the solution slightly more compact by eliminating the nil case:

function readList3(): pnode;
var
  dummy: node;
  last, n: ^node;
begin
  last := @dummy;
  while not seekEof do
    begin
      new(n);
      read(n.i);
      last^.next := n;
      last := n;
    end;
  last^.next := nil;
  readList3 := dummy.next;
end;

If we'd like the solution to be even simpler, we can use recursion:

function readList4(): pnode;
var
  n: ^node;
begin
  if seekEof then exit(nil);
  new(n);
  read(n^.i);
  n^.next := readList4();
  readList4 := n;
end;

tail recursion

As a recursive program calls itself, information about each nested recursive call is stored on the call stack. The call stack has a fixed size, so if a program recurses too deeply it may fail with a stack overflow error. On a modern desktop computer, the call stack is large enough to hold as many as hundreds of thousands of nested call frames if they are small, but if a function stores lots of data in local variables (e.g. in an array) then you may run out of stack space with many fewer nested calls.

The function readList4 above is recursive, but it is not tail-recursive. A tail-recursive procedure calls itself as the last statement in its body; a tail-recursive function returns a value that it gets by calling itself. Free Pascal (and many other compilers) can optimize tail recursive procedures and functions so that internally they use iteration rather than recursion, in which case there is no danger of running out of stack space. In Free Pascal, you can enable tail recursion optimization using the {$optimization tailrec} directive.

Here is another version of the readList function, rewritten to use tail recursion:

{$optimization tailrec}

procedure readList5(var list: pnode);
begin
  if seekEof then list := nil
  else
    begin
      new(list);
      read(list^.i);
      readList5(list^.next);
    end;
end;

example: constructing a list of odd numbers

Here is a function that takes an integer N and returns a linked list that contains all positive odd integers that are less than or equal to N, in ascending order.

We saw above that when building a list it's easier to prepend than append. That's because when prepending we only need to remember the head of the list, whereas when appending we have to keep track of both the head and tail. So in this example we iterate through numbers in reverse order so that we can prepend.

function oddList(k: integer): pnode;

var 
  first, n: ^node;
begin
  first := nil;

  if k mod 2 = 0 then k := k - 1;   // make k odd
  while k >= 1 do
    begin
      new(n);
      n^.i := k;
      n^.next := first;
      first := n;
      k := k - 2;
    end;
   oddList := first;
end;

example: deleting a linked list

Here is a procedure that takes a pointer to a linked list of integers and calls dispose to destroy every node in the list.

procedure deleteList(p: pnode);
var 
  q: ^node;
begin
  while p <> nil do
    begin
      q := p^.next;
      dispose(p);
      p := q;
    end;
end;

example: deleting all odd numbers from a linked list

Here are two versions of a procedure that deletes all odd numbers from a linked list. Our first version is iterative, and has separate loops to delete odd numbers at the beginning and to delete odd numbers in other positions:

procedure deleteOdd(var p: pnode);

var 
  q, r: ^node;
begin
  // Delete odd nodes at the beginning
  while (p <> nil) and (p^.i mod 2 = 1) do
    begin
      q := p;
      p := p^.next;
      dispose(q);
    end;
  if p = nil then exit;
  
  // Delete odd nodes in other positions
  q := p;
  while (q^.next <> nil) and (q^.next^.i mod 2 = 1) do
    begin
      r := q^.next;
      q^.next := r^.next;
      dispose(r);
    end;
end;

We can simplify this either by inserting a dummy node at the head of the list (as in the function readList3 above) or by using recursion. Here is a tail-recursive implementation:

procedure deleteOdd2(var p: pnode);
var
  q: ^node;
begin
  if p = nil then exit;
  if p^.i mod 2 = 1 then
    begin
      q := p;
      p := p^.next;
      dispose(q);
      deleteOdd2(p);
    end
  else deleteOdd2(p^.next);
end;

sieve of Eratosthenes

We've seen in previous lectures that we can determine whether an integer n is prime using trial division, in which we attempt to divide n by successive integers. Because we must only check integers up to `sqrt(n)`, this primality test runs in time O(`sqrt n`).

Sometimes we may wish to generate all prime numbers up to some limit N. If we use trial division on each candidate, then we can find all these primes in time `O(N sqrt N`). But there is a faster way, using a classic algorithm called the Sieve of Eratosthenes.

It's not hard to carry out the Sieve of Eratosthenes using pencil and paper. It works as follows. First we write down all the integers from 2 to N in succession. We then mark the first integer on the left (2) as prime. Then we cross out all the multiples of 2. Now we mark the next unmarked integer on the left (3) as prime and cross out all the multiples of 3. We can now mark the next unmarked integer (5) as prime and cross out its multiples, and so on. Just as with trial division, we may stop when we reach an integer that is as large as `sqrt N`.

Here is the result of using the Sieve of Eratosthenes to generate all primes between 2 and 30:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

We can easily implement the Sieve of Eratosthenes in Pascal:

{$mode delphi}

var
  n, i, j: integer;
  prime: array of boolean;

begin
  write('Enter n: ');
  readln(n);
  setLength(prime, n + 1);
  for i := 2 to n do
    prime[i] := true;
  
  for i := 2 to trunc(sqrt(n)) do
    if prime[i] then
      begin
        j := 2 * i;
        while j <= n do
          begin
            prime[j] := false;
            j := j + i;
          end;
      end;
  
  for i := 2 to n do
    if prime[i] then write(i, ' ');
  writeln;
end.

How long does the Sieve of Eratosthenes take to run?

The inner loop, in which we set prime[j] := false, runs N/2 times when we cross out all the multiples of 2, then N/3 times when we cross out multiples of 3, and so on. So its total number of iterations will be

`N(1/2 + 1/3 + 1/5 + ... + 1/p)`

where p <= `sqrt N`.

The series

`sum_(p prime) 1/p = 1/2 + 1/3 + 1/5 + …`

is called the prime harmonic series. How can we approximate its sum through a given element 1/p?

First, we may note that the elements of the prime harmonic series are a subset of elements of the ordinary harmonic series, which sums the reciprocals of all positive integers:

`1/1 + 1/2 + 1/3 + … `

The ordinary harmonic series diverges, and the sum of all its terms through 1/n is close to ln n. In fact the difference between this sum and ln n converges to a constant:

`lim_(n->oo) [sum_(k=1)^n 1/k - ln n] = 0.577...`

So since p <= `sqrt N` we have

`N(1/2 + 1/3 + 1/5 + ... + 1/p) < N(1/2 + 1/3 + 1/4 + .. + 1/p) = N * O(log sqrt N) = N * O(log N) = O(N log N)`

We can derive a tighter bound using properties of the prime harmonic series itself. Surprisingly, the prime harmonic series diverges. This was first demonstrated by Euler. It grows extremely slowly, and its partial sum through 1/p is close to ln (ln n):

`lim_(n->oo) [sum_(p<=n) 1/p - ln (ln n)] = 0.261...`

This shows in turn that

`N(1/2 + 1/3 + 1/5 + ... + 1/p) = N * O(log log (sqrt N)) = N * O(log log N) = O(N log log N)`

This is very close to O(N). (In fact more advanced algorithms can generate all primes through N in time O(N).)