Programming I, 2018-9
Lecture 6 – Notes

+= and friends

The operators +=, -=, *= and /= let us write some assignment statements more compactly. For example,

x += 7;

is equivalent to

x := x + 7;

Similarly,

a[i] *= 2;

is equivalent to

a[i] := a[i] * 2;

records

A record is a compound value that contains a set of fields. Each record type defines the names and types of the fields that it contains. For example:

type
  book = record
    title: string;
    author: string;
    pages: integer;
  end;

You can access a record's fields using the '.' operator. For example:

var
  b: book;

begin
  b.title := 'war and peace';
  b.author := 'leo tolstoy';
  b.pages := 1440;
  
  writeln('title = ', b.title);

closest pair of points

We can conveniently use a record type to store a point (x, y) in 2-dimensional space:

type
  boolArr = array of boolean;
  
  point = record
    x, y: real;
  end;

Let's write a program that generates 100 random points in the square whose corners are at (0.0, 0.0) and (100.0, 100.0). For example, one random point might be (32.76852, 71.03879). Our program will determine the distance between the two closest points in this random set of 100.

We first write a function to determine the Euclidean distance between any two points:

uses math;

function distance(p, q: point): real;
begin
  exit(sqrt((p.x - q.x) ** 2 + (p.y - q.y) ** 2));
end;

Now we can generate and compare the random points as follows:

var
  a: array[1..100] of point;
  i, j: integer;
  d: real;
  min: real = Infinity;
begin
  randomize;
  
  for i := 1 to 100 do
    begin
      a[i].x := 100 * random;
      a[i].y := 100 * random;
    end;
  for i := 1 to 100 do
    for j := (i + 1) to 100 do
      begin
        d := distance(a[i], a[j]);
        if d < min then
          min := d;
      end;
  writeln(d:0:3);
end;

Note carefully the form of the double for loop above:

  for i := 1 to 100 do
    for j := (i + 1) to 100 do

In this loop we always have i < j, and each pair of points is compared exactly once. If we had written

  for i := 1 to 100 do
    for j := 1 to 100 do

then the program would always (incorrectly) generate the answer 0, since if i = j then distance(a[i], a[j]) is the distance from a point to itself, i.e. 0. Furthermore, this loop would examine each pair of points twice. For example, when i = 10 and j = 20 it would compute distance(a[10], a[20]), and then when i = 20 and j = 10 it would compute distance(a[20], a[10]), which is the same.

If we let the number of points be N instead of 100, then how long will our program take to run? When i = 1, j iterates from 2 to N, a total of (N – 1) values. When i = 2, j iterates from 3 to N, a total of (N – 2) values. And so on. The total number of calls to the distance function is

(N – 1) + (N – 2) + … + 2 + 1 = O(N2)

This is the running time of our program.

(By the way, there are more advanced algorithms that can find the closest pair in O(N log N) time, though we will not study them in this class.)

binary search for a boundary

Last week we learned how to use a binary search to look for a value in a sorted array. Now we will look at another application of binary search.

Suppose that we know that a given array consists of a series of values with some property P followed by some series of values that do not have that property. Then we can use a binary search to find the boundary point dividing the values with property P from those without it.

For example, suppose that we know that an array contains a series of even numbers followed by a series of odd numbers. The values in the array might be

8 4 22 6 84 4 10 7 9 9 3 17

We want to find the index of the first odd value in the array. Here is a function to do that using a binary search:

function firstOdd(const a: array of integer): integer;
var
  lo, hi, mid: integer;
begin
  lo := -1;
  hi := length(a);
  while hi - lo > 1 do
    begin
      mid := (lo + hi) div 2;
      if a[mid] mod 2 = 0 then  // even
        lo := mid
      else
        hi := mid;
    end;
  exit(hi);
end;

As the function runs:

When the while loop finishes, hi = lo + 1. So there are no unknown elements, and we know that a[lo] is even and a[hi] is odd. hi is the index of the first odd value.

We could use the same technique to find, for example, the first element in a sorted array whose value is >= 100.

insertion sort

We recently learned how to sort an array using bubble sort. Bubble sort is the easiest sorting algorithm to write, but is also relatively inefficient. Let's now learn a more efficient algorithm called insertion sort.

Insertion sort is similar to how you might sort a deck of cards by hand. The sort loops through the array elements from left to right. Assume that elements are indexed from 0 as in a Pascal dynamic array. For each element a[i], we find all elements to the left of a[i] that are greater than a[i] and shift them rightward one position. This makes room so that we can insert a[i] at a position to the left of those elements. And now the subarray a[0] … a[i] is in sorted order. After we repeat this for all i, the entire array is sorted.

For example, consider an insertion sort on this array:

%3

We shift 6 rightward and insert 5 to its left. Now a[0..1] is sorted:

%3

Now we shift 5 and 6 rightward, and insert 3 before them. Now a[0..2] is sorted:

%3

Now we shift 3, 5, and 6 to the right, and insert 1 before them:

%3

All elements to the left of 8 are less than it, so we can leave it in its place for the moment:

%3

Now we shift 8 rightward and insert 7:

%3

And so on. Here is an animation of insertion sort in action on the above array.

More concretely, to insert a[i] into the sorted subarray a[0 .. (i - 1)], insertion sort first saves the value of a[i] in a variable v. It then walks backwards through the subarray, shifting elements forward by one position as it goes. When it sees an element that is less than or equal to v, it stops, and inserts v to the right of that element. At this point the entire subarray a[0 .. i] is sorted.

What is the running time of insertion sort? In the best case, the input array is already sorted. Then no elements are shifted or modified at all and the algorithm runs in time O(n). The worst case is when the input array is in reverse order. Then to insert each value we must shift elements to its left, so the total number of shifts is 1 + 2 + … + (n – 1) = O(n2). If the input array is ordered randomly, then on average we will shift half of the subarray elements on each iteration. Then the time is still O(n2).

Insertion sort has the same worst-case asymptotic running time as bubble sort, i.e. O(n2). But it generally outperforms bubble sort by a factor of 3 or more. It is a reasonable choice for a simple sorting algorithm when n is not large.

joining digits from right to left

We have already seen how to join a series of digits from left to right into a number. To review, here is a function that joins the digits from 1 to k into a number from left to right:

function join(k: integer): integer;
var
  i: integer;
  n: integer = 0;
begin
  for i := 1 to k do
    n := 10 * n + i;
  exit(n);
end;

For example, join(5) = 12345.

Let's now see how to join digits in the other direction, from right to left. That is not much more difficult. Here's a function that joins the digits from 1 to k in that direction:

function join2(k: integer): integer;
var
  i: integer;
  d: integer = 1;
  n: integer = 0;
begin
  for i := 1 to k do
    begin
      n := n + i * d;
      d *= 10;
    end;
  exit(n);
end;

For example, join2(5) = 54321.

non-decimal bases

We would now like to write programs that can work with numbers written in non-decimal bases, i.e. bases other than 10.

Base 2 (binary) is especially common in computer programming. We looked at numbers in base 2 in the first lecture of this class. We also often use base 16 (hexadecimal), in which we have extra digits a = 10, b = 11, c = 12, d = 13, e = 14, f = 15. For example,

ff16 = 25510 because 15 · 161 + 15 · 160 = 240 + 15 = 255

Also,

ff16 = 111111112

Note that you can convert any hexdecimal number to binary by simply converting each hexadecimal digit to its binary equivalent (e.g. f16 = 11112) and concatenating the results.

Let's now write a function toBinary that can generate the binary representation of any number. We store this representation, a series of binary digits, as a string. For example, toBinary(27) = '11011'.

This will not be difficult to write, because we already know how to do this in base 10. We have already seen how to visit all the base-10 digits of a number. To generate the binary digits, we write the same code, but just use the number 2 as our modulus:

function toBinary(n: integer): string;
var
  d: integer;
  s: string = '';
begin
  while n > 0 do
    begin
      d := n mod 2;
      s := chr(ord('0') + d) + s;
      n := n div 2;
    end;
  exit(s);
end;

Let's now write a program that reads a number in base 7 and writes it in base 10. To do this, we will need to combine a series of base-7 digits into a number. Again, we have already done this in base 10! To accomplish the same task in base 7, we simply change our multiplicative constant to 7:

var
  s: string;
  c: char;
  n: integer = 0;
begin
  readln(s);
  for c in s do
      n := 7 * n + (ord(c) - ord('0'));
  writeln(n);
end;

exercise: finding substrings

Let's write a function contains that takes two strings S and T, and returns true if S contains T. For example, contains('key lime pie', 'lime') should return true.

function contains(s, t: string): boolean;
var
  i, j: integer;
  match: boolean;
begin
  for i := 1 to length(s) - length(t) + 1 do
    begin
      match := true;
      for j := 1 to length(t) do
        if s[i + j - 1] <> t[j] then
          begin
            match := false;
            break;
          end;
      if match then
        exit(true);
    end;
  exit(false);
end;

The program runs in time O(M · N), where length(S) = M and length(T) = N. (By the way, there exist other string matching algorithms that are more efficient than this.)

exercise: highest sum in any base

Let's write a program that reads a decimal number N and computes the sum of the digits of N in every base from 2 through 10. The program will write out the base number in which the digit sum is greatest. For example, if N = 10 the program will write 6. That's because 10 = 146 in which the digit sum is 1 + 4 = 5, and no other base would give a higher digit sum.

function digitSumInBase(n: integer; base: integer): integer;
var
  sum: integer = 0;
begin
  while n > 0 do
    begin
      sum += (n mod base);
      n := n div base;
    end;
  exit(sum);
end;

var
  n, b, s: integer;
  maxSum: integer = - MaxInt;
  maxBase: integer = 0;

begin
  readln(n);
  for b := 2 to 10 do
    begin
      s := digitSumInBase(n, b);
      if s > maxSum then
        begin
          maxSum := s;
          maxBase := b;
        end;
    end;
  writeln(maxBase);
end.

exercise: time addition

This record type represents a time of day:

type
  time = record
    hours: integer;     // 0 .. 23
    minutes: integer;   // 0 .. 59
  end;

We will write a function

functionaddMinutes(t: time; n:integer): time;

that adds a positive or negative number of minutes to a time, wrapping around as necessary. For example:

The easiest way to solve this problem is to convert the input time into another representation: a single integer representing a number of minutes since midnight. This will range from 0 (meaning midnight) to 1439 (meaning 23:59). Note that there are 24 * 60 = 1440 minutes in a day.

In this representation, the problem is easy to solve since we can simply add n and then use the mod operator to map the result back into the range 0..1439. This will handle wraparound in exactly the way we want. We have seen in a previous lecture that mod will yield a negative value if its first argument is negative, but in this case we can simply add the modulus (1440) to send the result back into positive territory.

Once we have the answer as a number of minutes since midnight, we can easily use div and mod to map it back into hours and minutes. Here is the complete function:

function addMinutes(t: time; n: integer): time;
var
  m: integer;  // minutes since midnight
  u: time;
begin
  m := 60 * t.hours + t.minutes;
  m := (m + n) mod (24 * 60);
  if m < 0 then
    m += 24 * 60;
  u.hours := m div 60;
  u.minutes := m mod 60;
  exit(u);
end;