Programming I, 2018-9
Lecture 2 – Notes

int64 type

Consider this program that attempts to add the integers from 1 to 200,000:

{$mode delphi}

var
  i: integer;
  sum: integer = 0;

begin
  for i := 1 to 200 * 1000 do
    sum := sum + i;
  writeln(sum);
end.

The program prints:

-1474736480

Clearly this is wrong.

The problem is that the computation has overflowed, i.e. exceeded the range of a signed 32-bit integer. Actually we should have expected this. From algebra we know that the sum of the integers from 1 to N is (N)(N + 1) / 2, which is more than N² / 2. If N = 200,000, then N² / 2 = (200,000)(200,000) / 2 = 20,000,000,000. This is much greater than the largest possible 32-bit integer (which is about 2 billion).

Fortunately Pascal includes a type int64 representing a signed 64-bit integer. A value of type int64 can be any integer in the range from – 2⁶³ – 1 to 2⁶³ – 1, i.e. from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. In the program above, if we change the sum variable to have type int64 then it will print the correct answer:

20000100000

type conversions

Pascal will implicitly (automatically) convert between certain types:

char → string
integer, int64 → real
integer → int64
int64 → integer

For example:

var
  c: char;
  s: string;
  i: integer;
  r: real;

begin
  readln(c);
  readln(i);
  s := c;  // char converted to string
  r := i;  // integer converted to real

boolean operators

Pascal includes the three familiar Boolean operators and, or and not. Most often we use these in if statements:

if (x > 3) and (y > 5) then …

Note that the parentheses in the preceding statement are required! If you attempt to write

if x > 3 and y > 5 then … // ERROR

you will get a compiler error.

not is a unary operator, i.e. an operator that takes a single argument:

if not ((a >= 0) and (a <= 10)) then …

You may use these operators in any expression, not just in if statements. Each of them returns a Boolean value:

var
  x, y: integer;
  b, c: boolean;

begin
  readln(x, y);
  b := (x > 3) and (y < 3);
  c := not b;
  …

nested loops

In programming it is common to use a nested loop, i.e. a loop inside a loop. For example, here's a program that prints out a rectangle of asterisks:

var
  x, y: integer;
  i, j: integer;

begin
  write('Enter dimensions: ');
  readln(x, y);
  
  for i := 1 to y do
    begin
      for j := 1 to x do
        write('*');
      writeln;
    end;
end.

The output looks like this:

Enter dimensions: 7 3
*******
*******
*******

Note that the inner loop ("for j := 1 to x") runs in its entirety on every iteration of the outer loop ("for i := 1 to y"). So the total number of asterisks printed is x ⋅ y.

It is also possible (and common) to change the bounds of the inner loop on each iteration of the outer loop. For example, here is a program to print a triangle of asterisks:

var
  n: integer;
  i, j: integer;

begin
  write('Enter size: ');
  readln(n);
  
  for i := 1 to n do
    begin
      for j := 1 to i do
        write('*');
      writeln;
    end;
end.

The output looks like this:

Enter size: 5
*
**
***
****
*****

In each of these examples we've used a doubly nested loop, i.e. a loop inside a loop. Of course, loops may be triply or even aribitrarily nested.

formatting numeric output

Here's a program that uses a nested loop to print a multiplication table:

var
  i, j: integer;

begin
  for i := 1 to 10 do
    begin
      for j := 1 to 10 do
        write(i * j, ' ');
      writeln;
    end;
end.

Here is the output:

1 2 3 4 5 6 7 8 9 10 
2 4 6 8 10 12 14 16 18 20 
3 6 9 12 15 18 21 24 27 30 
4 8 12 16 20 24 28 32 36 40 
5 10 15 20 25 30 35 40 45 50 
6 12 18 24 30 36 42 48 54 60 
7 14 21 28 35 42 49 56 63 70 
8 16 24 32 40 48 56 64 72 80 
9 18 27 36 45 54 63 72 81 90 
10 20 30 40 50 60 70 80 90 100

This looks poor because the columns don't line up. That's because numbers in the tables have varying numbers of digits.

We can fix this by specifying a field width as we output each value:

write(i * j : 3, ' ');

In this statement the field width is 3. The field width is the minimum number of characters to output as the value is written. If the value is shorter than the field width, the output will be padded at the left with space characters. (The default field width is 0.)

With this change, the output now looks like this:

  1   2   3   4   5   6   7   8   9  10 
  2   4   6   8  10  12  14  16  18  20 
  3   6   9  12  15  18  21  24  27  30 
  4   8  12  16  20  24  28  32  36  40 
  5  10  15  20  25  30  35  40  45  50 
  6  12  18  24  30  36  42  48  54  60 
  7  14  21  28  35  42  49  56  63  70 
  8  16  24  32  40  48  56  64  72  80 
  9  18  27  36  45  54  63  72  81  90 
 10  20  30  40  50  60  70  80  90 100

Much nicer.

We can additionally specify a decimal width when we use write() to output a real value. Consider this program to print the average of three numbers:

var
  x, y, z: real;
  avg: real;

begin
  readln(x, y, z);
  avg := (x + y + z) / 3;
  writeln(avg);
end.

When we run it with the input "4.2 4.3 4.4", the program prints

 4.2999999999999998E+000

This is ugly. There are a couple of problems. First, the number is printed in scientific notation with lots of digits plus an exponent ("E+000") at the end. Often we don't want or need to see so much information. Second, the result is inexact: it is 4.2999999999999998 rather than 4.3000000000000000 due to floating-point rounding error, a common occurrence (whose exact causes are beyond the scope of this course).

We can fix both these problems by specifying a decimal width:

writeln(avg : 0 : 2);

In this expression 0 is the field width, which was described above. (0 is the default value, meaning that no extra spaces will be written.) 2 is the decimal width, which is the number of digits to print after the decimal point. Now the program prints

4.30

Notice that the output value was rounded to two decimal places, yielding 4.30 rather than 4.29.

while

In the last lecture we saw how write loops using the for statement. for works well when we know in advance how many times we want to loop. But often we want to loop for as long as a certain condition remains true, without knowing in advance how many times that will be. The while statement lets us do this.

Here's a program that uses while to remove all zeros at the end of a decimal number:

var
  n: integer;

begin
  write('Enter n: ');
  readln(n);
  
  while n mod 10 = 0 do
    n := n div 10;
    
  writeln(n);
end.

For example:

Enter n: 18000
18

If the condition in a while loop is false at the beginning, the loop body is skipped: no iterations will run:

Enter n: 1862
1862

Just as with for, you can use begin...end if you want the loop body to contain more than one statement:

while n mod 10 = 0 do
  begin
    writeln('Dividing by ten…');
    n := n div 10;
  end;

reading an indefinite number of values

We can use while to write programs that read an indefinite number of values.

To accomplish this, the user must be able to signal that input from the keyboard is complete. On Linux and macOS, you can type Ctrl+D to indicate the end of keyboard input. On Windows, Ctrl+Z does the same thing.

You can call the built-in function seekEof to check whether the input is at its end. seekEof does two things. First, it advances past all leading whitespace characters in the input (spaces, newlines, and so on). (This is the meaning of "seek"). Then it returns a boolean value which is true if the input has reached the end. ("eof" means "end of file", a term which is used even when input comes from the keyboard in a terminal window).

Here is a program that reads an arbitrary number of integers, one per line, and then prints their sum:

var
  sum: integer = 0;
  i: integer;

begin
  while not seekEof do
    begin
      readln(i);
      sum := sum + i;
    end;
  
  writeln(sum);
end.

If we run the program, and type

then it will print

(Again, on Windows you must type Ctrl+Z rather than Ctrl+D here.)

calling library functions

We've already learned how to call several procedures and one function in Pascal's standard library: write, writeln, read, readln, and seekEof. We will now learn how to call many more.

In Pascal, a function takes zero or more parameters (also called arguments) and returns a value. A procedure is like a function, but returns no value. (We often use the term "functions" to refer to functions and procedures generally, since it would be too awkward to say "functions and procedures" all the time.)

The Pascal Run-Time Library contains thousands of built-in functions that you can call. I've chosen some of the most useful functions in the library and listed them on the page Run-Time Library Quick Reference. You will want to refer to this often when writing Pascal code for this class.

In this library reference, a function signature (i.e. a line of text indicating the function's name, parameter types and return type) looks like this:

function stringOfChar(c: char; n: integer) : string

This means that the function stringOfChar takes two parameters, namely a char and an integer. It returns a string. The names “c” and “n” appear in the function documentation here, but you don't specify them when calling the function. You can call this function like this:

var
  s: string;

…

  s := stringOfChar(4, 'm');           // now s = 'mmmm'

Similarly, a procedure's signature looks like this:

procedure gotoXY(x: integer; y: integer)

A procedure has no return type since it returns no value.

The function in the standard library are grouped into units as specified in the documentation. To use the functions in a particular unit, you must specify it in a uses declaration at the top of your program. For example, the max function is in the math unit, so we can write

uses math;

begin
  writeln(max(5, 10));
end.

If you forget the uses declaration in this program, the compiler will complain:

Error: Identifier not found "max"

By the way, we will learn to write our own functions and procedures later in this course.

converting between characters and integers

As mentioned in the previous lecture, computers generally use the ASCII character set to represent text consisting of uppercase and lowercase Latin letters, decimal digits, and basic punctuation characters. (More exotic characters require the Unicode character set.) Here is a table of ASCII characters:

Dec  = Decimal Value
Char = Character


Dec  Char                           Dec  Char     Dec  Char     Dec  Char
---------                           ---------     ---------     ----------
  0  NUL (null)                      32  SPACE     64  @         96  `
  1  SOH (start of heading)          33  !         65  A         97  a
  2  STX (start of text)             34  "         66  B         98  b
  3  ETX (end of text)               35  #         67  C         99  c
  4  EOT (end of transmission)       36  $         68  D        100  d
  5  ENQ (enquiry)                   37  %         69  E        101  e
  6  ACK (acknowledge)               38  &         70  F        102  f
  7  BEL (bell)                      39  '         71  G        103  g
  8  BS  (backspace)                 40  (         72  H        104  h
  9  TAB (horizontal tab)            41  )         73  I        105  i
 10  LF  (NL line feed, new line)    42  *         74  J        106  j
 11  VT  (vertical tab)              43  +         75  K        107  k
 12  FF  (NP form feed, new page)    44  ,         76  L        108  l
 13  CR  (carriage return)           45  -         77  M        109  m
 14  SO  (shift out)                 46  .         78  N        110  n
 15  SI  (shift in)                  47  /         79  O        111  o
 16  DLE (data link escape)          48  0         80  P        112  p
 17  DC1 (device control 1)          49  1         81  Q        113  q
 18  DC2 (device control 2)          50  2         82  R        114  r
 19  DC3 (device control 3)          51  3         83  S        115  s
 20  DC4 (device control 4)          52  4         84  T        116  t
 21  NAK (negative acknowledge)      53  5         85  U        117  u
 22  SYN (synchronous idle)          54  6         86  V        118  v
 23  ETB (end of trans. block)       55  7         87  W        119  w
 24  CAN (cancel)                    56  8         88  X        120  x
 25  EM  (end of medium)             57  9         89  Y        121  y
 26  SUB (substitute)                58  :         90  Z        122  z
 27  ESC (escape)                    59  ;         91  [        123  {
 28  FS  (file separator)            60  <         92  \        124  |
 29  GS  (group separator)           61  =         93  ]        125  }
 30  RS  (record separator)          62  >         94  ^        126  ~
 31  US  (unit separator)            63  ?         95  _        127  DEL

Sometimes we want to convert a char in Pascal into its integer ASCII code, or vice versa. For example, suppose that we want to write a program that reads a character and prints the following character in the alphabet. We cannot do this:

var
  c, d: char;

begin
  read(c);
  d := c + 1;  // ERROR

That's because Pascal (unlike some languages) does not allow us to add an integer to a character.

Instead, we can convert the character to an integer, add one to it, then convert back to a character. The ord and chr functions let us perform these conversions:

var
  c, d: char;
  i, j: integer;

begin
  read(c);
  i := ord(c);  // char -> integer
  j := i + 1;
  d := chr(j);  // integer -> char
  writeln(d);
end.

If we run this program and type the character 'a', then i will be 97 (according to the ASCII table above). Then j will be 98, and the program will print 'b'.

Be warned: the ASCII code for a digit such as '7' is not the corresponding integer! If we run the program above and type '7', then i will be 55, not 7, as you can see in the table above.

repeat

We have seen how to loop using for and while. A third statement for looping in Pascal is repeat, which repeatedly executes a block of statements until a condition becomes true.

Here's a program that uses repeat to read an integer repeatedly until the input is valid:

var
  i: integer;

begin
  repeat
    readln(i);
    if i < 0 then
      writeln('Error!  i cannot be negative');
  until i >= 0;
  ...
end.

Note that in a repeat loop the test condition is at the end of the loop, not at the beginning as in a while loop. This means that in a repeat loop the body will always execute at least once.

Also note that repeat...until can enclose several statements without wrapping them all in begin...end. This is another difference between repeat and while.

a guessing game

We now have the tools we need to write a simple game. In this game, the computer chooses a random number between 1 and 1000 and the user has to guess it:

I'm thinking of a number from 1 to 1000.
Your guess: 800 
Too low!
Your guess: 900
Too high:
Your guess: 850
Too low:
Your guess: 860
Too high:
Your guess: 864
You got it!

Here is the program:

var
  n, guess: integer;

begin
  randomize;
  n := random(1000) + 1;
  writeln('I''m thinking of a number from 1 to 1000.');
  
  repeat
    write('Your guess: ');
    readln(guess);
    
    if guess < n then
      writeln('Too low!')
    else if guess > n then
      writeln('Too high!');
  until guess = n;
  
  writeln('You got it!');
end.

Notice that we must call randomize at the beginning of the program – otherwise the program will choose the same random number on every run.

Also notice the line

  n := random(1000) + 1;

As described in the library reference, random(N) returns a random number from 0 to (N – 1), so we must add 1 to obtain a random number in the range from 1 to N.

If we are the user playing this game, what is our best strategy to minimize the number of guesses we must make in the worst case? And how many guesses might be required?

As we play the game, at every moment we know that the target number falls in the range A...B for some integers A and B. As you might imagine, our best strategy is as follows. At each step, we guess a number G that divides this interval in half, i.e. G = (A + B) / 2. If G is too high, then we now know that the target number is in the range A … (G – 1). If it is too low, we now know that it's in the range (G + 1) … B.

This guessing strategy is called a binary search, and it is an important algorithm that we'll see again later in this course. With this strategy, at each step the size of the interval containing the target value drops by a factor of 2. So after K guesses its size has dropped by a factor of 2^K. In particular, in this case the interval originally contains 1000 numbers. After the first guess, its size is at most 1000 / 2 = 500. After the second guess, its size is at most 1000 / 2² = 250. And so on. After 10 guesses, the interval has dropped by a factor of 2¹⁰ = 1024, and must now contain only a single number. In other words, 10 guesses are always sufficient to determine the number that the computer has chosen.

In the general case, if the computer chooses a random number from 1 to N then we might need log₂N guesses in the worst case.

break

Sometimes we may wish to exit a loop early, before it would otherwise terminate. The break statement is used for this purpose. It breaks out of the nearest enclosing for, while, or repeat loop.

For example, here's a program that reads a series of numbers from its input and prints a message 'contains an odd number' if any of them are odd. As soon as it sees an odd number, it does not read any more numbers:

var
  odd: boolean;
  i: integer;

begin
  odd := false;

  while not seekEof do
    begin
      readln(i);
      if i mod 2 = 1 then
        begin
          odd := true;
          break;
        end;
    end;

  if odd then
    writeln('contains an odd number')
  else
    writeln('no odd numbers');
end.

We could alternately write this program without break, as follows:

var
  odd: boolean;
  i: integer;

begin
  odd := false;

  while not seekEof and not odd do
    begin
      readln(i);
      if i mod 2 = 1 then
          odd := true;
    end;

  if odd then
    writeln('contains an odd number')
  else
    writeln('no odd numbers');
end.

Study these programs to understand why they are equivalent.

In this program it does not matter much whether we use break. But sometimes break is the most clear or convenient alternative.

exit

The exit statement is another way to jump out of the current code block. The exit statement will exit the main begin/endblock, i.e. the entire program, immediately!

(Later, when we learn to write our own functions, we'll see that exit actually exits the current function. But the code we are writing today does not include any functions, so exit exits the entire program.)

We can rewrite the above program using exit as follows. Notice that no boolean variable is now needed.

var
  i: integer;

begin
  while not seekEof do
    begin
      readln(i);
      if i mod 2 = 1 then
        begin
          writeln('contains an odd number');
          exit;
        end;
    end;
  writeln('no odd numbers');
end.

primality testing

As we know from mathematics, a prime number is an integer greater than 1 whose only factors are 1 and itself. For example, 2, 7, 47 and 101 are all prime.

We would now like to write a program that tests whether a given number is prime. To do this, we will use a simple algorithm called trial division, which means dividing by each possible factor in turn. By the way, there also exist more efficient (and complex) algorithms for primality testing; you may encounter these in more advanced courses.

Here is a naive implementation of trial division:

var
  n, i: integer;
  
begin
  write('Enter n: ');
  readln(n);
  
  for i := 2 to n - 1 do
    if n mod i = 0 then
      begin
        writeln('not prime');
        exit;
      end;
      
  writeln('prime');
end.

This works fine, but is inefficient because it must test all integers from 2 up to (n – 1). When n is large, this can take a long time.

Actually for a given n we need test only the values up to sqrt(n). To see this, consider the following fact. If ab = n for integers a and b, then either a ≤ sqrt(n) or b ≤ sqrt(n). Proof: Suppose that a > sqrt(n) and b > sqrt(n). Then ab > sqrt(n) ⋅ sqrt(n) = n, a contradiction. So either a ≤ sqrt(n) or b ≤ sqrt(n).

It follows that if we have tested all the values from 2 through sqrt(n) and none of them divide n, then if ab = n we must have a = 1 or b = 1. And so n is prime.

So we can make our program much more efficient by replacing the statement

for i := 2 to n - 1 do

with

for i := 2 to trunc(sqrt(n)) do

Note that the call to trunc rounds down to the nearest integer. Without this call, the code will not compile, since sqrt returns a real but for expects the upper bound to be an integer.

With this change, our primality testing algorithm is complete. It is simple, but it is a significant algorithm, the first of many that we will see in this course. Like all algorithms, it is language-independent. Our presentation is in Pascal, but you could easily code the algorithm above in any programming language.

execution speed

As an experiment, let's write a program to add the numbers from 1 to 1,000,000,000 (a billion). We will use an int64 to hold the result, since it will certainly not fit in a 32-bit integer:

{$mode delphi}

var
  i: integer;
  sum: int64 = 0;

begin
  for i := 1 to 1000 * 1000 * 1000 do
    sum := sum + i;
  
  writeln(sum);
end.

On my laptop, this program runs in about 1.7 seconds. This shows us that each iteration of the loop above runs in about 1.7 nanoseconds. Our usual units for measuring time intervals smaller than 1 second are

1 ms = 1 millisecond = a thousandth of a second
1 μs = 1 microsecond = a millionth of a second
1 ns = 1 nanosecond = a billionth of a second

This is typical: a modern desktop or laptop computer can execute about a billion low-level operations (such as adding two numbers or writing a number from the CPU to main memory) per second. This is incredibly fast. A billion is a very large number. To put things in perspective, a billion seconds is about 31.7 years, so it would take you that long to accomplish this summation task if you could add one number per second. :)

Let's now turn this into a double loop:

{$mode delphi}

var
  i, j: integer;
  sum: int64 = 0;

begin
  for i := 1 to 1000 * 1000 * 1000 do
    for j := 1 to 1000 * 1000 * 1000 do
      sum := sum + j;
  
  writeln(sum);
end.

The number we are computing will no longer fit even in an int64 so it will overflow, but that's not the point: we're interested in how long this program will take to run. Each iteration of the inner loop will take (at least on my laptop) 1.7 sec. The outer loop will run the inner loop 1,000,000,000 times. So the total running time will be 1.7 billion sec, or about 53.9 years.

Finally, let's make a triple loop:

{$mode delphi}

var
  i, j, k: integer;
  sum: int64 = 0;

begin
  for i := 1 to 1000 * 1000 * 1000 do
    for j := 1 to 1000 * 1000 * 1000 do
      for k := 1 to 1000 * 1000 * 1000 do
        sum := sum + k;
  
  writeln(sum);
end.

This will take one billion times longer to run, so (on my laptop at least) this program will take 53.9 billion years to run. According to physicists, the age of the universe is only about 13.8 billion years. So even if your computer is a bit faster than mine, you're going to have to be pretty patient if you want to wait around for this program to complete. :)

Programming I, 2018-9 Lecture 2 – Notes