Introduction to Algorithms, 2021-2
Week 4: Notes

Some of the topics we discussed today are covered in these sections of Problem Solving with Algorithms:

Here are some additional notes.

Euclid's algorithm, continued

In the last lecture we learned about Euclid's algorithm for finding the greatest common divisor of two integers. We said that it works because of this mathematical fact: for all integers a and b,

gcd(a, b) = gcd(b, a mod b)

Let's now prove that. First, let's introduce the notion of a linear combination of two integers a and b, which is any integer of the form ia + jb for some integers i and j. For example, 26 is a linear combination of 3 and 5, because 2 · 3 + 4 · 5 = 26. To put it differently, a linear combination of a and b is any number that we can form by adding up some number of a's plus some number of b's.

Recall that d | n means that d divides n, i.e. n is a multiple of d. Notice that if d | a and d | b, then d also divides any linear combination of a and b. For example, any linear combination of 6 and 10 must be divisible by 2, because 2 | 6 and 2 | 10.

We'd like to show that the sets {a, b} and {b, a mod b} have the same common divisors. Suppose that d | b and d | a mod b. We have already seen this equation that relates a, b, a div b and a mod b:

a = b · (a div b) + (a mod b)

We see that a is a linear combination of b and (a mod b). So we know that d | a. And so if d is a divisor of both b and (a mod b), then it's a divisor of both a and b.

Now suppose that d | a and d | b. We can rearrange the equation above to to

a mod b = a – b · (a div b)

We see that (a mod b) is a linear combination of a and b. So we know that d | (a mod b). And so if d is a divisor of both a and b, then it's a divisor of both b and (a mod b).

So the sets {a, b} and {b, a mod b} have the same common divisors. Therefore the sets have the same greatest common divisors, and so gcd(a, b) = gcd(b, a mod b).

We'd now like to consider the running time of Euclid's algorithm. Frst note that for any integers a, b with a > b, we have

(a mod b) < a / 2

For if 0 < b ≤ a / 2, then (a mod b) ≤ b – 1 < a / 2. Or if a / 2 < b < a, then a div b = 1, so (a mod b) = a – b < a – a / 2 = a / 2.

Now suppose that we call gcd(x, y) and the function runs for at least a few iterations. Then

In two iterations the first argument has dropped from x to (x mod y), which is less than x / 2. So after every two iterations of the algorithm, the first argument will be reduced by at least a factor of 2. This shows that the algorithm cannot iterate more than 2 log2(x) times. In other words, it runs in time O(log N) in the worst case, since N = x.

least common multiple

We have seen how to compute the greatest common divisor of two integers A related concept is the least common multiple of two integers p and q, which is the smallest integer that is divisible by both p and q. For example,

lcm(60, 90) = 180

As with the gcd, we can find the least common multiple of two numbers by computing their prime factorizations. For every prime p, if the prime factorization of a includes pi, and the prime factorization of b includes pj, then the prime factorization of gcd(a, b) will include pmax(i, j). (In other words, the prime factorization of gcd(a, b) is the union of the prime factorizations of a and b.)

How can we compute a least common multiple efficiently? Here is a useful fact: for all integers and b,

gcd(a, b) · lcm(a, b) = a · b

Why is this true? First observe that for any integers i and j, trivially

min(i, j) + max(i, j) = i + j

Now, given integers a and b, consider any prime p. Suppose that the prime factorization of a contains pi, and the prime factorization of b contains pj. Then we have already seen that the prime factorization of gcd(a, b) contains pmin(i, j), and the prime factorization of lcm(a, b) contains pmax(i, j). Therefore the prime factorization of the product gcd(a, b) · lcm(a, b) will contain pmin(i, j) + max(i, j) = pi + j, which is also present in the prime factorization of the product a · b. Since this is true for all primes in the factorizations, we have gcd(a, b) · lcm(a, b) = a · b.

Now we can rearrange the formula above to give us lcm(a, b):

lcm(a, b) = a · b / gcd(a, b)

We can compute the gcd efficiently using Euclid's algorithm, so this formula gives us an efficient way to compute the lcm.

sieve of Eratosthenes

We've seen in previous lectures that we can determine whether an integer n is prime using trial division, in which we attempt to divide n by successive integers. Because we must only check integers up to sqrt(n), this primality test runs in time O(sqrt(n)).

Sometimes we may wish to generate all prime numbers up to some limit N. If we use trial division on each candidate, then we can find all these primes in time O(N sqrt(N)). But there is a faster way, using a classic algorithm called the Sieve of Eratosthenes.

It's not hard to carry out the Sieve of Eratosthenes using pencil and paper. It works as follows. First we write down all the integers from 2 to N in succession. We then mark the first integer on the left (2) as prime. Then we cross out all the multiples of 2. Now we mark the next unmarked integer on the left (3) as prime and cross out all the multiples of 3. We can now mark the next unmarked integer (5) as prime and cross out its multiples, and so on. Just as with trial division, we may stop when we reach an integer that is as large as sqrt(N).

Here is the result of using the Sieve of Eratosthenes to generate all primes between 2 and 30:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

We can easily implement the Sieve of Eratosthenes in Python:

# Generate all prime numbers up to (but not including) n.

isPrime = n * [True]
isPrime[0] = isPrime[1] = False   # 0 and 1 are not prime

i = 2
while i * i <= n:
    if isPrime[i]:
        for j in range(2 * i, n, i):
            isPrime[j] = False
    i += 1

How long does the Sieve of Eratosthenes take to run?

The inner loop, in which we set isPrime[j] = False, runs N/2 times when we cross out all the multiples of 2, then N/3 times when we cross out multiples of 3, and so on. So its total number of iterations will be

N(1/2 + 1/3 + 1/5 + ... + 1/p)

where p <= sqrt(N).

The series of reciprocals of primes

1/2 + 1/3 + 1/5 + 1/7 + 1/11 + 1/13 + ...

is called the prime harmonic series. How can we approximate its sum through a given element 1/p?

Euler was the first to show that the prime harmonic series diverges: its sum grows to infinity, through extremely slowly. Its partial sum through 1/p is actually close to ln (ln n). This means that the running time of the Sieve of Eratosthenes is

N(1/2 + 1/3 + 1/5 + ... + 1/p) = N * O(log log (sqrt N)) = N * O(log log N) = O(N log log N)

This is very close to O(N). (In fact more advanced algorithms can generate all primes through N in time O(N).)

binary search

We may wish to search for a value (sometimes called the “key”) in a sorted array. We can use an algorithm called binary search to find the key efficiently.

Binary search is related to the number-guessing game that we saw earlier. In that game, one player thinks of a number N, say between 1 and 1,000,000. If we are trying to guess it, we can first ask "Is N greater than, less than, or equal to 500,000?" If we learn that N is greater, then we now know that N is between 500,001 and 1,000,000. In our next guess we can ask to compare N to 750,000. By always guessing at the midpoint of the unknown interval, we can divide the interval in half at each step and find N in a small number of guesses.

Similarly, suppose that we are searching for the key in a sorted array of 1,000,000 elements from a[0] through a[999,999]. We can first compare the key to a[500,000]. If it is not equal, then we know which half of the array contains the key. We can repeat this process until the key is found.

To implement this in Python, we will use two integer variables lo and hi that keep track of the current unknown range. Initially lo = 0 and hi = length(a) - 1. At all times, we know that all array elements with indices < lo are less than the key, and all elements with indices > hi are greater than the key. That means that the key, if it exists in the array, must be in the range a[lo] … a[hi]. As the binary search progresses, lo and hi move toward each other. If eventually hi < lo, then the unknown range is empty, meaning that the key does not exist in the array.

Here is a binary search in Python. (Recall that a Python list is actually an array.)

# Search for value k in the list a.

lo = 0
hi = len(a)- 1

while lo <= hi:
    mid = (lo + hi) // 2
    if a[mid] == k:
        print('found', k, 'at index', mid)
        break
    elif a[mid] < k:
        lo = mid + 1
    else:  # a[mid] > k
        hi = mid - 1

Let's now consider the big-O running time of binary search. Suppose that the input array has length N. After the first iteration, the unknown interval (hi – lo) has approximately (N / 2) elements. After two iterations, it has (N / 4) elements, and so on. After k iterations it has N / 2k elements. We will reach the end of the loop when N / 2k = 1, so k = log2(N). This means that in the worst case (e.g. if the key is not present in the array) binary search will run in O(log N) time. In the best case, the algorithm runs in O(1) since it might find the desired key immediately, even if N is large.