An algorithm is a precise sequence of steps to achieve some goal. You already know some algorithms from daily life, such as algorithms for adding/subtracting/multiplying decimal numbers which you learned in grade school. As another example, if you're playing a card game and want to sort the cards in your hand, you will use some algorithm to do that (even though you may not be conscious of the exact algorithm that you are using).
As we'll see in this course, different algorithms that perform the same task may differ in efficiency. For example, suppose that we want to sort a list of numbers. If there are only a few numbers, it won't matter much which algorithm we use. However, now suppose that there are a billion numbers in the list. In this case, our choice of algorithm could have a dramatic effect – in fact one algorithm might be more than a million times as efficient as another!
In this class we'll be very interested in the relationship between a problem's input size and the running time of an algorithm on the problem. For example, if we want to sort N numbers, then the input size is N. For some algorithms, the running time increases only modestly with the input size, e.g. linearly. But for other algorithms, the running time may increase more dramatically, even exponentially. In this class, we'll learn how to analyze an algorithm's running time and how to express the relationship between input size and running time in a mathematically precise way.
Algorithms are a major area of research; some computer scientists spend their entire careers working on them. For many problems, certain algorithms are known but it is not currently known whether any faster algorithm exists.
Perhaps surprisingly, some problems cannot be solved by any algorithm! We will not discuss such problems in this class, but you'll learn about them in more advanced courses.
You should be familiar with these identities, which we will use often:
au + v = au · av
au – v = au / av
auv = (au)v
Powers of 10 are often used in engineering and computer science:
10-9 = 1 / 1,000,000,000, prefix = nano (e.g. nanosecond)
10-6 = 1 / 1,000,000, prefix = micro (e.g. microsecond)
10-3 = 1 / 1,000, prefix = milli (e.g. millisecond)
100 = 1
103 = 1,000 = one thousand, prefix = kilo (e.g. kilogram = 1000 g)
10⁶ = 1,000,000 = one million, prefix = mega, (e.g. megawatt)
10⁹ = 1,000,000,000 = one billion, prefix = giga (e.g. gigawatt)
1012 = 1,000,000,000,000 = one trillion, prefix = tera (e.g. terabyte)
1015 = 1,000,000,000,000,000 = one quadrillion, prefix = peta
You should know the powers of 2 by heart at least up to 210:
20 = 1
21 = 2
22 = 4
23 = 8
24 = 16
25 = 32
26 = 64
27 = 128
28 = 256
29 = 512
210 = 1,024
It is a convenient mathematical coincidence that 210 = 1,024 ≈ 1,000. As a consequence, we also have 220 ≈ 1,000,000 and 230 ≈ 1,000,000,000. These facts allow you to convert between powers of 2 and 10 easily, at least approximately. For example, 226 = 26 · 220 ≈ 64 · 1,000,000 = 64,000,000.
You should be familiar with these identities, which we will use often:
logb(uv) = logb u + logb v
logb(u / v) = logb u - logb v
logb(up) = p logb u
loga b · logb c = loga c
The mathematical study of the integers is called number theory.
Let a, b, c ∈ ℤ. Here is some terminology: if a = bc then a is divisible by (or a multiple of) b and c. b and c are factors of a.
For example: the integer 6 has 8 distinct factors: 1, 2, 3, 6, -1, -2, -3, -6. However, often we will only be interested in an integer's positive factors.
Division Theorem. For any a, b ∈ ℤ with b > 0, there exist unique q, r ∈ℤ such that
a = q · b + r and 0 ≤ r < b
We write
q
= the quotient
of
a and b, denoted by a
div b
r
= the remainder,
denoted by a
mod b
For example, if a = 17 and b = 5, then
q
= 17 div 5 = 3
r = 17 mod 5 = 2
17 = 3 · 5 + 2
Or if a = -17 and b = 5, then
q
= -17 div 5 = -4
r = -17 mod 5 = 3
-17 = -4 · 5 + 3
For any a ∈ ℤ, a mod 2 = 0 if and only if a is even. a mod 2 = 1 if and only if a is odd. Also, a div 1 = a, and a mod 1 = 0.
As we saw in Programming 1, the Python operators // and % compute the div and mod:
>>> 17 // 5 3 >>> 17 % 5 2
By longstanding convention, pretty much every society on Earth writes numbers in base 10, otherwise known as decimal. When we write a number such as 1985, we actually mean a sum of powers of 10:
1985 = 1 · 103 + 9 · 102 + 8 · 101 + 5 · 100
In the number above, 1 is the most significant digit, i.e. the one that represents the greatest power of 10. 5 is the least significant digit.
We use base 10 by convention, but of course the number 10 is arbitrary. We would now like to work with numbers written in non-decimal bases, i.e. bases other than 10.
For example, consider base 5. In the base 5 system, every digit has a value from 0 to 4. Consider the number
20315
Here, the subscript 5 means that the number is written in base 5.
The decimal value of the base-5 number above is
2 · 53 + 0 · 52 + 3 · 51 + 1 · 50 = 26610
Every positive integer has a unique representation in any base.
Base 2 (binary) is especially common in computer programming. Here are the first natural numbers in base 2:
02 = 010
12 = 110
102 = 210
112 = 310
1002 = 410
1012 = 510
1102 = 610
1112 = 710
10002 = 810
We also often use base 16 (hexadecimal), in which we have extra digits a = 10, b = 11, c = 12, d = 13, e = 14, f = 15. For example,
ff16 = 25510 because 15 · 161 + 15 · 160 = 240 + 15 = 255
Suppose that we have a series of digits in some base b. We'd like to combine the digits into a number, i.e. compute the integer that the digits represent in that base. We first notice that we can append a digit d to an integer n in base b using this formula:
n' = b · n + d
In base 10, this should be pretty obvious. For example, if n = 198 and d = 5, then n' = 10 · 198 + 5 = 1985.
However, this will work in any base. For example, consider 10223, which equals 3510 since
1 · 33 + 0 · 32 + 2 · 31 + 2 · 30 = 35
Suppose that we'd like to append the digit 1 to this base-3 number. Then we can compute 3 · 35 + 1 = 106, which equals 102213.
So now suppose that we have the base-3 digits 1, 0, 2, 2, 1 and we'd like to combine them into an integer. We can use an iterative process, starting from zero. At every step we multiply by the base (3), then add the next digit:
0 · 3 + 1 = 1
1 · 3 + 0 = 3
3 · 3 + 2 = 11
11 · 3 + 2 = 35
35 · 3 + 1 = 106
This shows that 102213 = 10610, as we saw above.
Let's implement this algorithm in Python:
s = input('Enter digits: ') b = int(input('Enter base: ')) n = 0 for c in s: n = b * n + int(c) print(n)
Notice that we need to call the conversion function int() to convert c (a character such as '4') into an integer (4) that we can add.
It works:
$ py combine.py Enter digits: 10221 Enter base: 3 106
However, this program will not handle bases larger than 10. We'll see how to fix that soon (perhaps in this week's tutorial).
Now consider the inverse problem: we're given an integer n, and we want to figure out how to write it in base b, i.e. extract a series of digits in base b. We first notice that for any integer i,
i mod b is the last digit of i (in base b)
i div b is the number formed by all digits of i except the last (in base b)
These facts should be obvious in base 10; for example, 2765 mod 10 = 5, and 2765 div 10 = 276. But they work in other bases too. For example, again consider 10223, which we saw above equals 3510:
1 · 33 + 0 · 32 + 2 · 31 + 2 · 30 = 35
Notice that 35 mod 3 = 2 (which is the last digit of 10223), and 35 div 3 = 11 = 1023 (all digits except the last).
And so we can use an iterative process to extract all digits from a number. At each step, we use the mod operation to extract a digit, then divide by the base:
35 mod 3 = 2
35 div 3 = 11
11 mod 3 = 2
11 div 3 = 3
3 mod 3 = 0
3 div 3 = 1
1 mod 3 = 1
1 div 3 = 0
This gives us the digits 2, 2, 0, 1 in reverse order, so we have produced the number 10223.
We can implement this algorithm in Python:
n = int(input('Enter n: ')) b = int(input('Enter base: ')) s = '' while n > 0: d = n % b s = str(d) + s # prepend d to s n = n // b print(s)
For example:
$ py extract.py Enter n: 35 Enter base: 3 1022
Because the digits are generated in reverse order, we must prepend to the string at each step to end up with the digits in the correct order. (To append means to add at the end; to prepend means to add at the beginning.)
Also notice that we call the conversion function str() to convert d (a number such as 4) into a character (e.g. '4') that we can prepend.
Again, the program above will work correctly only for bases 10 and lower. Soon, we'll see how to generalize it.
To convert from base B to base C, you can perform these two steps in succession:
Combine the base B digits into an integer.
Extract digits in base C from the integer.
For example, suppose that we want to convert 3247 into base 5. We can run the two programs we wrote above:
$ py combine.py Enter digits: 324 Enter base: 7 165 $ py extract.py Enter n: 165 Enter base: 5 1130
We see that 3247 = 16510 = 11305. Of course, you could join these two programs into a single program that performs the entire conversion all at once.