Programming 1, 2022-3
Week 8: Notes

list comprehensions

A list comprehension is an expression that loops over a sequence and collects a series of computed values into a list. List comprehensions are powerful and convenient, and we will use them often.

For example, consider this loop that builds a list of all perfect squares from 1 to 400:

squares = []
for i in range(1, 21):
    squares.append(i * i)

We may replace the three lines above by a single line with a list comprehension:

squares = [i * i for i in range(1, 21)]

In general, a list comprehension may have the form

[ <expression> for <var> in <sequence> ]

A comprehension of this form will loop over the given <sequence>. On each loop iteration, it sets <var> to the value of an element of the sequence, then evaluates the given <expression>. All results are collected into a list.

Here are some more examples of list comprehensions. This comprehension builds a list of numbers from 1 to 20 and their squares:

>>> [(i, i * i) for i in range(1, 11)]
[(1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]

We may add 1 to each element in a list:

>>> l = [2, 5, 7, 10]
>>> [i + 1 for i in l]
[3, 6, 8, 11]

Let's write a program that will read a single input line containing a number of integers, separated by spaces. The program will print the sum of the integers. Using a list comprehension, we can write this in a single line:

print(sum([int(w) for w in input().split()]))

Consider Project Euler's Problem 6 (find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum). Here's a solution using a list comprehension:

sum_of_squares = sum([x * x for x in range(1, 101)])
square_of_sum = sum(range(1, 101)) ** 2
answer = square_of_sum - sum_of_squares

Finally, here's a function that builds a nested list representing a matrix of zeroes:

def empty(rows, cols):
    return [cols * [0] for r in range(rows)]

if clauses in list comprehensions

A list comprehension may have an if clause containing an arbitrary condition. Only list elements that satisfy the condition are included in the generated list.

For example, this comprehension collects all characters in the string 'watermelon' that are in the second half of the alphabet:

>>> [c for c in 'watermelon' if c >= 'n']
['w', 't', 'r', 'o', 'n']

Here's a 1-line solution to Project Euler's Problem 1 (find the sum of all the multiples of 3 or 5 below 1000):

>>> sum([i for i in range(1000) if i % 3 == 0 or i % 5 == 0])
233168

multiple for clauses in list comprehensions

A list comprehension may have more than one 'for' clause. The 'for' clauses represent a nested loop. All values generated by the inner loop are collected into the resulting list.

For example, this comprehension generates all pairs of values (x, y), where 0 ≤ x, y < 3:

>>> [(x, y) for x in range(3) for y in range(3)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

The following comprehension is similar, but in it 'y' only iterates up to the value 'x', so it only generates pairs where y < x:

>>> [(x, y) for x in range(3) for y in range(x)]
[(1, 0), (2, 0), (2, 1)]

The comprehension above is equivalent to the following:

l = []
for x in range(3):
    for y in range(x):
        l.append( (x, y) )

Notice that when there are multiple 'for' clauses in a single comprehension:

Let's write a function to flatten a 2-dimensional matrix, i.e. return a 1-dimensional list of its elements:

def flatten(m):
    return [ x for row in m for x in row ]

For example:

>>> mat = [ [2, 4, 6],
...         [1, 3, 7],
...         [8, 9, 1] ]
>>> flatten(mat)
[2, 4, 6, 1, 3, 7, 8, 9, 1]

Let's now write a program that will read any number of lines of input, each containing any number of integers separated by whitespace. The program will print the sum of all the numbers on all the lines:

import sys

nums = [int(w) for line in sys.stdin for w in line.split()]
print(sum(nums))

It's possible to mix 'for' and 'if' clauses in a comprehension. For example, here's code to generate all pairs (x, y) where 0 ≤ x, y < 3, but skipping those with x = 1:

>>> [(x, y) for x in range(3) if x != 1 for y in range(3)]
[(0, 0), (0, 1), (0, 2), (2, 0), (2, 1), (2, 2)]

As a larger example, here's a comprehension that finds all triples of integers (a, b, c) such that a2 + b2 = c2, with 0 ≤ a < b < c ≤ 20:

>>> [(a, b, c) for c in range(21)
               for b in range(c)
               for a in range(b)
               if a * a + b * b == c * c]
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20)]

nested list comprehensions

It's possible to nest one list comprehension inside another. Unlike the examples we saw above, this will produce a 2-dimensional result, i.e. a list of lists.

For example, suppose that we'd like to generate an N x N matrix, where each element is the sum of its row and column numbers (where rows and columns are numbered from 0). If N is 4, then the matrix will look like this:

0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6

Here is a function that takes n as a parameter and produces this matrix:

def num_matrix(n):
    return [ [ i + j for j in range(n) ] for i in range(n) ]

Notice that in this nested comprehension, the for loop on the right represents the outer loop. The inner comprehension will be evaluated once on each iteration of that loop.

As another example, here is code that will read a matrix from standard input, where each input line contains the numbers in one row of the matrix:

import sys

m = [ [ int(w) for w in line.split() ] for line in sys.stdin ]

set comprehensions

A set comprehension is similar to a list comprehension, but collects values into a set, not a list.

For example, suppose that we have a set s of integers. We'd like to add one to each integer, and collect the resulting values into a new set t. We can perform this easily using a set comprehension:

>>> s = {7, 9, 100}
>>> t = {x + 1 for x in s}
>>> t
{8, 10, 101}

Suppose that we'd like to know how many distinct integers are products of two integers from 1 to 10. We can solve this in a single line using a set comprehension:

>>> len({ a * b for a in range(1, 11) for b in range(1, 11) })
42

Note that a list comprehension would not produce the same result, since the generated list would contain duplicate elements. With a set comprehension, the duplicates are automatically eliminated.

dictionary comprehension

A dictionary comprehension is similar to a list or set comprehension, but collects values into a dictionary. For example:

>>> { x : x * x for x in range(15) }
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100, 11: 121, 12: 144, 13: 169, 14: 196}

command-line arguments

When we run a program from the command, we may specify command-line arguments. For example:

$ python3 prog.py hello one two three

In a Python program, sys.argv holds a list of all command-line arguments. Actually the first element in the list is the name of the program itself, and subsequent elements hold the arguments. For example, suppose that prog.py holds this program:

import sys

print(sys.argv)

Let's run it:

$ python3 prog.py hello one two three
['prog.py', 'hello', 'one', 'two', 'three']

file input

To read from a file in Python, we can call the open() method and pass a filename. open() will return a file object that we can use to read data from the file. We can use a 'for' loop to iterate over lines in the file, just as we can read lines of standard input by iterating over sys.stdin. When we are finishing reading from the file, we should call the close() method to close the file object.

Here's a program that accepts a filename on the command line, and reports the number of characters, words, and lines in the file. (This is like the classic UNIX utility wc, which stands for "word count".)

import sys

if len(sys.argv) < 2:
    print(f'usage: py {sys.argv[0]} <filename>')
    exit(1)

f = open(sys.argv[1], 'r')

chars = words = lines = 0
for line in f:
    chars += len(line)
    words += len(line.split())
    lines += 1

print(f'{chars} chars, {words} words, {lines} lines')

file output

To write to a file in Python, we can call the open() method and pass a filename plus an extra argument 'w', indicating that we want to write. If the file does not exist, it will be created. If the file already exists, it will be truncated: all existing data in the file will be lost! We may then write lines to the file by calling the write() method on the file object once for each output line. When we are finished writing data, we should call close() to close the file.

Here's a program that accepts two filenames on the command line. It reads lines from the first file and writes them the second file, duplicating every line as it writes:

import sys

if len(sys.argv) < 3:
    print(f'usage: ${sys.argv[0]} <from-file> <to-file>')
    quit()

from_file = sys.argv[1]
to_file = sys.argv[2]

input = open(from_file, 'r')    # open for reading ('r' is default mode)
output = open(to_file, 'w')     # open for writing

for line in input:
    output.write(line)      # write line to output
    output.write(line)      # write it again

input.close()
output.close()

You can read about more methods for input and output in our Python quick library reference.

debugging

Most programmers spend a fair amount of time debugging. Debugging is something like being a detective, trying to solve the mystery of why a program doesn't behave the way you think it should. It can require a lot of patience, but ultimately it's satisfying when you finally figure out why a program is misbehaving, just like the end of a mystery novel or film. :)

A basic tool for debugging is inserting print statements in your code to reveal selected elements of a program's state as it runs.

For some debugging problems, a debugger is a valuable tool. Debuggers are available for all major languages including Python.

Specifically, Visual Studio Code includes a nice interface to a Python debugger. When you have a Python source file open in the editor, look for the triangle icon in the upper right. To start debugging, open the dropdown menu to the right of the triangle and choose "Debug Python File". Your program will start running under the debugger. If an exception occurs, execution will stop and you'll be able to examine the values of local and global variables in the 'Variables' pane in the upper left.

Even before you start running your program under the debugger, you may wish to create one or more breakpoints. To create a breakpoint on a line, either click anywhere in the line and press F9, or click in the editor margin to the left of the line. A red dot will appear to the left of the line, indicating that there is a breakpoint there. When execution reaches any line with a breakpoint, it will stop. After that, you can step through your program's execution line by line using the Step Into (F11) and Step Over (F10) commands. If the execution point is currently at a function call, Step Into will travel into the function call and stop on the first line of the function. By contrast, Step Over will execute the function without stopping and will then stop on the first line after the function call.

Most debuggers (even for other programming languages) have similar commands and even keyboard shortcuts, so if you become familiar with Python's debugger you should be able to switch to other debuggers easily.

Here is a buggy insertion sort function that we tried to debug at the end of the lecture:

# Insertion sort, with bugs!
def insertion_sort(a):
    n = len(a)
    for i in range(1, n - 1):
        t = a[i]                # lift up a[i]
        j = i
        while j >= 0 and a[j - 1] > t:
            a[j] = a[j - 1]     # shift element over
            j -= 1
        a[j] = t                # put [a] in its place

However, we ran out of time before we could find all the bugs in this code. Can you find and fix the bugs to make the function correct?