Week 11: Notes

raising and catching exceptions

You have surely noticed that Python's built-in operators and library functions sometimes report errors. For example, the index() method returns the index of the first occurrence of a value in a sequence, but produces a ValueError if the value is not present:

>>> [3, 4, 5, 6].index(5)
2
>>> [3, 4, 5, 6].index(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 7 is not in list

Similarly, the open() function produces a FileNotFoundError if a file does not exist:

>>> open('non_existent_file')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'non_existent_file'

These errors are actually exceptions, which are a mechanism supported by Python and many other languages. Any code that wants to report an error can raise (= throw) an exception. In the examples above, index() raised a ValueError exception, and open() raised a FileNotFoundError exception.

By default, an exception will terminate the program. However, Python's tryexcept statement can catch an exception and handle it in some other way. For example, consider the following calculator program. The user can type expressions such as "3 + 4" to be evaluated. If an expression is invalid, we will print an error message and continue executing.

def evaluate(s):
    w = s.split()
    n = int(w[0])
    for i in range(1, len(w), 2):
        op = w[i]
        x = int(w[i + 1])
        if op == '+':
            n += x
        elif op == '-':
            n -= x
        elif op == '*':
            n *= x
        else:
            assert False, 'bad operation'
    return n

def calc():
    while True:
        try:
            s = input('> ')
        except EOFError:
            break

        try:
            print(evaluate(s))
        except IndexError:
            print('missing value')
        except ValueError:
            print('not an integer')
        except AssertionError as e:
            print(e)

Let's try it:

>>> calc()
> 3 + 4
7
> 5 * 2
10
> 4 * x
not an integer
> 4 +
missing value
> 4 / 10
bad operation

If evaluate() runs without error, the code in the except block in calc() will not run. If evaluate() raises an AssertionError, IndexError or ValueError, then the except block will catch the exception and run the corresponding code, and then execution will continue normally after the try statement since the exception has been handled. If evaluate() raises some other kind of exception, then the except block will not catch it, and the program will terminate.

Note that an exception is actually an object, i.e. an instance of the built-in class Exception or one of its subclasses. FileNotFoundError, IndexError, ValueError and so on are classes that inherit from Exception. Each type of exception may have attributes that describe the error that occurred. For example, a FileNotFoundError has an attribute filename containing the file that was not found. In a tryexcept statement, you can give a name to the exception that was caught and can examine its attributes:

name = input('Enter filename: ')

try:
    f = open(name)
    line = f.readline()
except FileNotFoundError as e:
    print(f'file not found: {e.filename}')

In the calculator program above, we use this technique to print out an AssertionError that was caught.

You may define your own classes of exceptions. For example, let's modify the calculator program above to use its own exception class EvalException:

class EvalException(Exception):
    pass

def evaluate(s):
    w = s.split()
    if len(w) < 1:
        raise EvalException('missing value')
    if not w[0].isdigit():
        raise EvalException('not an integer')
    n = int(w[0])
    for i in range(1, len(w), 2):
        op = w[i]
        if len(w) < i + 2:
            raise EvalException('missing value')
        if not w[i + 1].isdigit():
            raise EvalException('not an integer')
        x = int(w[i + 1])
        if op == '+':
            n += x
        elif op == '-':
            n -= x
        elif op == '*':
            n *= x
        else:
            raise EvalException('bad operation')
    return n

def calc():
    while True:
        try:
            s = input('> ')
        except EOFError:
            break

        try:
            print(evaluate(s))
        except EvalException as e:
            print(e)

This version of the program is arguably better. In the original program we caught system-defined exceptions such as IndexError and ValueError. The problem with that is that if we expand evaluate() to perform a more complex calculation, then if we have a bug Python might throw these exceptions in some place that we don't expect. If our own exception handler catches these exceptions, that may be confusing and will make the program difficult to debug.

Note that an exception raised by a function f need not be caught by the immediate caller of f. Consider this example:

def a():
    f = open('poem')
    print('successful open')
    


def b():
    a()

def c():
    try:
        b()
    except FileNotFoundError:
        print('file not found')

In this code, the call to open() in a() might raise a FileNotFoundError. There is no tryexcept statement in a(), or in its caller b(). However, c() contains a tryexcept statement that can catch a FileNotFoundError. If a FileNotFoundError is raised, Python will unwind the call stack, aborting the execution of a() and then b() until it arrives at the tryexcept statement in c(), which will catch the exception.

We see that a raise statement is a form of non-local exit that causes execution to jump to some outer point. In fact we've already seen two other statements in Python that can also jump out from the current execution point. Namely, break immediately exist the current loop iteration, and return immediately exits the current function call. raise is more powerful in that it can immediately exit a series of nested function calls extending from a trycatch statement down to the function that raises the exception.

Here's one more point about exceptions. In a tryexcept statement, you can choose to specify no exception type at all, in which case the statement will catch any exception at all:

try:
    foo()
except:
    print('some error occurred')

However I don't generally recommend using this form of tryexcept. A tryexcept statement is easier to read when it indicates the type of exception that it anticipates. Furthermore, if some sort of error occurs other than the one that you expected to handle, then this form of try … except will catch it, which may lead to behavior that is surprising and difficult to debug. (As we noted above, this can also be a problem if you catch any system-defined exception in any non-trivial block of code.)

As a final example, let's write a function that searches through a binary tree for any value that is greater than 90. If it finds one, it will throw an exception with the value that was found. The tree is not necessarily a binary search tree, so we must search everywhere in the tree using recursion. Here's the code:

class Found(Exception):
    def __init__(self, x):
        self.x = x

class Node:
    def __init__(self, val, left, right):
        self.val = val
        self.left, self.right = left, right

def find(n):
    if n == None:
        return
    if n.val > 90:
        raise Found(n.val)
    find(n.left)
    find(n.right)

def find_large(tree):
    try:
        find(tree)
        return None
    except Found as e:
        return e.x

Notice that the find_large() function catches a Found exception and returns the value in it. This shows that we can use an exception to escape a deeply nested set of recursive calls, and then catch the exception and return normally. However, I would not generally recommend this as a programming technique. It would probably be clearer and more efficient to return a value without using exceptions, as in this recursive function:

def find_large(n):
    if n == None:
        return None
    if n.val > 90:
        return n.val
    v = find_large(n.left)
    if v != None:
        return v
    return find_large(n.right)

Nevertheless this is an interesting example.

nested functions

Python allows us to write nested functions, i.e. functions that are defined inside other functions or methods.

As a first example, consider the freq() function that we wrote above to find the most frequent character in a string. Instead of a lambda expression, we could use a nested function:

# Return the character in s which occurs most frequently.
def freq(s):
    d = {}
    for c in s:
        if c in d:
            d[c] += 1
        else:
            d[c] = 1

    def keyval(k):
        return d[k]

    return max(d.keys(), key = keyval)

Notiec that the function keyval() is nested inside the function freq(), and has access to the local variable d defined inside freq().

As another example, suppose that we'd like to write a function replace_with_max() that takes a square matrix m and returns a matrix n in which each value in m is replaced with the maximum of its neighbors in all 4 (horizontal or vertical) directions. For example, if m is

2 4
5 9

then replace_with_max(m) will return

5 9
9 5

As a first attempt, we might write

def replace_with_max(m):
    size = len(m)
    
    # Make a matrix of dimensions (size x size) filled with zeroes
    n = [ size * [ 0 ] for _ in range(size) ]
    
    for i in range(size):
        for j in range(size):
            n[i][j] = max(m[i  1][j], m[i + 1][j],
                          m[i][j  1], m[i][j + 1])
                          
    return n

However, we have a problem: if a square (i, j) is at the edge of the matrix, then an array reference such as m[i][j + 1] might go out of bounds.

To solve this problem, let's write a nested helper function get(i, j) that returns an array element if the position (i, j) is inside the matrix, otherwise (- math.inf), i.e. -∞. Here is the improved function:

def replace_with_max(m):
    def get(i, j):
        if 0 <= i < size and 0 <= j < size:
            return m[i][j]
        else:
            return - math.inf
    
    size = len(m)
    
    # Make a matrix of dimensions (size x size) filled with zeroes.
    n = [ size * [ 0 ] for _ in range(size) ]
    
    for i in range(size):
        for j in range(size):
            n[i][j] = max(get(i - 1, j), get(i + 1, j),
                          get(i, j - 1), get(i, j + 1))
                          
    return n

Notice that the nested function get() can refer to the parameter m, and also to the local variable size that is defined in its containing function replace_with_max().

Nested functions are often convenient for writing recursive helper functions. For example, suppose that we have a class TreeSet that holds values in a binary search tree. We'd like to write a method contains(x) that returns True if the value x is present in the tree. We could write contains() iteratively (we did this in our algorithms lecture), but let's write it recursively here. We'll need a recursive function that takes a tree node as a parameter; in the recursive case it will call itself, passing either the node's left or right child. It would be a bit awkward to make this a method. We could write the function outside the TreeSet class, however it's convenient to nest it inside contains():

class Node:
    def __init__(self, val, left, right):
        self.val = val
        self.left = left
        self.right = right

class TreeSet:
    def __init__(self):
        self.root = None

    def contains(self, x):
        def f(node):
            if node == None:
                return False
            if x == node.val:
                return True
            
            return f(node.left if x < node.val else node.right)
        
        return f(self.root)
    

Notice that the nested function f() can access the parameter x in its parent function contains(), which is convenient. If we wrote the function outside the class, it would need to take x as a parameter.

The 'nonlocal' keyword

In the replace_with_max() function we wrote in the previous section, we saw that the nested function get() can read the values of the variables m and size in the containing function. What if get() wants to update the value of such a variable? For example, suppose that we want to count the number of calls to get() made inside a single call to replace_with_max(). We could attempt to write

def replace_with_max(m):
    g = 0     # number of calls to get()

    def get(i, j):
        g += 1
        if 0 <= i < size and 0 <= j < size:
            return m[i][j]
        else:
            return -math.inf
    

However, that won't work because as we have seen before, any variable that is updated inside a function is local by default in Python. And so in the code above, Python will think that g is a local variable inside get(), and will report an error when we first attempt to increment it.

One possible solution would be to make g global, and use a declaration global g inside get(). However, that's ugly since g doesn't really need to be global. A better way is to declare g as nonlocal:

def replace_with_max(m):
    g = 0     # number of calls to get()

    def get(i, j):
        nonlocal g
        g += 1
        if 0 <= i < size and 0 <= j < size:
            return m[i][j]
        else:
            return -math.inf
    

Now the code will work. The nonlocal statement is somewhat like the global statement in that it declares that a variable is not local. The difference is that global declares that a variable is to found in the global (i.e. top-level) scope, whereas nonlocal declares that a variable is a local variable in an enclosing function.

debugging

Most programmers spend a fair amount of time debugging. Debugging is something like being a detective, trying to solve the mystery of why a program doesn't behave the way you think it should. It can require a lot of patience, but ultimately it's satisfying when you finally figure out why a program is misbehaving, just like the end of a mystery novel or film. :)

A basic tool for debugging is inserting print statements in your code to reveal selected elements of a program's state as it runs.

For some debugging problems, a debugger is a valuable tool. Debuggers are available for all major languages including Python.

Specifically, Visual Studio Code includes a nice interface to a Python debugger. When you have a Python source file open in the editor, look for the triangle icon in the upper right. To start debugging, open the dropdown menu to the right of the triangle and choose "Debug Python File". Your program will start running under the debugger. If an exception occurs, execution will stop and you'll be able to examine the values of local and global variables in the 'Variables' pane in the upper left.

Even before you start running your program under the debugger, you will probably want to create one or more breakpoints. To create a breakpoint on a line, either click anywhere in the line and press F9, or click in the editor margin to the left of the line. A red dot will appear to the left of the line, indicating that there is a breakpoint there. When execution reaches any line with a breakpoint, it will stop. After that, you can step through your program's execution line by line using the Step Into (F11) and Step Over (F10) commands. If the execution point is currently at a function call, Step Into will travel into the function call and stop on the first line of the function. By contrast, Step Over will execute the function without stopping and will then stop on the first line after the function call.

Most debuggers (even for other programming languages) have similar commands and even keyboard shortcuts, so if you become familiar with Python's debugger you should be able to switch to other debuggers easily.

Here is a buggy insertion sort function that we debugged in the lecture:

# Insertion sort, with bugs!
def insertion_sort(a):
    n = len(a)
    for i in range(1, n - 1):
        t = a[i]                # lift up a[i]
        j = i
        while j >= 0 and a[j - 1] > t:
            a[j] = a[j - 1]     # shift element over
            j -= 1
        a[j] = t                # put [a] in its place

As an exercise, you may wish to debug it again.

testing software

Testing is the process of trying to discover bugs in a program and, as far as is possible, ensure that it is bug-free.

The most basic way to test your program is manually, i.e. by running it and manually entering input or using it interactively. This generally doesn't scale well to larger programs. It's also difficult to test a program thoroughly in this way.

In recent decades there has been a major trend toward automated testing. One common form of automated testing is unit tests, which test individual pieces of functionality such as individual modules, classes, methods and/or functions inside a program.

In Python, we can easily write simple unit tests using the 'assert' statement. For example, suppose that we've written a class PriorityQueue with methods add(x), remove_smallest() and is_empty(). Here is a naive implementation of this class:

# in file priority_queue.py

class PriorityQueue:
    def __init__(self):
        self.a = []

    def is_empty(self):
        return len(self.a) == 0

    def add(self, x):
        self.a.append(x)

    def remove_smallest(self):
        x = min(self.a)
        i = self.a.index(x)
        return self.a.pop(i)

Here is a suite of several unit tests for the PriorityQueue class:

# in file test_priority_queue.py

import random
from priority_queue import *

def test1():
    q = PriorityQueue()
    assert q.is_empty()

def test2():
    q = PriorityQueue()

    for x in [4, 2, 6, 5]:
        q.add(x)

    for x in [2, 4, 5, 6]:
        assert not q.is_empty()
        assert q.remove_smallest() == x

def test3():
    q = PriorityQueue()
    
    random.seed(0)
    nums = [random.randrange(1000) for _ in range(100)]
    for x in nums:
        q.add(x)
    
    for x in sorted(nums):
        assert q.remove_smallest() == x

    assert q.is_empty()

def test():
    test1()
    test2()
    test3()
    print('All tests passed')

If any unit test in this function fails, the program will terminate with a stack trace revealing which test caused the problem.

Notice that test3() calls the add() method with a series of random values. Random data of this sort can be very useful for testing. The function also calls random.seed(0), which ensures that random values will be the same on each run. That means that test results will be reproducible, which is a good thing.

It's often a good idea to run all unit tests automatically each time a program is run. That ensures that you'll notice immediately if a change to the program causes a unit test to fail.

We wrote the unit tests above in pure Python, without using any libraries. To make writing tests easier, many testing frameworks are available for Python and other languages. The Python standard library contains the unittest and doctest modules, which are frameworks that you can use for writing tests. However, I generally recommend the pytest library, which you can install using pip. To use pytest, you can write tests using assert statements, as I've done in the PriorityQueue tests above. pytest will automatically discover tests implemented as functions with names beginning with "test", in source files whose names begin or end with "test". So if we are using pytest, we can delete the top-level function test() above, and simply run pytest:

$ pytest
============================= test session starts ==============================
platform linux -- Python 3.11.6, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/adam/Desktop/test
plugins: anyio-4.0.0, xonsh-0.14.0
collected 3 items                                                              

test_priority_queue.py ...                                               [100%]

============================== 3 passed in 0.01s ===============================
$

All the tests passed. Now let's intentionally introduce a bug. We'll change the last line of remove_smallest() to look like this:

return self.a.pop(i) + 1

Now let's run pytest again. Here is part of the output:

$ pytest
=========================================================== test session starts ===========================================================
platform linux -- Python 3.11.6, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/adam/Desktop/test
plugins: anyio-4.0.0, xonsh-0.14.0
collected 3 items                                                                                                                         

test_priority_queue.py .FF                                                                                                          [100%]

================================================================ FAILURES =================================================================
__________________________________________________________________ test2 __________________________________________________________________

    def test2():
        q = PriorityQueue()
    
        for x in [4, 2, 6, 5]:
            q.add(x)
    
        for x in [2, 4, 5, 6]:
            assert not q.is_empty()
>           assert q.remove_smallest() == x
E           assert 3 == 2
E            +  where 3 = <bound method PriorityQueue.remove_smallest of <priority_queue.PriorityQueue object at 0x7f886cb61110>>()
E            +    where <bound method PriorityQueue.remove_smallest of <priority_queue.PriorityQueue object at 0x7f886cb61110>> = <priority_queue.PriorityQueue object at 0x7f886cb61110>.remove_smallest

test_priority_queue.py:16: AssertionError
__________________________________________________________________ test3 __________________________________________________________________

    def test3():
        q = PriorityQueue()
    
        random.seed(0)
        nums = [random.randrange(1000) for _ in range(100)]
        for x in nums:
            q.add(x)
    
        for x in sorted(nums):
>           assert q.remove_smallest() == x
E           assert 2 == 1
E            +  where 2 = <bound method PriorityQueue.remove_smallest of <priority_queue.PriorityQueue object at 0x7f886cb254d0>>()
E            +    where <bound method PriorityQueue.remove_smallest of <priority_queue.PriorityQueue object at 0x7f886cb254d0>> = <priority_queue.PriorityQueue object at 0x7f886cb254d0>.remove_smallest

test_priority_queue.py:27: AssertionError
========================================================= short test summary info =========================================================
FAILED test_priority_queue.py::test2 - assert 3 == 2
FAILED test_priority_queue.py::test3 - assert 2 == 1
======================================================= 2 failed, 1 passed in 0.01s =======================================================
$

We see that one of our tests (i.e. test1) passed, and the other two (test2 and test3) failed. Without pytest, we could still run these tests using the top-level test() function that we wrote before, however the program would stop after the first failure. pytest also helpfully shows the source code of the test that failed.

In contrast to unit tests, integration tests test the functionality of groups of modules or the entire program, e.g. by sending input to the program and checking its output.

When we find and fix a bug in a program, we may write a regression test that tests that the bug is gone, and add it to our standard suite of tests for the program. That way we will find out immediately if we ever accidentally reintroduce the same bug in the future.

If a program has a graphical interface, it might not be so easy to test it in an automated way. However, if you write your program using a model-view architecture then you should be able to write automated unit tests for the model portion of your program at least.

Companies that develop software generally spend a lot of time and energy on testing, and often encourage programmers to write lots of tests. Some people have even adopted the practice of test-driven development, in which programmers write unit tests for a function or class before they implement it.

Ideally our tests will check our program's entire set of functionality. There are tools for many languages that will measure test coverage, i.e. the percentage of a program's lines that execute as the suite of tests run. For example, pytest-cov is a plugin that will produce a coverage report for tests run using pytest.

Suppose that our test suite for a program has 80% coverage. Then if there is a bug in the 20% of the lines that were not covered, the tests cannot possibly reveal it. So ideally a program's test will have 100% coverage. However, 100% coverage is often difficult to achieve, since many lines in a program may test obscure edge cases or error conditions, which may be difficult to test automatically.

Furthermore, even 100% coverage does not guarantee that tests will find every possible bug in a program. For example, consider an implementation of quicksort. We might write a single unit test that checks that sorting the array [4, 2, 6] results in the array [2, 4, 6]. This test alone is likely to achieve 100% coverage, since all of the lines of the quicksort function will run as the array is sorted. However, it is certainly possible that the implementation may have bugs which that single test case does not reveal.

To be very sure that software (or hardware) has no bugs, it is possible to write a mathematical proof of correctness of a piece of software (or hardware), and to check the proof in an automated way. However this is relatively expensive and difficult, so it's not done so often in practice. Making this easier is an active area of research.