Week 10: Notes

raising and catching exceptions

You have surely noticed that Python's built-in operators and library functions sometimes report errors. For example, the index() method returns the index of the first occurrence of a value in a sequence, but produces a ValueError if the value is not present:

>>> [3, 4, 5, 6].index(5)
2
>>> [3, 4, 5, 6].index(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 7 is not in list

Similarly, the open() function produces a FileNotFoundError if a file does not exist:

>>> open('non_existent_file')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'non_existent_file'

These errors are actually exceptions, which are a mechanism supported by Python and many other languages. Any code that wants to report an error can raise (= throw) an exception. In the examples above, index() raised a ValueError exception, and open() raised a FileNotFoundError exception.

By default, an exception will terminate the program. However, Python's try … except statement can catch an exception and handle it in some other way. For example, suppose that we want to open a file and read its contents if the file exists, but still continue executing if it does not. We might write

try:
    f = open('poem')
    text = list(f)   # read all file lines into a list
except FileNotFoundError:
    print('warning: poem not found')
    text = []
print(len(text))

If open() runs without error, the code will read the file and the code in the 'except' block will not run. If open() raises a FileNotFoundError, then the code in the 'except' block will run, and then execution will continue normally after the 'try' statement since the exception has been handled. If open() raises some other kind of exception, then the except: block will not run, and the program will terminate (unless some enclosing code catches the exception that has been raised).

Note that an exception is actually an object, i.e. an instance of the built-in class Exception or one of its subclasses. ValueError and FileNotFoundError are classes that inherit from Exception. Each type of exception may have attributes that describe the error that occurred. For example, a FileNotFoundError has an attribute 'filename' containing the file that was not found. In a try … except statement, you can give a name to the exception that was caught and can examine its attributes:

name = input('Enter filename: ')

try:
    f = open(name)
    line = f.readline()
except FileNotFoundError as e:
    print(f'file not found: {e.filename}')

You may define your own classes of exceptions. For example, suppose that we're writing a stack class and we'd like to report an error if the caller attempts to pop a value from an empty stack. We may write

class EmptyStackException(Exception):
    pass

(As we've seen before, the 'pass' statement does nothing, and we can use it when writing a class with no methods.)

In the class definition above, EmptyStackException is a subclass of Exception, i.e. a special kind of Exception. We'll discuss subclassing more soon (see the notes below).

Now, in our stack class, we might write

def pop(self):
    if self.is_empty():
        raise EmptyStackException()
    …

The raise statement raises an exception. If the caller does not catch the exception, the program will be terminated.

In this example, an EmptyStackException has no attributes. If we like, we could give the EmptyStackException class an __init__() initalizer that stores attributes in an instance, and then they would be available to a caller who catches this exception in a try … except statement.

Note that an exception raised by a function f need not be caught by the immediate caller of f. Consider this example:

def a():
    f = open('poem')
    print('successful open')
    …


def b():
    a()

def c():
    try:
        b()
    except FileNotFoundError:
        print('file not found')

In this code, the call to open() in a() might raise a FileNotFoundError. There is no try … except statement in a(), or in its caller b(). However, c() contains a try … except statement that can catch a FileNotFoundError. If a FileNotFoundError is raised, Python will unwind the call stack, aborting the execution of a() and then b() until it arrives at the try … except statement in c(), which will catch the exception.

We see that a raise statement is a form of non-local exit that causes execution to jump to some outer point. In fact we've already seen two other statements in Python that can also jump out from the current execution point. Namely, 'break' immediately exist the current loop iteration, and 'return' immediately exits the current function call. 'raise' is more powerful in that it can immediately exit a series of nested function calls extending from a try … catch statement down to the function that raises the exception.

Here's one more point about exceptions. In a try … except statement, you can choose to specify no exception type at all, in which case the statement will catch any exception at all:

try:
    foo()
except:
    print('some error occurred')

However I don't generally recommend using this form of try … except. A try … except statement is easier to read when it indicates the type of exception that it anticipates. Furthermore, if some sort of error occurs other than the one that you expected to handle, then this form of try … except will catch it, which may lead to behavior that is surprising and difficult to debug.

As a final example, let's write a function that searches through a binary tree for any value that is greater than 90. If it finds one, it will throw an exception with the value that was found. The tree is not necessarily a binary search tree, so we must search everywhere in the tree using recursion. Here's the code:

class Found(Exception):
    def __init__(self, x):
        self.x = x

class Node:
    def __init__(self, val, left, right):
        self.val = val
        self.left, self.right = left, right

def find(n):
    if n == None:
        return
    if n.val > 90:
        raise Found(n.val)
    find(n.left)
    find(n.right)

def find_large(tree):
    try:
        find(tree)
        return None
    except Found as e:
        return e.x

Notice that the find_large() function catches a Found exception and returns the value in it. This shows that we can use an exception to escape a deeply nested set of recursive calls, and then catch the exception and return normally. However, I would not generally recommend this as a programming technique. It would probably be clearer and more efficient to return a value without using exceptions, as in this recursive function:

def find_large(n):
    if n == None:
        return None
    if n.val > 90:
        return n.val
    v = find_large(n.left)
    if v != None:
        return v
    return find_large(n.right)

Nevertheless this is an interesting example.

class objects and class attributes

Consider the Point class that we saw in an earlier lecture:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'P({self.x}, {self.y})'

A class is actually an object in Python:

>>> Point
<class '__main__.Point'>

Because a class is an object, we can assign attributes to it. For example:

>>> Point.abc = 7
>>> Point.abc + 1
8

Class attributes are distinct from instance attributes. Each instance of the Point class has its own values of x and y, but there is only one value of the abc attribute, shared by all Point instances.

We might use a class attribute to store a constant instance of a class, for example:

>>> Point.origin = Point(0, 0)

As another example, suppose that we have a Student class, and each student has its own integer ID. We'd like to have some sort of variable next_id that remembers the next ID to be assigned. We could make this be a global variable, but it might be nicer to place it in the Student class. So we can use a class attribute:

class Student:
    next_id = 0

    def __init__(self, name):
        self.name = name
        self.id = Student.next_id
        Student.next_id += 1

Notice that we can initialize a class attribute inside a class definition. Python will initialize this attribute only once, as it reads the class definition - not every time it creates a new instance of a class.

class methods

In the Student class above, suppose that we'd like to add a method that resets next_id to zero (e.g. when a new semester begins and a new set of students arrive). We could attempt to write the method like this:

def reset(self):
    Student.next_id = 0

However, with this syntax we'll need to invoke the method on some particular instance of the Student class. That seems wrong, since the method has nothing to do with any particular student.

Instead, we can make this be a class method in the Person class:

    @classmethod
    def reset(cls):
        cls.next_id = 0

A class method is invoked on the class object itself, not on an instance of the class:

>>> Person.reset()

The first parameter of a class method is traditionally called cls, and refers to the class object itself. In this example, cls will be the Person class, so cls.next_id is just the same as Person.next_id.

As another example of a class method, consider the Vec class that we saw a few lectures ago representing a vector with any number of dimensions:

class Vec:
    def __init__(self, *coords):
        self.a = coords
    ...

Suppose that we'd like to write a function that builds a Vec from a string such as "[3.0 4.0 5.0]". We could use a global function, but it's nicer to place this in the Vec class itself. We can use a class method:

    @classmethod
    def from_str(cls, s):
        assert s[0] == '[' and s[-1] == ']'
        a = [float(w) for w in s[1:-1].split()]
        return cls(*a)

Let's call it:

>>> v = Vec.from_str('[3.0 4.0 5.0]')
>>> v.a
(3.0, 4.0, 5.0)

In the line return cls(*a) above, cls is the Vec class itself, so this is equivalent to return Vec(*a).

inheritance

Like other object-oriented languages, Python supports inheritance, a mechanism that allows a class to extend another class and to change its behavior.

Suppose that we're writing software for a school. We might have a class Person, representing any person at the school. This class might have attributes such as name, address, year of birth, and so on. Some people are students, so we could write a subclass Student that inherits from the Person class. A Student has all the attributes of a Person, and might have additional attributes that represent the courses the student is taking, how many years they have been studying, their expected degree, and so on. In this situation, we say that Person is a superclass or base class or parent class.

Similarly, we could have another subclass Teacher that also inherits from Person, and has other attributes such as their salary, the number of years they have been teaching, and so on.

The world is full of category relationships such as these. If we're writing a program that manages events, we could have a parent class Event and subclasses such as Concert, Film, Play and so on. Or, for a program that manages businesses in a city, we could have a parent class Business and subclasses such as Restaurant, Bank, and Shop.

A subclass automatically inherits its parent's attributes and methods. It may add additional attributes and/or methods, and may also override any its parent's methods by providing an alternate implementation of them. When overriding a method, the subclass may choose to call the parent's version of the same method as part of its activity.

To make these ideas more concrete, let's look at an example. Consider a class that implements a stack using a linked list. We saw this in a recent algorithms lecture:

class Node:
    def __init__(self, val, next):
        self.val = val
        self.next = next

class LinkedStack:
    def __init__(self):
        self.head = None
    
    def push(self, x):
        n = Node(x, self.head)
        self.head = n
  
    def pop(self):
        assert self.head != None, 'stack is empty'
        x = self.head.val
        self.head = self.head.next
        return x

    def is_empty(self):
        return self.head == None

We'd now like to write a class StatStack that is like LinkedStack, but has an additional property count containing the number of values that are currently on the stack, plus a method avg() that returns their average. We would like avg() to run in O(1). To achieve this, StatStack will remember both the number of values currently on the stack and also their sum.

We can write StatStack using inheritance:

class StatStack(LinkedStack):
    def __init__(self):
        super().__init__()   # call the __init__ method in the superclass
        self.total = 0
        self.count = 0

    def push(self, x):
        super().push(x)
        self.total += x
        self.count += 1

    def pop(self):
        x = super().pop()
        self.total -= x
        self.count -= 1
        return x

    def sum(self):
        return self.total

    def avg(self):
        return self.total / self.count

Above, the notation class StackStack(LinkedStack) means that the class StackStack inherits from LinkedStack.

StatStack has an initializer __init__() that first calls the base class initializer:

super().__init__()   # call the __init__ method in the superclass

The special function super() returns the object that this method was invoked on (just like 'self'), but considers it as an instance of the parent class, so that super().__init__() will call the __init__ method in the parent class of this object. After that call returns, __init__ (in the StatStack class) initializes the 'total' and 'count' attributes to 0.

StatStack overrides the push() and pop() methods from its parent class, meaning that StatStack provides its own implementation of these methods. In the push() method, StatStack calls super().push(x) to call the same-named method in the base class. It then runs self.total += x to update the running total. pop() is similar.

Let's try it:

>>> s = StatStack()
>>> s.push(5)
>>> s.push(10)
>>> s.push(45)
>>> s.avg()
20.0
>>> s.pop()
45
>>> s.avg()
7.5
>>> s.is_empty()
False

Our calls to push(), avg() and pop() invoke the implementations inside the StatStack class. StatStack has no implementation of is_empty(), so when we call is_empty() it invokes the implementation inside the parent class Stack.

A subclass may itself have subclasses. For example, we could have classes like these:

class Business:
    ...

class Restaurant(Business):    # A restaurant is a type of business
    ...

class Cafe(Restaurant):        # A cafe is a kind of restaurant
    ...

In general, when we call any method of an object o, Python will use the most derived implementation, i.e. the one defined in o's class itself or otherwise in the nearest superclass that has a definition of the method. For example, suppose that we have a object c of type Cafe and we call a method c.hours(). If hours() has a definition in the Cafe class, it will be called. Otherwise, Python will look for a definition in the Restaurant class, and then in the top-level Business class, and will call the first definition that it finds.

multiple inheritance

In some languages including Python and C++, a class may also have multiple superclasses. This complicates matters somewhat. Suppose that a class A derives from both B and C, and we create an instance 'a' of A and then call a.foo(). If A has no definition of foo() but both B and C do, then which superclass implementation will be invoked? Languages with multiple inheritance (including Python) have rules for resolving situations such as this one, which may be somewhat complex. However, we will not discuss multiple inheritance further in this course.