Programming 1, 2021-2
Week 9: Notes

Some of today's topics are covered in these sections of Think Python:

Here are some more notes.

Magic methods for operator overloading

Python's magic methods let us specify custom behavior for our classes. We have already seen two magic methods in Python: __init__ for initializing an object, and __repr__ for converting an object to a string.

Additional magic methods let us customize the behavior of the +, -, *, and / operators:

More similar operators exist; you can read about them in the official Python reference. But the ones listed here are sufficient for our purposes for now.

By the way, providing custom behavior for operators such as these is called operator overloading, and is possible in many (but not all) programming languages.

To see how these are used, let's write a class Vector that can represent a vector with an arbitrary number of elements. We'll include a __repr__ method so that vectors print out nicely:

class Vector:
    def __init__(self, *args):
        self.a = args

# Generate a string representation such as [3 5 10].
def __repr__(self):
l = [str(x)
for x in self.a]
return '[' + ' '.join(l) + ']'

We've already seen that a parameter such as "*args" allows a function or method to accept an arbitrary number of arguments, which are gathered into a single tuple. So our initializer sets the attribute 'a' to hold a tuple of values in the vector:

>>> v = Vector(2.0, 4.0, 5.0)
>>> v.a
(2.0, 4.0, 5.0)

We will now implement a magic method __add__ that allows the operator + to combine instances of our Vector class:

def __add__(self, w):
    assert len(self.a) == len(w.a), 'vectors must have same dimension'
    sum = []
        
    for i in range(len(self.a)):
        sum.append(self.a[i] + w.a[i])
        
    return Vector(*sum)

Notice that when we call the Vector constructor, we must use the '*' operator to explode the values from 'sum' into separate arguments, because the initializer function expects each coordinate to be a separate argument. (The initializer will gather all these arguments back into a tuple.)

Now we can add Vector objects using +:

$ py -i vector.py 
>>> v = Vector(2.0, 4.0, 5.0)
>>> w = Vector(1.0, 2.0, 3.0)
>>> z = v + w
>>> z
[3.0 6.0 8.0]

Behind the scenes, the '+' operator just calls the magic method, and in fact we can call it directly if we like:

>>> v.__add__(w)
[3.0 6.0 8.0]

We could similarly implement a method __sub__ that will allow the - operator to subtract two Vectors, or a method __mul__ that causes the * operator to compute the dot product of two Vectors.

class objects

In Python, every class is itself an object. For example, let's revisit the Point class from the last lecture:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

>>> Point
<class '__main__.Point'>

This is how Python prints out a class. It is an object like any other. For example, we can put it into a variable:

>>> xyz = Point
>>> xyz
<class '__main__.Point'>

testing an object's type

Let's look at an instance of the Point class:


>>> p = Point(3, 4)
>>> p
<__main__.Point object at 0x7fc34f4107d0>
>>> type(p)
<class '__main__.Point'>

The type() function returns the type of any object. In an object-oriented language such as Python, all types are classes, so type() reveals an object's class. We can confirm that type(p) is the class object itself:

>>> type(p) is Point
True

class attributes

Because a class is an object, we can assign attributes to it. For example:

>>> Point.abc = 7
>>> Point.abc + 1
8

Class attributes are distinct from instance attributes. Each instance of the Point class has its own values of x and y, but there is only one value of the abc attribute, shared by all Point instances.

We might use a class attribute to store a constant instance of a class, for example:

>>> Point.origin = Point(0, 0)

As another example, suppose that we have a Student class, and each student has its own integer ID. We could use a class attribute to store the next ID to be assigned:

class Student:
    next_id = 0

    def __init__(self, name):
        self.name = name
        self.id = Student.next_id
        Student.next_id += 1

Notice that we can initialize a class attribute inside a class definition. Python will initialize this attribute only once, as it reads the class definition - not every time it creates a new instance of a class.

inheritance

Like other object-oriented languages, Python supports inheritance, a mechanism that allows a class to extend another class and to change its behavior.

Suppose that we're writing software for a school. We might have a class Person, representing any person at the school. This class might have attributes such as name, address, year of birth, and so on. Some people are students, so we could write a subclass Student that inherits from the Person class. A Student has all the attributes of a Person, and might have additional attributes that represent the courses the student is taking, how many years they have been studying, their expected degree, and so on. In this situation, we say that Person is a superclass or base class or parent class.

Similarly, we could have another subclass Teacher that also inherits from Person, and has other attributes such as their salary, the number of years they have been teaching, and so on.

The world is full of category relationships such as these. If we're writing a program that manages events, we could have a parent class Event and subclasses such as Concert, Film, Play and so on. Or, for a program that manages businesses in a city, we could have a parent class Business and subclasses such as Restaurant, Bank, and Shop.

A subclass automatically inherits its parent's attributes and methods. It may add additional attributes and/or methods, and may also override any its parent's methods by providing an alternate implementation of them. When overriding a method, the subclass may choose to call the parent's version of the same method as part of its activity.

To make these ideas more concrete, let's look at an example. Consider a class that implements a stack using a linked list. We saw this in the last algorithms lecture:

class Node:
    def __init__(self, val, next):
        self.val = val
        self.next = next

class LinkedStack:
    def __init__(self):
        self.head = None
    
    def push(self, x):
        n = Node(x, self.head)
        self.head = n
  
    def pop(self):
        assert self.head != None, 'stack is empty'
        x = self.head.val
        self.head = self.head.next
        return x

We'd now like to write a class StatStack that is like LinkedStack, but has additional methods count() and avg() that report the number of values that are currently on the stack, and their average value. We would like these methods to run in O(1). To achieve that, StatStack will keep track of the number of values currently on the stack, as well as their sum.

We can write StatStack using inheritance:

class StatStack(LinkedStack):
    def __init__(self):
        super().__init__()   # call the __init__ method in the superclass
        self.total = 0       # total of all values currently on the stack
        self.count = 0       # number of values on the stack        

    def push(self, x):
        super().push(x)
        self.total += x
        self.count += 1

    def pop(self):
        x = super().pop()
        self.total -= x
        self.count -= 1 call 
        return x

    def sum(self):
        return self.total

    def avg(self):
        return self.total / self.count

Above, the notation class StackStack(LinkedStack) means that the class StackStack inherits from LinkedStack.

StatStack has an initializer __init__() that first calls the base class initializer:

        super().__init__()   # call the __init__ method in the superclass

The special function super() returns the object that this method was invoked on (just like 'self'), but considers it as an instance of the parent class, so that super().__init__() will call the __init__ method in the parent class of this object. After that call returns, __init__ (in the StatStack class) initializes the 'total' and 'count' attributes to 0.

StatStack overrides the push() and pop() methods from its parent class, meaning that StatStack provides its own implementation of these methods. In the push() method, StatStack calls super().push(x) to call the same-named method in the base class. It then runs self.total += x to update the running total. pop() is similar.

iterating with zip()

The handy zip() function lets you iterate over two (or more) sequences simultaneously.

For example, let's use zip() to zip together two lists, and iterate over the result:

>>> l = [2, 4, 6, 8, 10]
>>> m = [20, 40, 60, 80, 100]
>>> for p in zip(l, m):
...     print(p)
(2, 20)
(4, 40)
(6, 60)
(8, 80)
(10, 100)

Notice that on iteration we receive a pair of values, one from each of the lists that we zipped. (Think of a zipper on a jacket that pulls together two edges as it moves upward.)

Also, recall that in a 'for' loop in Python we may use pattern matching to match multiple variables on each iteration. So we may write, for instance:

>>> for x, y in zip(l, m):
...     print(f'x = {x}, y = {y}')
x = 2, y = 20
x = 4, y = 40
x = 6, y = 60
x = 8, y = 80
x = 10, y = 100

Now we are iterating over two lists, matching x to items from the first list and y to items from the second.

Note that zip() wil stop as soon as it reaches the end of any list:

>>> l = [2, 4, 6]
>>> m = [20, 40, 60, 80, 100]
>>> list(zip(l, m))
[(2, 20), (4, 40), (6, 60)]

We can use zip() to simplify some loops. For example, consider the method for adding two Vector objects that we saw above:

# add self and another vector w, and return the vector sum
def __add__(self, w):
    assert len(self.a) == len(w.a)

    sum = []
    for i in range(len(self.a)):
        sum.append(self.a[i] + w.a[i])

    return Vector(*sum)

Instead of iterating over indices, let's use zip():

def __add__(self, w):
    assert len(self.a) == len(w.a)

    sum = []
        
    for x, y in zip(self.a, w.a):
        sum.append(x + y)

    return Vector(*sum)

We may shorten the code even more by using zip() together with a list comprehension:

def __add__(self, w):
    assert len(self.a) == len(w.a)

    sum = [x + y for x, y in zip(self.a, w.a)]

    return Vector(*sum)