Programming 1
Week 10: Notes

Some of this week's topics are covered in Introducing Python:

Here are some additional notes.

writing good code

We now know enough Python that we are starting to write larger programs. Especially when writing a larger program, we want to write code in a way that is structured, clear, and maintainable. Functions and classes are essential tools for this purpose.

Here are three general rules to follow for writing good code:

  1. Don't repeat yourself. Beginning programmers often write code that looks like this:

if x > 0:
    … 20 lines of code …
else:
    … the same 20 lines of code, with a few small changes …

    This is bad code. It is hard to read: the differences between the 20-line blocks may be important, but hard to spot. And it is hard to maintain. Every time you change one of the parallel blocks of code, you must change the other. That is a chore, and it's also easy to forget to do that.

    In this situation you should factor out the 20 lines of code into a separate function. You can use function parameters to produce the differences in behavior between the two 20-line blocks. Then you can write

if x > 0:
    my_fun(… arguments)
else:
    my_fun(… different arguments)
  1. Every function should fit on a single screen. Practically speaking this means that functions should generally be limited to about 50 lines. Functions that are much longer than this quickly become hard to understand.

  2. Make variables as local as possible. In other words, avoid global variables when possible. (In many programming languages you can declare variables inside a loop body or other block of code inside a function, which is generally a good practice. But unfortunately Python does not allow this.)

iterables and sequences

We have now learned about various kinds of sequences and other iterable objects in Python. Let's look at these again to clarify the difference between them.

A sequence is an object s containing a series of elements that you can access using the syntax s[i]. Strings, lists, tuples, and ranges are all sequences.

Some operations that we can perform on all sequences include

An iterable object is anything that you can loop over with using the for statement. To put it differently, an iterable object can produce a stream of values that you can visit one at a time.

All sequences are iterable, but not all iterables are sequences. In addition to the sequence types listed above, iterables in Python include (for example) sets, dictionaries, and file objects. A generator comprehension also produces an iterable.

Some operations that we can perform on any iterable it include

As mentioned above, we can use for to loop over any iterable. Another way to visit all of an iterable's elements is to use an iterator. The built-in function iter takes an iterable and returns an iterator. You can call the next function to retrieve each element in turn from an iterator. When there are no more elements, next raises a StopIteration exception:

>>> i = iter([5, 10, 15])
>>> next(i)
5
>>> next(i)
10
>>> next(i)
15
>>> next(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

An iterator can make only a single pass over an iterable's elements. If you want to visit all the elements again, you must construct another iterator for that purpose.

The iter and next functions are a lower-level iteration mechanism than the for statement. (To put it differently, for internally calls a set of magic methods that are closely related to iter and next.)

first-class functions

In Python, functions are first-class values. That means that we can work with functions just like with other values such as integers and strings: we can refer to functions with variables, pass them as arguments, return them from other functions, and so on.

Here is a Python function that adds the numbers from 1 to 1,000,000:

def bigSum():
    sum = 0
    for i in range(1, 1_000_001):
        sum += i
    return sum

We can put this function into a variable f:

>>> f = bigSum

And now we can call f just like the original function bigSum:

>>> f()
500000500000

Let's write a function time_it that takes a function as an argument:

def time_it(f):
    start = time.time()
    x = f()
    end = time.time()
    print(f'function ran in {end - start:.2f} seconds')
    return x

Given any function f, time_it runs f and measures the time that elapses while f is running. It prints this elapsed time, and then returns whatever f returned:

>>> time_it(big_sum)
function ran in 0.04 seconds
500000500000

This is a first example illustrating that it can be useful to pass functions to functions. As we will see, there are many other reasons why we might want to do this.

As another example, here is a function max_by that finds the maximum value in an input sequence, while applying a function f to each element before comparing values:

def max_by(seq, f):
    max_elem = None
    max_val = None
    for x in seq:
        v = f(x)
        if max_elem == None or v > max_val:
            max_elem = x
            max_val = v
    return max_elem

We can use max_by to find the longest list in a list of lists:

>>> max_by([[1, 7], [3, 4, 5], [2]], len)
[3, 4, 5]

Or we can use it to find the list whose last element is greatest:

def last(s):
    return s[-1]

>>> max_by([[1, 7], [3, 4, 5], [2]], last)
[1, 7]

This capability is so useful that it's built into the standard library. The standard function max can take a keyword argument key holding a function that works exactly like the second argument to max_by:

>>> max([[1, 7], [3, 4, 5], [2]], key = len)
[3, 4, 5]

The built-in function sorted and the sort() method take a similar key argument, so that you can sort by any attribute you like. For example:

>>> l = [[2, 7], [1, 3, 5, 2], [3, 10, 6], [8]]
>>> l.sort(key = len)
>>> l
[[8], [2, 7], [3, 10, 6], [1, 3, 5, 2]]

lambda expressions

Let's return to the previous example where we were given a list of lists, and found the list whose last element is greatest:

def last(s):
    return s[-1]

>>> max_by([[1, 7], [3, 4, 5], [2]], last)
[1, 7]

It's a bit of a nuisance to have to define a separate function last here. Instead, we can use a lambda expression:

>>> max_by([[1, 7], [3, 4, 5], [2]], lambda l: l[-1])
[1, 7]

A lambda expression creates a function "on the fly", without giving it a name. In other words, a lambda expression creates an anonymous function.

A function created by a lambda expression is no different from any other function: we can call it, pass it as an argument, and so forth. Even though the function is initially anonymous, we can certainly put it into a variable:

>>> abc = lambda x, y: 2 * x + y
>>> abc(10, 3)
23

The assignment to abc above is basically equivalent to

def abc(x, y):
    return 2 * x + y

which is how we would more typically define this function.