Programming 1, 2021-2
Week 12: Notes

Some of today's topics are covered in these sections of Think Python:

Here are some more notes.

modules

A module is a collection of definitions that Python code can import and use. Actually we've been using modules in Python's standard library for many weeks now. For example, the line

import math

lets us use functions in the math module, which is built into the standard library. This statement loads the module into memory and actually makes a variable called 'math' that points to it. Like everything else in Python, a module is an object:

>>> import math
>>> math
<module 'math' (built-in)>

After that, we can access any name defined by the module by prefixing it with the module name:

>>> math.sin(0)
0.0

Altenatively, we may wish to import some of the module's names directly into our namepace, so that we can access them without a prefix. We can do that using a 'from…import' statement:

>>> from math import sin, cos
>>> sin(0) + cos(0)
1.0

In fact we may want to import all of a module's names, which we can do using the '*' wildcard character:

>>> from math import *
>>> log(1) + sqrt(1)
1.0

Finally, the 'import...as' statement will import a module using an alternative name of our choice:

>>> import math as m
>>> m.ceil(4.5) + m.floor(4.5)
9

writing modules

Writing modules in Python is easy. In fact, every Python source file is a module!

Let's make a source file with a couple of statistics functions. We'll call it 'stats.py':

# stats.py

def avg(nums):
    return sum(nums) / len(nums)

def variance(nums):
    a = avg(nums)
    d = avg([(n - a) ** 2 for n in nums])
    return d

Python considers this source file to be a module with the name 'stats'. Let's write another source file 'top.py' that imports this module and calls one of its functions:

# top.py
import sys
import stats

nums = [float(line) for line in sys.stdin]
a = stats.avg(nums)
v = stats.variance(nums)
print(f'avg = {a:.2f}, variance = {v:.2f}')

In this program, the first statement 'import sys' imports a built-in module, and the second statement 'import stats' imports a module defined by a Python file in the same directory. In general, 'import' looks for modules in every directory in Python's built-in search path. If you're curious, you can see the search path in the variable sys.path. On my system it looks like this:

>>> import sys
>>> sys.path
['', '/home/adam/lib/python', '/usr/lib/python39.zip', '/usr/lib/python3.9', '/usr/lib/python3.9/lib-dynload', '/home/adam/.local/lib/python3.9/site-packages', '/home/adam/.local/lib/python3.9/site-packages/ofxstatement_cz_komercni-0.0.1-py3.9.egg', '/home/adam/.local/lib/python3.9/site-packages/ofxstatement_us_first_republic-0.0.1-py3.9.egg', '/usr/local/lib/python3.9/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.9/dist-packages']

packages

A package is a special kind of module that can contain both top-level definitions and other modules.

Creating a package is easy, since Python will actually consider any directory to be a package. For example, let's take the file 'stats.py' from the previous section and place it in a subdirectory called 'utils'. Now 'utils' is a package that contains the module 'stats'. We could place other modules into 'utils' as well. Back in the parent directory, we can make a file 'top2.py' that imports the 'stats' module from this package:

# top2.py
from utils import stats

print(stats.avg([10, 20, 30]))

installing packages

Packages are also a unit of software distribution. In other words, it's possible (and fairly easy) to write a package of Python code and then make it available for others to install and use on their systems.

To this end, Python includes a package manager called pip that can install and remove packages on your system. As a first experiment, you can run 'pip list' to see a list of packages that are currently installed. When I run this command, I see output that begins like this:

$ pip list
Package                         Version
------------------------------- --------------
appdirs                         1.4.4
apturl                          0.5.2
attrs                           20.3.0
Babel                           2.9.1
banking.statements.nordea       1.3.0
banking.statements.osuuspankki  1.3.4.dev0
bcrypt                          3.1.7
…

I did not explicitly install most of these packages; instead, they came as part of the base Python installation or were installed by various Python-based programs on my system.

We can easily install additional packages. Suppose that I'm looking for an implementation of a calendar widget for Tkinter, since Tkinter itself contains no such widget. By default, pip finds packages to install in an enormous repository called PyPI (the Python Package Index), which currently lists over 300,000 packages that have been contributed by thousands of users. If I go to the PyPI web site and search for 'tkinter calendar', the top result is a package called 'tkcalendar'. Let's install it:

$ pip install tkcalendar
Collecting tkcalendar
  Downloading tkcalendar-1.6.1-py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB 3.2 MB/s 
Collecting babel
  Downloading Babel-2.9.1-py2.py3-none-any.whl (8.8 MB)
     |████████████████████████████████| 8.8 MB 5.2 MB/s 
Requirement already satisfied: pytz>=2015.7 in /usr/lib/python3/dist-packages (from babel->tkcalendar) (2021.1)
Installing collected packages: babel, tkcalendar
Successfully installed babel-2.9.1 tkcalendar-1.6.1

pip installed the 'tkcalendar' package, as well as a second package 'babel' that is a dependency of 'tkcalendar'.

Now we can import 'tkcalendar' and use functions that it provides. The following commands will display a calendar widget:

>>> import tkcalendar
>>> cal = tkcalendar.Calendar(None)
>>> cal.grid()

debugging

Most programmers spend a fair amount of time debugging. Debugging is something like being a detective, trying to solve the mystery of why a program doesn't behave the way you think it should. It can require a lot of patience, but ultimately it's satisfying when you finally figure out why a program is misbehaving, just like the end of a mystery novel or film. :)

A basic tool for debugging is inserting print statements in your code to reveal selected elements of a program's state as it runs.

For some debugging problems, a debugger is a valuable tool. Debuggers are available for all major languages including Python.

Specifically, Visual Studio Code includes a nice interface to a Python debugger. When you have a Python source file open in the editor, look for the triangle icon in the upper right. To start debugging, open the dropdown menu to the right of the triangle and choose "Debug Python File". Your program will start running under the debugger. If an exception occurs, execution will stop and you'll be able to examine the values of local and global variables in the 'Variables' pane in the upper left.

Even before you start running your program under the debugger, you may wish to create one or more breakpoints. To create a breakpoint on a line, either click anywhere in the line and press F9, or click in the editor margin to the left of the line. A red dot will appear to the left of the line, indicating that there is a breakpoint there. When execution reaches any line with a breakpoint, it will stop. After that, you can step through your program's execution line by line using the Step Into (F11) and Step Over (F10) commands. If the execution point is currently at a function call, Step Into will travel into the function call and stop on the first line of the function. By contrast, Step Over will execute the function without stopping and will then stop on the first line after the function call.

Most debuggers (even for other programming languages) have similar commands and even keyboard shortcuts, so if you become familiar with Python's debugger you should be able to switch to other debuggers easily.

file paths and directories

Previously we saw how to open, read, and write files in Python. Python's standard library also contains functions for working with file paths and directories. For example, you can list all files in a directory, or test whether a particular name refers to a file or directory. See the functions in the 'os' package in our quick library reference.

If you are working with filenames and directories, you should be aware of some cross-platform differences:

When possible, try to write code that will work in a cross-platform way. Probably it's best never to write a path containing separator characters (such as "images/barn.jpg") in your code. Instead, use the os.path.join() function to combine path components, e.g.

name = os.path.join("images", "barn.jpg")

menus

Menus are useful in many graphical applications. In the past, almost every desktop application with a graphial interface contained a top-level menu with various submenus. In recent years, applications without a top-level menu have become more common (perhaps due to the influence of tablets and phones, where such menus are rare). Nevertheless, you may still find that a menu will be helpful in a graphical program that you are writing.

See our Tkinter quick reference to read about the Menu widget in Tkinter.

writing good code

We now know enough Python that we are starting to write larger programs. Especially when writing a larger program, we want to write code in a way that is structured, clear, and maintainable. Functions and classes are essential tools for this purpose.

Here are three general rules to follow for writing good code:

  1. Don't repeat yourself.

    Beginning programmers often write code that looks like this:

if x > 0:
    … 20 lines of code …
else:the same 20 lines of code, with a few small changes …

    Or like this:

if key == 'w':   # up
    … 12 lines of code …
elif key == 'x':  # down
    … 12 very similar lines of code …
elif key == 'a':  # left
    … 12 very similar lines of code …
elif key == 'r':  # right
    … 12 very similar lines of code …

    This code is poor. It is hard to read: the differences between the parallel blocks of code may be important, but hard to spot. And it is hard to maintain. Every time you change one of the parallel blocks of code, you must change the other(s) in the same way. That is a chore, and it's also easy to forget to do that (or to do it incorrectly).

    In some cases, you can eliminate this redundancy by writing code in a more general way, e.g. by looping over four possible direction vectors rather than having four blocks of code for the cases up, left, right, and down. Another useful tool is functions. You can often factor out parallel blocks of code into a function with parameters representing the (possibly small) differences in behavior between the parallel blocks. Then in place of the first code example above you can write

if x > 0:
    my_fun(… arguments)
else:
    my_fun(… different arguments)
  1. Make variables as local as possible.

    Generally speaking, a local variable is better than an attribute, and an attribute is better than a global variable.

    The scope of a variable is the portion of a program's text in which the variable may be read or written. A local variable's scope is the function in which it appears. An attribute's scope is the class in which the attribute appears, at least if you only set the attribute in code in that class (which many languages will enforce, though not Python) A global variable's scope is the entire program. Another way of phrasing this rule is that a variable should have as small a scope as possible, which makes a program easier to understand since there are fewer places where a variable's value might change.

    This rule does not apply to constants, i.e. read-only data. A global constant is unproblematic; it will never change, so its meaning is completely clear.

  2. Every line and function should fit on a single screen.

    Lines should be limited to about 100 characters. Lines that are longer than this are hard to read, especially if since they may wrap in an editor window. I recommend that you configure your editor to display a vertical gray line at the 100th column position, so that you can see if you've written a line that is too long. In Visual Studio Code, you can do that via the "editor.rulers" setting. Specifically, you can add this line to your settings.json file:

    Functions should generally be limited to about 50 lines. Functions that are much longer than this quickly become hard to understand, and are hard to read since you may have to scroll up and down to see them.