Week 2: Notes

break

The 'break' statement will exit any kind of loop immediately. For example:

n = 1

while n <= 100:
    print(n)
    if n == 8:
        break
    n += 1

produces the output

$ py hello.py
1
2
3
4
5
6
7
8
$

When n is 8, the loop immediately exits.

continue

A related statement is 'continue', which aborts the current iteration of a loop and continues with the next iteration.

For example, the following loop adds up the sum of numbers from 1 to 100, and also the sum of squares of those numbers. However it uses the 'continue' statement to skip the number 28:

sum = 0
sum_squares = 0

for i in range(1, 101):
    if i == 28:
        continue

    sum += i
    sum_squares += i * i

print(sum)
print(sum_squares)

In this particular example, actually we could trivially replace 'continue' with a comparison using '!=':

for i in range(1, 101):
    if i != 28:
        sum += i
        sum_squares += i * i

However 'continue' is sometimes convenient in situations where more code follows it, to avoid moving all that code into a nested block.

ASCII and Unicode

Computers generally store text using a coded character set, which assigns a unique number called a code point to each character. Two character sets are used in virtually all software systems today.

First, the ASCII character set includes only 128 characters; its code points range from 0 to 127. For example, in ASCII the character 'A' has the number 65, and 'B' has the number 66. ASCII includes all the characters you see on a standard English-language keyboard: the uppercase and lowercase letters A-Z/a-z of the Latin alphabet, the numbers 0-9 and various punctuation marks such as $, % and &. ASCII does not include accented characters such as č or ř.

You can ses a table of all ASCII characters at asciitable.com.

Note that ASCII includes various whitespace characters, which are not visible on the printed page. We will encounter some of these sometimes:

A tab character (ASCII code 9) moves the output position to the next tab stop.
A newline character (ASCII code 10) moves to the next line. In text files on Linux and macOS, each line ends with an instance of this character.
A space (ASCII code 32) is used throughout text to separate words.

The newer Unicode character set extends ASCII to include all characters in all languages of the world, including accented characters and also ideographic characters in Asian languages such as 日. Code points in Unicode range from 0 to 1,114,111.

The site unicode-table.com has a large table showing all the Unicode characters that exist.

Python is fully compatible with Unicode. You can write

s = 'Řehoř'

s = '人'

and these strings will work just like strings of ASCII characters.

chr() and ord()

Python includes two functions that can map between characters and their integer code points.

Given a Unicode character c, ord(c) returns its code point. For example, ord('A') is 65, and ord('B') is 66. ord('ř') is 345 (a value outside the ASCII range).

chr() works inversely: it maps a code point to a character. For example, ord(65) is 'A', and ord(345) is 'ř'.

These functions are sometimes useful when we wish to manipulate characters. For example, here is a program that reads a lowercase letter, and prints the next letter in the alphabet:

c = input('enter letter: ')
i = ord(c) - ord('a')
if 0 <= i < 26:
    i = (i + 1) % 26
    print('next letter is', chr(ord('a') + i))
else:
    print('not a lowercase letter')

The program uses ord() to convert a character (such as 'd') to a number (such as 3) representing its position in the lowercase alphabet. It then adds 1 (mod 26), and uses chr() to map the result back into a lowercase letter.

for, revisited

Last week we learned about the 'for' statement, and learned how to loop over ranges of integers.

Actually 'for' can loop over many other kinds of sequences. For example, it can loop over all characters in a string:

for c in 'hello':
    print(c)

This produces the output

$ py hello.py
h
e
l
l
o
$

type conversions

So far we've seen four basic types in Python: integers, floats, booleans, and strings. Each of these types has an associated type conversion function:

int(x) converts x to an integer. Any fractional part is discarded; for example, int(45.228) will produce 45. True will become 1, and False will become 0.
float(x) converts x to a floating-point number. (If x is an enormous integer, it may not fit, in which case an error will occur.)
bool(x) converts x to a boolean value. Any non-zero number will become True, and zero will become False. If x is a string, the resulting value will be True if the string is non-empty, or False if it is empty.

str(x) converts x to a string.

math functions

For our first peek into Python's enormous standard library, we will see how to use Python's built-in math functions. To get access to these, write this at the top of your program:

import math

These functions include, for example:

fabs(x) - absolute value
sqrt(x) - square root
exp(x) - return e^x
log(x) - return log_e(x)
sin(x), cos(x), tan(x) - trigonometric functions

Our Python Library Quick Reference lists these functions and others. Also, you can see a full list in the Python library documentation.

To use any of these functions, write "math." followed by the name of the function. For example:

import math

print(math.sqrt(2))

prints

1.4142135623730951

random numbers

Our quick library reference also lists various functions that can generate random numbers. To use these, you must first write 'import random'. We will often use these as well.

example: estimating π

Here's a program that uses random numbers to estimate π. It works by throwing darts randomly at a square with corners at (-1, -1) and (1, 1). The square has area 4. For each dart, it checks whether it is in the unit circle, which has area π. The fraction f of darts that land in the circle will be approximately π/4, so π is approximately 4f.

import random

tries = int(input('How many darts? '))
hits = 0

for i in range(tries):
    x = random.uniform(-1, 1)
    y = random.uniform(-1, 1)
    if x * x + y * y < 1:
        hits += 1

f = hits / tries
print('pi is approximately', 4 * f)

string formatting

Python includes f-strings, which are formatted strings that can contain interpolated values. For example:

color1 = 'blue'
color2 = 'green'

print(f'The sky is {color1} and the field is {color2}')

Write the character f immediately before a string to indicate that it is a formatted string. Interpolated values can be arbitrary expressions:

x = 14
y = 22

print(f'The sum is {x + y}.')

You may optionally specify a format code after each interpolated value to indicate how it should be rendered as a string. Some common format codes include

b – an integer in binary (base 2)
d – an integer in decimal (base 10)
x – an integer in hexadecimal (base 16)
f – a floating-point number in fixed-point (i.e. not exponential) notation. This format code may optionally be preceded by a period ('.') followed by an integer precision, which indicates how many digits should appear after the decimal point.

For example:

m = 127
e = 2.718281828459045

print(f'm in hex is {m:x}')   # prints 'm in hex is 7f'
print(f'e is {e:.4f}')        # prints 'e is 2.7183'

You can specify a comma (',') before a 'd' or 'x' formatt code to specify that digits should be printed in groups of 3, with a separator between groups:

>>> x = 2 ** 100
>>> f'{x:,d}'
'1,267,650,600,228,229,401,496,703,205,376'

An integer preceding a format code indicates a field width. If the value's width in characters is less than the field width, it will be padded with spaces on the left:

>>> f'{23:9d}'
'       23'
>>> f'{723:9d}'
'      723'
>>> f'{72377645:9d}'
' 72377645'

If the field width is preceded with a '0', then the output will be padded with zeroes instead:

>>> f'{23:09d}'
'000000023'
>>> f'{723:09d}'
'000000723'
>>> f'{72377645:09d}'
'072377645'

There are many more format codes that can you can use to control output formatting in more detail. See the Python library documentation for a complete description of these.

reading lines of standard input

On all major operating systems, as a program runs it can read from its standard input. (Usually standard input comes from the terminal, but it is also possible to redirect it to come from a file instead.)

In Python, we will often want to read lines from standard input. The sys.stdin object is a sequence of lines, and so we can loop over it using for. For example, here is a program that reads numbers from standard input, one per line, and computes their sum:

import sys
  
sum = 0
for line in sys.stdin:
    n = int(line)    # convert string to integer
    sum += n
  
print('The sum is', sum)

When we run the program and enter its input from a terminal, we need some way to signal that the input is complete. On Linux or macOS, we can do this by typing Ctrl+D. On Windows, type Ctrl+Z followed by Enter.

When we run the program, we see this:

3
4
5
The sum is 12

Above, we typed Ctrl+D or Ctrl+Z after the number 5 (though that was not visible in the terminal output).

Note when you loop over sys.stdin in this way, each line will be a string that contains a newline character at the end of it. By contrast, when you read a string using input() it will not have a newline at the end. The example above works because the int() function will ignore whitespace (such as a newline) at the end of a string.

redirecting standard input and output

Text-mode programs commonly read from standard input and write to standard output. By default, standard input comes from the keyboard, and standard output appears in a terminal window. However, when you run a program you may redirect its standard input to come from a file, and may also redirect standard output to a file. (Sources and destinations other than files are also possible.)

On any operating system, use the < character to redirect input, and the > character to redirect output. For example, use a text editor to create a file test.in with these contents:

4
5

Now run the Python program we just wrote, redirecting its input from test.in and redirecting its output to a file called 'test.out':

$ python3 sum.py < test.in > test.out
$

The program ran, but produced no output on the terminal (since its output was redirected). Let's look at the output it produced:

$ cat test.out
9
$

Each of our homework assignments in our ReCodEx system will contain sample input and sample output for the program that you're supposed to write. To test your program, I recommend placing each sample input in a text file, and running your program with its input redirected from that file.

Note that input redirection on Windows will work in the default terminal named 'cmd', but unfortunately not in PowerShell.