Python includes both functions and methods in its standard library.
A function takes one or more arguments and optionally returns a value. Some of Python's built-in functions that we've already seen in this course are len(), chr(), ord(), input() and print(). To call a function, we simply write its name followed by the arguments:
name = input('Enter your name: ')
A method is like a function, but is invoked on a particular object. For example:
s = 'yoyo' b = s.startswith('yo') # method call
In the second line above, we are invoking (or calling) the startswith() method on the string s. We pass the string 'yo' to the method. The method returns a value, which is True in this case since the string 'yoyo' does start with 'yo'.
Above, we said that a method is invoked on an object. In Python any value is an object, so (for example) 3, False, and 'yoyo' are all objects. (In some other languages, there is a technical distinction between objects and other kinds of values.) Methods (and a related feature, classes, which we'll discuss later) are fundamental building blocks in object-oriented programming. A language that has methods and classes, such as Python, C++, Java, or C#, is called object-oriented.
Not all programming languages have both functions and methods. For example, C has only functions, and classic Java has only methods. Python is a bit of a hybrid since it has both functions and methods. This arguably makes the language more flexible and convenient (at the cost of some complexity).
In this course we will soon learn how to write our own functions, and before too long we'll learn how to write our own methods (and classes) as well.
Python's library contains many more useful methods on strings. For example, .lower() converts all characters in a string to lowercase:
>>> s = 'YUMMY PIE' >>> s.lower() 'yummy pie' >>> s 'YUMMY PIE'
Notice that the call to lower() above returned a new string that was like s, but in which all characters are lowercase. It did not modify the original string s, which still contains uppercase characters. In fact, it could not possibly modify s, since Python strings are immutable. Many string methods are similar to lower() in that they return a new string derived by modifying a given string in some way.
Our quick reference lists more string methods and operators. Note that strings are iterable, since you can loop over them with 'for'. They are also sequences, since you can access string elements using the syntax s[i]. Soon we'll see other kinds of iterables and sequences. (In Python every sequence is iterable, but some iterables such as sys.stdin are not sequences). So if you're looking in the quick reference for operations that work on strings, you can find them in three places: in the section on iterables, in the section on sequences, and in the section specifically about strings.
We've already seen Python's input() and print() functions. input() reads a line from standard input. print() reads a line to standard output. By default standard input and output are the terminal, but we'll soon see that we can redirect them from or to a file.
At some point the input may reach its end. When the input comes from a terminal, we can signal the end of the input by typing ctrl+D (on Linux or macOS) or Ctrl+Z Enter (on Windows).
Let's write a program that reads numbers from
standard input, one per line, until it ends. The program will print
the sum of the numbers. We could try to use input() to read the
numbers, however when the input ends input() will produce an error,
which is inconvenient. As another approach, we can use the object
sys.stdin
, which represents a program's
standard input. We can loop over sys.stdin using a 'for' statement.
On each iteration, we'll receive the next input line as a string:
import sys sum = 0 for line in sys.stdin: # for each line of standard input n = int(line) # convert string to integer sum += n print('The sum is', sum)
Let's run the program, and give it some numbers as input:
$ python sum.py 3 4 5 ^D
At the end we typed ^D (control D), meaning end of input. The program now prints
The sum is 12
When you run a program you may redirect its standard input to come from a file, and may also redirect standard output to a file.
Use the
'<' character to
redirect the input. For example, let's use a text editor to create a
file test.in
with these contents:
4 5 6
Now let's run the Python program frm the previous section, redirecting its input from test.in:
$ python sum.py < test.in 15
Similarly, you can use the > character to redirect a program's output. Let's run the program again, redirecting both the input and output:
$ python sum.py < test.in > test.out $
The program ran, but produced no output on the terminal since its output was redirected. Let's look at the output it produced. You can view it in an editor. Alternatively, the 'cat' command (on Linux or macOS) will display the contents of a file:
$ cat test.out 15 $
Many of our homework assignments in our ReCodEx system will contain sample input(s) and sample output(s) for the program that you're supposed to write. You may want to place each sample input in a text file. Then you can run your program with its input redirected from each file in turn. That will be much more convenient than manually entering input data each time you run your program.
Earlier, we saw that we can loop over sys.stdin to read lines from a file. You should be aware that when you do this, each string you receive will end with a newline character. Consider this program print.py, which reads all lines from standard input and copies them to standard output:
import sys for line in sys.stdin: print(line)
Suppose that we have a text file story.txt with three lines:
the beginning the middle the end
Let's run the program above and redirect its input from this file:
$ python print.py < story.txt the beginning the middle the end $
Notice the extra blank lines after each output line. As
mentioned above, each string generated by the 'for' loop will end
with a newline character. For example, the first line read from the
file will be 'the beginning\n'
. (As we
saw in an earlier section, on Windows the file will actually contain
'\r\n'
at the end of the line, but
Python will convert this sequence to '\n'
.)
When we invoke print() on this string, it prints the newline in the
string, and then prints a second newline because print()
normally prints a newline after any output string you give it.
If we don't want the extra lines, we can call the strip() method to remove the newlines returned by 'for'. strip() removes all whitespace at the beginning and end of a string. Whitespace includes unprintable characters such as spaces and newlines:
>>> ' one two three '.strip() 'one two three' >>> 'down the street\n'.strip() 'down the street'
Let's modify the program print.py() above so that it strips each line read from standard input:
import
sys
for
line in sys.stdin:
line = line.strip()
print(line)
Now it won't print extra blank lines:
$ py print.py < story.txt the beginning the middle the end $
Alternatively, if want to remove only the newline character at the end of the line but leave all other whitespace intact, then instead of calling strip() we could call
line = line[:-1]
Python includes f-strings, which are formatted strings that can contain interpolated values. For example:
>>> color1 = 'blue' >>> color2 = 'green' >>> f'The sky is {color1} and the field is {color2}' 'The sky is blue and the field is green'
Write the character f
immediately before a string
to indicate that it is an f-string.
Interpolated values can be arbitrary expressions. For example, consider a program that reads two values and prints their sum. Without an f-string, we might write
x = int(input('Enter x: ')) y = int(input('Enter y: ')) print('The sum of', x, 'and', y, 'is', x + y)
Using an f-string, we may instead write the last line like this:
print(f'The sum of {x} and {y} is {x + y}')
In my opinion, this is easier to read and write.
You may optionally specify a format code after each interpolated value to indicate how it should be rendered as a string. Some common format codes include
b – an integer in binary (base 2)
d – an integer in decimal (base 10)
x – an integer in hexadecimal (base 16)
f – a floating-point number. This format code may optionally be preceded by a period ('.') followed by an integer precision, which indicates how many digits should appear after the decimal point.
For example:
>>> import math >>> m = 127 >>> f'hex value is {m:x}' 'hex value is 7f' >>> import math >>> math.pi 3.141592653589793 >>> f'pi is {math.pi:.3f}' 'pi is 3.142'
Notice that Python rounds (rather than truncates) a floating-point number to a given number of digits.
You can specify a comma (',') before a 'd' or 'x' format code to specify that digits should be printed in groups of 3, with a separator between groups:
>>> x = 2 ** 100 >>> f'{x:,d}' '1,267,650,600,228,229,401,496,703,205,376'
An integer preceding a format code indicates a field width. If the value's width in characters is less than the field width, it will be padded with spaces on the left:
>>> f'{23:9d}' ' 23' >>> f'{723:9d}' ' 723' >>> f'{72377645:9d}' ' 72377645'
If the field width is preceded with a '0', then the output will be padded with zeroes instead:
>>> f'{23:09d}' '000000023' >>> f'{723:09d}' '000000723' >>> f'{72377645:09d}' '072377645'
There are many more format codes that can you can use to control output formatting in more detail. See the Python library documentation for a complete description of these.
Lists are a fundamental type in Python. We can make a list by specifying a series of values surrounded by square brackets:
l = [3, 5, 9, 11, 15]
A list may contain values of various types:
l = ['horse', 789, False, -22.3]
It may contain any number of values, or may even be empty:
l = []
The len function returns the number of elements in a list:
len(['potato', 'tomato', 'tornado']) # returns 3
We can access elements of a list by index. The
first element has index 0, and the last element has index len(l)
– 1
:
>>> l = [3, 5, 9, 11, 15] >>> l[0] 3 >>> l[4] 15
Just like with strings, we can use negative indices to count from the end of the list:
>>> l = [3, 5, 9, 11, 15] >>> l[-1] 15 >>> l[-2] 11
Slice syntax works with lists, just like with strings:
>>> l = [3, 5, 9, 11, 15] >>> l[1:3] [5, 9] >>> l[3:] [11, 15]
The 'in' operator tests whether a list contains a given value:
>>> 77 in [2, 8, 77, 3, 1] True
Note that this is a bit different than 'in' on strings. The 'in' operator does not test whether a sublist is present in a list:
>>> [8, 77] in [2, 8, 77, 3, 1] False
Unlike strings, lists in Python are mutable. We can set values by index:
>>> l = [3, 5, 9, 11, 15] >>> l[0] = 77 >>> l[3] = 99 >>> l [77, 5, 9, 99, 15]
A list's length may change over time. The append() method adds a single element to a list:
>>> l = [3, 5, 9, 11, 15] >>> l.append(20) >>> l.append(30) >>> l [3, 5, 9, 11, 15, 20, 30]
We'll often use append() to build up a list in a loop. For example, we can build a list of the squares of all numbers from 1 to 10:
l = [] for i in range(1, 11): # 1 .. 10 l.append(i * i)
The extend() method adds a series of elements to a list. The += operator is a synonym for extend():
>>> l = [2, 4, 6] >>> l.extend([8, 10]) >>> l [2, 4, 6, 8, 10] >>> l += [12, 14] >>> l [2, 4, 6, 8, 10, 12, 14]
The insert() method inserts an element into a list at a given position:
>>> l = [3, 5, 9, 11, 15] >>> l.insert(2, 88) >>> l [3, 5, 88, 9, 11, 15]
The del
operator can delete one or more elements of a list by index:
>>> l = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'grape'] >>> del l[1] >>> l ['orange', 'pear', 'banana', 'kiwi', 'grape'] >>> del l[2:4] >>> l ['orange', 'pear', 'grape']
We may even assign to a slice in a list, replacing that slice with an arbitrary sequence of values:
>>> l = [2, 4, 6, 8, 10] >>> l[1:3] [4, 6] >>> l[1:3] = [100, 200, 300] >>> l [2, 100, 200, 300, 8, 10]
The list() function converts any sequence to a list:
>>> list('watermelon') ['w', 'a', 't', 'e', 'r', 'm', 'e', 'l', 'o', 'n'] >>> list(range(120, 130)) [120, 121, 122, 123, 124, 125, 126, 127, 128, 129]
Like strings, lists are
iterable, so you can loop over them using 'for'. Lists are also
sequences, since you can access their elements using the syntax l[i]
.
So you can find list operations in three sections in our quick
reference guide, namely the sections about iterables,
sequences,
and specifically about lists.
A Python list is actually an array, meaning a sequence of elements stored in contiguous memory locations. In fact it is a dynamic array, since it can expand over time. (In many programming languages, arrays have a fixed size).
For this
reason, we can retrieve or update any element of a list by index
extremely quickly, in constant
time. Even if a
list l has 1,000,000,000 elements, accessing e.g. l[927_774_282]
will be extremely fast, just
as fast as accessing the first element of a short list.
In our algorithms class, we will usually use the term "array" to describe this kind of data structure.
The split() method is convenient for breaking strings into words. By default, it will consider words to be separated by sequences of whitespace characters, which include spaces, tabs, newlines, and other unprintable characters. It returns a list of strings:
>>> 'a big red truck'.split() ['a', 'big', 'red', 'truck'] >>> ' a big red truck '.split() ['a', 'big', 'red', 'truck']
Here's a program that reads a series of lines from standard input, each containing a series of integers separated by one or more spaces. It will print the sum of all the integers on all the input lines:
import sys sum = 0 for line in sys.stdin: for word in line.split(): sum += int(word) print(sum)
You may optionally pass a character to split() specifying a delimiter that separates words instead of whitespace:
>>> 'big_red_truck'.split('_') ['big', 'red', 'truck']
The method join() has the opposite effect: it joins a list of strings into a single string, inserting a given separator string between each pair of strings:
>>> ' '.join(['tall', 'green', 'tree']) 'tall green tree' >>> '_'.join(['tall', 'green', 'tree']) 'tall_green_tree'
Here's a program that reads a single line, breaks it it into words, reverses the words, then prints them back out:
words = input().split() # break input into words words = words[::-1] # reverse them print(' '.join(words))
Let's run it:
$ py rev.py one fine day day fine one $
Suppose that we write the following declarations:
>>> l = [3, 5, 7, 9] >>> m = l
Now the variables l
and m refer to the
same list. If we
change l[0]
, then the change will
be visible in m:
>>> l[0] = 33 >>> m[0] 33
This works because in fact in Python every variable is a pointer to an object. So two variables can point to the same objects, such as the list above. An assignment "m = l" does not copy a list. It runs in constant time, and is extremely fast.
Alternatively, we may make a copy of the list l. There are several possible ways to do that, all with the same effect:
>>> l = [3, 5, 7, 9] >>> n = l.copy() # technique 1: call the copy() method >>> n = list(l) # technique 2: call the list() function >>> n = l[:] # technique 3: use slice syntax
Now the list n has the same values as l, but it is a different list. Changes in one list will not be visible in the other:
>>> l[1] = 575 >>> l [3, 575, 7, 9] >>> n [3, 5, 7, 9]
Python provides two different operators for testing equality. The
first is the ==
operator:
>>> x == y True >>> x == z True
This operator tests for structural equality. In other words, given two lists, it compares them element by element to see if they are equal. (It will even descend into sublists to compare elements there as well.)
The second equality operator is the is
operator:
>>> x is y True >>> x is z False
This operator tests for reference equality. In other words, it returns true only if its arguments actually refer to the same object. (Reference equality is also called physical equality).
You may
want to use each of these operators in various situations. Note that
is
returns
instantly (it runs in
constant time), whereas == may
traverse a list in its entirety, so it may be significantly slower.
A list may contain any type of elements, including sublists:
>>> m = [[1], [2, 3, 4, 5], [6]]
A list of lists is a natural way to represent a matrix in Python. Consider this matrix with dimensions 3 x 3:
5 11 12 2 8 7 14 2 6
If we want to store it in Python as a list of lists, normally we will use row-major order, in which each sublist holds a row of the matrix:
m = [ [5, 11, 12], [2, 8, 7], [14, 2, 6] ]
Alternatively we could use column-major order, in which each
sublist is a matrix column; then the first sublist would be
[5, 2, 14]
. The choice is
arbitrary, but by convention we will generally use row-major order.
With this ordering, we can use the syntax m[i][j]
to access the matrix element at row i, column j. Do not
forget that rows and columns are numbered from 0:
>>> m = [ [5, 11, 12], [2, 8, 7], [14, 2, 6] ] >>> m[1][0] # row 1, column 0 2
Of course, we may use the index -1 to reference the last row or the last column:
>>> m[-1][-1] = 100 >>> m [[5, 11, 12], [2, 8, 7], [14, 2, 100]]
Here's a program that will read a matrix from the input, with one row of numbers per input line:
# Read a matrix from the input, e.g. # # 2 3 4 # 5 1 8 # 0 2 9 import sys m = [] for line in sys.stdin: # build a row of the matrix row = [] for word in line.split(): row.append(int(word)) m.append(row) print(m)
Now suppose that we want to build a zero matrix of a given size, i.e. a matrix whose elements are all 0. Recall that we may use the * operator to build a list of a given length by repeating a given element:
>>> 3 * [0] [0, 0, 0]
So you might think that we can build e.g. a 3 x 3 matrix of zeros by
>>> m = 3 * [3 * [0]] >>> m [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
WARNING: This code looks correct but is not. Let's see what happens if we attempt to set the upper-left element of the matrix:
>>> m[0][0] = 7 >>> m [[7, 0, 0], [7, 0, 0], [7, 0, 0]]
It appears that several matrix elements have changed!
Here's what's going on here: all three of the sublists are actually pointers to the same list. Here is a similar example:
>>> a = [1, 2, 3] >>> m = [a, a] >>> m [[1, 2, 3], [1, 2, 3]] >>> a[0] = 7 >>> m [[7, 2, 3], [7, 2, 3]]
The line m = [a, a]
creates a list with
two elements, each of which is a pointer to the list a
.
When a
changes, the change is visible in
m.
With that understanding, let's revisit our attempt to create a 3 x 3 matrix of zeros:
>>> m = 3 * [3 * [0]] >>> m [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
This code constructs a single list with three zeroes (3
* [0]
), and then repeats it 3 times, however the repetition
does not copy the list - it merely makes three pointers to the
same list. And so an update to any matrix element will actually be
visible in three places in the matrix.
Here's a correct way to make a 3 x 3 matrix of zeroes:
m = [] for i in range(3): m.append(3 * [0])
This may seem like more work, though later in this course we'll see how we can write even this form in a single line.