Some of the topics we discussed today are covered in these sections of Problem Solving with Algorithms:
6.7 The Bubble Sort
6.11 The Merge Sort
Here are some more notes:
Sorting is a fundamental task in computer science. Sorting an array means rearranging its elements so that they are in order from left to right.
We will study a variety of sorting algorithms in this class. Most of these algorithms will work on sequences of any ordered data type: integer, real, string and so on.
Our first sorting algorithm is bubble sort. Bubble sort is not terribly efficient, but it is simple to write.
Bubble sort works by making a number of passes over the input. On each pass, it compares pairs of elements: first elements 1 and 2, then elements 2 and 3, and so on. After each comparison, it swaps the elements if they are out of order. Here is an animation of bubble sort in action.
Suppose that an array has N elements. Then the first pass of a bubble sort makes N – 1 comparisons, and always brings the largest element into the last position. So the second pass does not need to go as far: it makes only N – 2 comparisons, and brings the second-largest element into the second-to-last position. And so on. After N – 1 passes, the sort is complete and the array is in order.
Here's an implementation of bubble sort in Python:
def bubble_sort(a): n = len(a) for i in range(n - 1, 0, -1): # n - 1 .. 1 for j in range(i): if a[j] > a[j + 1]: a[j], a[j + 1] = a[j + 1], a[j]
What's the running time of bubble sort? Suppose that the input array has N elements. As mentioned above, on the first pass we make (N – 1) comparisons; on the second pass we make (N – 2) comparisons, and so on. The total number of comparisons is
(N – 1) + (N – 2) + … + 2 + 1 = N(N – 1) / 2 = O(N2)
The number of element swaps may also be O(N2), though it will be as low as 0 if the array is already sorted. In any case, the total running time is O(N2) in the best and worst case due to the cost of the comparisons.
Our next sorting algorithm is selection sort. In a selection sort, we start by finding the smallest value in the array and exchanging it with the first element. We then find the second smallest value and exchange it with the second element, and so on. Here is an animation of selection sort in action.
Here's an implementation in Python:
def selection_sort(a): n = len(a) for i in range(n - 1): min_index = i min_val = a[i] for j in range(i + 1, n): if a[j] < min_val: min_val = a[j] min_index = j a[i], a[min_index] = a[min_index], a[i]
We can analyze selection sort's running time similarly to bubble sort. Just as in bubble sort, the total number of comparisons is
(N – 1) + (N – 2) + … + 2 + 1 = N(N – 1) / 2 = O(N2)
and the running time is O(N2) in the best and worst case. However, selection sort will usually outperform bubble sort by as much as a factor of 3. That's because it performs many fewer swaps as it rearranges the data.
Insertion sort is another fundamental sorting algorithm. Insertion sort loops through array elements from left to right. For each element a[i], we first "lift" the element out of the array by saving it in a temporary variable t. We then walk leftward from i; as we go, we find all elements to the left of a[i] that are greater than a[i] and shift them rightward one position. This makes room so that we can insert a[i] to the left of those elements. And now the subarray a[0..i] is in sorted order. After we repeat this for all in turn, the entire array is sorted.
Here is an animation of insertion sort in action.
Here is a Python implementation of insertion sort:
def insertion_sort(a): n = len(a) for i in range(n): t = a[i] j = i - 1 while j >= 0 and a[j] > t: a[j + 1] = a[j] j -= 1 a[j + 1] = t
What is the running time of insertion sort? In the best case, the input array is already sorted. Then no elements are shifted or modified at all and the algorithm runs in time O(n).
The worst case is when the input array is in reverse order. Then to insert each value we must shift elements to its left, so the total number of shifts is 1 + 2 + … + (n – 1) = O(n2). If the input array is ordered randomly, then on average we will shift half of the subarray elements on each iteration. Then the time is still O(n2).
Insertion sort has the same worst-case asymptotic running time as bubble sort and selection sort, i.e. O(n2). Like selection sort, insertion sort generally runs several times more quickly than bubble sort. Furthermore, insertion sort has a major advantage over selection sort in that it is adaptive: it runs especially quickly when the input array is already sorted or nearly sorted. It is a reasonable choice for a simple sorting algorithm when n is not large.
Suppose that we have two arrays, each of which contains a sorted sequence of integers. For example:
a = [3, 5, 8, 10, 12] b = [6, 7, 11, 15, 18]
And suppose that we'd like to merge the numbers
in these arrays into a single array c
containing all of the numbers in sorted order.
Fortunately this is not
difficult. We can use integer variables i and j to point to members
of a and b, respectively. Initially i = j = 0. At each step of the
merge, we campare a[i]
and b[j]
.
If a[i]
< b[j]
,
we copy a[i]
into the destination array, and increment i. Otherwise we copy b[j]
and increment j. The entire process will run in linear time, i.e. in
O(N) where N = len(a) + len(b).
Let's write a function to accomplish this task:
# Merge sorted arrays a and b onto c, given that # len(a) + len(b) = len(c). def merge(a, b, c): i = j = 0 # index into a, b for k in range(len(c)): if j == len(b): # j is out of bounds c[k] = a[i] # so we take a[i] i += 1 elif i == len(a): # i is out of bounds c[k] = b[j] # so we take b[j] j += 1 elif a[i] < b[j]: # both i and j are in bounds c[k] = a[i] i += 1 else: c[k] = b[j] j += 1
Now we can use merge
to merge the arrays a and b
mentioned above:
a = [3, 5, 8, 10, 12] b = [6, 7, 11, 15, 18] c = (len(a) + len(b)) * [0] merge(a, b, c)
We now have a function that merges two sorted arrays. We can use this as the basis for implementing a general-purpose sorting algorithm called mergesort.
Mergesort has a simple recursive structure. To sort an array of n elements, it divides the array in two and recursively mergesorts each half. It then merged the two sorted subarrays into a single sorted array.
For example, consider mergesort’s operation on this array:
Merge sort splits the array into two halves:
It then sorts each half, recursively.
Finally, it merges these two sorted arrays back into a single sorted array:
Here's an animation of mergesort in action on the above array.
Here's an implemention of mergesort, using our merge function from the previous section:
def merge_sort(a): if len(a) < 2: return mid = len(a) // 2 left = a[:mid] # copy of left half of array right = a[mid:] # copy of right half merge_sort(left) merge_sort(right) merge(left, right, a)
What is the running time of merge sort? In the code above, the helper
function merge
runs in time O(N), where N is the length
of the array c. The array slice operations a[:mid]
and
a[mid:]
also take O(N). So if T(N) is the time to run
merge_sort
on an array with N elements, then we have
T(N) = 2 ⋅ T(N / 2) + O(N)
We have not seen this recurrence before. In this class we will not formally study how to solve recurrences such as this one. But its solution is
T(N) = O(N log N)
For large N, O(N log N) is much faster than O(N2), so mergesort will be far faster than insertion sort or bubble sort.
For example, suppose that we want to sort 1,000,000,000 numbers. And suppose (somewhat optimistically) that we can perform 1,000,000,000 operations per second. An insertion sort might take roughly N2 = 1,000,000,000 * 1,000,000,000 operations, which will take 1,000,000,000 seconds, or about 32 years. A mergesort might take roughly N log N ≈ 30,000,000,000 operations, which will take 30 seconds. This is a dramatic difference. :)