Introduction to Algorithms, 2022-3
Week 9: Exercises

1. Hash Removal

In the lecture we saw this hash table implementation:

class HashSet:
    def __init__(self, num_buckets):
        # each array element is the head of a linked list of Nodes
        self.a = num_buckets * [None]

    def contains(self, x):
        i = hash(x) % len(self.a)     # hash bucket index
        p = self.a[i]
        while p != None:
            if p.val == x:
                return True
            p = p.next
        return False

    def add(self, x):
        if not self.contains(x):
            i = hash(x) % len(self.a)
            self.a[i] = Node(x, self.a[i])   # prepend to hash chain

Add a method remove(self, x) that will delete a value x from a hash table if it is present.

2. Hash Dictionary

Write a class HashDictionary that implements a dictionary using a hash table with a fixed number of buckets. The number of buckets should be a constructor argument. The class should support these methods:

d.set(key, value)
Add a new (key, value) pair, or update an existing key if present.
d.remove_key(key)
Remove a key and its associated value.
d.lookup(key)
Look up a key and return its associated value, or None if absent.

3. Resizable Hashing

Modify the class HashSet from the lecture so that the number of buckets expands dynamically, as we discussed in the lecture. The constructor should take an initial number of buckets plus a load factor limit. Whenever the load factor exceeds the limit, the add() method should expand the table, doubling the number of buckets.

4. Bucket Arrays

In a hash table, we could store the contents of each hash bucket using a dynamic array (i.e. a Python list) rather than a linked list. Write a class HashArraySet that is like the class HashSet from the lecture, but uses dynamic arrays in this fashion. Now compare the performance of HashSet and HashArraySet: with each class, create a hash table with 1000 buckets, insert 100,000 random values, then look up each of the random values in turn to check that it is in the table.

5. Hash Chain Lengths

Write a program that reads an integer N and simulates inserting (100 · N) values into a hash table with N buckets using separate chaining, assuming that hash values will be distributed randomly. The program should print the lengths of the longest and shortest chains. (You do not need to use an actual hash table implementation in solving this exercise.)

6. Empty Buckets

Suppose that we insert N values into a hash table with N buckets. What fraction of the buckets do we expect will be empty? Assume that N is a large number.

7. Random Queue

We'd like to implement a data structure RandomQueue with the following interface:

All operations should be fast, i.e. not run in linear time. How can we implement this structure?