Consider the hash function we saw in the lecture:
# Generate a hash code in the range 0 .. 2^32 - 1.
def my_hash(s):
h = 0
for c in s:
d = ord(c)
h = (B * h + d) % (2 ** 32)
return ha) If B = 0, what hash function will result? Will this be a good hash function?
b) If B = 1, what hash function will result? Will this be a good hash function?
c) Suppose that we want to place all unique words of War and Peace into 10,000 hash buckets. If we use the hash function above, some values of B will be better than others, because they will result in a more even distribution of hash values. Design an experiment to compare the performance of these choices: B = 2; B = 256; B = 257; B = 65,537; B = 1,000,003.
Write a class HashDictionary that implements a dictionary using a hash table with a fixed number of buckets. The number of buckets should be a constructor argument. The class should support these methods:
Modify the class HashSet from the lecture so that the number of buckets expands dynamically, as we discussed in the lecture. The constructor should take an initial number of buckets plus a load factor limit. Whenever the load factor exceeds the limit, the add() method should expand the table, doubling the number of buckets.
Write a program that reads an integer N and simulates inserting (100 ยท N) values into a hash table with N buckets, assuming that hash values will be distributed randomly. The program should print the number of values in the smallest and largest buckets. (You do not need to use an actual hash table implementation in solving this exercise.)
Suppose that we insert N values into a hash table with N buckets. What fraction of the buckets do we expect will be empty? Assume that N is a large number.