You can read about these topics in Learn You a Haskell, chapters 7, 9, and 11.
Note that the System.Random module is included in
the random
package, which is not
included in a default Haskell installation. You can install it using
the Cabal package manager:
$ cabal install --lib random
Haskell modules are contained in installable units called packages. The default installation of Haskell includes a number of packages including
base
(which contains the fundamental modules Prelude, Data.Int, Data.List
and many others)
array
(which contains Data.Array)
containers
(which contains various collection types such as Data.Set and
Data.Map)
and many others.
You can install additional packages using Cabal,
a package manager that comes with Haskell. Cabal is somewhat similar
to Python's package manager pip
: it can
download packages from the internet and install them on your system.
For example, as mentioned above you can use Cabal to install the
random
package:
$ cabal install --lib random
The command ghc-pkg list
will list
packages that are installed on your system. Its output looks like
this:
$ ghc-pkg list /home/adam/.ghcup/ghc/9.2.7/lib/ghc-9.2.7/package.conf.d Cabal-3.6.3.0 array-0.5.4.0 base-4.16.4.0 binary-0.8.9.0 bytestring-0.11.4.0 containers-0.6.5.1 deepseq-1.4.6.1 ...
However this list is not complete: it may not include packages that you installed using Cabal. Those will be listed in a package environment file on your machine. When you run ghci, it prints the path of this file:
$ ghci Loaded package environment from /home/adam/.ghc/x86_64-linux-9.2.7/environments/default GHCi, version 9.2.7: https://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/adam/Dropbox/ubuntu/ghci >
On my machine, this package environment file looks like this:
$ cat /home/adam/.ghc/x86_64-linux-9.2.7/environments/default clear-package-db global-package-db package-db /home/adam/.cabal/store/ghc-9.2.7/package.db package-id ghc-9.2.7 package-id bytestring-0.11.4.0 package-id unix-2.7.2.2 ... package-id directory-1.3.6.2 package-id text-1.2.5.0 package-id random-1.2.1.1-14950f0d57c55eed8fd0701e27e87527c3c7ef42acca95164df8337d2d526ad3 $
Notice the random
package at the end of
the listing.
In a procedural language, we can read or write the array element at a given index in O(1). Haskell's lists are linked lists, so retrieving or replacing an element at the Nth index of a list will take O(N). Last week we learned that Haskell has immutable arrays. We can retrieve an element of a Haskell array in O(1), but if we want to update a single value we must create a new array, which will take O(N) if the array has N elements.
In some situations we may want a sequence of elements in which we can update the element at a given index more quickly. Let's call this a random-access list. If the size of our sequence is fixed, we may implement a random-access list using a binary tree. We can construct the tree to be balanced, so we will be able to read or update elements in O(log N). As one possibility, each tree node could contain a pair (i, v), where i is an index and v is a value. However that is wasteful: the indices form a continuous range of integers, so we don't actually need to store them all in the tree. Instead, we may build a binary tree in which the internal nodes only contain pointers to children, and all values are stored in leaves. As we descend the tree to look for a value at a particular index, we can decide whether to go left or right at each step based on the index we are seeking.
For example, suppose that our sequence contains 7 elements, namely the characters of the string "sundial". We can store them in a tree that looks like this:
The tree has an asymmetrical shape because 7 is not a power of 2, but that is fine. In the above tree, for any node at the root of a subtree with n elements, the left subtree holds (n div 2) elements and the right subtree holds the remaining elements. For example, at the root node the left subtree has (7 div 2) = 3 elements, and the right subtree has the remaining 4 elements. So if we are looking for an element with an index that is less than 3, we can go left, otherwise right.
In the tutorial we wrote a partial implementation of this structure:
data Tree a = Leaf a | Node (Tree a) (Tree a) type IMap a = (Int, Tree a) mk_tree :: [a] -> Tree a mk_tree [x] = Leaf x mk_tree xs = let n = length xs `div` 2 (xs1, xs2) = splitAt n xs in Node (mk_tree xs1) (mk_tree xs2) -- Build a tree containing the given elements. mk_imap :: [a] -> IMap a mk_imap xs = (length xs, mk_tree xs) -- Retrieve an element by index. iget :: IMap a -> Int -> a iget (1, Leaf x) 0 = x iget (n, Node left right) k | k < n_left = iget (n_left, left) k | otherwise = iget (n - n_left, right) (k - n_left) where n_left = n `div` 2 -- # of elements in the left subtree -- Set an element by index. iset :: IMap a -> Int -> a -> IMap a ...
Haskell's Data.Sequence
module (in the containers
package)
contains a type Seq a
which is a more
powerful random-access list. This type allows you to read or update
elements by index in O(log N), concatenate or split sequences in
O(log N), and perform various other operations efficiently as well.
How can we build a purely functional set or dictionary data structure? For a dictionary, first suppose that the keys form a fixed set, and are of a comparable type. Then we can build a balanced binary tree in which each node contains a (key, value) pair. Since the keys are fixed, the tree's shape will not change and it will always be balanced. For a set, if we want to store a dynamically changing subset of values from a known fixed set, we can similarly use a dictionary that maps values in that fixed set to a boolean.
For example, suppose that you are implementing an algorithm on a graph whose vertices are named by strings. Since the graph will not change as the algorithm runs, its vertex names are a fixed set and you could use this approach to store a visited set of vertices, or a map from vertices to integers.
In the general case, when keys may be added dynamically and are not all known at the beginning, you can implement a set or dictionary using a self-balancing binary tree structure such as a red-black or AVL tree.
Haskell's containers
package includes modules Data.Set
and Data.Map
which provide set and map types using balanced binary trees. There
are also modules Data.IntSet
and Data.IntMap
which are more efficient when keys are integers.