Here are notes about the topics we covered in Lecture 8. For more details, see the Essential C# textbook or the C# reference pages.
The top-level object
class contains a method
Equals()
:
virtual bool Equals (object obj);
This method is distinct from the ==
operator. The
default behavior of Equals
and == is as follows:
for classes: both Equals
and == test reference
equality, i.e. they return true only if two objects are actually
the same object
for structs:
Equals
tests structural equality: it
returns true if two objects have the same type and their
corresponding fields are equal
== is not defined; an attempt to use it will result in a compiler error
When you write a class or struct, you can override the Equals
method for your type and can also provide an overloaded ==
operator. In theory these could have different behavior, which is
potentially confusing. I recommend that if you provide a custom
implementation of Equals
for your type, you should also
customize ==
to behave in the same way, and vice versa.
(In fact, if you customize ==, the compiler will require you to write
a custom version of Equals
as well.)
object
also contains a method GetHashCode():
virtual int GetHashCode ();
If you override Equals
for your type, you should also
override GetHashCode,
ensuring that two equal values will always have the same hash code.
This will ensure that your type will work correctly as a hash table
key. (In fact, if you override Equals
the compiler will
require you to override GetHashCode
as well.)
Here is a partial implementation of a big number class with its
own implementation of Equals
, == and GetHashCode
:
class BigNum { int[] digits; public static bool operator == (BigNum b, BigNum c) { // assuming no leading zeroes if (b.digits.Length != c.digits.Length) return false; for (int i = 0 ; i < b.digits.Length ; ++i) if (b.digits[i] != c.digits[i]) return false; return true; } public static bool operator != (BigNum b, BigNum c) => !(b == c); public override bool Equals(object o) => (o is BigNum n) && (this == n); // calculate (this mod 2^32) public override int GetHashCode() { int h = 0; foreach (int d in digits) h = 10 * h + d; return h; } }
A method may be generic: it may take one or more type parameters. For example:
public static void swap<T>(ref T a, ref T b) { T t = a; a = b; b = t; } public static void fill<T>(T[] a, T t) { for (int i = 0 ; i < a.Length ; ++i) a[i] = t; }
A class or interface may also be generic. Here's a generic version of our dynamic array class:
class DynArray<T> { T[] a = new T[10]; int count; public int length { get => count; } public void add(T t) { if (count == a.Length) { T[] b = new T[count * 2]; for (int j = 0 ; j < count ; ++j) b[j] = a[j]; a = b; } a[count++] = t; } public T this[int index] { get => count < index ? a[index] : default(T); set => a[index] = value; } public bool contains(T t) { foreach (T u in a) if (u.Equals(t)) return true; return false; } }
Note that the contains
method above uses the Equals
method to compare two values. It cannot use ==
, since
the ==
operator is not defined for every type T.
Since DynArray is generic, we can instantiate it with any type we want. For example:
DynArray<double> a = new DynArray<double>(); a.add(3.0); a.add(4.0); DynArray<string> b = new DynArray<string>(); b.add("yo");
The default
operator returns the default value for a
type:
WriteLine(default(int)); // writes 0
default
is most useful
inside a generic class, where it can act on a type parameter. In the
DynArray class above, the indexer uses default
to return a type's default value if
the index is out of bounds.
A generic method, class or interface may have multiple type parameters. Here is an interface type for a map from any type to any type:
interface Map<K, V> { V this[K key] { get; set; } }
Here is a class that implements the Map<K, V>
interface naively using a dynamic array of key/value pairs:
class ArrayMap<K, V> : Map<K, V> { struct Pair { public readonly K key; public readonly V val; public Pair(K key, V val) { this.key = key; this.val = val; } } DynArray<Pair> a = new DynArray<Pair>(); int? find(K key) { for (int i = 0 ; i < a.length ; ++i) if (a[i].key.Equals(key)) return i; return null; } public V this[K key] { get { if (find(key) is int i) return a[i].val; throw new KeyNotFoundException(); } set { if (find(key) is int i) a[i] = new Pair(key, value); else a.add(new Pair(key, value)); } } }
A generic type parameter may include constraints. Here is a method that copies values from one array to another, using a constraint to ensure that the arrays have compatible types:
public static void copy<T, U>(T[] a, U[] b) where T : U { for (int i = 0 ; i < a.Length ; ++i) b[i] = a[i]; }
Each type constraint can have one of the following forms:
T :
type – T must be a subtype of the
given type
T : struct
– T must be a value type
T : class
– T must be a reference type
Commonly we use a constraint to ensure that a generic type has a
built-in ordering, i.e. that it implements the built-in
IComparable<T>
interface. This interface has a
single method:
int CompareTo (T val);
The method returns
a negative value if this object precedes val
in the built-in ordering
0 if this object equals val
a positive value if this follows val
in the
built-in ordering
The built-in types int
, double
and
string
all implement IComparable<T>
,
for example.
Here is a generic method that returns the largest value in an array of any type, using that type's built-in ordering:
public static T max<T>(T[] a) where T : IComparable<T> { T m = a[0]; for (int i = 1 ; i < a.Length ; ++i) if (a[i].CompareTo(m) > 0) m = a[i]; return m; }
Here is a class that can accomplish the same thing. After it receives
a series of values via the add
method, the max property
will contain the largest of the values.
class Maximizer<T> where T : IComparable<T> { T _max; bool empty; public void add(T t) { if (empty || t.CompareTo(_max) > 0) _max = t; empty = false; } public T max { get => _max; } }
Sometimes we'd like to compare objects using an ordering that is different from their type's built-in ordering. For example, we might like to compare strings not by lexicographic order, but by length.
A comparer is an object that can compare two values of a
given type. It implements the built-in IComparer
interface, which has a single method:
int Compare (T x, T y);
The method returns
a negative value if x is less than y
zero if x equals y
a positive value if x is greater than y
Here is the Maximizer
class from above, rewritten to
use a comparer. Note that it no longer has a generic type constraint:
class Maximizer2<T> { IComparer<T> comparer; T _max; bool empty; public Maximizer2(IComparer<T> comparer) { this.comparer = comparer; } public void add(T t) { if (empty || comparer.Compare(t, _max) > 0) _max = t; empty = false; } public T max { get => _max; } }
An enumerator implements the built-in IEnumerator<T>
interface, which represents a stream of objects of type T. It is like
the IntStream
interface we saw a few lectures ago, but
uses a generic type.
IEnumerator
has several methods and properties. The
most important are
T Current { get; }
Return the current value in the enumeration. You must call MoveNext
once before retrieving the first value!
bool MoveNext ();
Advance to the next value in the enumeration. Returns false if there are no more elements.
An enumerable object implements the built-in interface
IEnumerable<T>
, which represents any object that
can provide an enumerator. This interface has a couple of methods;
the important one is
IEnumerator<T> GetEnumerator ();
Return an IEnumerator
that can traverse all elements in
this IEnumerable
.
You can only traverse an enumerator once. An enumerable object,
however, can be traversed many times; each time, the caller will call
GetEnumerator
to retrieve an IEnumerator
for the traversal.
Enumerable objects are important because
the built-in foreach
statement can iterate
over any enumerable object
all built-in collection classes are enumerable
Unfortunately it's a bit of a bother to implement an enumerable
object, since you must implement a fair number of methods. For
completeness, here's an implementation of a class Range
that represents a range of integers and is enumerable:
class RangeEnumerator : IEnumerator<int> { int i, end; public RangeEnumerator(int start, int end) { this.i = start - 1; this.end = end; } public int Current { get => i; } object IEnumerator.Current { get => Current; } public bool MoveNext() => ++i <= end; public void Reset() => throw new NotSupportedException(); public void Dispose() { } } class Range : IEnumerable<int> { int start, end; public Range(int start, int end) { this.start = start; this.end = end; } public IEnumerator<int> GetEnumerator() => new RangeEnumerator(start, end); IEnumerator IEnumerable.GetEnumerator() => GetEnumerator(); }
The standard C# library contains a number of built-in collection
classes in the System.Collections.Generic
namespace.
Here is a picture of their type hierarchy:
IEnumerable<T>
ICollection<T>
IDictionary<K,
V>
Dictionary<K,
V>
(hash
table)
SortedDictionary<K,
V>
(balanced
binary tree)
SortedList<K,
V>
(sorted
array)
IList<T>
List<T>
(array)
ISet<T>
HashSet<T>
(hash
table)
SortedSet<T>
(balanced
binary tree)
Queue<T>
(circular
array)
Stack<T>
(array)
For details about these interfaces and classes, see the C# library quick reference .
These classes are very useful, and we will be using them often in
this course. The List<T> class is especially useful: it is a
dynamic array, similar to the DynArray<T>
class we
wrote above.
Of course, a major goal of this course is not only to be able to use collection classes like these, but also to understand how they are implemented and the performance tradeoffs between them.
Here is a generic method that can invert a dictionary: given a dictionary that maps keys to values, it constructs an inverse dictionary that maps the values to the keys. (This assumes that all values are unique.)
static IDictionary<V, K> invert<K, V>(IDictionary<K, V> d) { var e = new Dictionary<V, K>(); foreach (K key in d.Keys) e[d[key]] = key; return e; }
Here's an alternate implementation that iterates directly over the key/value pairs in the source dictionary, which is perhaps slightly clearer:
static IDictionary<V, K> invert<K, V>(IDictionary<K, V> d) { var e = new Dictionary<V, K>(); foreach (KeyValuePair<K, V> pair in d) e[pair.Value] = pair.Key; return e; }
Below is a review of binary search trees, which may be helpful for this week's homework assignment. For more details about binary search trees, see e.g. Introduction to Algorithms, ch. 12.
A binary tree holds a set of values. A binary tree has zero or more nodes, each of which contains a single value. The tree with no nodes is called the empty tree. Any non-empty tree consists of a root node plus its left and right subtrees, which are also (possibly empty) binary trees.
Here is a picture of a binary tree:
In
this tree, a
is the root node. Node b
is the parent of
nodes d
and
e.
Node d
is
the left
child of
b,
and node e
is
b's
right
child.
Node e
has
a left child but no right child. Node c
has
a right child but no left child.
The subtree rooted at b is the left subtree of node a.
The nodes d, f, h and i are leaves: they have no children. Nodes a, b, c, e and g are internal nodes, which are nodes that are not leaves.
A binary search tree is a tree of ordered values such as integers or strings in which, for any node N with value v,
all values in N's left subtree are less than v
all values in N's right subtree are greater than v
Here is a binary search tree of integers:
Finding a value in a binary search tree is straightforward. To find the value v, we begin at the root. Let r be the root node's value. If v = r, we are done. Otherwise, if v < r then we recursively search for v in the root's left subtree; if v > r then we search in the right subtree.
Inserting a value into a binary search tree is also straightforward. Beginning at the root, we look for an insertion position, proceeding down the tree just as in the above algorithm for finding a node. When we reach an empty left or right child, we create a node there.
Deleting a value from a binary search tree is a little trickier. It's not hard to find the node to delete: we just walk down the tree, just like when searching or inserting. Once we've found the node N we want to delete, there are several cases.
If N is a leaf (it has no children), we can just remove it from the tree.
If N has only a single child, we replace N with its child. For example, we can delete node 15 in the binary tree above by replacing it with 18.
If N has two children, then we must replace it by the next
highest node in the tree. To do this, we start at N's right child
and follow left child pointers for as long as we can. This wil take
us to the smallest node in N's right subtree, which must be the next
highest node in the tree after N. Call this node M. We must remove M
from the right subtree, and fortunately this is easy: M has no left
child, so we can remove it following either case (a) or (b) above.
Now we update the node N, setting its value to the value that was in
M.
As a concrete example, suppose that we want to delete
the root node (with value 10) in the tree above. This node has two
children. We start at its right child (20) and follow its left child
pointer to 15. That’s as far as we can go in following left child
pointers, since 15 has no left child. So now we remove 15 (following
case b above), and then replace the value 10 with 15 at the root.