Week 9: Notes

There was no lecture or tutorial this week.

Our topics this week are covariance and contravariance and iterators.

You can read about our C# topics in Essential C# 7.0:

Chapter 12 "Generics": Covariance and Contravariance
Chapter 17 "Building Custom Collections": Iterators

They are also covered in Programming C# 8.0:

Chapter 5 "Collections": Implementing IEnumerable<T> with Iterators
Chapter 6 "Inheritance": Covariance and Contravariance

Here are some more notes on these topics.

covariance

In C# we have studied both generics and inheritance. Now we will examine the phenomenon of covariance, which arises from the combination of generic and inheritance in C# (and also in related languages such as Java).

The fundamental question in covariance is this: Suppose that Foo<T> is a generic class, and that B is a subclass of A. Should Foo<B> be considered a subtype of Foo<A>? You might think that the answer is obviously yes, but the question is not so simple.

For example, suppose that we've written a generic class DynArray representing a dynamic array:

class DynArray<T> {
  T[] a = new T[1];
  int count;
  
  public int length { get { … } set { … } }
  
  public void add(T t) { … }
  
  public T this[int index] { get { … } set { … } }
  
  …
}

And suppose that we have an abstract class Shape with subclasses Rectangle, Circle and Triangle:

abstract class Shape {
  public abstract double area { get; }
}

class Rectangle : Shape {
  public double width, height;
  public Rectangle(double width, double height) { … }
  
  public override double area { get => width * height; }
}

class Circle : Shape { … }

class Triangle : Shape { … }

Finally, suppose that we have a method areaSum that adds the area of all Shapes in a dynamic array:

public static double areaSum(DynArray<Shape> a) {
  double sum = 0;
  for (int i = 0 ; i < a.length ; ++i)
    sum += a[i].area;
  return sum;
}

We might like to use the areaSum method to add up the areas of a set of rectangles:

DynArray<Rectangle> r = new DynArray<Rectangle>();
r.add(new Rectangle(10, 2));
r.add(new Rectangle(20, 4));
double a = areaSum(r);   // is this allowed?

Is the call to areaSum() above legal? That is to say, is DynArray<Rectangle> a subtype of DynArray<Shape>, so that we can implicitly convert it to that type?

Actually this call to areaSum will not compile because DynArray<Rectangle> is not a subtype of DynArray<Shape>. To put it differently, the class DynArray<T> is not covariant.

Again, this might surprise you. Why does the compiler disallow this? Suppose that DynArray<Rectangle> were convertible to DynArray<Shape>. Then we could make the following call:

public static void addCircle(DynArray<Shape> a) {
  a.add(new Circle(5.0));
}

DynArray<Rectangle> r = new DynArray<Rectangle>();
addCircle(r);  // ???

This call would add a circle to a dynamic array of rectangles, which makes no sense. In other words, the compiler must disallow this conversion because it would be unsafe.

In fact in C# class types such as DynArray<T> can never be covariant. However, C# allows generic interface types to be covariant. When you declare a generic interface, you can mark a type parameter T using the keyword out to indicate that the interface is covariant in T. For example, here is an interface with an covariant type parameter:

interface ReadOnlyArray<out T> {
  int length { get; }
  T this[int index] { get; }
}

An interface that is covariant in T only produces (never receives) values of type T. (It may receive or produce values of other types as well; for example, the length property above has type int.) The compiler will check this. This constraint allows covariant type conversions to be safe.

Suppose that the DynArray class above implements this interface:

class DynArray<T> : ReadOnlyArray<T> { … }

And suppose that we modify areaSum to take a ReadOnlyArray<Shape> as its argument:

public static double areaSum(ReadOnlyArray<Shape> a) {
  double sum = 0;
  for (int i = 0 ; i < a.length ; ++i)
    sum += a[i].area;
  return sum;
}

Now we may pass a DynArray<Rectangle> to areaSum:

DynArray<Rectangle> r = new DynArray<Rectangle>();
r.add(new Rectangle(10, 2));
r.add(new Rectangle(20, 4));
double a = areaSum(r);  // will now compile

This works because we can convert a DynArray<Rectangle> to a ReadOnlyArray<Shape>, which works because ReadOnlyArray<T> is covariant.

You will get a compile-time error if you attempt to mark the following interface's type parameter T as covariant, since the interface's methods and properties receive values of type T:

interface Arr<T> {
  int length { get; set; }
  void add (T t);
  T this[int index] { get; set; }
}

contravariance

Contravariance is a complement to covariance. You can mark an interface's type parameter T as contravariant using the in keyword, which means that the interface only receives (never produces) values of type T. (It may receive or produce values of other types as well.)

Covariance and contravariance work in opposite directions. Suppose that Foo<T> is a generic interface, and that B is a subtype (e.g. a subclass) of A. If Foo is covariant in T, then a Foo<B> is also a subtype of Foo<A>, so we can implicitly convert a Foo<B> to a Foo<A>. If Foo is contravariant in T, then this goes the other way: Foo<A> is a subtype of Foo<B>!

That might seem pretty abstract, so let's look at an example of a contravariant interface. Here is an interface type for a dictionary that maps values of type K to type V. The interface is contravariant in the type parameter K:

interface Dictionary<in K, V> {
    V get(K key);
    void set(K key, V val);
}

Continuing the example from above, suppose that I have a Dictionary<Shape, Color> that maps a bunch of Shape objects to their color. Because Dictionary is contravariant in K, I can pass this Dictionary to a method squareColors() that expects a Dictionary<Square, Color>. That is safe, because squareColors() will only pass Square objects as keys, which will work in this Dictionary. The type conversion is allowed because the interface is contravariant and Square is a subtype of Shape. Again, notice that this is the opposite direction from a covariance conversion.

By the way, the interface IDictionary<K, V> in the standard C# class library is not actually contravariant in K. That's because it has methods such as Keys() that allow the caller to receive values values of type K from a IDictionary.

array covariance

Surprisingly, arrays in C# are covariant. For example, the following code will compile:

Rectangle[] a = new Rectangle[5];
Shape[] b = a;   // Rectangle[] is convertible to Shape[]
b[0] = new Circle(4.0);

But the last statement above will fail at run time since b is actually a Rectangle[] and a Circle cannot be added to a Rectangle[].

I (and many other people) believe that this array covariance is a design flaw in the C# language. It exists for historical reasons (largely because Java arrays are also covariant, and the early design of C# imitated Java). Array covariance has the following negative consequences:

Every assignment to an array of a reference type (such as a Shape[] above) must check at run time that the value being assigned is compatible with the destination array. This may have a significant performance cost.
Code that assigns an incompatible element to an array may fail at run time rather than at compile time. In other words, array covariance is not type safe: it allows code to compile that may yield a run-time type error.

iterators

C# supports a special type of method called iterators. An iterator produces a sequence of values by returning them one at a time. If that sounds familiar, that's because we have seen the same concept before, namely in Python, where these are called generators!

Here is an iterator method in C# that produces a sequence of squares of integers:

static IEnumerable<int> squares(int start, int end) {
for (int i = start ; i <= end ; ++i)
    yield return i * i;
}

Notice the yield return statement that returns a single value in the sequence. (In Python this statement was called yield.)

The method returns an IEnumerable<int>, representing a sequence of integers. We can iterate over this sequence with foreach:

static void Main() {
     foreach (int i in squares(1, 5))
         WriteLine(i);   // writes 1, 4, 9, 16, 25
}

The sequence of values returned by an iterator is lazy: they are computed only on demand. Each time the caller requests the next value in the sequence, an iterator's code runs until it calls the yield return statement, which yields the next value in the sequence. At that point the iterator's execution is suspended until the caller requests the next value, at which point its code continues executing until the next yield return statement, and so on. When execution reaches the end of the iterator method body, the sequence is complete.

An iterator function should have return type IEnumerable<T> for some (concrete or generic) type T.

We can use iterators to construct sequences of non-numeric values as well. For example, this iterator yields a sequence of strings representing all lines in a file:

  static IEnumerable<string> lines(string filename) {
      using (StreamReader r = new StreamReader(filename)) {
          while (r.ReadLine() is string s)
              yield return s;
      }
  }

Note that an iterator may even return an infinite sequence of values! Here is an iterator that returns the infinite Fibonacci sequence:

static IEnumerable<int> fibs() {
   int a = 1, b = 1;
  
   while (true) {
       yield return a;
       (a, b) = (b, a + b);
   }
}

Let's add up the first 10 Fibonacci numbers:

static void Main() {
    int count = 0, sum = 0;
    foreach (int k in fibs()) {
        sum += k;
        count += 1;
        if (count >= 10)
            break;
    }
    WriteLine(sum);
}

Making classes enumerable

Using iterators, we can easily make any class enumerable, so that the foreach statement will work on instances of the class. All we have to do is implement a method with the special name GetEnumerator(). The method should return a sequence of type IEnumerator<T>, and the easiest way to write it is using an iterator. (Note that IEnumerator<T> is not the same interface as IEnumerable<T>, which we used in the previous section. These interfaces have slightly different purposes, which we will not discuss in this class; if you want to know more, read the C# library documentation.)

For example, here is an enumerable linked list class:

class LinkedList<T> {
    class Node {
        public T val;
        public Node next;
        
        public Node(T val, Node next) {
            this.val = val; this.next = next;
        }
    }

    Node head;
    
    void prepend(T val) {
        head = new Node(val, head);
    }
    
    public IEnumerator<T> GetEnumerator() {
        for (Node n = head ; n != null ; n = n.next)
            yield return n.val;
    }
}