Week 5: Notes

inheritance

C# supports inheritance, a mechanism that allows a class to extend another class and to change its behavior. We have already seen inheritance in Programming 1, where we discussed inheritance between classes in Python. However we will emphasize inheritance more in Programming 2. Inheritance plays a major role in the C# collection class library and in other popular class libraries such as graphical interface toolkits. Furthermore, in Programming 2 we'll start to write larger programs in which we may have our own uses for inheritance.

Consider a simple C# class that implements a fixed-size stack of values of type int:

class IntStack {
    int[] a;
    int n;

    public IntStack(int limit) {
        a = new int[limit];
    }

    public int count => n;

    public void push(int i) {
        a[n++] = i;
    }

    public int pop() => a[--n];

    public int this[int i] => a[n - 1 - i];

    public static bool operator ! (IntStack s) =>
        s.count == 0;
}

Notice that our class contains

We'd now like to write a class AvgStack that is like IntStack, but has an additional property avg that can report the average value of the numbers that are currently on the stack. We'd like avg to run in O(1). To achieve that, AvgStack will keep track of the current total of all numbers on the stack.

We can write AvgStack using inheritance. Here's an implementation:

class AvgStack : IntStack {
    int sum;

    public AvgStack(int limit) : base(limit) { }

    public override void push(int i) {
        base.push(i);
        sum += i;
    }

    public override int pop() {
        int i = base.pop();
        sum -= i;
        return i;
    }

    public double avg => (double) sum / n;
}

The notation class AvgStack : IntStack means that the class AvgStack inherits from IntStack. In this situation, we say that IntStack is the superclass (or parent class or base class), and AvgStack is a subclass (or derived class).

AvgStack has a public constructor that takes a single argument limit, just like the parent class. The notation : base(limit) means that the AvgStack constructor chains to the base class constructor – in other words, it calls the base class constructor, passing it the same value of limit that it itself received. (We previously saw similar syntax for chaining to a different constructor in the same class. When chaining to a constructor in the same class, we write : this(); when chaining to a constructor in the base class, we write : base().)

The AvgStack constructor does not need to initialize the sum field to 0, since 0 is the default value for fields of type int.

AvgStack overrides the push and pop methods from its parent class. That means that AvgStack provides its own implementation of these methods. In the push() method, AvgStack calls base.push(d) to call the same-named method in the base class. It then runs sum += i to update the running total. pop() is similar.

Note that if you attempt to build the IntStack and AvgStack classes as written above, the code will not compile! We need to make two changes to the base class IntStack:

1. We must modify the push() and pop() methods to be virtual:

    public virtual void push(int i) {
        a[n++] = i;
    }

    public virtual int pop() => a[--n];

In C#, a subclass may only override base class methods if they are virtual.

2. We must modify the field n in the base class to be protected:

    protected int n;    // in IntStack class

That is because the avg property in AvgStack accesses n:

    public double avg => (double) sum / n;   // in AvgStack class

In C#, the default access level for any member (e.g. field, method) is private. So if we use the simple declaration "int count;", then count will be private, which means that it will accessible only within its own class. A protected member, by constract, is accessible within its own class and within any subclasses. Finally, as we have seen before, a member marked as public is accessible from anywhere.

using base and derived types

We can create instances of each of the IntStack and AvgStack classes as follows:

IntStack s = new IntStack(100);   // create an IntStack
AvgStack t = new AvgStack(100);   // create an AvgStack

How about the following – will this work?

AvgStack t = new IntStack(100);    // does not compile

We are creating a IntStack, and attempting to store it in a variable of type AvgStack. The code will not compile. If t has type AvgStack, then it cannot hold this object because an IntStack is not an AvgStack. In general, an instance of a superclass (e.g. IntStack) is not an instance of a subclass (e.g. AvgStack) because it doesn't have the extended capabilities that the subclass provides.

On the other hand, consider this:

IntStack s = new AvgStack(100);   // works OK

This code will compile: a variable of type IntStack can hold an AvgStack, because an AvgStack is a kind of IntStack. In general, an instance of a subclass counts as an instance of its superclass (or of any ancestor class).

What happens if we invoke a method through a variable whose type is the superclass? In other words, how will the following code behave?

IntStack s = new AvgStack(100);
for (int i = 0 ; i < 10 ; ++i)
    s.push(i);   // which push() will this call?

Which implementation of push() will be invoked by the code above – the implementation in IntStack, or in the subclass AvgStack?

The code will call the implementation in AvgStack. Even though the object is held in a variable of type IntStack, it is actually an AvgStack. A call to a virtual method is resolved based on an object's actual class – the type of the variable holding the object is irrelevant.

The assigment IntStack s = new AvgStack(100) works because the type AvgStack is implicitly convertible to IntStack. Because of this phenomenon, we may write a method that works on IntStack objects and may pass it instances of the AvgStack subclass. For example:

static void empty(IntStack s) {
    while (!s.isEmpty)
        Console.WriteLine(s.pop());
}

We may pass an AvgStack to this method:

AvgStack s = new AvgStack(100);
for (int i = 0 ; i < 10 ; ++i)
    s.push(i);
empty(s);

We have seen that C# will implicitly convert from a subclass to a superclass type. In the other direction, C# provides an explicit conversion that we can access via a type cast. For example:

IntStack s = new AvgStack(100);
AvgStack t = (AvgStack) s;    // explicit conversion

We can use an explicit conversion to a subclass type when we are sure that a variable holds a object of that subclass. On the other hand, if the object does not actually belong to the subclass then the program will exit with an exception:

IntStack s = new IntStack(100);
AvgStack t = (AvgStack) s;  // compiles, but will throw an exception when run

is

The is operator returns true if a value is not null and belongs to a given type. It works with both reference types and nullable types:

IntStack s = getStack();
if (s is AvgStack) {
    AvgStack a = (AvgStack) s;
    WriteLine($"average = {a.avg}");
}

It's also possible to declare a local variable inside an is expression. The lines above may be rewritten as follows:

IntStack s = getStack();
if (s is AvgStack a)
    WriteLine($"average = {a.avg}");

Note that is works even with value types such as int:

int? i = abc();
if (i is int)  // true if i != null
    WriteLine(i);

The is operator can be convenient when you want to read all lines of standard input. The Console.ReadLine() method returns null on end of input, so you can use is in a while loop to capture all input lines. For example, the following loop will read all lines from standard input and echo them back out to standard output:

while (ReadLine() is string line)
    WriteLine(line);

as

The as operator checks whether a value belongs to a type. If so, it returns the value; otherwise it returns null:

IntStack s = getStack();
AvgStack ls = s as AvgStack;  // if s was not an AvgStack, ls will be null

abstract classes and methods

In Introduction to Algorithms we learned how to store an arithmetic expression using an expression tree. Here is a set of classes we might use to store expression trees in C#:

class Expr {
}

class OpExpr : Expr {
    char op;
    Expr left, right;

    public OpExpr(char op, Expr left, Expr right) {
        this.op = op; this.left = left; this.right = right;
    }
}

class IntExpr : Expr {
    int i;

    public IntExpr(int i) {
        this.i = i;
    }
}

An interior node of the tree is an OpExpr, which contains an operator (such as '+' or '-') plus pointers to children. Each child may be either another interior node, or a leaf node stored in the IntExpr class.

For example, the expression 2 + (3 * 4) could be stored as follows:

    // 2 + (3 * 4)
    Expr e = new OpExpr('+',
        new IntExpr(2),
        new OpExpr('*', new IntExpr(3), new IntExpr(4)));

(Of course, really it would be nice to write a parser function that can parse any arithmetic expression and generate a tree for it.)

Now suppose that we'd like to add an eval() method that will evaluate an expression. For example, e.eval() should return 14. We will have implementations of eval() in the OpExpr and IntExpr classes. Because these are implementations of the same method, it will need to be defined in the base class Expr as well. In theory we could define eval() there with a dummy implementation that will never be called:

    public virtual int eval() => 0;

However there is a better alternative. We can define the base class Expr as abstract, and define eval() as an abstract method in that class. An abstract class can never be instantiated directly – it exists only so that subclasses can extend it. Similarly, an abstract method can exist only in an abstract class, and has no implementation – it exists only so that subclasses can implement it.

Here is the complete implementation:

abstract class Expr {
    public abstract int eval();
}

class OpExpr : Expr {
    char op;
    Expr left, right;

    public OpExpr(char op, Expr left, Expr right) {
        this.op = op; this.left = left; this.right = right;
    }

    public override int eval() {
        int l = left.eval();
        int r = right.eval();
        switch (op) {
            case '+': return l + r;
            case '-': return l - r;
            // ... other operator here ...
            default: throw new Exception("unknown operator");
        }
    }
}

class IntExpr : Expr {
    int i;

    public IntExpr(int i) {
        this.i = i;
    }

    public override int eval() => i;
}

Note that an abstract class may even contain fields and non-abstract methods, though we don't have any such members in this simple example.

Object class

All C# class automatically inherit from a top-level class called System.Object which contains several methods that are shared by all objects. The short name object is a synonym for System.Object.

One method in System.Object is ToString():

public virtual string ToString ();

C# automatically calls this method when it needs to convert any object to a string, for example when writing an object to the console:

Window w = new Window();

Console.WriteLine(w);   // will automatically call w.ToString()

The default implementation of ToString simply returns the name of the class, so the above code will write

Window

You can override ToString() in your own classes if you'd like to change their printed representation. (ToString() is like the __repr magic method in Python.)

System.Object also contains a couple of other methods (Equals, GetHashCode) that we will consider a bit later.

interfaces

In Introduction to Algorithms we learned about various abstract data types, such as stacks and queues. For each abstract data type, we described the methods that it provides. For example, a stack has methods push(), pop() and isEmpty(). We saw that we can implement each abstract data type in various ways. For example, we can implement a stack either using a linked list, or an array. Different implementations of an abstract data type will have equivalent functionality, though they might have different performance characteristics.

C# interfaces formalize this concept. An interface defines a set of methods or other members in an abstract data type. For example, here is an interface for a stack of integers:

interface IStack {
    bool isEmpty();
    void push(int i);
    int pop();
}

There is no implementation yet - just a set of methods to be implemented. We can write a concrete class LinkedStack that implements a stack using a linked list. It might begin like this:

class LinkedStack : IStack {
    Node? head;

    public bool isEmpty() => head == null;
  
    public void push(int i) {
        head = new Node(i, head);
    }
  
    public int pop() { … }
}

Above, "class LinkedStack : IStack" means that we are defining a class LinkedStack that implements the IStack interface. The compiler will check that our class provides an implementation of every method that was declared in the interface. If any implementations are missing, we will get a compiler error.

We might then write a second class ArrayStack that implements a stack using a fixed-size array. It could begin like this:

class ArrayStack : IStack {
    int[] a;
    int num;  // number of elements currently on the stack

    public ArrayStack(int limit) {
        a = new int[limit];
    }

    public bool isEmpty() { return num == 0; }

    
}
  

We now have two different concrete implementations of an IStack. We might create two IStack objects as follows:

IStack s = new LinkedStack();
IStack t = new ArrayStack(100);

Notice that a variable can have an interface type, such as the variables s and t above. A variable with an interface type can hold any object that implements that interface.

Method parameters can also have an interface type. For example, here is a method that will empty a Stack, printing all values as it does so:

static void empty(IStack s) {
    while (!s.isEmpty()) {
        int i = s.pop();
        WriteLine(i);
    }
}

We may pass either a LinkedStack or an ArrayStack to this method. For example:

ArrayStack as = new ArrayStack(100);
for (int i = 0 ; i < 5 ; ++i)
    as.push(i);
empty(as);

Notice that an interface may not contain a constructor. Any class that implements an interface may have its own constructor(s), but those are not part of the interface.

An interface may, however, contain properties and indexers. For example, let's modify the IStack interface so that isEmpty is a read-only property, not a method:

interface IStack {
    bool isEmpty { get; }
    void push(int i);
    int pop();
}

All members in an interface are implicitly public, so you should not write the keyword public in an interface declaration. However, when you write a class that implements an interface, all implementations of the interface's members must be declared public (as you can see in the examples above).

Note that a class may implement more than one interface. For example, let's write an interface ICountable for any type that has a current element count:

interface ICountable {
    int count { get; }
}

Now we can modify our ArrayStack class so that it implements both IStack and ICountable:

class ArrayStack : IStack, ICountable {
    int[] a;
    int num;  // number of elements currently on the stack

    public ArrayStack(int limit) {
        a = new int[limit];
    }

    public bool isEmpty => num == 0;  // implementation of property from Stack

    public int count => num;  // implementation of property from Countable

    

}

interface inheritance

Up to now we have only discussed inheritance between classes. C# also allows an interface to inherit from another interface. An inheriting interface has all the members of its parent interface, and can add additional members of its own. Here is a somewhat abstract example:

interface IShape {
    double perimeter();
    double area();
}

interface IPolygon : IShape {
    int num_vertices { get; }
}

class Circle : IShape {  }

class Rectangle : IPolygon {  }

class Triangle : IPolygon {  }

In this example, an IShape represents a shape in two dimensions and a Polygon is an IShape that has a finite number of vertices. Because the Triangle class implements Polygon, it must provide an implementation of num_vertices as well as of the perimeter() and area() methods.

single and multiple inheritance

C# provides

This particular combination was popularized by Java. Many object-oriented languages (including Python) have a more general mechanism: they have multiple class inheritance, so that one class may have several immediate superclasses. That is more powerful, but creates various complexities that C# avoids through its simpler model.

You have have noticed that an interface is somewhat similar to an abstract class, since it defines functionality that must be implemented by another class. Here are some important differences:

generic classes

We have now learned enough C# that we can implement data structures such as stacks, queues, and binary trees. However, in the subset of C# that we know so far, each of our implementations will be tied to some specific type. For example, we can write a stack class that holds integers:

class IntStack {
    int[] a;
    int num;  // number of elements currently on the stack

    public IntStack(int maxSize) {
        a = new int[maxSize];
    }
}

Similarly, we can write a stack that holds strings:

class StringStack {
    string[] a;
    int num;  // number of elements currently on the stack

    public StringStack(int maxSize) {
        a = new string[maxSize];
    }
}

But this is a significant limitation - we certainly don't want to have to rewrite our data structure classes once for every possible element type!

In Python last semester, we had no such issue because Python is dynamically typed. When we wrote a stack class in Python, we could add anything at all to it:

# Python code
s = new Stack()

s.push(14)
s.push('hello')
s.push([1, 2, 3])

C# is different: it's a statically typed language, meaning that every variable has a type. And so C# provides a different mechanism called generic classes. Generic classes let us write a container class only once and reuse its implementation for every possible element type.

A generic class has one or more type parameters. For example, we can implement a class called FixedStack<T>. This means "a fixed-size stack of objects of type T". Our class might look like this:

class FixedStack<T> {
    T[] a;
    int num;  // number of elements currently on the stack

    public FixedStack(int maxSize) {
        a = new T[maxSize];
    }

    public void push(T val) {
        a[num] = val;
        num += 1;
    }

    public T pop() {
        num -= 1;
        T ret = a[num];
        return ret;
    }
}

In the class above, T is a type parameter. When the caller creates a FixedStack, they will specify the type T:

FixedStack<int> s = new FixedStack<int>(100);  // create a stack of ints
s.push(15);
s.push(25);

FixedStack<string> t = new FixedStack<string>(100);  // create a stack of strings
t.push("hello");
t.push("goodbye");

Notice that inside the class FixedStack<T>, the type T can be used like any other type: we can create an array of objects of type T, or use T as a method parameter type, a method return type or a variable type.

Observe that a FixedStack is not quite like our stack implementation in Python, where a single stack could hold objects of various types. In C#, by contrast, each FixedStack is tied to some particular element type. That is a good thing: if we create a FixedStack<int>, intending to hold ints, and then attempt to push a string to it by mistake, we will get a compile-time error.

By the way, the word T above is arbitrary: a type parameter can have any name you like. It is traditional, however, in C# (and other languages with generics) to use "T" for classes that take a single type parameter.

multiple type parameters

A single class may take multiple type parameters. For example, suppose that we want to write a Dictionary class that maps keys to values using a hash table. It might look like this:

class Dictionary<K, V> {
    

    public void add(K key, V val) {  }

    public bool contains(K key) {  }
}

Here, K and V are two type parameters. When the caller creates a Dictionary, they will specify types for K and V:

Dictionary<int, string> d = new Dictionary<int, string>();   // maps int → string
d.add(10, "sky");
d.add(20, "purple");

Dictionary<string, double> e = new Dictionary<string, double>();  // maps string → double
e.add("purple", 55.2);

Let's consider how we could implement such a class. As we learned in Intro to Algorithms, a hash table contains an array of hash chains, each of which consists of a linked list of nodes. So we will need a Node class that is itself generic:

class Node<K, V> {
    public K key;
    public V val;
    public Node<K, V>? next;

    public Node(K key, V val) {
        this.key = key; this.val = val;
    }
}

Notice that inside a generic class such as Node, we must specify the type parameters <K, V> when we refer to the class itself (e.g in the field declaration Node<K, V> next), but the constructor declaration does not include the type parameters.

Now, inside the Dictionary class we could have a field that holds an array of hash chains of appropriate type:

class Dictionary<K, V> {
    Node<K, V>? a[];
}

generic interfaces

We discussed generic classes above. An interface may also be generic. For example, let's revisit the IStack interface we wrote before, which described a stack of ints:

interface IStack {
    bool isEmpty { get; }
    void push(int i);
    int pop();
}

We will now modify it to be a generic interface IStack<T>, which is a stack of elements of type T:

interface IStack<T> {
    bool isEmpty { get; }
    void push(T i);
    T pop();
}

Above, we wrote a class FixedStack<T>, implementing a fixed-size stack of elements of type T. We can now modify that class so that it implements the IStack<T> interface:

class FixedStack<T> : IStack<T> {
    T[] a;
    int num;  // number of elements currently on the stack

}

Once again, a variable may have an interface type:

IStack<double> s = new FixedStack<double>(100);
s.push(2.0);
s.push(4.0);

IComparable

Many types are naturally ordered: for any two values, we can say whether one is greater or less than the other. For example, we can compare integers or doubles in the obvious fashion, or can compare strings alphabetically. For other types, an ordering may not be so obvious: for example, does it make sense to say that one stack is greater or less than another? Perhaps not.

The C# standard library contains a generic interface IComparable<T> that is implemented by all ordered built-in types. IComparable<T> means "comparable with objects of type T". For example, the built-in type int implements IComparable<int>, since integers are comparable with integers. Similarly, string implements IComparable<string>.

IComparable<T> is defined like this:

interface IComparable<T> {

    int CompareTo (T other);

}

The CompareTo method returns

For example:

WriteLine(4.CompareTo(7));  // writes -1, since 4 < 7

IComparable<T> is useful because it enables us to write generic classes that compare their elements, as we will see in the next section.

generic class constraints

The generic class FixedStack<T> that we wrote above treats its elements as opaque values: it stores them, but does not know anything about their behavior.

Now suppose that we want to write a class MaxStack<T> which holds a stack of values of type T, and also provides a method T max() that returns the largest value currently on the stack. You might think that we could implement the class like this:

class MaxStack<T> {
    T[] a;
    int num;

    

    T max() {
        T v = a[0];

        // Scan the stack, looking for the largest element.
        for (int i = 1 ; i < num ; ++i)
            if (a[i] > v)  // we found a larger element
                v = a[i];  // remember it
        return v;
    }
}

This code will not compile. The problem is that we can't use the > operator to compare two values of type T, because T might be some type that is not ordered – for example, T might be an array type, and arrays cannot be compared in C#.

So we will need to modify MaxStack so that it's only possible to create a MaxStack<T> if T is an ordered type. C# provides constraints for this purpose. As we saw in the previous section, any ordered type implements the IComparable<T> interface. So we can add a constraint to MaxStack<T> that says that T must implement IComparable<T>:

class MaxStack<T> where T : IComparable<T> {

}

And now in the class we can implement the max() method like this:

  T max() {
      T v = a[0];

      // Scan the stack, looking for the largest element.
      for (int i = 1 ; i < num ; ++i)
          if (a[i].CompareTo(v) > 0)  // we found a larger element
              v = a[i];  // remember it
    return v;
  }

Notice that even if T is constrained to implement IComparable<T>, we still cannot use the < operator to compare two elements of type T – instead, we must call CompareTo().

In fact, in C# you may not ever use comparison operators to compare values of generic types, even the == operator! However, if T is known to implement IComparable<T> then you can test whether two values of type T are identical by calling CompareTo() and checking for the return value 0. It would be nice if C# allowed comparison operators in this context (and some other languages do), but it does not.

The C# collection classes

LIke many object-oriented languages, C# contains a set of collection classes in its standard library. These classes implement common data structures such as stacks, queues and hash tables. They are generic and make use of inheritance.

Here is a picture of the class hierarchy of some of the major collection classes and interfaces:

%3

All of these classes and interfaces live in the namespace System.Collections.Generic.

In the picture, an arrow from A to B means that B implements or inherits from A.

All entities above whose names begin with the letter I (e..g ICollection, IDictionary) are interfaces. This is a standard naming convention in the C# standard library.

The picture above includes the following classes:

You can read more details about these classes and interfaces in our Quick Library Reference. Note that you will need to follow the class hierarchy in order to find all the members available in each of these classes. For example, the methods available in List<T> include not only the ones listed under List<T> in the library reference, but also those listed under IList<T> and ICollection<T>, since List<T> implements those interfaces.

You can read even more details about the collection classes (including various classes not depicted above) in the official .NET API documentation.

Notice that all collection classes ultimately inherit from an interface IEnumerable<T>. This interface contains various methods that allow the foreach statement to iterate over a collection's values. We will not discuss these methods at this point, but you should certainly be aware that you can use foreach to iterate over any collection.

You will want to become fluent in using these collection classes. With them, you finally have easy access to the same useful data structures that we often used in Python, i.e. lists, sets and dictionaries.

Dictionaries in particular are quite useful, so let's look at the Dictionary<K, V> class a bit more. As you can see in the documentation, Dictionary<K, V> implements the interface IDictionary<K, V>, which in turn implements the interface ICollection<KeyValuePair<K, V>>. This means that a dictionary is a collection of key-value pairs. You can use foreach to iterate over a dictionary, and each element you get back will be a KeyValuePair<K, V>. Each KeyValuePair in turn has properties Key and Value. For example:

    Dictionary<string, int> d = new Dictionary<string, int>();
    d["yellow"] = 10;
    d["red"] = 15;
    d["blue"] = 20;
    
    foreach (KeyValuePair<string, int> p in d)
        WriteLine($"key = {p.Key}, value = {p.Value}");