Week 6: Notes

The 'var' keyword

When you declare a local variable, you can use the var keyword to tell the compiler to infer its type. For example:

var s = "distant hill";

This is equivalent to

string s = "distant hill";

var does not allow a variable to hold values of any type at all! C# is statically typed, so any assignment to a variable declared with var must match the type that the compiler inferred. For example:

var s = "distant hill";
s = 14;    // error: cannot implicitly convert 'int' to 'string'

I personally don't use var much. I think code is generally easier to read when variable types are explicit.

exceptions

Like Python and most other modern languages, C# has exceptions. Code may throw an exception to indicate that an exceptional situation has occurred, typically some sort of error. In an addition, the C# language itself will throw an exception in some situations, such as if code attempts to access an array element that's out of bounds, or attempts to access a property of null.

When an exception is thrown, it will pass up the call stack, aborting the execution of any methods in progress until some block of code catches the exception and continues execution. If the exception is not caught, the program will print an error message and terminate.

An exception in C# is an object, specifically any object belonging to the System.Exception class or any of its subclasses. A large number of exception classes are built into the standard library including IndexOutOfRangeException, NullReferenceException, FormatException, and others.

throwing an exception

The throw statement throws an exception, either of a built-in or user-defined exception class. (This is like raise in Python.) For example:

class OpException : Exception {
    char op;

    public OpException(char op) {
        this.op = op;
    }
}

int compute(char op, int a, int b) =>
    op switch {
        '+' => a + b,
        '-' => a - b,
        '*' => a * b,
        _ => throw new OpException(op)
    };

catching exceptions

The try statement attempts to execute a block of code. It may have one or more catch clauses, each of which will execute if a certain type of exception is thrown. For example:

static void Main() {
    StreamReader reader;
    try {
        reader = new StreamReader("numbers");
    } catch (FileNotFoundException e) {
        WriteLine("can't find input file: " + e.FileName);
        return;
    } catch (DirectoryNotFoundException) {
        WriteLine("invalid path");
        return;
    }
    …

When an exception is caught, the catch block (called an exception handler) executes. As you can see in the code above, a catch block may optionally specify a variable to receive the exception object.

A catch block may rethrow the exception it caught, or even a different exception. If a catch block does not throw an exception, execution resumes below the try statement.

generic method constraints

In the examples we saw above, a generic method or class took a type parameter T that could be any type at all. Sometimes we will want to require that the type T has certain capabilities, for example by requiring that it implements a certain interface. We can accomplish that via a generic constraint.

For example, consider a function that computes the sum of all integers in an array:

int sum(int[] a) {
    int s = 0;

    foreach (int i in a)
        s += i;
    return s;
}

We could write a similar function that computes the sum of an array of doubles, but it would be nicer to have a single function that we can use for int, double, or any other numeric type. We can't write the function using an unconstrained generic:

T sum<T>(T[] a) {
    T s = 0;            // ERROR: Cannot implicitly convert type 'int' to 'T'

    foreach (T i in a)
        s += i;         // ERROR: Operator '+=' cannot be applied to operands of type 'T'
    return s;
}

The type T must have a zero element and must support addition. In the library there's a generic interface INumber<T> for numeric types that support these and other numeric operations. The standard types int, float, and double implement INumber<T>. So we may add a constraint saying that T must implement that interface:

using System.Numerics;

T sum<T>(T[] a) where T : INumber<T> {
    T s = T.Zero;

    foreach (T i in a)
        s += i;
    return s;
}

Now the code will work fine. Note that we must write T.Zero to retrieve the zero element of type T. (Actually Zero is defined in INumberBase<T>, a parent interface of INumber<T>.)

generic class constraints

Suppose that we want to write a class TreeSet<T> that holds values of type T in a binary tree. As a first attempt, we might write

class Node<T> {
    public T val;
    public Node<T>? left, right;

    public Node(T val) { this.val = val; }
}

class TreeSet<T> {
    Node<T>? root;

    public bool contains(T x) {
        Node<T>? p = root;
        while (p != null) {
            if (x == p.val)      // ERROR: Operator '==' cannot be applied to operands of type 'T'
                return true;
            else if (x < p.val)  // ERROR: Operator '<' cannot be applied to operands of type 'T'
                p = p.left;
            else
                p = p.right;
        }
                
        return false;
    }

    // more methods here: insert(), delete(), ...
}

This code will not compile. The problem is that we can't use the > operator to compare two values of type T, because T might be some type that is not ordered – for example, T might be an array type, and arrays cannot be compared in C#.

The C# standard library contains a generic interface IComparable<T> that is implemented by all ordered built-in types. IComparable<T> means "comparable with objects of type T". For example, the built-in type int implements IComparable<int>, since integers are comparable with integers. IComparable<T> is defined like this:

interface IComparable<T> {

    int CompareTo (T other);

}

The CompareTo() method returns

a negative number if this object is less than other
0 if this object equals other
a positive number if this object is greater than other

For example:

WriteLine(4.CompareTo(7));  // writes -1, since 4 < 7

Let's add a constraint to TreeSet<T> that says that T must implement IComparable<T>.The code will now look like this:

class Node<T> {
    public T val;
    public Node<T>? left, right;

    public Node(T val) { this.val = val; }
}

class TreeSet<T> where T: IComparable<T> {
    Node<T>? root;

    public bool contains(T x) {
        Node<T>? p = root;
        while (p != null) {
            int c = x.CompareTo(p.val);
            if (c == 0)
                return true;
            else if (c < 0)
                p = p.left;
            else
                p = p.right;
        }
                
        return false;
    }

    // more methods here: insert(), delete(), ...
}

Notice that even if T is constrained to implement IComparable<T>, we still cannot use the < or == operators to compare two elements of type T – instead, we must call CompareTo(). This is inconvenient, and is arguably a weakness in C#. Above, we saw that an interface can provide operators: any class implementing the INumeric<T> interface must include + and other numeric operators. This is a relatively new feature (it first appeared in C# 11, released in 2022). Unfortunately IComparable<T> does not include operators such as < and == (probably for reasons of backward compatibility).

nested classes

In the TreeSet<T> example in the previous section, it was a bit inconvenient that we had to make the Node class generic and write Node<T> everywhere we wanted to use it. As an alternative, we can make Node be a nested class inside TreeSet<T>. Then it won't need to be declared as a generic class Node<T>, but will still be able to use the type variable T declared by its containing class TreeSet<T>. The code will look like this:

class TreeSet<T> where T: IComparable<T> {
    class Node {
        public T val;
        public Node? left, right;

        public Node(T val) { this.val = val; }
    }

    Node? root;

    public bool contains(T x) {
        Node? p = root;
        while (p != null) {
            int c = x.CompareTo(p.val);
            if (c == 0)
                return true;
            else if (c < 0)
                p = p.left;
            else
                p = p.right;
        }
                
        return false;
    }

    // more methods here: insert(), delete(), ...
}

In my opinion this is nicer. Node is just a helper class for TreeSet<T> anyway, so it makes sense for it to be nested.

As another example, suppose that we want to write a Dictionary class that maps keys to values using a hash table. It might look like this:

class Dictionary<K, V> where K : IComparable<K> {
    …

    public void add(K key, V val) { … }

    public bool contains(K key) { … }
}

Here, K and V are two type parameters. When the caller creates a Dictionary, they will specify types for K and V:

Dictionary<int, string> d = new();   // maps int → string
d.add(10, "sky");
d.add(20, "purple");

Dictionary<string, double> e = new();  // maps string → double
e.add("purple", 55.2);

As we learned in Introduction to Algorithms, a hash table contains an array of hash chains, each containing a linked list of nodes. So we will need a Node class that holds a key of type K and a value of type V. We could declare it as a generic class Node<K, V>, but it will be easier to nest it inside the class Dictionary<K, V>, and then it will be able to use the types K and V directly. Our code might look like this:

class Dictionary<K, V> where K : IComparable<K> {
    class Node {
        public K key;
        public V val;
        public Node? next;

        public Node(K key, V val) {
            this.key = key; this.val = val;
        }
    }

    Node?[] a;  // array of hash chains
    
    ...
}

In fact we've already seen that the standard library contains a class Dictionary<K, V> that works similarly.

delegates

A delegate is a value that represents a function or method. It's similar to to a function object in Python, or a function pointer in languages such as C.

The delegate keyword declares a new delegate type. For example:

delegate bool IntCondition(int i);

With this declaration, an IntCondition is a type of delegate that takes an integer argument and returns a boolean.

We can now declare a variable of type IntCondition, and use it to refer to a function or method of corresponding type:

bool isOdd(int i) => i % 2 == 1;

IntCondition c = isOdd;
…

We can invoke the delegate using function call syntax:

WriteLine(c(4));    //  writes False

In the example above, the delegate c refers to a static method odd(). A delegate may also refer to an instance method, in which case it actually references a particular object on which the method will be invoked. For example:

class Interval {
    public int low, high;
    public Interval(int low, int high) { this.low = low; this.high = high; }

    public bool contains(int i) {
        return low <= i && i <= high;
    }
}

IntCondition c = new Interval(1, 5).contains;
IntCondition d = new Interval(3, 7).contains;
WriteLine(c(2));  // writes True
WriteLine(d(2));  // writes False

Here is a function that counts how many elements in an array of integers satisfy an arbitrary condition:

int count(int[] a, IntCondition c) {
    int n = 0;
    foreach (int i in a)
        if (c(i))
            ++n;
    return n;
}

We can invoke this function as follows:

bool isEven(int i) => i % 2 == 0;
  
int[] a = { 3, 4, 5, 6, 7 };
WriteLine(count(a, isEven));  // writes 2

Delegates may be generic:

delegate bool Condition<T>(T t);  // maps type T to bool

Here is the count() function from above, rewritten to work on an array of any type T. Notice that the function itself must also be generic (indicated by the "<T>" after the method name).

int count<T>(T[] a, Condition<T> c) {
    int n = 0;
    foreach (T x in a)
        if (c(x))
            ++n;
    return n;
}

The standard library contains several useful generic delegate types. The built-in type Predicate<T> is exactly equivalent to the type Condition<T> that we just defined:

delegate bool Predicate<T>(T obj);

Additionally, the built-in type Func<T, U> represents an arbitrary function from type T to type U:

delegate U Func<T, U>(T arg);

lambda expressions

A lambda expression is an anonymous function that can appear inside another expression. (It's similar to an lambda expression in Python, which we saw last semester).

For example, here's a generic function map() that applies a Func<T, U> to every element of an array of type T[], returning a new array of type U[]:

U[] map<T, U>(T[] a, Func<T, U> f) {
    U[] b = new U[a.Length];
    for (int i = 0; i < a.Length ; ++i)
        b[i] = f(a[i]);
    return b;
}

We can define a function and pass it to map():

int plus2(int i) => i + 2;

int[] a = { 100, 200, 300 };
int[] b = map(a, plus2);

Alternatively, we can invoke map() using a lambda expression:

int[] b = map(a, i => i + 2);

Here, i => i + 2 is a lambda expression. It's an anonymous function that takes an integer parameter i and returns the value i + 2.

A lambda expression may refer to parameters or local variables in its containing function or method. For example, suppose we want to write a method that adds a given value k to each element in an array. We could write a local method and pass it to map():

int[] add_k(int[] a, int k) {
    int f(int i) {
        return i + k;
    }

    return map(a, f);
}

Or we can use a lambda expression that adds k directly:

int[] add_k(int[] a, int k) {
    return map(a, i => i + k);
}

Linq methods

The System.Linq namespace contains many methods that operate on sequences, i.e. objects that implement IEnumerable<T>. These are extension methods, meaning that they are implemented in the System.Linq namespace rather than in IEnumerable<T> itself. However for the purpose of calling these methods that makes no difference, and you can invoke them directly on any enumerable object.

Useful methods include Select(), which maps a function over a sequence, and Where(), which filters a sequence, keeping only the elements for which a given condition is true. Most of these methods return a new sequence, i.e. another IEnumerable<T>. You can call the ToArray() or ToList() methods to gather a sequence's elements into an array or list.

For example, suppose that we want to keep only the odd values in an array, and add 10 to every remaining value. We might write

int[] a = { 15, 16, 17, 18, 19 };
int[] b = a.Where(i => i % 2 == 1).Select(i => i + 10).ToArray();

There are many other useful methods such as Concat(), OrderBy(), and Reverse(). You can read about these in our quick reference documentation.

Note that the System.Linq namespace is imported automatically by C#'s implicit usings feature. As we've seen before, this feature is not enabled on ReCodEx, so you'll need to import System.Linq explicitly there.

Linq syntax

C# includes special syntax for calling some of the Linq methods. For example, instead of

int[] b = a.Where(i => i > 10).Select(i => i + 10).ToArray();

we can write

int[] b = (from i in a where i > 10 select i + 10).ToArray();

This is the closest approximation in C# to a list comprehension in Python, where we could write

# Python code

b = [i + 10 for i in a if i > 10]

As usual, the C# equivalent is more verbose.