» CS 330 Lecture 16 – Parametric Polymorphism

CS 330 Lecture 16 – Parametric Polymorphism

March 3, 2017 by Chris Johnson. Filed under cs330, lectures, spring 2017.

Dear students,

Don’t get me started on arrays. What sort of world is this where an array is fixed in size, but yet it doesn’t know what that size is? We must fix this. Let’s create a growable array:

class Flexray {
  public:
    Flexray() :
      capacity(10),
      nitems(0),
      elements(new int[capacity]) {
    }

    ~Flexray() {
      delete[] elements;
    }

    void operator<<(int item) {
      if (nitems == capacity) {
        grow();
      }
      elements[nitems] = item;
      ++nitems;
    }

    int size() const {
      return nitems;
    }

    int &operator[](int i) {
      // TODO bounds check
      return elements[i];
    }

    const int &operator[](int i) const {
      // TODO bounds check
      return elements[i];
    }

  private:
    void grow() {
      int new_capacity = 2 * capacity;
      int *new_elements = new int[new_capacity];
      memcpy(new_elements, elements, sizeof(int) * capacity); 
      delete[] elements;
      capacity = new_capacity;
      elements = new_elements;
    }

    int capacity;
    int nitems;
    int *elements;
}

Okay, great. Now the boss tells us we need a growable array of strings. Then a growable array of stringstreams. Then a growable array of std::out_of_range. What are we going to do? Copy and paste and tweak types? No.

There is actually more than one answer to this question. We’ve already seen one vehicle for reusing code: subtype polymorphism. We could just have our collection appeal to the supertype. This is how Java’s collections worked when I first learned it. Collections were written in terms of Object, like this:

public class Flexray {
  private int capacity;
  private int nitems;
  private Object[] elements;

  public Flexray() {
    capacity = 10;
    nitems = 0;
    elements = new Object[capacity];
  }

  public void add(Integer item) {
    if (nitems == capacity) {
      grow();
    }
    elements[nitems] = item;
    ++nitems;
  }

  public int size() {
    return nitems;
  }

  public Object get(int i) {
    return elements[i];
  }

  private void grow() {
    int newCapacity = 2 * capacity;
    Object[] newElements = new Object[newCapacity];
    System.arraycopy(elements, 0, newElements, 0, nitems);
    capacity = newCapacity;
    elements = newElements;
  }
}

Now we can make a list that works for any type, right? Because everything’s an Object. Sadly, a generalizing polymorphic solution suffers from a couple of drawbacks:

When we retrieve an element using get, we must downcast the result to the actual type.
When we add an element using add, we can throw anything in. It doesn’t have to be of the type we want our list to hold.
The collection can’t hold primitives, as they are not in the Object hierarchy.

A better fix is to define Flexray once but let Flexray‘s type come in as a parameter. This rounds out our four forms of polymorphism:

Coercion, in which we have a value of type X and an operation that expects type Y, but there’s a known path for implicitly converting Xs into Ys. The function is polymorphic in the sense that it can operate on Ys and any values of types that can be converted to Y.
Ad hoc polymorphism is the name we give to the overloading of methods. Several versions of a single function are written, each catering to a different type. We don’t really save on effort with ad hoc polymorphism, but we at least save on names. Each version of that method has the same name.
Subtype polymorphism is when a piece of code is targeted to handle an umbrella type, a supertype, but somehow it calls the overridden methods in the subtype.
Parametric polymorphism is when we pass a type as a parameter to a function, class, or some other construct. The compiler uses that type to produce seemingly many different versions of the construct.

In C++, we decorate our class with a template and replace all references to int elements to T elements:

template<typename T>
class Flexray {
  public:
    Flexray() :
      capacity(10),
      nitems(0),
      elements(new T[capacity]) {
    }

    ~Flexray() {
      delete[] elements;
    }

    void operator<<(T item) {
      if (nitems == capacity) {
        grow();
      }
      elements[nitems] = item;
      ++nitems;
    }

    int size() const {
      return nitems;
    }

    T &operator[](int i) {
      // TODO bounds check
      return elements[i];
    }

    const T &operator[](int i) const {
      // TODO bounds check
      return elements[i];
    }

  private:
    void grow() {
      int new_capacity = 2 * capacity;
      T *new_elements = new T[new_capacity];
      memcpy(new_elements, elements, sizeof(T) * capacity); 
      delete[] elements;
      capacity = new_capacity;
      elements = new_elements;
    }

    int capacity;
    int nitems;
    T *elements;
}

A templated class or function isn’t actually compilable code. It’s just a pattern. Only when the compiler sees a reference to the templated construct with the type value does it go and create a specialization of that pattern. If it sees another reference with a different type, it generates a completely different specialization for just that type. So, we might have 20 different versions of Flexray, all catering to elements of different types. The source code will have just one definition, but the machine code will have 20 different representations.

Java handles templates differently. The source code looks mostly the same:

public class Flexray<T> {
  private int capacity;
  private int nitems;
  private T[] elements;

  public Flexray() {
    capacity = 10;
    nitems = 0;
    elements = (T[]) new Object[capacity];
  }

  public void add(T item) {
    if (nitems == capacity) {
      grow();
    }
    elements[nitems] = item;
    ++nitems;
  }

  public int size() {
    return nitems;
  }

  public T get(int i) {
    return elements[i];
  }

  private void grow() {
    int newCapacity = 2 * capacity;
    T[] newElements = (T[]) new Object[newCapacity];
    System.arraycopy(elements, 0, newElements, 0, nitems);
    capacity = newCapacity;
    elements = newElements;
  }
}

The designers decided that they didn’t want 20 different machine code representations. They decided that they would use the generic parameters only for type checking and not for code generation. The generic type is erased completely from the single machine code representation and is replaced with a general supertype—either Object or the generic type’s upper bound. Their design suffers from a couple of issues:

The generic type is simply not known at runtime. This means we can’t say things like new T[10].
The collection can’t hold primitives, as they are not in the Object hierarchy.

Hey, at least we fixed the type checking issues!

The most common application of templates is to create reusable collections. But in C++ we can also pass the compiler a value to help it specialize the templated pattern. For example, suppose we want lightning fast multiplication. Let’s take a first stab, which is anything but lightning fast:

int multiply(int a, int n) {
  int sum = 0;
  for (int i = 0; i < n; ++i) {
    sum += a;
  }
  return sum;
}

Compare this to a templated version of multiply. Here one of the coefficients comes in as a template parameter:

template<int n>
int multiply(int a) {
  int sum = 0;
  for (int i = 0; i < n; ++i) {
    sum += a;
  }
  return sum;
}

Now consider this function that invokes the two multiplys:

void foo() {
  multiply(2, 7);
  multiply<2>(7);
}

What will the compiler do in the first case? It will generate very general machine code that invokes the multiplication hardware through a multiplication instruction, which tends to consume more cycles than addition—though they are close on modern architectures.

In the second case, the compiler is equipped with more information. It knows one of the operands. That means it can tailor the machine code it generates to the situation at hand. It may be able to avoid the multiplication unit! Let’s verify this using Matt Godbolt’s Compiler Explorer.

What else can we do with templates? We can eliminate memory leaks. But that’s a story for another day.

Sincerely,

P.S. It’s Haiku Friday!

First it was New York
Now you can ♥ anything
‘Cuz they are T-shirts