New, abstract, virtual, override – isn’t virtual good enough?

A virtual function is a very important concept in Object Oriented Programming (OOP) language, and forms the main method for polymorphism. In short, virtual functions are the glue that allows for the behaviour of a method to change depending on the runtime type of a specific object instance, not just the static type known or declared at compile time.

As virtual functions require an additional level of indirection to be invoked, languages that focus on performance without compromises such as C++ default to non-virtual method functions and require you to opt in to using virtual functions using the C++ keyword virtual. At the time Java was released, the primary focus was reliability and ease of use over raw performance and as such virtual was the default, requiring methods to be explicitly marked as non-overridable using the keyword final. The reasoning behind this decision was, presumably, that since virtual functions are more powerful than non-virtual ones, they must also be “better” and should be used everywhere.

C# on the other hand decided to follow the C++ line rather than the Java one and make non-virtual the default. Not only that, Microsoft also came up with a whole family of keywords (new, abstract, virtual, override, and sealed) to do the job of the one keyword virtual in C++. It turns out that the language gurus at Microsoft did not do this just to be different from C++ and Java. In fact, the new keywords all address specific problems with the previous approaches.

OOP constructs is sometimes used by frameworks to allow for customisation and extensibility. The most complex such example in the .NET framework is the type hierarchies used by Windows Forms and Windows Presentation Foundation. Although there are many advantages to using in OOP this way, it also poses great challenges when it comes to versioning.

As an extreme example, if a class in such a framework introduces a new abstract function, then all existing sub classes will be incompatible with the new version of the framework without modification. A more likely scenario is an addition of a new function with the same name as an existing, accepting different parameter types. If such a function is called with parameter types that do not match the function arguments exactly, such as passing an int to a method accepting long, adding a new overload may make the method call ambiguous and break backwards compatibility. You would think, however, that adding a new function with a name never previously used in the framework cannot break compatibility. As we will see, that may not always be the case.

To illustrate the problem, we will use a Java snippet with some extremely poorly chosen method names. We will use a simplified version of a Stream class, supporting reading and writing from and to a resource such as a file or network connection. We have a base class Stream provided in a framework, and a special subclass TopSecretStream for reading and writing from a top secret storage mechanism.

class Stream
{
    public abstract String read();
    public abstract void write(String str);

    // Other useful functions implemented using the above abstract ones ...
}

class TopSecretStream extends Stream
{
    public String read() {
        // Reads from top secret storage.
    }

    public void write(final String str) {
        // Writes to top secret storage.
    }

    public void foo()
    {
        // Deletes the files and triggers the explosives.
    }
}

The TopSecretStream acts as a Stream, reading and writing text, but also has the very badly named method foo with some nasty side effects.

Now, what happens if in a future version of the framework someone decides to add an extra method to Stream:

class Stream
{
    public abstract String read();
    public abstract void write(String str);
    public void foo(); // Flush pending changes, called every 2 seconds

    // Other useful functions implemented using the above abstract ones ...
}

Disaster! Just by running the application against a new version of the framework, we have now accidentally deleted all the files and triggered the explosives! Although TopSecretStream never meant for Stream to call foo(), the unfortunate naming caused Java to confuse the two methods, interpreting the version in TopSecretStream as meant to override the first. Surely, this is not how it is meant to work?

C# is different, and allows for much more control over how methods are selected. Given classes Foo, Bar and Baz where Foo is the base class, Bar subclasses Foo and Baz subclasses Bar – each method declaration in Bar needs to specify both whether the method should be called when an instance of Bar is statically typed as Foo, and whether it should be possible for a method in Baz to be called instead of the one declared in Bar when an instance of Baz is statically typed as Bar. Let’s take a look at a few combinations of the C# keywords and what they do:

  • no modifiers – The method cannot be called when an instance of Bar is typed as Foo, Baz cannot change which method that gets called when an instance of Baz is typed as Bar.
  • new – As above, explicitly stated. In addition, the compiler will not complain of Foo already has a method of the same name
  • virtual – The method cannot be called when an instance of Bar is typed as Foo, but Baz can change which method that gets called when an instance of Baz is typed as Bar.
  • abstract – As virtual, but Baz, or a sub class of Baz, must provide an overriding implementation of the function.
  • override – Foo (or a base class of Foo) must already declare a method with the same name, marked as abstract, virtual or override but not sealed. This method will get called when an instance of Bar is typed as Foo. Baz can change which method that gets called when an instance of Baz is typed as Bar.
  • override sealed – As override, but Baz cannot change which method that gets called when an instance of Baz is typed as Bar.

In addition, the new keyword can be prepended to virtual and abstract to explicitly state that a declaration in a base class should be ignored, and silence any compiler warnings complaining about a name clash.

Armed with our new found understanding of the above keywords, what would our example above have looked like in C#?

class Stream
{
    public abstract String Read();
    public abstract void Write(String str);

    // Other useful functions implemented using the above abstract ones ...
}

class TopSecretStream extends Stream
{
    public override String Read(
    {
        // Reads from top secret storage.
    }

    public override void Write(String str)
    {
        // Writes to top secret storage.
    }

    public void Foo()
    {
        // Deletes the files and triggers the explosives.
    }
}

… and after adding the method to Stream:

class Stream
{
    public abstract String Read();
    public abstract void Write(String str);
    public virtual void Foo(); // Flush pending changes, called every 2 seconds

    // Other useful functions implemented using the above abstract ones ...
}

As Foo in TopSecretStream not marked with override, .NET knows not to confuse it with Foo in Stream. If we recompile our application against a new version of the framework, the compiler will warn about a name clash and we can chose to either accept it by adding the new keyword to Foo in TopSecretStream or, even better, renaming it to something more sensible!

Tags: cpp csharp java