Productive Rage

Dan's techie ramblings

A static type system is a wonderful message to the present and future

Last week, I read the article "My time with Rails is up" (by Piotr Solnica) which resulted in me reading some of DHH's latest posts and re-reading some of his older ones.

Some people write articles that I enjoy reading because they have similar ideas and feelings about development that I do, that they manage to express in a new or particularly articulate way that helps me clarify it in my own head or that helps me think about whether I really do still agree with the principle. Some people write articles that come from a completely different point of view and experience to me and these also can have a lot of benefit, in that they make me reconsider where I stand on things or inspire me to try something different to see how it feels. DHH is, almost without fail, interesting to read and I like his passion and conviction.. but he's definitely not in that first category of author. There was one thing in particular, though, that really stuck out for me in his post "Provide sharp knives" -

Ruby includes a lot of sharp knives in its drawer of features.. The most famous is monkey patching: The power to change existing classes and methods. .. it offered a different and radical perspective on the role of the programmer: That they could be trusted with sharp knives. .. That’s an incredibly aspirational idea, and one that runs counter to a lot of programmer’s intuition about other programmers.

Because it’s always about other programmers when the value of sharp knives is contested. I’ve yet to hear a single programmer put up their hand and say "I can’t trust myself with this power, please take it away from me!". It’s always "I think other programmers would abuse this"

The highlighted section of the quote is what I disagree with most - because I absolutely do want to be able to write code in a way that limits how I (as well as others) may use it.

And I strongly disagree with it because..

The harsh reality is that all code is created according to a particular set of limitations and compromises that are present at the time of writing. The more complex the task that the code must perform, the more likely that there will be important assumptions that are "baked into" the code and that it would be beneficial for someone using the code to be aware of. A good type system can be an excellent way to communicate some of these assumptions. And, unlike documentation, convention or code review, a good type system can allow these assumptions to be enforced by the computer - rather than a principle that should be treated as unbreakable being allowed to be ignored. Computers are excellent at verifying simple sets of rules, which allow them to help identify common mistakes (or miscomprehensions).

At the very simplest level, specifying types for a method's arguments makes it much less likely that I'll refactor my code by swapping two of the arguments and then miss one of the call sites and not find out that something now fails until runtime (the example sounds contrived but, unfortunately, it is something that I've done from time to time). Another simple example is that having descriptive classes reminds me precisely what the minimum requirements are for a method without having to poke around inside the method - if there is an argument named "employeeSummaries" in a language without type annotations, I can presume that it's some sort of collection type.. but should each value in the collection include just the key and name of each employee or should it be key, name and some other information that the method requires such as, say, a list of reporting employees that the employee is responsible for managing? With a type system, if the argument is IEnumerable<EmployeeSummary> then I can see what information I have to provide by looking at the EmployeeSummary class.

A more complex example might involve data that is shared across multiple threads, whether for parallel processing or just for caching. The simplest way to write this sort of code reliably is to prevent mutation of the data from occurring on multiple threads and one way to achieve that is for the data to be represented by immutable data types. If the multi-threaded code requires that the data passed in be immutable then it's hugely beneficial for the type system to be able to specify that immutable types be used, so that the internal code may be written in the simplest way - based on the requirement that the data not be mutable.

I want to reinforce here that this is not just about me trying to stop other people from messing up when they use my code, this is just as much about me. Being able to represent these sorts of key decisions in the type system means that I can actually be a little bit less obsessive with how much I worry about them, easing the mental burden. This, in turn, leaves me more mental space to concentrate on solving the real problem at hand. I won't be able to forget to pass data in an immutable form to methods that require it in an immutable form, because the compiler won't let me do so.

Isn't this what automated tests are for?

An obvious rebuttal is that these sorts of errors (particular the mixing-up-the-method-arguments example) can (and should) be caught by unit tests.

In my opinion: no.

I believe that unit tests are required to test logic in an application and it is possible to write unit tests that show how methods work when given the correct data and that show how they will fail when given invalid data but it's difficult (and arduous) to prove, using automated tests, that the same guarantees that a type system could enforce are not being broken anywhere in your code. The only-allow-immutable-data-types-to-be-passed-into-this-thread-safe-method example is a good one here since multi-threaded code will often appear to work fine when only executed within a single thread, meaning that errors will only surface when multiple threads are working with it simultaneously. Writing unit tests to try to detect race conditions is not fun. You could have 100% code coverage and not always pick up on all of the horrible things that can happen when multiple threads deal with mutable data. If the data passed around within those code paths is immutable, though (which may be enforced through the types passed around), then these potential races are prevented.

Good use of static typing means that an entire class of unit tests are not required.

The fact that static typing is not enough to confirm that your code is correct, and that unit tests should be written as well, does not mean that only units tests should be used.

I've kept this post deliberately short because I would love for it to have some impact and experience has taught me that it's much more difficult for that to be the case with a long format post. I've expanded on this further at "A static type system is a wonderful message to the present and future - Supplementary". There's more about the benefits, more about the limitations, more examples of me saying "I don't want the power to do try to do something that this code has been explicitly written not to have to deal with" and no more mentions of multi-threading because static typing's benefits are not restricted to especially complicated problem domains, applications may benefit regardless of their complexity.

Posted at 21:33