Nirav's Contemplations: Programming

I've been pondering about trivialities again, I am observing how most programmers have strong vested interest in defending static v/s dynamic typing as if they are mutually exclusive.

There're those who think static typing is the ultimate engineering solution to build reliable software and dynamic typing is some kind of a sick joke to put their lazyness to test. Then there are those who think types are for primitive minds and variable declaration is an insulting feature.

At this point in time, I can relate to both groups as I've had strong opinion on this subject. Having opinion on typing is OK, I think. (having no opinion on this topic basically means google didn't work for you and you should head back).

Most programmers go through, what I call, a typing evolution cycle. I started with dynamically typed language (name removed because of possible copyright violation). It was really great to stuff everything in to a variant (actually I never even bothered to declare variables!) and not worrying about compilation or runtime errors, it mostly ran. Then I was introduced to C (and later C++) and I remember how I hated it because I was now forced to declare variables and had to think about what I wanted computer to do upfront. It was alien to me that compiler will point out my mistakes instead of doing my bidding. During that time I was a supporter of dynamic typing because I was naive.

As the time went by I got more used to static typing with interesting IDE features such as IntelliSense of Visual Studio. Writing correct program was much easier now and it always worked (mostly correctly). Then came the Java programming, everything was object (except primitives). Large projects and Eclipse JDT made me appreciate value of static typing. When I start writing test I don't think about type errors because they are taken care of, I could easily refactor code without nightmares that something could be broken. Scala's type inference and simpler type system actually boosted my faith in static typing even further. At this point in time I supported static typing because I was naive.

This was my experience, may be it is reverse for others (starting from static typing and going back and forth).
My views have changed on typing over the time. Depending on what I'm working on, I don't mind hand-waving type system (coupled with esoteric tests) for a quick isolated stab that brings a lot of benefits v/s a bloat of API to do the same in few years. While still having the confidence in my software being immune to my bad keyboard-fu and assurance that stupid mistakes will not make it all the way to production. Probably, I'm still naive.

Most programmers have suffered null pointer one way or other - usually a core-dump followed by a segmentation fault on development machine or on a production box with application in smokes. NullPointerException results in a visible embarrassment of not thinking about "that something *could be* null".

Tracking Null Pointer ranges from loading core-dump in gdb and tracing dereferenced pointer to stack traces pointing to exact location in source. However, ease of tracking nulls opens up the doors to ignore them in practice and throwing null-checks just becomes as common as throwing one more div to fix IE's layout problems which is bad.

Problems with Null:
I hate having to ignore nulls as it is not always enough just to add one more null check. The reason why I am writing this blog is because I have several problems with Nulls:

All and Every reference can be a null in languages like Java. This covers everything: method parameters, return values, fields etc. There's no precise way for a programmer to know that some method might return null or accept null parameters. You absolutely have to resort to actual source code or documentation to see if it can possibly return null (and you are going to need good luck with that). All of it adds extra work when you really want to be focusing on fixing the real problem.
The problem with NullPointerException is that they point to causal eventuality and not usually the actual cause. So what you see in stack traces are usually the code paths where damage is not really initiated but done when we are normally interested in case where damage is initiated.
Null is actually very ambiguous. Is it the uninitialized value or absence of value or is it used to indicate an error? The paradigm of null fits well in database but not in programming model.
Having Nulls in your code has major implications in code quality and complexity. For example, it is not unusual to see code branches with null checks breeding like rabbits when an API "may" return null which in turn results in extremely defensive code. This significantly taxes readability.
Null makes Java's type system dumber when a method is overridden and you want to call it. Writing code like methodDoingStuff((ActualType)null, otherArgs) isn't exactly a pretty sight. This results in subtle errors when arguments are non-generic.

In many ways Nulls are necessary evil. For those of us who care about readability and safety we can't ignore them yet we shouldn't just let it overtake safety and readability.

I have come to know several techniques to tackle nulls. First, there is Null Object pattern which is not entirely as ridiculous as the name implies but it's not practical in real life software having hundreds of class hierarchies and thousands of classes, and so, I will not talk about it. Then there are languages like Haskell and Scala with library classes that try to treat nulls in, IMO, a better way. Haskell has MayBe and Scala has Options. After using options in Scala for a while in a side project, I found that I was no longer fighting with nulls. I knew exactly when I had to make a decision that a value is really optional and I must do alternate processing.

The central idea behind Haskell's MayBe and Scala's Option is to introduce a definitive agreement on a value's eligibility to be either null or not-null enforced with the help of type system. I will talk about Scala's Option since I have worked with it, but the concept remains same. I will also introduce how to implement and use Options in Java since this is much more of a functional way of thinking about handling nulls and it doesn't (almost) take Scala's neat language features to implement it.

Treating nulls the better way:

Usual course of action when you are not sure what value to return from a method is:

Most of the times it results in the last option because you don't have to fear about breaking everything (well, mostly) and everyone passes the buck like this.

We can do better with Scala's Option classes. We can wrap any reference in to Some or None and handle it with pattern matching or "for comprehension". For example:

Some(x) represents a wrapper with x as actual value; None represents absence of value. Some and None are subclasses of Option. Option has all the interesting methods you can use. When a variable in question is null we can: fall back to default value, evaluate and return a function's computed value, filter and so on.

Options in Java:
Implementing Options in Java is surprisingly a trivial task. However, it is not as pleasant as Scala's options. Implementation boils down to a wrapper class Option with two children: Some and None. None represents a null but with a type (None[T]) and Some represents non-null type.

To make Option interesting we make Option extend List, so we can iterate on it to mimic poor man's "for comprehension". We will also go as far as tagging both types with Enums so we can do poor man's pattern matching with a switch. You can find example implementation of Options in Java with a test case demonstrating use of Options. Here's the small snippet which covers the essense:

As you can see, Option opens up several doors to fix the null situation. You now have choice to compute a value, use default value or do arbitrary stuff when you encounter nulls.

How are my null problem solved with Options:

Using Options for optional/null-able references I have at least avoided "all things could be null" problem in my code. When an API is returning a Option, I don't have to wonder if it can return null. Intention is pretty clear.
When I am forced to handle null right at the time of using an API, I have to handle it right there: do alternate processing or use default. No surprises.
Option is a very clear way of saying a variable represents possibly an absent value.
Option doesn't really solve this problem completely. For example, method signatures with wrapper Option type can get really long (e.g. def method1(): Map[String, Option[List[Option[String]]] = {}). However, compared to null checks, I would prefer long method signature any day. Other benefits out-weight this limitation.
Clearly, Option[Integer] always means only Option[Integer] and not Option[Integer], Option[String], Option[Character], Option[Date] and so on. Compiler can infer exact method call from generic types.

As good as the concept behind optional values is, it doesn't and will not always save you from Null. You will still have to deal with existing libraries which return nulls and cause all these problems and more. However, most of the time null is problematic in your own code.

Where to use Options:

Here are the common places where I think using Options makes more sense:

APIs: Make your API as specific and as readable as possible; all optional parameters and return values should be Option.
Use in your domain model: You already have fair understanding on null-able columns, use Option for null-able fields in your table. It is not hard to integrate using Options if you are using an ORM with interceptable DB fetch; you can initialize fields to None if database contains null and so on.

In the interest of keeping this post relevant and on topic, I have completely avoided heavy theoretical baggage (monads et. al.) that's inevitable when theoretical functionalists (functional programmers) talk about Options. I really hope this post generates some interest in this topic. If you disagree or would like to share more on this topic, please leave a comment.

Nirav's Contemplations

Monday, March 05, 2012

On Static v/s Dynamic typing

Monday, December 13, 2010

Tackling nulls the functional way

Open source contributions

Recent Comments