I recently began writing, as an exercise, some unit of measure code in Scala. I saw a headline in my newsreader some months ago about a Scala library for handling units of measure and I made a point NOT to read it because it sounded to me like an interesting problem and I wanted to first take a stab at it myself and then compare my solution to the one from the article and maybe even write my own article on my solution.

In the course of trying out a couple of designs I encountered a situation I wasn’t sure how to handle. I wanted an abstract class called Dimension which would encapsulate a measurement in some unit and I wanted to create several subtypes extending Dimension such as Length, Time, Mass, Temperature, etc. All Dimensions should be able to sum together, but only with their own kind. For example, 1 meter + 2 feet should give 1.6096 meters and 1 kilogram + 500 grams should give 1.5 kilograms. However, it makes no sense to add 30 seconds and 45 degrees Celsius. I wanted to arrange the types in such a way that a user of the library would not have the option of adding dimensions of two different types.

Here’s my initial code:

  abstract class Dimension(val value: Double, val name: String, val coef: Double) {
    def +(x: Dimension): Dimension
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
  Dimension(value, name, coef) {
    def +(x: Dimension): Time= new Time(value + coef * x.value / x.coef, name, coef)
  }

  class Length(value: Double, name: String, coef: Double) extends 
  Dimension(value, name, coef) {
    def +(x: Dimension): Length= new Length(value + coef * x.value / x.coef, name, coef)
  }

You can see the Dimension class declares an addition operator to satisfy our requirement that all Dimensions must be additive. No surprises so far.

The Time and Length classes extend Dimension and are concrete, so they must implement the addition operator. The operators have the same signature as the one from Dimension except for the return type. When creating a subtype, we are allowed to narrow the return types, so I made them more specific. Time.+ returns not merely a Dimension as in the supertype but a Time. Parameters, on the other hand, can only have their types widened in subtypes, so they remain Dimension. This is because the return type is a covariant position and the parameter type is a contravariant position. If you don’t know what variance is, I have an article on it.

This code has two main weaknesses. First, a developer subclassing Dimension is trusted to return the same type as the class. That is to say, a developer could write:

  class Length(value: Double, name: String, coef: Double) extends 
  Dimension(value, name, coef) {
    def +(x: Dimension): Time = new Time(value + coef * x.value / x.coef, name, coef)
  }

The developer could mix the types! This would return an unexpected and nonsensical result. The second and much more serious weakness is that a user of the library doesn’t have to pass a Length to Length.+. The user could write:

  val sum = new Length(10.0, "meters", 1.0) + new Time(10.0, "seconds", 1.0)

Nothing is stopping him from doing this.

What I wanted was a Dimension class that enforced some additional rules on its descendants. Dimension should dictate not only that all its subtypes must implement the addition operator but also that the operator should only accept a parameter of the same type as the class in which it is defined and that the operator should also return that same type. In short, I wanted to force all subtypes of Dimension to look like this:

  class X(...) extends Dimension(...) {
    def +(x: X): X = ...
  }

Learning to Accept Yourself (As a Parameter)

As I considered the problem I pretty quickly noticed a similarity to a basic Scala trait: scala.Ordered. Ordered[A] is used primarily in sorting. By mixing in Ordered, you can define an ordering for any class. The reason I thought of ordered is that includes the abstract method “compare (that : A) : Int”. This method compares “this” to “that”. In other words, it takes a parameter that it is able to compare with itself, which is *usually* a parameter of the same type. Here’s a typical use of Ordered[A]:

class Student(val lastName: String, val firstName: String) 
  extends Ordered[Student]
{
  def compare(that: Student) = 
    (lastName + "," + firstName).compare(that.lastName + "," + that.firstName)
}

That’s close to what I want, but not quite. Ordered[A] allows you to implement a class that can be compared to itself, but it doesn’t require it. You could implement “class Apples extends Ordered[Oranges]”, literally comparing apples and oranges. So this arrangement (a parameterized class or trait in which the subtypes specify themselves as the type parameter) allows, but does not enforce, the structure that I want. So Ordered[A] provides a clue, but not the complete solution.

Becoming Self Aware

The missing piece is a little-known Scala construct called the explicit self type. It is a way of specifying what type the “this” reference must have. The Scala-lang website has an article explaining another situation in which explicit self types are useful: specifying that within a class “this” should refer to an abstract variable type.

Here’s a very simple example of how explicit self types work. Here’s a base trait called TraitA and two traits that make use of TraitA. TraitB1 uses an explicit self type to denote that “this” must be of type TraitA, and TraitB2 uses extends to inherit from TraitA.

trait TraitA {
  def t1(): String
}

trait TraitB1 {
  self: TraitA =>
  def t2(): String = "TraitB1.t2 !" + t1() + "!"
}

trait TraitB2 extends TraitA {
  def t2(): String = "TraitB2.t2!" + t1() + "!"
}

TraitB1 and TraitB2 are exactly alike except for the way they gain access to the t1() method. Here’s an interpreter session in which we create some classes that extend these traits.

scala> class Class1 extends TraitB1 {
     |   def t1() = "Class1.t1"
     |   override def toString() = "Class1: " + t1() + " " + t2()
     | }
<console>:6: error: illegal inheritance;
 self-type Class1 does not conform to TraitB1's selftype TraitB1 with TraitA
       class Class1 extends TraitB1 {
                            ^

scala> class Class2 extends TraitB1 with TraitA {
     |   def t1() = "Class2.t1"
     |   override def toString() = "Class2: " + t1() + " " + t2()
     | }
defined class Class2

scala> new Class2
res0: Class2 = Class2: Class2.t1 TraitB1.t2 !Class2.t1!

scala> class Class3 extends TraitB2 {
     |   def t1() = "Class3.t1"
     |   override def toString() = "Class3: " + t1() + " " + t2()
     | }
defined class Class3

scala> new Class3
res1: Class3 = Class3: Class3.t1 TraitB2.t2!Class3.t1!

Class1 extends TraitB1, the trait that used the explicit self type. The class defines t1(), so all the necessary implementation is there, but the compile fails anyway. The explicit self type says that “this” must be of type TraitA but neither Class1 nor TraitB1 extends TraitA, so even though the t1() method is supplied, “this” cannot have type TraitA for Class1 because Class1 does not inherit from TraitA.

Class2 and Class3 compile and run just fine. Class2 is identical to Class1 except that it mixes in TraitA. Since it is declared “with TraitA” the explicit self type is satisfied because “this” can have type TraitA.

Class3 is identical to the others except that it extends TraitB2. TraitB2 is declared as extending TraitA, so Class3 compiles because it indirectly inherits from TraitA.

Use explicit self types with care! They can be a little dangerous if you use them to subvert Scala’s compile-time type checking. For example:

trait StringMaker {
  def makeString(): String
}

class DoesntCompile extends StringMaker {
  override def toString() = "DoesntCompile " + makeString
}

class Compiles {
  self: StringMaker =>
  override def toString() = "Compiles " + makeString
}

The first class is declared as extending StringMaker, but it doesn’t implement the makeString method. This class fails to compile, and rightly so. The compiler does its job and warns you that the class won’t work.

The second class includes an explicit self type. The class named Compiles says that it must be a StringMaker. Now, it says this internally, not externally. The class declaration says nothing about StringMaker. Any code that used the Compiles class wouldn’t know that it’s supposed to be a StringMaker. The class compiles but when you try to instantiate it you get an exception. Not only is it an exception, it’s a NullPointerException which crashes my interpreter (!!!) which makes me think this may be a bug.

Time to Self Actualize

My solution to the problem was a combination of the “class X extends Ordered[X]” idiom and the explicit self type. Here it is:

  abstract class Dimension[T](val value: Double, val name: String, val coef: Double) {
    self: T =>
    protected def create(value: Double, name: String, coef: Double): T
    def +(x: Dimension[T]): T = create(value + coef * x.value / x.coef, name, coef)
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
        Dimension[Time](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Time(a, b, c)
  }

  class Length(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

  class Mass(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

This compiles just fine except for the last class, which I included to demonstrate that the compiler enforces conformance to the explicit self type. Every class that extends Dimension[X] must itself be an X. That enforces the rule I wanted. Here’s the compiler error for the last class.

$ scalac Units.scala
Units.scala:19: error: illegal inheritance;
 self-type Mass does not conform to Dimension[Length]'s selftype Dimension[Length] with Length
        Dimension[Length](value, name, coef) {
        ^
one error found

That basically says if you want to extend Dimension[Length] then you better be a Length yourself.

Considering that when I had investigated this problem for about 10 minutes I was nearly ready to call it impossible, this is a surprisingly simple and not too cryptic solution. Plus, it’s a usage of explicit self types that I hadn’t seen before. I wonder, in fact, why the Ordered[A] trait itself doesn’t use this trick.

As a bonus, here’s a sneak peek at part of my Units library so far.

  abstract class Dimension[T](val value: Double, val name: String, val coef: Double) {
    self: T =>
    protected def create(value: Double, name: String, coef: Double): T
    def +(x: Dimension[T]): T = create(value + coef * x.value / x.coef, name, coef)
    def -(x: Dimension[T]): T = create(value - coef * x.value / x.coef, name, coef)
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
        Dimension[Time](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Time(a, b, c)
  }

  class Length(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

  abstract class TimeUnit(name: String, coef: Double) {
    def apply(value: Double) = new Time(value, name, coef)
    def apply(orig: Time) = new Time(0, name, coef) + orig
  }

  object Second   extends TimeUnit("seconds",    1.0)
  object Minute   extends TimeUnit("minutes",    1.0 / 60)
  object Hour     extends TimeUnit("hours",      1.0 / 3600)

  abstract class LengthUnit(name: String, coef: Double) {
    def apply(value: Double) = new Length(value, name, coef)
    def apply(orig: Length) = new Length(0, name, coef) + orig
  }

  object Meter      extends LengthUnit("meters",      1.0)
  object Inch       extends LengthUnit("inches",      1.0 / .0254)
  object Foot       extends LengthUnit("feet",        1.0 / .0254 / 12)

And here’s what it looks like in the interpreter:

scala> val length1 = Meter(3)
length1: Length = 3.0 meters

scala> val length2 = Foot(4.5)
length2: Length = 4.5 feet

scala> length1 + length2
res0: Length = 4.3716 meters

scala> length2 + length1
res1: Length = 14.34251968503937 feet

scala> Inch(length1 + length2)
res2: Length = 172.11023622047244 inch

scala> val time1 = Second(90)
time1: Time = 90.0 seconds

scala> val time2 = Hour(.75)
time2: Time = 0.75 hours

scala> Minute(time2)
res3: Time = 45.0 minutes

scala> time1 + length1
<console>:14: error: type mismatch;
 found   : Length
 required: Dimension[Time]
       time1 + length1
               ^

Don’t forget to subscribe to my RSS feed, or follow this blog on Twitter.
Copyright © 2009 Matthew Jason Malone

I noticed some strange behavior in some Scala code recently. It was rather a mystery. I looked for my error and googled for a solution for the longest time with no success. Eventually I got my answer from the Scala mailing list / Nabble forum. Here’s the class that was causing the trouble.

class ArrayWrapper[A](length: Int) {
  private val array = new Array[A](length)
  def apply(x: Int) = array(x)
  def update(x: Int, value: A) = array(x) = value
  override def toString(): String = array.toString
}

The first thing you’ll notice about this class is that it is extremely simple! There aren’t a lot of moving parts. It’s a simple wrapper that exposes 3 basic array behaviors: apply (a ‘getter’), update (a ‘putter’), and good ol’ toString. Arrays in Scala take a type parameter, and to ensure that this class could wrap an array of any type I used a type parameter, too. Have a good look at the class and make sure you understand how it works. It won’t take long.

How do you expect this class to behave? Let’s play a little fill-in-the-blanks. Here is a Scala interpreter session with some results blanked out.

scala> class ArrayWrapper[A](length: Int) {
    |   private val array = new Array[A](length)
    |   def apply(x: Int) = array(x)
    |   def update(x: Int, value: A) = array(x) = value
    |   override def toString(): String = array.toString
    | }
defined class ArrayWrapper

scala> val a = new ArrayWrapper[Int](5)
??????????????

scala> val x = a(0)
??????????????

scala> x.toString
??????????????

scala> a(0).toString
??????????????

scala> a(0) = 0

scala> a.toString
??????????????

scala> a(0).toString
??????????????

scala>

There are 6 blanks. What do you expect to see in each of those? Well, the first blank follows the creation of a new ArrayWrapper[Int] and its assignment to a val ‘a’. So, according to our overriding definition of toString, it is simply the result of the underlying Array’s toString method. I know from experience how a brand new Array of Ints looks. It looks like this:

scala> new Array[Int](5)
res1: Array[Int] = Array(0, 0, 0, 0, 0)

So that’s what I expect to see here. Anyone expect something different? Here’s what I actually saw in the first blank:

scala> val a = new ArrayWrapper[Int](5)
a: ArrayWrapper[Int] = Array(null, null, null, null, null)

Hmm. That’s not what I expected. Did you predict this? Why is this array full of nulls when a new Array[Int] is usually full of zeros? I was stumped. The array is parameterized, I reasoned, so maybe type erasure was involved. That doesn’t make sense, though. No types should be erased at this point.

Let’s look at the next few lines, 12-16. I called a(0) (the apply method) and assigned the result to x. I then called the toString method on x. What do you expect in these two lines? I would have expected a(0) to return 0 and x.toString to return “0”, but my conviction is shaken by that last result. Will a(0) return null? Will x.toString throw a NullPointerException? Decide what you predict will happen. Here’s the actual result:

scala> val x = a(0)
x: Int = 0

scala> x.toString
res0: java.lang.String = 0

Each line behaves in the “correct” way even though we saw all those nulls in the underlying array. That’s good news, I suppose. Maybe the problem is limited to Array’s toString method. It should be smooth sailing now. Let’s now look at line 18, in which we call a(0).toString. It’s just combining the operations (apply and toString) from the previous two lines without storing the intermediate result in ‘x’. I expected that to return String “0”. You can probably guess by now that what I expected is not what I got. Make your own prediction before you read the actual result below. What will happen when we call a(0).toString?

scala> a(0).toString
java.lang.NullPointerException
       at .<init>(<console>:7)
       at .<clinit>(<console>)
       at RequestResult$.<init>(<console>:3)
       at RequestResult$.<clinit>(<console>)
       at RequestResult$result(<console>)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.i...

Ouch! NullPointerException! This is an unpleasant surprise. The call a(0) returned a zero earlier, and calling toString on that zero returned a String “0”. But now we get this disaster. I’m getting more and more confused. Do you have an explanation for this crazy behavior yet?

Moving along, in line 21 we assign 0 to a(0). Remember that a(0) returned 0 earlier. By the way, behind the scenes the line “a(0) = 0” doesn’t call the apply method, but the ‘update’ method. It succeeds. In lines 23 and 26 we call a.toString and a(0).toString. What will happen in each case? At this point, it’s anybody’s guess. The behavior has been so wacky I can’t even make a sensible prediction. Make a guess of your own, if you dare, and observe the actual result below:

scala> a(0) = 0

scala> a.toString
res3: String = Array(0, null, null, null, null)

scala> a(0).toString
res4: java.lang.String = 0

The underlying Array now appears to contain a zero in addition to the nulls. Also, the a(0).toString, which was throwing a NullPointerException earlier, is now succeeding.

As I say, I puzzled over this problem for some time. I wanted to blame the issue on type erasure in the parameterized type, but that explanation didn’t make sense. I posted a question to the Scala forum on Nabble and got a response back in short order from Daniel Sobral.

The culprit? Drumroll…

Boxing. Well, boxing, unboxing, and a peculiarity of parameterized types. To review, here is our ArrayWrapper class:

class ArrayWrapper[A](length: Int) {
  private val array = new Array[A](length)
  def apply(x: Int) = array(x)
  def update(x: Int, value: A) = array(x) = value
  override def toString(): String = array.toString
}

We declared ‘array’ to be an Array[A], which is to say an Array of who-knows-what. When the Array is defined in this way, with a type parameter of unknown type, the Array must be an array of object references! It cannot be an array of Java int primitives. That’s the peculiarity of parameterized types. That’s why the default values for the members of the array were null instead of 0. The underlying array is actually an array of java.lang.Integer objects.

When we ran ‘val x = a(0)’, Scala retrieved the value at index 0 which was null. The apply method has Int return type in our example, and null is not an legal value of an Int. Int is Scala’s version of the Java int primitive type. So the null was converted (unboxed) to Int value 0. Then it could be stored in val x, etc. Once it’s safely unboxed, it behaves like a normal Scala Int value.

So, why did a(0).toString not work? Shouldn’t the null returned from a(0) be unboxed to Int 0, then re-boxed for the toString call? Apparently it doesn’t work that way. The unboxing hasn’t happened at the time the toString call is executed, so that toString is called on the null, giving us the NullPointerException. I don’t know whether this behavior is imposed by the JVM or the Scala language. Either way, it seems to me like a violation of the Principle of Least Astonishment and an opportunity for improvement.

Once we call a(0) = 0, then the underlying array is populated with a boxed version of 0, which is to say an instance of java.lang.Integer. After it’s populated with a non-null it works normally.

Again, this only happens for Arrays with parameterized types. If we make ArrayWrapper non-parameterized and declare ‘array’ as an Array[Int] then the problem goes away.

scala> class ArrayWrapper(length: Int) {
     |   private val array = new Array[Int](length)
     |   def apply(x: Int) = array(x)
     |   def update(x: Int, value: Int) = array(x) = value
     |   override def toString(): String = array.toString
     | }
defined class ArrayWrapper

scala> val a = new ArrayWrapper(5)
a: ArrayWrapper = Array(0, 0, 0, 0, 0)

scala> a(0).toString
res0: java.lang.String = 0

There are a few lessons in all this for the Scala developer:

  • Be vigilant about Array initialization. Initialize them explicitly, especially when dealing with primitives like Int, Long, Float, Double, Byte and Char. Don’t trust the default values.
  • Beware parameterized Arrays. They are flawed. Consider specifying their type or using another collection instead, such as a List or Map which can’t contain un-initialized values.
  • Unit test all your code, even those parts that look too simple to screw up.

Don’t forget to subscribe to my RSS feed, or follow this blog on Twitter.
Copyright © 2009 Matthew Jason Malone