As I’ve marveled before, the Scala type system is extremely expressive.  In this post, I want to look at one particular aspect of the type system.  I want to look at a few of the ways to pass behaviors into functions.  Using classes, traits, function types, and a construct that looks like an ad hoc interface (and probably other ways I haven’t learned yet) you can exercise precise control over what units of behavior your functions will accept and use.

First, let’s define a zoo of classes that we will experiment on:

trait TestTrait {
  def func(s1: String): Unit
}

class TestClass1 {
  def func(s1: String): Unit = println("TestClass1: "+s1)
}

class TestClass2 {
  def func(s1: String): Int = { println("TestClass2: "+s1); 999; }
}

class TestClass3 extends TestTrait {
  def func(s1: String) = println("TestClass3: "+s1)
}

class TestClass4 {
  def func(a1: Any) = println("TestClass4: "+a1)
}

class TestClass5 {
  def otherFunc(s1: String) = println("TestClass5: "+s1)
}

class TestClass6 {
  def wrongFunc(list: List[Int]) = println("TestClass6: "+list)
}

There.  These are very simple, but I’ll just point out the salient points of each:

  • TestTrait – Basically, an interface with one method.
  • TestClass1 – One method, func, takes a String and returns Unit (void, to Java coders).
  • TestClass2 – Same as TestClass1, but returns Int.
  • TestClass3 – Same as TestClass1, but also extends TestTrait.
  • TestClass4 – Same as TestClass1, but takes an Any parameter.
  • TestClass5 – Same as TestClass1, but its function has a different name
  • TestClass6 – One method, different name, different parameter type, does notextend TestTrait.

Each of these (except TestTrait) defines some behavior that we might want to pass into a function.  That behavior is trivial in these classes, but I’m trying to keep the examples short.  Let’s now define several functions that might be able to use some of this behavior.

def test1(fn : (String) => Unit) = { fn("test1"); }

def test2(fn : (String) => Int)  = { fn("test2"); }

def test3(obj: { def func(x: String): Unit }) = { obj.func("test3") }

def test4(obj: TestClass1) = { obj.func("test4") }

def test5(obj: TestTrait)  = { obj.func("test5") }

def test6(obj: { def func(x: Any): Unit }) = { obj.func("test6") }

Now we just need to instantiate each one class and start passing parameters!  Create an instance of each class, TestClass1 to TestClass6, and name them tc1 to tc6.  The pass each instance’s function into test1 and test2, and pass the instance itself into test3 through test6.  Let’s do this in the Scala interpreter, rather than compiling.  Like so:

val tc1 = new TestClass1
val tc2 = new TestClass2
val tc3 = new TestClass3
val tc4 = new TestClass4
val tc5 = new TestClass5
val tc6 = new TestClass6

test1(tc1.func)
test1(tc2.func)
test1(tc3.func)
test1(tc4.func)
test1(tc5.otherFunc)
test1(tc6.wrongFunc)

test2(tc1.func)
test2(tc2.func)
test2(tc3.func)
test2(tc4.func)
test2(tc5.otherFunc)
test2(tc6.wrongFunc)

test3(tc1)
test3(tc2)
test3(tc3)
test3(tc4)
test3(tc5)
test3(tc6)

test4(tc1)
test4(tc2)
test4(tc3)
test4(tc4)
test4(tc5)
test4(tc6)

test5(tc1)
test5(tc2)
test5(tc3)
test5(tc4)
test5(tc5)
test5(tc6)

test6(tc1)
test6(tc2)
test6(tc3)
test6(tc4)
test6(tc5)
test6(tc6)

Did you run that?  It didn’t work, did it?  At least not all of it.  Now we we have something to discuss.  Here’s a chart of what did and didn’t work:

test1 test2 test3 test4 test5 test6
tc1 OK FAIL OK OK FAIL FAIL
tc2 OK OK FAIL FAIL FAIL FAIL
tc3 OK FAIL OK FAIL OK FAIL
tc4 OK FAIL FAIL FAIL FAIL OK
tc5 OK FAIL FAIL FAIL FAIL FAIL
tc6 FAIL FAIL FAIL FAIL FAIL FAIL

Taking these results one test function at a time:

test1(fn : (String) => Unit)

All but one of the test objects has at least one member function that can be passed in to this function.  It’s easy to see why this works for tc1, tc3, and tc5.  They each have a function that matches the type of parameter fn exactly.  It’s also easy to see why it doesn’t work for tc6.  TestClass6.wrongFunc takes a List[Int] parameter, which is unrelated to fn’s expected String parameter.

But what about tc2 and tc4?  TestClass2.func takes a String parameter, just like the test1′s fn parameter, but it returns an Int instead of Unit.  But it still works.  Instead of requiring a Unit return type for fn, Scala allows functions that return values.  It simply throws away any returned value and treats the passed in function as though it returned Unit.  Scala’s typesafety requirements in this case are very permissive!  This can be convenient, but be aware that you have very little control of what functions can be passed in when you use this kind of parameter type.

TestClass4.func returns Unit, but it takes a parameter of type Any.  Since test1 knows parameter fn as a function with a String parameter, test1 will only ever pass Strings into fn.  A String “is-a” Any so Scala says it’s ok to pass Strings into TestClass4.func.

This all goes back to variance.  In Scala, functions are contravariant with respect to parameters and covariant with respect to return type.  If you’re not familiar with variance, this means that if function f2′s parameters are simple supertypes (simple, meaning a supertype or the same type) of function f1′s parameters, and if f2′s return type is a simple subtype of f1′s return type then f2 is a subtype of f1 and can be used anywhere f1′s type is required.  The makes sense.  If a function takes type X, then you can pass in any value of type X or a subtype of X.  If a function returns type Y, then the result can be treated as Y or any supertype of Y.  Because Unit is a subtype of Int (and of all reference types) and String is a simple supertype of String, the function type “(String) => Int” is a subtype of “(String) => Unit” and can be used wherever a “(String) => Unit” is required.

test2(fn : (String) => Int)

This one only works for tc2.  TestClass2.func matches the type of parameter fn exactly.  But none of the other functions can be used here.  This is because none of the other functions take a parameter that is a simple subtype of String and return a simple supertype of Int, none of the others will work here.

A quick note about the use of functions in this way:  If you have an instance and pass a member function of that instance as a parameter, the function still has access to that instance’s data.  Moreover, if the passed function changes the state of the object of which it is a member, then function that receives the function as a parameter can mutate the object.  I don’t think I’m being clear.  Here’s some code to demonstrate what I mean.

class TestClass(name: String) {
  var memberStr: String = ""
  def func(str: String) = { memberStr += str }
  override def toString = name + ": " + memberStr
}

val testA = new TestClass("A")
val testB = new TestClass("B")

def test(fn : (String) => Unit, num: Int): Unit = {
  if (num > 0) { fn("X"); test(fn,num-1); }
}

test(testA.func, 5)
test(testB.func, 2)

println(testA + ", " + testB)

See?  Function test just takes a function as a parameter (and an Int).  It does NOT take a TestClass as a parameter, but it’s still able to alter those two instances of TestClass by way of the mutator member function TestClass.func.  So be aware that when you declare a function parameter like this, you never know whether you’re getting a bare function, or a member function.

test3(obj: { def func(x: String): Unit })

Here’s a construct that’s much more strict than the function types used in test1 and test2.  The test3 function take an ad-hoc interface as a parameter.  It’s not a named interface (a Trait), so the parameter’s type doesn’t have to be declared as extending anything.  The parameter is accepted if it has a member function of the given name and type.

Now, in this case, remember, the parameter isn’t the function, it’s the object.  So the permissive behavior that Scala allows in test1 and test2 because of the variance behavior of function types does not apply here.  An object of type A has a function funcA, and object of type B has a function funcB whose type is a subtype of funcA, that does NOT make B a subtype of A.  The names and types of the members declared in the ad-hoc interface must match the parameter exactly.  The parameter can have additional members, but it must at least provide all the members listed in the ad-hoc interface.

In this case, TestClass1 and TestClass3 are the only classes that have functions named func that take a String parameter and return Unit.  The other classes have either the wrong function name, parameter type, or return type.

test4(obj: TestClass1)

This is an easy one.  You know what’s happening here.  Function test4 states explicitly that it requires an object of type TestClass1.  Only a TestClass1 or a subtype of TestClass1 will do.

test5(obj: TestTrait)

Function test5 expects a TestTrait parameter.  Any object that has the trait TestTrait can be passed in.  In this case, TestClass3 is the only class that extends TestTrait.  We could have declared TestCase1 as extending TestTrait, but we didn’t.  Therefor, it doesn’t matter that TestClass1 has a function with the right name, the right parameter types, and the right return type.  It isn’t explicitly defined as extending TestTrait, so it isn’t allowed.

test6(obj: { def func(x: Any): Unit })

This is really just a more extreme example of the phenomenon demonstrated in test3.  This is another ad-hoc interface.  That member function is the supertype of ALL functions that have one parameter and return a reference type.  ALL of them.  You see why?  Any is the superclass of every type and Unit is a subclass of every reference type.  Function types are contravariate with respect to their parameters and covariant with respect to their return types.  But, of course, only our instance of TestClass4 is accepted because this parameter doesn’t have a function type, but an ad-hoc interface type.  It is not the case that all types are covariant with respect to the types of their member functions.  TestClass4 is the only one that has a function with the right name, parameter type, and return type.

Variance in Java

If you’re a Java developer then you probably know a thing or two about subtyping. B is a subtype of A if B extends or implements A (I’ll use this convention throughout this post). A is the supertype, B is the subtype. But what about arrays? Or generic collections? Is an array of B a subtype of an array of A? Is List<B> a subtype of List<A>? Let’s do some experiments:

Object testObj = null;

String[] arrayB = { "a", "b", "c" };
Object[] arrayA = arrayB;
testObj = arrayA[0];

List<String> listB = new ArrayList<String>();
listB.add("123");
List<Object> listA = (List)listB;
testObj = listA.get(0);

List<Double> listC = new ArrayList<Double>();
listC.add(new Double(10.0));
listB = (List)listC;
System.out.println(listB.get(0));

I’ve left out the boilerplate for brevity. This code compiles and it runs fine, mostly. This tells us three things. First, Java treats an array of B as a subtype of an array of A. We know this because we can use a reference to an array of Objects to refer to an array of Strings. This means an array of Strings “is-a” array of Objects.

Second, we know that an ArrayList<String> is a subtype of a List<String>. This makes sense, because ArrayList implements interface List. An ArrayList<String> “is-a” List<String>. We can even finagle the List<String> into a List<Object>, but we have to make an explicit cast.

Third, we see a ClassCastException during the call to listA.get(0). We use the explicit cast again to assign a List<Double> to a reference to List<String>. The compiler allows this! This kind of sloppy typing is one of the main complaints of Java generics’ detractors (this link includes an example of an erroneous assignment that doesn’t even require an explicit cast!). Now, let’s look at some of the things Java doesn’t allow.

Object[] arrayA = { new Object() };
String[] arrayB = arrayA;

List<String> listB = new ArrayList<String>();
List<Object> listA = listB;

This causes two compiler errors. The first occurs when we try to assign an array of Object to a reference to an array of String. This is sensible. An array of Objects could contain any object: Strings, Doubles, Threads, anything. We can’t treat such an array as an array of Strings.

The second error occurs when we try to assign a list of Strings to a reference to a List of Objects. This is the same code as in the previous snippet, but without the explicit cast. The compiler doesn’t allow this. Since lists are mutable (we can add and remove items) we could add non-String members to listA, which means those non-String members would also be in listB. But if we don’t add any items to listA, the assignment is perfectly safe. Java can’t figure out in all cases when it’s safe to do an assignment and when it is not.

Why do Java generics work the way they do? I think it’s mainly due to two factors: backward compatibility, promoting adoption of the feature. When generics were introduced, there were already millions of lines of code out there that depend on regular, non-generic, mutable collections. To make the new code compatible with legacy code, the type parameters are erased during compilation, and allowing those dangerous casts lets developers work around the fact that List<B> is not a subtype of List<A>. To make it easier to use generics in new code and convert to non-generics for integration with old code, the compiler rules were made fairly permissive.

Variance Terminology

If you’re not familiar with the term “variance”, here’s what it means with respect to Java. Java arrays are covariant. That means that an array of B is a subtype of an array of A, provided that B is a subtype of A. The type-subtype relationship of the arrays follows the relationship of the contents. Lists in Java are invariant (some say nonvariant). A List of Strings has no relationship to a List of Objects. You can explicitly cast a List of Strings to a List of Objects but you can force the conversion the opposite direction, too, (Yuck.) so that doesn’t count. Now consider some hypothetical generic class X such that X<A> is a subtype of X<B> if B is a subtype of A. That’s the opposite of the way arrays work. The hypothetical generic X is contravariant.

One more detail of terminology: A type class could theoretically be covariant with respect to one type parameter, and contravariant with respect to another (not in Java, just theoretically). So, say you have a generic class X that is covariant with respect to its first type parameter and contravariant with respect to its second. So X<B,I> is a subtype of X<A,J> only if B is a subtype of A and J is a subtype of I. Weird, huh? This actually happens in Scala.

Variance in Scala

In Scala, variance is not left to chance. There are very strict rules. Variance with respect to type parameters is spelled out explicitly for each class (or trait). The same conventions are used for Arrays, Lists, or any generic class! The variance system (indeed, the whole type system) in Scala is more complicated and has a steeper learning curve than in Java, but it affords you the ability to write very expressive code that behaves in a more intuitive fashion. Here are three generic classes that use each of the variance types.

class InVar[T]     { override def toString = "InVar" }
class CoVar[+T]     { override def toString = "CoVar" }
class ContraVar[-T] { override def toString = "ContraVar" }
/************ Regular Assignment ************/
val test1: InVar[String] = new InVar[String]
val test2: CoVar[String] = new CoVar[String]
val test3: ContraVar[String] = new ContraVar[String]

The ‘+’ denotes covariance with respect to the type parameter, and ‘-’ denotes contravariance. The class is invariant with respect to type parameters without a plus or minus. If you run this code you can see that when the type parameters are the same on both sides, the assignments work fine. Now, let’s see what happens when we test assignment for different type parameters.

scala> /************ Invariant Subtyping ************/

scala> val test1: InVar[String] = new InVar[AnyRef]
<console>:5: error: type mismatch;
 found   : InVar[AnyRef]
 required: InVar[String]
       val test1: InVar[String] = new InVar[AnyRef]
                                   ^

scala> val test2: InVar[AnyRef] = new InVar[String]
<console>:5: error: type mismatch;
 found   : InVar[String]
 required: InVar[AnyRef]
       val test2: InVar[AnyRef] = new InVar[String]
                                   ^

scala> /************ Covariant Subtyping ************/

scala> val test3: CoVar[String] = new CoVar[AnyRef]
<console>:5: error: type mismatch;
 found   : CoVar[AnyRef]
 required: CoVar[String]
       val test3: CoVar[String] = new CoVar[AnyRef]
                                  ^

scala> val test4: CoVar[AnyRef] = new CoVar[String]
test4: CoVar[AnyRef] = CoVar

scala> /************ Contravariant Subtyping ************/

scala> val test5: ContraVar[String] = new ContraVar[AnyRef]
test5: ContraVar[String] = ContraVar

scala> val test6: ContraVar[AnyRef] = new ContraVar[String]
<console>:5: error: type mismatch;
 found   : ContraVar[String]
 required: ContraVar[AnyRef]
       val test6: ContraVar[AnyRef] = new ContraVar[String]
                                      ^

Now you can see the difference in the three classes. The invariant class doesn’t allow assignment in either direction, regardless of whether their type parameters have a subtype relationship. The covariant class allows an assignment from subtype to supertype. String is a subtype of AnyRef, so CoVar[String] is a subtype of CoVar[AnyRef]. The contravariant class allows an assignment from supertype to subtype. String, again, is a subtype of AnyRef, so ContraVar[AnyRef] is a subtype of ContraVar[String].

So, it’s as simple as that, right? Sorry. There’s a little more to it. Once you’ve declared a type parameter as covariant or contravariant there are some restrictions on where this type can be used. Why? Scala is not a purely functional langage in that it allows objects to alter their internal state. It allows mutability. Mutability throws a monkey wrench into variance. Say, for example you had a Scala implementation of a linked list like so:

class LinkedList[+A] {
  private var next: LinkedList[A] = null
  def add(item: A): Unit = { ... }
  def get(index: Int): A = { ... }
}

val strList = new LinkedList[String]
strList.add("str1")
val anyList: LinkedList[Any] = strList
anyList.add(new Double(1.0))
val str: String = strList.get(1)

This code won’t compile. Do you see the problem? This code, if it worked, would allow us to create a list of Strings and then add a Double to that list! If we allow that then there’s a disaster when we get to the last line. A LinkedList of Strings returns a Double. That’s no good. That’s the same problem as we saw in Java. Scala nips this sort of code in the bud by disallowing covariant types in certain places including member function parameter types. Places where covariant types are allowed and contravariant types are forbidden are called covariant positions. And the reverse is true, too. If contravariant types are allowed and covariant types are forbidden in some position, this is called a contravariant position.

The above code causes a compiler error for the add method. You can use covariant types in most other places including constructor parameter types, member val types, and method return types. You can also use covariant types as type parameters, but only where covariant types themselves are allowed. This means, in this example, you could add a method that returns a Set[A], but you could not add a method that takes a Set[A] as a parameter, because you could use that passed-in Set[A] to alter the state.

Contravariant types can be used as constructor parameter types, member function parameters types, and as type parameters in each of those positions.

Here’s a fun exercise for the reader. Experiment with using type parameters of the different kinds of variances in different positions. Below is some example code to get you started. For each usage that fails compilation, why is it not allowed? Can you think of a way that such a usage could cause an inconsistency (such as allowing a Double in a String collection, for example)?

class InVar[T](param1: T) {
  def method1(param2: T) = { }
  def method2: T = { param1 }
  def method3: List[T] = { List[T](param1) }
  def method4[U >: T]: List[U] = { List[U](param1) }
  val val1: T = method2
  val val2: Any = param1
  var var1: T = method2
  var var2: Any = param1
}

class CoVar[+T](param1: T) {
  def method1(param2: T) = { }
  def method2: T = { param1 }
  def method3: List[T] = { List[T](param1) }
  def method4[U >: T]: List[U] = { List[U](param1) }
  val val1: T = method2
  val val2: Any = param1
  var var1: T = method2
  var var2: Any = param1
}

class ContraVar[-T](param1: T) {
  def method1(param2: T) = { }
  def method2: T = { param1 }
  def method3: List[T] = { List[T](param1) }
  def method4[U >: T]: List[U] = { List[U](param1) }
  val val1: T = method2
  val val2: Any = param1
  var var1: T = method2
  var var2: Any = param1
}

Maybe in a future post, I’ll do a more thorough analysis of all the places type parameters of the different variances can be used. If you’d be interested in reading such a thing, please do leave a comment.

Conclusion

What was the point of all that? We still don’t get the mutable, covariant collections we were hoping for. But we do get two things we don’t get in Java. We get invariant, typesafe, mutable collections that won’t wind up holding objects of the wrong type, and we get covariant, typesafe, immutable collections. If you’re new to functional programming (like I am), your first thought might be, “What good is an immutable list? Or an immutable array? If it’s immutable, I can’t add any items to it, right?”

Take the Scala List as an example. It’s declared as a “class List[+A]” and it has a method for adding new items that’s declared “def + [B >: A](x : B) : List[B]“. So List is covariant with respect to its type parameter A. That’s great! So a List[BigInt] “is-a” List[Number] and it “is-a” List[Object]. To add items to a List use the ‘+’ function. It doesn’t change the List it was called on, but it does return a new List.

Also, you can add anything to a List. What you add affects what gets returned. Look at the function definition again: “def + [B >: A](x : B) : List[B]“. It take a parameter of type B, where B is any supertype of A (or A itself), and it returns a List[B]. Let’s consider the simple case, and then something more complex.

scala> var strList = List[String]("abc")
strList: List[String] = List(abc)

scala> strList = strList + "xyz"
strList: List[String] = List(abc, xyz)

scala> var objList = strList + new Object()
objList: List[java.lang.Object] = List(abc, xyz, java.lang.Object@156ee8e)

scala> var anyList = strList + 3.1416
anyList: List[Any] = List(abc, xyz, 3.1416)

First we create a List[String] called strList containing one item. Then we add a second String and store the resulting List back in the strList variable. Then we call the ‘+’ function with an Object parameter. This is allowed because the parameter must have type B where B is a supertype of A, and Object is indeed a superclass of String. The call to ‘+’ returns a List[Object] which contains both the Strings and the Object. Simple.

Then we call the ‘+’ method on testList again. Remember, strList still just contains the two Strings because it’s immutable and we didn’t assign the last result back to strList. We couldn’t have. The result was a List[Object], not a List[String], and List is covariant, not contravariant. This time we supply a parameter of type Double. But Double isn’t even a supertype of String! That’s ok. Scala determines the nearest common ancestor, the type Any, and uses that as B. The ‘+’ function returms a List[Any] that contains the Strings and the Double. That’s handy. And covariant and typesafe.

The way Scala deals with functions is pretty interesting. If you want to use them as you would use Java functions then they’re not that complicated. You have to learn the syntax, a little about the Scala type system, and bada-bing, you’re in business. But if you start exploring the way types are implemented in Scala you find some interesting stuff.

First, we’ll briefly describe the basics.  Here are a few very simple Scala function definitions and invocations in the scala interpreter:

scala> def method1() = { println("method1") }
method1: ()Unit

scala> def method2(str: String) = { println("method2: " + str) }
method2: (String)Unit

scala> def method3(str: String): Int = {
     |   println("method3: " + str); str.length;
     | }
method3: (String)Int

scala> def method4(f: (String) => Int) = {
     |   printf("method4: " + f("method4"))
     | }
method4: ((String) => Int)Unit

scala> method1
method1

scala> method2("abc")
method2: abc

scala> method3("abcdefg")
method3: abcdefg
res13: Int = 7

scala> method4(method3)
method3: method4
method4: 7

Very basic:  method1 takes no parameters and returns nothing, method2 takes a single parameter of type String and returns nothing, method3 takes a String parameter and returns an Int, and method4 takes a parameter of type “function that takes a String parameter and returns Int” and returns nothing.

Why are we able to declare functions like this in Scala?  Didn’t I read somewhere that Scala is very object oriented?  Didn’t I read that everything is an object?  Why do we have these bare naked functions defined outside of objects?  The reason is that in Scala, everything really is an object, even functions!  That method1 we defined?  That’s an object.  When we type “def method1() = {…}” we actually declared an instance of a special class.  I’ll declare method1 again, but with the underlying object exposed:

scala> val method1 = new Function0[Unit] {
     |   def apply: Unit = { println("method1") }
     | }
method1: java.lang.Object with () => Unit = <function>

scala> method1
res1: java.lang.Object with () => Unit = <function>

scala> method1.apply
method1

scala> method1()
method1

We instantiate an instance of trait Function0[Unit] and implement its one abstract method, called apply, and assign it to a val named method1.  Now you can see method1 is actually just a plain old Scala object.  When we type in “method1″ and hit enter, the interpreter just tells us the resulting value of the statement which is an Object with trait Function0.  Hmm, that didn’t work.  Next we try calling the apply method on the object.  That works!  But it’s just a regular call to a member method.  But when we type “method1()” then Scala knows that we want to use this object as a function, and that we’re not refering to the object itself.  When you declare a function using “def” Scala assumes that when you refer to the method you want the apply method invoked, and that you don’t want to return the function object.  Neat.

That Function0[Unit], by the way, defines a function that takes 0 parameters and returns Unit (which is to say nothing as in Java void (not to be confused with Nothing)).  If you want a function that takes two parameters, an Int and a String, and returns a List of Doubles, you would use Function2[Int, String, List[Double]].  So class FunctionX takes (X+1) type parameters, the first X of which define the function parameter types, and the last of which defines the return type.

So what if we go the other way?  What if we declare a method and then store it in a val?  In this case, Scala gets very picky.  Watch this:

scala> def method2 = { println("method2") }
method2: Unit

scala> val m2: () => Unit = method2
<console>:5: error: type mismatch;
 found   : Unit
 required: () => Unit
       val m2: () => Unit = method2
                            ^

scala> def method2() = { println("method2") }
method2: ()Unit

scala> val m2: () => Unit = method2
m2: () => Unit = <function>

scala> def method2 = { println("method2") }
method2: Unit

scala> val m2: () => Unit = method2 _
m2: () => Unit = <function>

Some strange stuff happens here.  First we just define a function called method2.  Nothing fancy.  Then we try to assign it to a val of type () => Unit.  It fails.  See the error message?  Found : Unit.  It parses it all wrong.  Scala thinks we’re trying to call method2 and assign the result to m2.  How can we set things straight?  Well, one way is to slightly change the way we define method2.  The only difference in the first and second definition is the addition of an empty parameter list, that empty pair parentheses.  For some reason, when we define the method in this apparently equivalent fashion, Scala rightly interprets our intentions and allows us to assign to m2.  There is another way, though.  In the third definition of method2, we’ve again removed the parentheses.  But this time we assign it successfully to val m2 by following method2 with an underscore.  The underscore just causes Scala to treat method2 as a Function0 object, rather than attempting to invoke it.

So a function is an object.  Who cares?  We’re still just calling functions.  Ah, but use your imagination.  You can do all kinds of tricks once you realize that a function is just an object.  For example:

scala> class TestClass {
     |   def f1(): Unit = { println("f1!!!"); func = f2 }
     |   def f2(): Unit = { println("f2!!!"); func = f3 }
     |   def f3(): Unit = { println("f3!!!"); func = f1 }
     |
     |   var func: () => Unit = f1
     |
     |   def test = { func() }
     | }
defined class TestClass

scala> val tc = new TestClass
tc: TestClass = TestClass@1eff71e

scala> tc.test
f1!!!

scala> tc.test
f2!!!

scala> tc.test
f3!!!

scala> tc.test
f1!!!

See what’s happening here?  We can store a reference to a function object, call the function it refers to, and re-assign it.  So the method “test” actually calls a different function each time.

Can you guess why I added the test method instead of just calling func directly?  func is declared with the var keyword, so if I entered “tc.func” instead of “tc.func()” then the interpreter would think I was refering to the function object.  Just so there’s no confusion, I wrapped the call “func()” inside a regular def-defined function called test.

Let’s see, what other neat tricks can we do?  Here’s something interesting:

scala> def printAll(str1: String, str2: String, str3: String): Unit = {
     |   println( str1 + ":" + str2 + ":" + str3 )
     | }
printAll: (String,String,String)Unit

scala> def fillInStr1(func: (String,String,String) => Unit, str1: String): (String,String) => Unit = {
     |   new Function2[String,String,Unit] {
     |     def apply(str2: String, str3: String) = {
     |       func(str1, str2, str3)
     |     }
     |   }
     | }
fillInStr1: ((String, String, String) => Unit,String)(String, String) => Unit

scala> val newPrint = fillInStr1(printAll _, "test123")
newPrint: (String, String) => Unit = <function>

scala> newPrint("abc","xyz")
test123:abc:xyz

scala> newPrint("123","456")
test123:123:456

First, we define printAll.  It’s just a function that prints out its 3 string parameters.  The next method, fillInStr1, is the interesting one.  The method signature is kind of complex.  It takes 2 parameters, func and str1.  func is a function taking 3 String parameters and returning nothing.  str1 is just a String.  fillInStr1 returns a function taking 2 String parameters and returning nothing.

Inside fillInStr, it just creates an instance of Function2, a function that takes 2 parameters.  This function object is defined so that the apply method calls the func function and passes str1 as the first parameter.  The other two parameters are the parameters of the Function2′s apply method.  Do you see what it’s doing?  It’s taking a function on 3 strings, and transforming it into a function on only 2 strings.  We can call fillInStr1 by passing in printAll (note the underscore), and a string.  What we get back is a function that behaves just like printAll, except with the first parameter already filled in.  Neat trick!

In fact, this trick is so neat that it has a name and is actually built into the language.  This little demonstration is a very simple, non-generalized application of a concept called currying.  The Code Commit blog has an excellent article on function currying in Scala if you’d like to know more about it.

This, of course, isn’t all there is to functions.  There’s a lot more!  But now you know enough to go out there and start experimenting.  See what tricks you can do, what problems you can solve with Scala’s versatile and powerful function objects.

I like to include a copyright notice on my posts, a small one down on the lower right, and a couple of links for RSS and Twitter.  The problem is that I always go back and add it as an afterthough.  So I write a post, proof it, convince myself that it’s perfect, post it, and then I notice that I didn’t put that little footer at the end.

I use wordpress.com to host my site.  I looked around the admin console for something pertaining to post templates but didn’t find anything.  I googled a little to see whether such a feature exists, but I didn’t find anything.

Then it occurred to me.  Matt Malone, you handsome devil, aren’t you a professional software developer?  And don’t you have the gall to write a blog proclaiming yourself such?  Why don’t you write something yourself?  So I did.  I wrote a very simple greasemonkey script.  Here it is below.  Feel free to install the script if you use wordpress.com and think it would be useful.

// ==UserScript==
// @name           WordPress Post Template
// @namespace      oldfashionedsoftware.com
// @description    Inserts some template text in new blog posts
// @include        http://matthewmalone.wordpress.com/wp-admin/post-new.php
// ==/UserScript==
var postTextAreaList, postTextArea;
postTextAreaList = document.evaluate( "//textarea[@name='content']",
    document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
postTextArea = postTextAreaList.snapshotItem(0);
postTextArea.value = "Your template text here";

One of the main complaints you hear about the Scala language is that it’s too complicated compared to Java. The average developer will never be able to achieve a sufficient understanding of the type system, the functional programming idioms, etc. That’s the argument. To support this position, you’ll often hear it pointed out that Scala includes several notions of nothingness (Null, null, Nil, Nothing, None, and Unit) and that you have to know which one to use in each situation. I’ve read an argument like this more than once.

It’s not as bad as all that. Yes, each of those things is part of Scala, and yes, you have to use the right one in the right situation. But the situations are so wildly different it’s not hard to figure out once you know what each of these things mean.

Null and null

First, let’s tackle Null and null. Null is a trait, which (if you’re not familiar with traits) is sort of like an abstract class in Java. There exists exactly one instance of Null, and that is null. Not so hard. The literal null serves the same purpose as it does in Java. It is the value of a reference that is not refering to any object. So if you write a method that takes a parameter of type Null, you can only pass in two things: null itself or a reference of type Null. Observe:

scala> def tryit(thing: Null): Unit = { println("That worked!"); }
tryit: (Null)Unit

scala> tryit("hey")
<console>:6: error: type mismatch;
 found   : java.lang.String("hey")
 required: Null
       tryit("hey")
             ^

scala> val someRef: String = null
someRef: String = null

scala> tryit(someRef)
<console>:7: error: type mismatch;
 found   : String
 required: Null
       tryit(someRef)
             ^

scala> tryit(null)
That worked!

scala> val nullRef: Null = null
nullRef: Null = null

scala> tryit(nullRef)
That worked!

In line 4 we try to pass in a String, and of course that doesn’t work. Then in line 14 we try to pass in a null reference, but that doesn’t work either! Why? It’s a null reference to a String. It may be null at run-time, but compile-time type checking says this is a no-no.

But look at line 21. We can pass in the literal null. And in line 27 we pass in another null reference, but this one is actually of type Null. Notice that we initialized nullRef to null. That’s the only value to which we could have initialized it, because null is the sole instance of Null.

Nil

Nil is an easy one. Nil is an object that extends List[Nothing] (we’ll talk about Nothing next). It’s an empty list. Here’s some example code using Nil:

scala> Nil
res4: Nil.type = List()

scala> Nil.length
res5: Int = 0

scala> Nil + "ABC"
res6: List[java.lang.String] = List(ABC)

scala> Nil + Nil
res7: List[object Nil] = List(List())

See? It’s basically a constant encapsulating an empty list of anything. It’s has zero length. It doesn’t really represent ‘nothingness’ at all. It’s a thing, a List. There are just no contents.

Nothing

If any of these is a little difficult to get, it’s Nothing. Nothing is another trait. It extends class Any. Any is the root type of the entire Scala type system. An Any can refer to object types as well as values such as plain old integers or doubles. There are no instances of Nothing, but (here’s the tricky bit) Nothing is a subtype of everything. Nothing is a subtype of List, it’s a subtype of String, it’s a subtype of Int, it’s a subtype of YourOwnCustomClass.

Remember Nil? It’s a List[Nothing] and it’s empty. Since Nothing is a subtype of everything, Nil can be used as an empty List of Strings, an empty List of Ints, an empty List of Any. So Nothing is useful for defining base cases for collections or other classes that take type parameters. Here’s a snippet of a scala session:

scala> val emptyStringList: List[String] = List[Nothing]()
emptyStringList: List[String] = List()

scala> val emptyIntList: List[Int] = List[Nothing]()
emptyIntList: List[Int] = List()

scala> val emptyStringList: List[String] = List[Nothing]("abc")
<console>:4: error: type mismatch;
 found   : java.lang.String("abc")
 required: Nothing
       val emptyStringList: List[String] = List[Nothing]("abc")

On line 1 we assign a List[Nothing] to a reference to List[String]. A Nothing is a String, so this works. On line 4 we assign a List[Nothing] to a reference to List[Int]. A Nothing is also an Int, so this works too. A Nothing is a subtype of everything. But both of these List[Nothing] instances contain no members. What happens when we try to create a List[Nothing] containing a String and assign that List to a List[String] reference? It fails because although Nothing is a subtype of everything, it isn’t a superclass of anything and there are no instances of Nothing, including String “abc”. So any collection of Nothing must necessarily be empty.

One other use of Nothing is as a return type for methods that never return. It makes sense if you think about it. If a method’s return type is Nothing, and there exists absolutely no instance of Nothing, then such a method must never return.

None

When you’re writing a function in Java and run into a situation where you don’t have a useful value to return, what do you do? There are a few ways to handle it. You could return null, but this causes problems. If the caller isn’t expecting to get a null, he could be faced with a NullPointerException when he tries to use it, or else the caller must check for null. Some functions will definitely never return null, but some may. As a caller, you don’t know. There is a way to declare in the function signature that you might not be able to return a good value, the throws keyword. But there is a cost associated with try/catch blocks, and you usually want to reserve the use of exceptions for truly exceptional situations, not just to signify an ordinary no-result situation.

Scala has a built-in solution to this problem. If you want to return a String, for example, but you know that you may not be able to return a sensible value you can return an Option[String]. Here’s a simple example.

scala> def getAStringMaybe(num: Int): Option[String] = {
     |   if ( num >= 0 ) Some("A positive number!")
     |   else None // A number less than 0?  Impossible!
     | }

getAStringMaybe: (Int)Option[String]

scala> def printResult(num: Int) = {
     |   getAStringMaybe(num) match {
     |     case Some(str) => println(str)
     |     case None => println("No string!")
     |   }
     | }
printResult: (Int)Unit

scala> printResult(100)
A positive number!

scala> printResult(-50)
No string!

The method getAStringMaybe returns Option[String]. Option is an abstract class with exactly two subclasses, class Some and object None. Those are the only two ways to instantiate an Option. So getAStringMaybe returns either a Some[String] or None. Some and None are case classes, so you can use the handy match/case construct to handle the result. None is object that signifies no result from the method.

The purpose of an Option[T] return type is to tell callers that the method might return a T in the form of a Some[T], or it might return None to signify no result. This way, the caller supposedly knows when he does and does not need to check for a good return value.

On the other hand, just because a method is declared as returning some non-Option type doesn’t mean it can’t return null. Moreover, a method declared as returning Option can, in fact, return a null. So the technique isn’t perfect.

This is a neat trick, but can you imagine a codebase peppered with Option[This] and Option[That] all over the place, and all those ensuing match blocks? I say use Option sparingly.

Unit

This is another easy one. Unit is the type of a method that doesn’t return a value of any sort. Sound familiar? It’s like a void return type in Java. Here’s an example:

scala> def doThreeTimes(fn: (Int) => Unit) = {
     |   fn(1); fn(2); fn(3);
     | }
doThreeTimes: ((Int) => Unit)Unit

scala> doThreeTimes(println)
1
2
3

scala> def specialPrint(num: Int) = {
     |    println(">>>" + num + "<<<")
     | }
specialPrint: (Int)Unit

scala> doThreeTimes(specialPrint)
>>>1<<<
>>>2<<<
>>>3<<<

In the definition of doThreeTimes we specify that the method takes a parameter called fn, which has a type of (Int) => Unit. This means that fn is a method that takes a single parameter of type Int and a return type of Unit, which is to say fn isn’t supposed to return a value at all just like a Java void function.

That’s it. Those are the ‘nothingness’ items in Scala. If you know of any more, please leave a comment! There is admitedly a lot to learn when you’re taking up Scala, but in return you get an incredibly expressive and succinct language.

If you enjoyed my earlier post on parsing in Scala Stephan Zeiger has a 3-part series at a more technical level.

Yesterday I made a post called Easy Parsing in Scala about using the Scala parsing libraries. I’ve made a couple changes to the code since then.

First, I noticed that the regex method takes a Regex object as its only parameter. Why, I thought to myself, didn’t they just make the method take a String so I don’t have to keep typing “new Regex”. Duh. Big duh. They’re giving me the opportunity to reuse Regex objects instead of stupidly recreating them over and over. So I added three private constant regular expressions that I could reuse: spaceRegex, numberRegex, and wordRegex. I couldn’t make a constant Regex for the one that matches a given number of characters, of course.

Second, I eliminated some repetition by adding a regexAndSpace method that matches a regular expression and then throws away the following whitespace. That’s a job that’s repeated 3 times, so I thought it made sense to factor it out. Without further ado, here’s the updated code:

import scala.util.parsing.combinator._
import scala.util.matching.Regex

object SvnParser extends RegexParsers {
  private val spaceRegex  = new Regex("[ \\n]+");
  private val numberRegex = new Regex("[0-9]+");
  private val wordRegex   = new Regex("[a-zA-Z][a-zA-Z0-9-]*");

  private def space  = regex(spaceRegex)
  private def regexAndSpace(re: Regex) = regex(re) <~ space

  override def skipWhitespace = false

  def number = regexAndSpace(numberRegex)
  def word   = regexAndSpace(wordRegex)
  def string = regex(numberRegex) >> { len => ":" ~> regexAndSpace(new Regex(".{" + len + "}")) }
  def list: Parser[List[Any]] = "(" ~> space ~> ( item + ) <~ ")" <~ space

  def item = ( number | word | string | list )

  def parseItem(str: String) = parse(item, str)
}

SvnParser.parseItem("( 5:abcde 3:abc  \n   20:three separate words     (  abc def     \n\n\n   123 ) ) ") match {
  case SvnParser.Success(result, _) => println(result.toString)
  case _ => println("Could not parse the input string.")
}

« Previous PageNext Page »