Don’t Be So (Case) Sensitive

In Scala, as in Java, C and many other languages, identifiers may contain a mix of lower and upper case characters. These identifiers are treated in a case sensitive manner. For example “index”, “Index” and “INDEX” would be treated as three separate identifiers. You can define all three in the same scope. That goes for Scala, Java, and most if not all descendants of the C language. In most of these languages, although case is significant in distinguishing identifiers, and although various capitalization schemes are used by convention, case does not alter functionality. Whether you name a variable “index”, “Index” or “INDEX”, as long as you don’t hide an identifier from an enclosing scope, the code will function in exactly the same way.

Scala, though, diverges slightly from this tradition. Here’s an example. Say we have a Pair of two Ints. Say we also have two plain Int values and we want to know whether those two Ints are equal to the values inside the Pair. In the case that they do match, we also want to know in which order they appear in the Pair.

Operations on tuples (such as a Pair) can often be implemented neatly by pattern matching. Here’s one solution to this problem:

def matchPair(x: (Int,Int), A: Int, b: Int): String = 
x match {
  case (A, b) => "Matches (A, b)"
  case (b, A) => "Matches (b, A)"
  case _      => "Matches neither"
}

This is completely unsurprising code except for one little detail. One of the function parameters is upper case while the other two are lower case. Other than that, there’s nothing unusual so far. So let’s try out this code in the Scala interpreter.

scala> def matchPair(x: (Int,Int), A: Int, b: Int): String = 
     | x match {
     |   case (A, b) => "Matches (A, b)"
     |   case (b, A) => "Matches (b, A)"
     |   case _      => "Matches neither"
     | }
matchPair: ((Int, Int),Int,Int)String

scala> val pair = (5, 10)
pair: (Int, Int) = (5,10)

scala> matchPair( pair,  5, 10 )
res1: String = Matches (A, b)

scala> matchPair( pair, 10,  5 )
res2: String = Matches (b, A)

scala> matchPair( pair, 99, 99 )
res3: String = Matches neither

So far so good! It returns the expected value when the values match in order, in reverse order, and when both values don’t match. Is this sufficient unit testing? What other tests would you run?

As you may well guess, no, this isn’t sufficient unit testing. Let’s try the case where one Int matches but not the other:

scala> matchPair( pair,  5, 99 )
res4: String = Matches (A, b)

scala> matchPair( pair, 99, 10 )
res5: String = Matches (b, A)

That didn’t work right. Is the matchPair function telling us that ‘pair’ (which is (5, 10) ) matches (5, 99) or (99 10)? That’s what it look like, but no. Scala does something a little bit surprising here. Do you know why?

As I said before, you can have variable and constants in Scala with upper or lower case names. Both are legal, just as they are in Java. But Scala makes some distinctions that Java doesn’t. Within a pattern (the part between ‘case’ and ‘=>’) Scala treats simple lower case identifiers differently. It uses them as new variables into which matched data is stored, but this is not the case for identifiers that begin with an upper case letter!

If you want to capture results of a pattern match in Scala you must use a lower case identifier and that identifier will hide any identifiers with the same name from an enclosing scope. So in our example function, “case (A, b)” matches a Pair. The first element of the pair is “A” which start with an upper case letter, so pattern matching results can’t be stored in it. It is used the way we intended, i.e. the pattern is matched if x._1 equals A.

The “b” in “case (A, b)”, though, begins with a lower case letter so it is assigned the value of x._2 (assuming x._1 equals A). It is as if you had typed “val b = x._2” in the function body. Within the case line, the “b” from the pattern hides the parameter named “b”.

So how can we make this function work the way we want? Here’s one way:

def matchPair(x: (Int,Int), A: Int, B: Int): String = 
x match {
  case (A, B) => "Matches (A, B)"
  case (B, A) => "Matches (B, A)"
  case _      => "Matches neither"
}

Now both the the Int parameters start with an upper case letter and are therefore tested against x._1 and x._2. This code passes our tests. Note that the code behaves differently simply based on the parameter names we choose. There’s another way to prevent Scala from using the identifiers for storing pattern results.

def matchPair(x: (Int,Int), a: Int, b: Int): String = 
x match {
  case (`a`, `b`) => "Matches (a, b)"
  case (`b`, `a`) => "Matches (b, a)"
  case _          => "Matches neither"
}

You can use the more traditional lower case parameter names if you quote them using the backquote character. That’s the key to the left of the “1” on my keyboard. This matchPair is equivalent to the one that used capital “A” and “B”.

Another quick example:

scala> val pair = (5, 10)
pair: (Int, Int) = (5,10)

scala> val (a,b) = pair
a: Int = 10
b: Int = 5

scala> val (X,Y) = pair
:5: error: not found: value X
       val (X,Y) = pair
            ^
:5: error: not found: value Y
       val (X,Y) = pair
              ^

You know that you can use the construct from line 4 above to declare and initialize multiple vals or vars using a tuple, right? And you know how that magic is done? Patterns, so the same principle applies here. The capitalized identifiers “X” and “Y” are taken to refer to existing identifiers because they can’t be used to store pattern match results. Since no such identifiers had been defined, you get an error.

If we define these values beforehand then Scala tries to match their values:

scala> val pair = (5,10)
pair: (Int, Int) = (5,10)

scala> val I = 5
I: Int = 5

scala> val J = 10
J: Int = 10

scala> val (I,q) = pair
q: Int = 10

scala> val (J,r) = pair
scala.MatchError: (5,10)
        at .<init>(<console>:6)
        at .<clinit>(<console>)
        at RequestResult$.<init>(<console>:3)
        at RequestResult$.<clinit>(<console>)
        at RequestResult$result(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(...

scala> val (I,J) = pair

In line 10, the value I has been declared and initialized to 5 so it matches the first part of the Pair. The identifier q becomes a new val initialized to the value in the second part of the Pair.

In line 13, we do the same thing but we try to match J to the first part of the Pair. This won’t work since pair._1 is 5 and J is 10. A MatchError is thrown.

In line 24, we use both of the capitalized identifiers. They match, but there are no lower case identifiers to make new values out of, so the line does nothing except to confirm (by not throwing an Error) that I equals pair._1 and J equals pair._2.

Now you know how to match when you want to match and assign results when you want to assign results. I hope that being able to match against already-defined identifiers will make your matching code more powerful.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Matt Malone's Old-Fashioned Software Development Blog

No Responses to “ Don’t Be So (Case) Sensitive ”

Archived Entry

Search:

Industry News

Categories