Scala Boggles The Mind

I was playing a game on my iPhone called Scramble the other day. It’s a great game. You are presented with a 4×4 grid of letters, and your job is to find words by chaining together adjacent letters. It bears a passing similarity to Boggle. I was playing online and I noticed that many other players play much better than I do. Well, a software developer doesn’t take a thing like this lying down. I decided it was time to write a Scramble (or Boggle) solver!

And since I’m trying to pick up the Scala language whenever I have an opportunity, I wrote the whole thing in Scala. I hope it will be, for the reader, an interesting study of a complete and useful (though simple) example Scala program.

First, I had to decide on a strategy. If we try every path on the game board, that’s inefficient. For example, if we try a path XQ then we can stop right there. XQ is not a word and no word starts with the letters XQ. This suggests using some sort of spell-checker-like logic. So let’s first create a dictionary data structure that allow us to look up words and eliminate dead ends. What we need is a tree. Here’s the data structure I came up with.

import scala.collection.mutable.HashMap

class LetterTree {
    private val nodes: HashMap[Char,LetterTree] = new HashMap[Char,LetterTree]
    var terminal: Boolean = false
    def addWord(word: String): Unit = addWord(word.toList)
    def addWord(word: List[Char]): Unit = word match {
        case Nil          => terminal = true
        case head :: tail => nodes.getOrElseUpdate(head, new LetterTree).addWord(tail)
    }
    def getSubTree(letter: Char): Option[LetterTree] =
        if (nodes.contains(letter)) Some(nodes(letter)) else None
}

It’s a mutable class. I toyed with some ideas for an immutable class, but it just complicated things more than I wanted to cope with. Our dictionary will be a tree in which LetterTree is the node class. You can see that a LetterTree has two member data: a hashmap of child LetterTrees indexed by Char, and a flag called terminal. So a LetterTree is a tree node whose child nodes are indexed by letter, and which can be marked (via the terminal flag) as ending a word. This is a pretty efficient way to store words.

There are two addWord methods for populating the tree. One takes a list of Chars. The other takes a String and is included for convenience. It converts its String parameter to a list of Chars and calls the other addWord method. If the addWord method is passed an empty list (Nil), then it has reached the end of a word and sets the terminal flag. Otherwise, it takes the first Char in the list and looks up (or else creates) the LetterTree mapped to that Char. The method then adds the remainder of the Char list to that mapped LetterTree.

Finally, there’s the getSubTree method. This will be useful when we start using the tree to look up potential word matches.

So imagine that we create a LetterTree and add the following words: item, its, it. We get a structure like this:

Figure 1

Each of the boxes represents a LetterTree instance. Each arrow, labeled with letter, represents an entry in the ‘nodes’ HashMap. The top level LetterTree does not itself represent a letter. It’s just a starting point. Its ‘nodes’ member has just one mapping: letter ‘i’ maps to a second LetterTree. That second LetterTree doesn’t have its ‘terminal’ flag set, so we haven’t made a word yet. Its ‘nodes’ map has a single entry mapping ‘t’ to a third LetterTree. This third LetterTree is a terminal node, so we know that the sequence ‘it’ forms a word. The third LetterTree’s ‘nodes’ map has entries for ‘s’ and ‘e’. The LetterTree mapped to ‘s’ is terminal, and the one mapped to ‘e’ is not. The ‘e’ node has a child node ‘m’ that is a terminal.

This is a pretty efficient way to store words. We got to use the sequence ‘it’ 3 times! Also, note that there is exactly one terminal node for each word. If we add a word, we will either append a new leaf node which will be terminal, or we will make one of the existing nodes terminal.

Now, all we have to do is create a LetterTree and call the addWord method for every word we can think of. That could be annoying. Let’s create an improved LetterTree that will read a list of words from a file.

import java.io.File
import scala.io.Source

class FileLetterTree(path: String) extends LetterTree {
    val file = new File(path)
     for (line <- Source.fromFile(file).getLines) addWord(line.trim)
}

Can you believe how easy that was?! We just extend LetterTree, take a file path as a constructor parameter, use the scala.io.Source class to get all lines, and add each line as a word. Now just find a text file containing all English words. You should be able to google it.

You might want to ensure the file contains only words 3 letters or longer (as required by the rules of the game), only lower case, and only the 26 english letters. Since Scramble has no ‘Q’ piece (neither does Boggle) but only a ‘Qu’ piece, you might want to do a global replace on your word file, replacing all instance of ‘qu’ with ‘q’.

Ok, now we have a data structure for looking up words. Now we need to create some code that represents the game board. We’ll assume a 4-by-4 board.

class GameBoard(lettersStr: String) {
    private val ltrStr = lettersStr.toLowerCase()
    if (!ltrStr.matches("^[a-z]{16}$"))
    throw new Exception("Exactly 16 letters a-z are required.")

    override def toString: String =
        ltrStr.substring(0,4)  + "\n" + ltrStr.substring(4,8) + "\n" +
        ltrStr.substring(8,12) + "\n" + ltrStr.substring(12,16)

    case class Letter(letter: Char) {
        var neighbors = List[Letter]()
        def addNeighbor(nbr: Letter) = { neighbors = nbr :: neighbors }
        override def toString = letter.toString
    }

    val letters = new Array[Array[Letter]](4,4)
    for (idx <- 0 until ltrStr.length)
        letters(idx/4)(idx % 4) = Letter(ltrStr(idx))

    for ( idx <- 0 to 3; jdx <- 0 to 3; iOff <- -1 to 1; jOff <- -1 to 1;
          if (iOff != 0 || jOff != 0) &&
          idx + iOff >= 0 && idx + iOff < 4 &&
          jdx + jOff >= 0 && jdx + jOff < 4 )
        letters(idx)(jdx).addNeighbor(letters(idx + iOff)(jdx + jOff))
}

That’s a lot of new code, but it is actually pretty simple if we break it down:

Lines 2-4 just ensure that we have good input. The code converts the constructor parameter to lower case, and then confirms that it is composed of exactly 16 letters from a to z.

The next section is a toString method. It just returns the 16-letter string in 4-letter chunks separated by newlines.

Next, there is an inner class called Letter. It encapsulates one letter on the game board, and includes a way to keep track of neighboring Letters (think of it as a graph node).

Line 16 creates the game board, a 4-by-4 array of Letters. The for-loop that follows initializes each of the 16 Letter objects using the corresponding letters from the 16-letter string.

All that’s left is to tell each Letter who his neighbors are. This is done with the somewhat complex for-loop on line 20. This structure loops through two indices, idx and jdx, as well as two offsets, iOff and jOff. It’s like four nested loops combined into one. It loops through idx and jdx values from 0 to 3, so it runs for each of the 16 Letters in the grid. It also loops through iOff and jOff from -1 to 1, so it looks at each neighbor of each Letter. See? For each of the 16 Letters in the grid, it looks at each of the 9 Letters around that Letter (including the Letter itself).

The last section of the for-loop header (lines 21-23) defines which combinations of idx, jdx, iOff, and jOff are valid and should be processed in the loop body. You see why, right? First of all, there’s no need for a Letter to add itself to its list of neighbors. It’s against the game rules to use the same grid position twice in the same word, so any iterations in which iOff and jOff are both 0 are eliminated at line 21.

Also, there are grid positions at the edges and corners. Those positions will have fewer neighbors. So for the Letter at position idx=0, jdx=0, offsets of iOff=-1 OR jOff=-1 make no sense. There are no neighboring Letters in the -1 direction at that position. These restrictions are made by lines 22 and 23.

Finally, if the indices and offsets pass the tests, the Letter at location (idx, jdx) is assigned a new neighbor, the Letter at (idx+iOff, jdx+jOff).

Of course, this isn’t the only way to write a program like this. We could have used a simple array of characters and put the neighbor-finding logic in the solver code, but I chose to divide up the responsibilities like this.

We’ve finished setting up the game board. We have a dictionary of legal words. Now we’re ready to play. Let’s add a function called findWords to the GameBoard class.

def findWords(tree: LetterTree): List[String] = {
    def findWords(tree: LetterTree, letter: Letter, sofar: List[Letter]): List[String] = {
        tree.getSubTree(letter.letter) match {
          case Some(subTree) =>
            var words: List[String] = Nil
            if (subTree.terminal) words = (letter :: sofar).foldLeft("")((c,n) => n+c) :: words
            for (nextLetter <- letter.neighbors if !sofar.contains(nextLetter))
            words = findWords(subTree, nextLetter, letter :: sofar) ::: words
            words
          case None => Nil
        }
    }
    var words: List[String] = Nil
    for (idx <- 0 to 3; jdx <- 0 to 3)
        words = words ++ findWords(tree, letters(idx)(jdx), Nil)
    words
}

This is the heart of our little program. This is the code that does the real work. The first thing that happens in this function is that we define an inner function. We’ll look at that in a second. First, look at code below the inner function. We create an empty List of Strings, then for each Letter in the 4-by-4 grid we call the inner findWords function. We pass in the dictionary tree (at the top tree level), the current Letter, and an empty list. That’s the list of letters used so far. At this point, no letters have been used yet, so it’s empty (Nil).

The result of the inner function call is a list of all valid words that start with the given Letter. That list is added to the list of all valid words. We do this for each of the 16 Letters.

Now, for that inner function. Let’s first examine the parameter list. First is tree of type LetterTree. This is the dictionary class we built earlier. On the first call, we pass in the whole dictionary, the top level tree. As we search for words, though, we will pass sub-trees, sub-sub-trees and so forth. The second parameter is a letter of type Letter. That’s just the current letter that we’re evaluating.

The last parameter is called sofar and it has type List[Letter]. Why is it called sofar? Because it’s the ordered list of connected Letters from the game board that we know begin words in the dictionary. The sofar list might contain the letters (c, o, m, p) because this is a prefix to words like compute, computer, computing, comparison, etc. We would never be passed a sofar list containing (c, o, m, p, x) because although this sequence could show up on the game board, this is not a prefix to any word in the dictionary. This parameter will grow longer with each recursive call to the inner function. This is why on the first call to the inner function we pass the empty list, Nil, as the sofar parameter.

We first look up the sub-tree beginning with the letter value of the Letter parameter. If we find that there is no sub-tree for this letter (the result of getSubTree matches None) then we know that there are no words beginning with the letter. It’s a dead end, so we return an empty list and we do not recurse any further.

If we do find a sub-tree, though, that means that there are words that begin with the sofar list followed by this letter. On line 6, we check to see if the node for the current letter is a terminal node in our dictionary. If it is, then we have a legal word. We append the current letter to the sofar list, convert the list to a string, and put it in the local words list.

In the next line, we loop through all of the current Letter’s neighbors, excluding any that are already in the sofar list. For each unused neighbor, we make a recursive call. This time, the parameters are the dictionary sub-tree, the neighbor letter generated by the for loop, and the sofar list with the current Letter appended. The list of words returned by each recursive call is combined with the local word list. After all the neighboring letters have been checked, the word list is returned from the function.

This is basically a graphs problem. The game board is an undirected graph. The dictionary is a tree, which is a type of graph. We are taking circuit-free paths along the game board graph and mapping them to paths on the tree, accumulating a list of matching paths for which the end is marked terminal on the tree. There are Java (and maybe Scala) libraries for dealing with graphs, and when I get a chance I’d like to see if I can get a more tidy implementation using one of these libraries.

Now, we’ll just pull all this together in a neat little package:

class PuzzleSolver(dictionaryPath: String) {
    val tree = new FileLetterTree(dictionaryPath)
    def solve(letters: String) = {
        val board = new GameBoard(letters)
        val wordSet = new HashSet[String]() ++ board.findWords(tree)
        val sortedWords = wordSet.toList.sort{ (a,b) =>
            a.length > b.length || (a.length == b.length && a > b)
        }
        println(sortedWords)
    }
}

Now we can create an instance of the PuzzleSolver class for a dictionary file that we specify. Then we can call the solve function for game board configurations. This class finds all the legal words contained in the game board, sorts them by length first and alphabetically second, and prints them out.

Here’s a sample session in the Scala interpreter:

scala> val solver = new PuzzleSolver("./words.txt")
solver: PuzzleSolver = PuzzleSolver@1876e5d

scala> solver.solve("TestingTheSolver")
List(storeen, torsel, tinsel, tensor, seeing, nestor, inseer, 
ingest, verst, verso, torse, tinge, store, soree, soget, snite, 
sneer, rotse, rotge, rosel, reest, orsel, inset, insee, ingot, 
hinge, gorse, geest, vlei, vest, vent, vein, veer, veen, tore, 
togs, ting, tine, tien, teng, stog, sore, snee, sero, sent, 
seit, sego, seer, seen, seel, rose, rest, rees, reen, reel, 
ogee, neti, nest, neer, lest, lent, lens, lehi, lees, leer, 
iten, hint, hing, hest, hent, hein, heer, gore, goes, goer, 
gest, gent, gens, gein, eros, vei, vee, tor, tog, toe, tin, 
tie, ten, teg, sot, sog, soe, set, ser, sen, seg, see, rot, 
rog, roe, rev, ree, ose, ore, oes, oer, nit, net, nei, nee, 
lev, les, len, lei, leg, lee, ing, hit, hin, hie, hen, hei, 
got, gos, gor, get, ges, gen, gel, gee, ers, ens, ego, eer, 
eel)

scala>

It works! Here’s the complete source code:

import java.io.File  
import scala.io.Source  
import scala.collection.mutable.HashMap  
import scala.collection.immutable.HashSet  
  
class LetterTree {  
    private val nodes: HashMap[Char,LetterTree] = new HashMap[Char,LetterTree]  
    var terminal: Boolean = false  
    def addWord(word: String): Unit = addWord(word.toList)  
    def addWord(word: List[Char]): Unit = word match {  
        case Nil          => terminal = true  
        case head :: tail => nodes.getOrElseUpdate(head, new LetterTree).addWord(tail)  
    }  
    def getSubTree(letter: Char): Option[LetterTree] =  
        if (nodes.contains(letter)) Some(nodes(letter)) else None  
}  

class FileLetterTree(path: String) extends LetterTree {  
    val file = new File(path)  
    for (line <- Source.fromFile(file).getLines) addWord(line.trim)  
}  


class GameBoard(lettersStr: String) {  
    private val ltrStr = lettersStr.toLowerCase()  
    if (!ltrStr.matches("^[a-z]{16}$"))  
    throw new Exception("Exactly 16 letters a-z are required.")  
  
    override def toString: String =  
        ltrStr.substring(0,4)  + "\n" + ltrStr.substring(4,8) + "\n" +  
        ltrStr.substring(8,12) + "\n" + ltrStr.substring(12,16)  
  
    case class Letter(letter: Char) {  
        var neighbors = List[Letter]()  
        def addNeighbor(nbr: Letter) = { neighbors = nbr :: neighbors }  
        override def toString = letter.toString  
    }  
  
    val letters = new Array[Array[Letter]](4,4)  
    for (idx <- 0 until ltrStr.length)  
        letters(idx/4)(idx % 4) = Letter(ltrStr(idx))  
  
    for ( idx <- 0 to 3; jdx <- 0 to 3; iOff <- -1 to 1; jOff <- -1 to 1;  
          if (iOff != 0 || jOff != 0) &&  
          idx + iOff >= 0 && idx + iOff < 4 &&  
          jdx + jOff >= 0 && jdx + jOff < 4 )  
        letters(idx)(jdx).addNeighbor(letters(idx + iOff)(jdx + jOff))  

  def findWords(tree: LetterTree): List[String] = {  
      def findWords(tree: LetterTree, letter: Letter, sofar: List[Letter]): List[String] = {  
          tree.getSubTree(letter.letter) match {  
            case Some(subTree) =>  
              var words: List[String] = Nil  
              if (subTree.terminal) words = (letter :: sofar).foldLeft("")((c,n) => n+c) :: words  
              for (nextLetter <- letter.neighbors if !sofar.contains(nextLetter))  
              words = findWords(subTree, nextLetter, letter :: sofar) ::: words  
              words  
            case None => Nil  
          }  
      }  
      var words: List[String] = Nil  
      for (idx <- 0 to 3; jdx <- 0 to 3)  
          words = words ++ findWords(tree, letters(idx)(jdx), Nil)  
      words  
  }  
}  

class PuzzleSolver(dictionaryPath: String) {  
    val tree = new FileLetterTree(dictionaryPath)  
    def solve(letters: String) = {  
        val board = new GameBoard(letters)  
        val wordSet = new HashSet[String]() ++ board.findWords(tree)  
        val sortedWords = wordSet.toList.sort{ (a,b) =>  
            a.length > b.length || (a.length == b.length && a > b)  
        }  
        println(sortedWords)  
    }  
}

One last thing: To be clear, no I don’t actually use this to cheat at online games. Just knowing that I could is satisfying enough for me.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Matt Malone's Old-Fashioned Software Development Blog

No Responses to “ Scala Boggles The Mind ”

Archived Entry

Search:

Industry News

Categories