General – Matt Malone's Old-Fashioned Software Development Blog

School district in the U.S. surrenders to ransomware

Matt — Wed, 30 Sep 2020 10:04:27 +0000

The operators of blackmail viruses have been actively looking for new niches to target over the past years. The sad truth in this regard is that entities like educational institutions, healthcare organizations and even law enforcement agencies are low-hanging fruit to these attackers. One of the latest onslaughts demonstrates this unsettling susceptibility. In mid-April 2018, an unidentified ransomware strain hit the computer network of the Leominster Public School District in Massachusetts. While the details of the specific attack vector remain undisclosed at the time of writing, the most likely entry point was a phishing email opened by one of unsuspecting staffers. Ultimately, the district officials have admitted to paying $10,000 worth of Bitcoin to regain access to the proprietary records.

According to the local police that’s investigating the incident, the school didn’t maintain an offsite data backup. That’s very poor security hygiene that makes users and companies incur serious losses in various information security incursions and data breaches. As a result, part of the target’s network was locked down as the malicious code applied a strong cipher to encrypt the most common types of files spotted on the host servers.

According to some unconfirmed reports, the troublemaking program might be the infamous WannaCry ransomware, which broke out worldwide in May 2017 and crippled numerous computer networks, including government-related ones and those belonging to industry giants. Some organizations had to rebuild entire segments of their infrastructure from scratch to recover from this massive attack. The UK’s National Health Service exemplifies the harsh impact as about 70,000 of its devices were affected.

The involvement of WannaCry in the Leominster case is a mere speculation, though. If it holds true, the attack probably tool place via unpatched software exploited in a furtive way. One way or another, although the FBI and security professionals advise against submitting ransoms in scenarios like that, the school district elected the lesser of two evils. The officials followed the crooks’ demands and coughed up the negotiated amount of cryptocurrency.

In summary, crypto ransomware continues to be a serious concern, and organizations are much better off keeping file backups to avoid the damage.

Social network phishing attacks are an escalating peril

Matt — Fri, 25 Sep 2020 10:04:14 +0000

Con artists are homing in on users’ social network accounts via phishing messages disguised as verification requests or copyright infringement alerts.

Social networks such as Facebook, Instagram, Twitter, and TikTok boast huge user audiences and therefore increasingly lure online scammers. By obtaining credentials for numerous accounts, threat actors can mishandle the unauthorized access to perpetrate fraudulent initial coin offerings (ICOs) or controversial propaganda. In some cases, crooks simply sell these details on hacker forums.

Unsurprisingly, social network accounts have become a pricey asset in the cybercriminal circles and need to be safeguarded accordingly. Since early August 2020, security researchers have been observing a spike in phishing stratagems that zero in on social network users.

The two dominant forms of these scams are described below. Go over the information to identify these hoaxes if they happen to hit you.

Bogus account verification pages

A massive phishing wave that has been gaining traction recently involves false claims about giving a user a verified badge on Twitter, Instagram, and TikTok. The first two social networking platforms are being targeted the most. The essence of the scam is to instruct would-be victims to enter their username and password in a page camouflaged as an official verification form, with all the branding elements being in their place.

The growingly popular TikTok service is in the epicenter of a similar campaign, where malefactors promise video bloggers a nifty verification badge in exchange providing sensitive information on a rogue site.

Regardless of the social network being impersonated, the distinguishing hallmark of the phishing pages’ URLs is that they include the string “badge” or “verified”. This pattern should give users a heads-up if encountered. Obviously, the credentials instantly go to scammers if typed in a pseudo-verification form.

Spoofed copyright infringement alerts

One more common phishing scam with a flavor of social networking is masqueraded as a copyright violation for a user’s recent post. The bogus warning pages claim that the Twitter or Instagram account will be suspended within 24 hours unless its owner signs in and provides the appropriate arguments on the matter.

A clever move of the fraudsters is that fake Instagram copyright violation notices include the target’s real profile image. This quirk makes the scam look more trustworthy. In some cases, bad actors also try to wheedle out the email account password. If given away, this access allows hackers to expand the attack surface and compromise other personal accounts.

A clue that indicates a likely scam is that the URLs of the landing pages include the words “violation” or “copyright”. If noticed, this red flag should discourage you from entering personally identifiable information on that resource.

How to avoid these phishing frauds

An important precaution is to turn on two-factor authentication (2FA) on your social network accounts. This way, crooks cannot log in unless they have access to your smartphone which receives a secret code for confirmation. If you aren’t lucky and fall victim to one of these scams, be sure to change your password immediately.

Ransomware Dark Net Economy Is Flourishing

Matt — Thu, 02 Nov 2017 13:05:32 +0000

A recent report presented recently by IT-security company Carbon Black stresses a 2,500 % increase in the ransomware Dark Net industry, matched against the previous year.

The study supports numerous forecasts expressed by the majority of info-security specialists a year ago who said ransomware would likely have an essential role in all types of cyber-crime and get the biggest market share.

To collect information for this report, experts scanned the Dark Net for communities and sites offering and advertising all ransomware related products and services.

Researchers found approximately 6,200 spots where criminals had offered their services with the help of more than 44,000 ads.

Rates are varying greatly, from $1 to $4,000. The price variance is determined by different economic models crooks select to sell their goods. Some charge on a per-sample basis when others prefer monthly subscription plans.

Comparing 2016 and 2017, the ransomware economy has exploded from $250,000 to $6,230, 000, a rate of 2,500%, researchers note in their report. These extortion schemes get enormous ransom payouts that totaled in $1B in 2016. Earlier in 2015, it was $24M.

Ransomware-as-a-Service (RaaS) is the main driving force of the ransomware economy. Big and small RaaS portals started to appear in early 2017. These RaaS sites are all different and each works in its price niche. For instance, you can find RaaS portals offering all-in-one solutions. Some portals offer only minimal number of services. Finally, there are individual sellers who offer just the ransomware code.

Multi-function RaaS services provide the ransomware executable file itself, they also offer delivery mediums like botnets and exploit kits. In addition, you can get a payment portal to manage ransoms. On top of that, you can rent customer support team. All of this is available from a convenient web-based admin panel.

Reduced service RaaS sites supply the ransomware file, and just a couple of the services above, typically at more affordable prices.

Finally, there are private sellers who are virus writers. They sell just the ransomware file and allow clients to manage the rest. Some ransomware writers earn more than $150,000 a year. That is much more than the standard salary of a legal software developer.

Stupid Scala Tricks

Matt — Thu, 25 Oct 2012 05:02:32 +0000

After an embarrassingly long hiatus, I’ve started fooling with Scala some more and would like to report how I whiled away a few pleasant hours this weekend. You may have heard of the Stupid Pet Tricks and Stupid Human Tricks segments that David Letterman featured on his television show some years ago. The idea was that the pets and people would perform tricks which, while amusing, were not useful for anything. They were just stupid tricks. In this post I will be describing a stupid Scala trick.

If I may continue to reminisce, I remember years ago seeing a Martin Gardner-esqe mathematical puzzle called the four bug problem (although I don’t remember whether what I read was actually written by Gardner). I see that Wolfram Alpha refers to it as the “Mice Problem”. In short, imagine four bugs (or mice) positioned at the four corners of a square. Each one begins walking toward his clockwise neighbor. Since they’re all walking, they’re all following moving targets. Each bug’s path is affected by his neighbor, so they form a sort of feedback loop. The problem is to describe the path that each bug will take. It’s an interesting problem.

You can also describe the path of 5 bugs on a pentagon or 6 on a hexagon, and so forth. But I wondered what might happen if you had 5,000 bugs positioned not on a 5,000-agon but just everywhere, each following one of his randomly chosen fellows. Chaos, no doubt. This is the story of a little Scala app that answers the question. I’ll present the parts of my solution individually as well as the complete source code which you can compile and experiment with.

Domain classes

First, I need to describe the location of a bug. I could use something built-in like java.awt.Point but I didn’t like the look of it because its interface mixes doubles and ints whereas I wanted floating point coordinates. So I started by creating a class called Position that stores x and y as Doubles. And since the whole idea is to move from position to position, I included a move function. Like so:

class Position(val x: Double, val y: Double) {
  def move(xOffset: Double, yOffset: Double) = new Position(x + xOffset, y + yOffset)
}

That’s a good start, but ultimately we want to take one bug’s position and move it toward another bug’s position. So I added a moveToward and moveAwayFrom function as well as a function to calculate the distance between two positions. I won’t describe this code in excruciating detail. It’s pretty apparent what’s going one. Here it is:

class Position(val x: Double, val y: Double) {
  def move(xOffset: Double, yOffset: Double) = new Position(x+xOffset, y+yOffset)
  private def distNumbers(that: Position) = {
    val xDist = that.x - this.x
    val yDist = that.y - this.y
    val dist = scala.math.sqrt(xDist*xDist + yDist*yDist)
    (xDist, yDist, dist)
  }
  def dist(that: Position) = distNumbers(that)._3
  def moveToward(that: Position, dist: Double) = {
    val (xDist, yDist, totalDist) = distNumbers(that)
    def offset(num: Double) = {
      val result = dist * num / totalDist
      if (result.isNaN) 0 else result
    }
    move(offset(xDist), offset(yDist))
  }
}

The private function is there to prevent duplication of logic. This class isn’t super fancy, but it has the following features: It is immutable, it consistently uses Doubles for coordinates and offsets, it allows us to move a position, to find the distance between two positions and to move one position some distance toward or away from a second position.

One improvement I added after a few tests was to check for NaN (not-a-number) in the xOffset and yOffset values. You can see that on line 14 above. Since these bugs are going to walk toward each other they will often actually meet, resulting in a division by zero. This would cause the bugs to go haywire and shoot off into the corner of the screen after a little while. Line 14 prevents this.

That’s it for the Position class. Now to create the Bug class.

class Bug(xCoord: Double, yCoord: Double) {
  private var pos = new Position(xCoord,yCoord)
  private var nextPos = new Position(xCoord,yCoord)

  def getPosition = pos
  def x: Double = pos.x
  def y: Double = pos.y

  def prepareMove(func : (Position) => Position) = {
    nextPos = func(pos)
  }

  def paint(g: Graphics2D) {
    g.drawLine(pos.x.toInt, pos.y.toInt, nextPos.x.toInt, nextPos.y.toInt)
    pos = nextPos
  }
}

This is mostly self-explanatory. A Bug is little more than a position, a mechanism for transitioning from one position to the next, and a way for it to draw itself on the screen. The only thing that might need explanation is the reason for the nextPos member and the prepareMove function. I included pos and nextPos because I wanted to ensure that all the bugs first decided where to move based on the other bugs’ current positions, and then moved. Say bug B is following bug A. First, bug A decides his move, based on whoever he happens to be following. When bug B makes his move, I want his move to be based on A’s original position, not the new position. That is to say, I want the bugs to all decide where to move at once based on their fellows’ current position, and then to move all at once to the positions they had decided on. The next position isn’t copied into current position until just after the bug draws itself.

The prepareMove function take a function parameter. You’ll notice that the Bug class contains no logic about how to follow another bug. I wanted this logic to reside outside Bug itself and get passed in.

Setup

Now, to instantiate a bunch of bugs and assign each of them someone to follow:

  val initWinSize = new Dimension(800, 800)
  val bugs = {
    def random = Random.nextInt(2000) - 1000
    List.fill(5000)(
      new Bug(random + initWinSize.width/2,
              random + initWinSize.height/2))
  }
  var targets = Random.shuffle(bugs)

I chose to distribute the locations across a range from -1000 to +1000 in both the x and y directions, and to create 5000 bugs. These numbers are arbitrarily chosen. I also added an offset so that the bugs would be distributed about the center of the program window, whose dimensions I have decided to set at 800 by 800. The scala.util.Random class makes it trivially easy to create a randomly shuffled list of target bugs. The targets list contains the same Bug instances as the bugs list, only rearranged. So the first item in the bugs list will follow the first item in the targets list, the second bug follows the second target, and so forth.

Bug logic

Now we have our Bug class defined, a population of bugs created, and we’ve assigned them someone to follow. Below is a function called performMoves which will be called over and over in a loop running in a Thread.

      def performMoves = {
        for ((bug,target) <- bugs.zip(targets)) {
          bug.prepareMove{(pos) =>
            pos.moveToward(target.getPosition, 1.0)
          }
        }
      }

This code loops through all the bug/target pairs, and moves the bug one unit toward the target. This is behavior we were looking for and it works! When I ran the complete program using the above logic the bugs start our in a tangled mess of intersecting paths, but they gradually coalesce into a few smooth curved lines. It’s neat to see.

The complex curves of single-file marching bugs begin to smooth out and the bugs gradually get closer and closer to one another. When one bug is very close to another bug and moves one unit towards it, the direction he moves starts to get erratic. This makes the smooth line jittery and jagged. To combat this tendency, I made an enhancement to the movement logic. In my improved version of performMoves, bugs behave in the normal way when they 1.0 units or further from their target bug. When they get closer than 1.0 unit the bug moves away from the center of gravity of the bug system. I experimented with different distances for the bugs to flee the center of gravity, but I got the most pleasing results when they flee at a rate of 1.0 over the square root of the bug’s distance from the center of gravity.

This causes the bugs to spread out a bit when they start getting too close to each other and keeps the paths relatively smooth. It also has the effect of making the whole system more dynamic and interesting to watch. When the bugs start to get close to each other there is an outward impulse which causes little waves in the lines. Also, whenever you get a small circuit of bugs (a loop of a dozen bugs, say) they very quickly close into a tight formation and are propelled away from the center, sometimes in a flattened figure eight. Anyway, here’s the improved bug movement logic:

      def performMoves = {
        centerOfGravity = {
          val big = bugs.foldLeft(new Position(0,0))((a,c) => a.move(c.x, c.y))
          new Position(big.x / bugs.length, big.y / bugs.length)
        }
        for ((cur,target) <- bugs.zip(targets)) {
          val centerDist = centerOfGravity.dist(cur.getPosition)
          cur.prepareMove{(bug) =>
            if (bug.dist(target.getPosition) < 1.0)
              bug.moveToward(centerOfGravity, -1.0 / math.sqrt(centerDist))
            else
              bug.moveToward(target.getPosition, 1.0)
          }
        }
      }

That’s all of the interesting parts of the code. In addition to this stuff, we need to add the user interface bits, mouse operations, drawing, etc. before we have a working program.

Finally, the complete source

I’ve kept you in suspense long enough. Without further ado, below is the complete source code followed by section-by-section description in case you really are interested in the UI. To run it just call BugsGame.main(null).

import swing._
import event._
import util.Random

class Position(val x: Double, val y: Double) {
  def move(xOffset: Double, yOffset: Double) = new Position(x+xOffset, y+yOffset)
  private def distNumbers(that: Position) = {
    val xDist = that.x - this.x
    val yDist = that.y - this.y
    val dist = scala.math.sqrt(xDist*xDist + yDist*yDist)
    (xDist, yDist, dist)
  }
  def dist(that: Position) = distNumbers(that)._3
  def moveToward(that: Position, dist: Double) = {
    val (xDist, yDist, totalDist) = distNumbers(that)
    def offset(num: Double) = {
      val result = dist * num / totalDist
      if (result.isNaN) 0 else result
    }
    move(offset(xDist), offset(yDist))
  }
}

class Bug(xCoord: Double, yCoord: Double) {
  private var pos = new Position(xCoord,yCoord)
  private var nextPos = new Position(xCoord,yCoord)
 
  def getPosition = pos
  def x: Double = pos.x
  def y: Double = pos.y
 
  def prepareMove(func : (Position) => Position) = {
    nextPos = func(pos)
  }
 
  def paint(g: Graphics2D) {
    g.drawLine(pos.x.toInt, pos.y.toInt, nextPos.x.toInt, nextPos.y.toInt)
    pos = nextPos
  }
}

object BugsGame extends SimpleSwingApplication {
  import java.awt.{Dimension, Graphics2D, Color => AWTColor}

  override def top = frame

  var center = new Point(0,0)
  var origCenter = new Point(0,0)
  var clickPt: Point = new Point(0,0)

  var centerOfGravity = new Position(0,0)

  val initWinSize = new Dimension(800, 800)

  val bugs = {
    def random = Random.nextInt(2000) - 1000
    List.fill(5000)(
      new Bug(random + initWinSize.width/2,
              random + initWinSize.height/2))
  }
  var targets = Random.shuffle(bugs)

  val frame = new MainFrame {
    title = "Scala Bugs"
    contents = mainPanel
    lazy val mainPanel = new Panel() {
      focusable = true
      background = AWTColor.white
      preferredSize = initWinSize

      override def paint(g: Graphics2D) {
        g.setColor(AWTColor.white)
        g.fillRect(0, 0, size.width, size.height)
        onPaint(g)
      }
    }

    listenTo(mainPanel.mouse.clicks, mainPanel.mouse.moves)
    reactions += {
      case MousePressed(src, point, i1, i2, b) => {
        clickPt = point
        origCenter = center
      }
      case MouseDragged(src, point, i1) => {
        center = new Point(origCenter.x + point.x - clickPt.x,
                           origCenter.y + point.y - clickPt.y)
      }
      case MouseReleased(src, point, i1, i2, b) => {
        center = new Point(origCenter.x + point.x - clickPt.x,
                           origCenter.y + point.y - clickPt.y)
      }
      case MouseClicked(src, point, i1, i2, b) => {
        targets = scala.util.Random.shuffle(bugs)
      }
    }

    val runner = new Thread(new Runnable {
      def run() = {
        while (true) {
          performMoves
          repaint()
          //Thread.sleep(10);
        }
      }

      def performMoves = {
        centerOfGravity = {
          val big = bugs.foldLeft(new Position(0,0))((a,c) => a.move(c.x, c.y))
          new Position(big.x / bugs.length, big.y / bugs.length)
        }
        for ((cur,target) <- bugs.zip(targets)) {
          val centerDist = centerOfGravity.dist(cur.getPosition)
          cur.prepareMove{(bug) =>
            if (bug.dist(target.getPosition) < 1.0)
              bug.moveToward(centerOfGravity, -1.0 / math.sqrt(centerDist))
            else
              bug.moveToward(target.getPosition, 1.0)
          }
        }
      }
    })
    runner.start
  }

  def onPaint(g: Graphics2D) {
    g.translate(center.x, center.y)

    for ((cur,target) <- bugs.zip(targets)) {
      g.setColor(AWTColor.lightGray)
      g.drawLine(cur.x.toInt,    cur.y.toInt,
        target.x.toInt, target.y.toInt)

      g.setColor(AWTColor.black)
      cur.paint(g)
    }

    g.setColor(AWTColor.red)
    g.drawOval(centerOfGravity.x.toInt - 3, centerOfGravity.y.toInt - 3, 6, 6)
  }

}

I’ll skip over the lines that I think are trivial:

Lines 5-22: The Position class as described above

Lines 24-40: The Bug class as described above

Lines 42: I used a SimpleSwingApplication as my base class. I’m no swing expert, so this may be a rather naive swing app.

Lines 47-49: Points (in screen coordinates) that I’ll use in my mouse operations.

Lines 53-61: The bug list initialization I described earlier.

Lines 63-76: Setting up the frame and main panel.

Lines 78-95: The mouse operations. You can drag the view around the window to see different areas in a large system, or to follow the bugs if they drift out of frame. You can also click in the window to re-randomize the targets list. This is a fun feature.

Lines 97-122: The main thread. It contains an infinite loop that moves the bugs and repaints, over and over. This section contains the performMoves function that I described earlier. This is the section that you’ll want to experiment with to see what kind of interesting behaviors you can get. Also, line 102 is a good place to add a Thread.sleep if you want to slow things down.

Lines 125-139: The onPaint function. First, it shifts the whole picture to the position selected using the mouse (line 89). Then for each bug it draws a light grey line from that bug to the bug it’s following, and calls on the bug to draw itself. Finally, it draws a small red circle to indicate the center of gravity of the bug population.

Self Help

Matt — Thu, 10 Dec 2009 05:54:52 +0000

I recently began writing, as an exercise, some unit of measure code in Scala. I saw a headline in my newsreader some months ago about a Scala library for handling units of measure and I made a point NOT to read it because it sounded to me like an interesting problem and I wanted to first take a stab at it myself and then compare my solution to the one from the article and maybe even write my own article on my solution.

In the course of trying out a couple of designs I encountered a situation I wasn’t sure how to handle. I wanted an abstract class called Dimension which would encapsulate a measurement in some unit and I wanted to create several subtypes extending Dimension such as Length, Time, Mass, Temperature, etc. All Dimensions should be able to sum together, but only with their own kind. For example, 1 meter + 2 feet should give 1.6096 meters and 1 kilogram + 500 grams should give 1.5 kilograms. However, it makes no sense to add 30 seconds and 45 degrees Celsius. I wanted to arrange the types in such a way that a user of the library would not have the option of adding dimensions of two different types.

Here’s my initial code:

  abstract class Dimension(val value: Double, val name: String, val coef: Double) {
    def +(x: Dimension): Dimension
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
  Dimension(value, name, coef) {
    def +(x: Dimension): Time= new Time(value + coef * x.value / x.coef, name, coef)
  }

  class Length(value: Double, name: String, coef: Double) extends 
  Dimension(value, name, coef) {
    def +(x: Dimension): Length= new Length(value + coef * x.value / x.coef, name, coef)
  }

You can see the Dimension class declares an addition operator to satisfy our requirement that all Dimensions must be additive. No surprises so far.

The Time and Length classes extend Dimension and are concrete, so they must implement the addition operator. The operators have the same signature as the one from Dimension except for the return type. When creating a subtype, we are allowed to narrow the return types, so I made them more specific. Time.+ returns not merely a Dimension as in the supertype but a Time. Parameters, on the other hand, can only have their types widened in subtypes, so they remain Dimension. This is because the return type is a covariant position and the parameter type is a contravariant position. If you don’t know what variance is, I have an article on it.

This code has two main weaknesses. First, a developer subclassing Dimension is trusted to return the same type as the class. That is to say, a developer could write:

  class Length(value: Double, name: String, coef: Double) extends 
  Dimension(value, name, coef) {
    def +(x: Dimension): Time = new Time(value + coef * x.value / x.coef, name, coef)
  }

The developer could mix the types! This would return an unexpected and nonsensical result. The second and much more serious weakness is that a user of the library doesn’t have to pass a Length to Length.+. The user could write:

  val sum = new Length(10.0, "meters", 1.0) + new Time(10.0, "seconds", 1.0)

Nothing is stopping him from doing this.

What I wanted was a Dimension class that enforced some additional rules on its descendants. Dimension should dictate not only that all its subtypes must implement the addition operator but also that the operator should only accept a parameter of the same type as the class in which it is defined and that the operator should also return that same type. In short, I wanted to force all subtypes of Dimension to look like this:

  class X(...) extends Dimension(...) {
    def +(x: X): X = ...
  }

Learning to Accept Yourself (As a Parameter)

As I considered the problem I pretty quickly noticed a similarity to a basic Scala trait: scala.Ordered. Ordered[A] is used primarily in sorting. By mixing in Ordered, you can define an ordering for any class. The reason I thought of ordered is that includes the abstract method “compare (that : A) : Int”. This method compares “this” to “that”. In other words, it takes a parameter that it is able to compare with itself, which is *usually* a parameter of the same type. Here’s a typical use of Ordered[A]:

class Student(val lastName: String, val firstName: String) 
  extends Ordered[Student]
{
  def compare(that: Student) = 
    (lastName + "," + firstName).compare(that.lastName + "," + that.firstName)
}

That’s close to what I want, but not quite. Ordered[A] allows you to implement a class that can be compared to itself, but it doesn’t require it. You could implement “class Apples extends Ordered[Oranges]”, literally comparing apples and oranges. So this arrangement (a parameterized class or trait in which the subtypes specify themselves as the type parameter) allows, but does not enforce, the structure that I want. So Ordered[A] provides a clue, but not the complete solution.

Becoming Self Aware

The missing piece is a little-known Scala construct called the explicit self type. It is a way of specifying what type the “this” reference must have. The Scala-lang website has an article explaining another situation in which explicit self types are useful: specifying that within a class “this” should refer to an abstract variable type.

Here’s a very simple example of how explicit self types work. Here’s a base trait called TraitA and two traits that make use of TraitA. TraitB1 uses an explicit self type to denote that “this” must be of type TraitA, and TraitB2 uses extends to inherit from TraitA.

trait TraitA {
  def t1(): String
}

trait TraitB1 {
  self: TraitA =>
  def t2(): String = "TraitB1.t2 !" + t1() + "!"
}

trait TraitB2 extends TraitA {
  def t2(): String = "TraitB2.t2!" + t1() + "!"
}

TraitB1 and TraitB2 are exactly alike except for the way they gain access to the t1() method. Here’s an interpreter session in which we create some classes that extend these traits.

scala> class Class1 extends TraitB1 {
     |   def t1() = "Class1.t1"
     |   override def toString() = "Class1: " + t1() + " " + t2()
     | }
:6: error: illegal inheritance;
 self-type Class1 does not conform to TraitB1's selftype TraitB1 with TraitA
       class Class1 extends TraitB1 {
                            ^

scala> class Class2 extends TraitB1 with TraitA {
     |   def t1() = "Class2.t1"
     |   override def toString() = "Class2: " + t1() + " " + t2()
     | }
defined class Class2

scala> new Class2
res0: Class2 = Class2: Class2.t1 TraitB1.t2 !Class2.t1!

scala> class Class3 extends TraitB2 {
     |   def t1() = "Class3.t1"
     |   override def toString() = "Class3: " + t1() + " " + t2()
     | }
defined class Class3

scala> new Class3
res1: Class3 = Class3: Class3.t1 TraitB2.t2!Class3.t1!

Class1 extends TraitB1, the trait that used the explicit self type. The class defines t1(), so all the necessary implementation is there, but the compile fails anyway. The explicit self type says that “this” must be of type TraitA but neither Class1 nor TraitB1 extends TraitA, so even though the t1() method is supplied, “this” cannot have type TraitA for Class1 because Class1 does not inherit from TraitA.

Class2 and Class3 compile and run just fine. Class2 is identical to Class1 except that it mixes in TraitA. Since it is declared “with TraitA” the explicit self type is satisfied because “this” can have type TraitA.

Class3 is identical to the others except that it extends TraitB2. TraitB2 is declared as extending TraitA, so Class3 compiles because it indirectly inherits from TraitA.

Use explicit self types with care! They can be a little dangerous if you use them to subvert Scala’s compile-time type checking. For example:

trait StringMaker {
  def makeString(): String
}

class DoesntCompile extends StringMaker {
  override def toString() = "DoesntCompile " + makeString
}

class Compiles {
  self: StringMaker =>
  override def toString() = "Compiles " + makeString
}

The first class is declared as extending StringMaker, but it doesn’t implement the makeString method. This class fails to compile, and rightly so. The compiler does its job and warns you that the class won’t work.

The second class includes an explicit self type. The class named Compiles says that it must be a StringMaker. Now, it says this internally, not externally. The class declaration says nothing about StringMaker. Any code that used the Compiles class wouldn’t know that it’s supposed to be a StringMaker. The class compiles but when you try to instantiate it you get an exception. Not only is it an exception, it’s a NullPointerException which crashes my interpreter (!!!) which makes me think this may be a bug.

Time to Self Actualize

My solution to the problem was a combination of the “class X extends Ordered[X]” idiom and the explicit self type. Here it is:

  abstract class Dimension[T](val value: Double, val name: String, val coef: Double) {
    self: T =>
    protected def create(value: Double, name: String, coef: Double): T
    def +(x: Dimension[T]): T = create(value + coef * x.value / x.coef, name, coef)
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
        Dimension[Time](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Time(a, b, c)
  }

  class Length(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

  class Mass(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

This compiles just fine except for the last class, which I included to demonstrate that the compiler enforces conformance to the explicit self type. Every class that extends Dimension[X] must itself be an X. That enforces the rule I wanted. Here’s the compiler error for the last class.

$ scalac Units.scala
Units.scala:19: error: illegal inheritance;
 self-type Mass does not conform to Dimension[Length]'s selftype Dimension[Length] with Length
        Dimension[Length](value, name, coef) {
        ^
one error found

That basically says if you want to extend Dimension[Length] then you better be a Length yourself.

Considering that when I had investigated this problem for about 10 minutes I was nearly ready to call it impossible, this is a surprisingly simple and not too cryptic solution. Plus, it’s a usage of explicit self types that I hadn’t seen before. I wonder, in fact, why the Ordered[A] trait itself doesn’t use this trick.

As a bonus, here’s a sneak peek at part of my Units library so far.

  abstract class Dimension[T](val value: Double, val name: String, val coef: Double) {
    self: T =>
    protected def create(value: Double, name: String, coef: Double): T
    def +(x: Dimension[T]): T = create(value + coef * x.value / x.coef, name, coef)
    def -(x: Dimension[T]): T = create(value - coef * x.value / x.coef, name, coef)
    override def toString(): String = value + " " + name
  }

  class Time(value: Double, name: String, coef: Double) extends
        Dimension[Time](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Time(a, b, c)
  }

  class Length(value: Double, name: String, coef: Double) extends
        Dimension[Length](value, name, coef) {
    protected def create(a: Double, b: String, c: Double) = new Length(a, b, c)
  }

  abstract class TimeUnit(name: String, coef: Double) {
    def apply(value: Double) = new Time(value, name, coef)
    def apply(orig: Time) = new Time(0, name, coef) + orig
  }

  object Second   extends TimeUnit("seconds",    1.0)
  object Minute   extends TimeUnit("minutes",    1.0 / 60)
  object Hour     extends TimeUnit("hours",      1.0 / 3600)

  abstract class LengthUnit(name: String, coef: Double) {
    def apply(value: Double) = new Length(value, name, coef)
    def apply(orig: Length) = new Length(0, name, coef) + orig
  }

  object Meter      extends LengthUnit("meters",      1.0)
  object Inch       extends LengthUnit("inches",      1.0 / .0254)
  object Foot       extends LengthUnit("feet",        1.0 / .0254 / 12)

And here’s what it looks like in the interpreter:

scala> val length1 = Meter(3)
length1: Length = 3.0 meters

scala> val length2 = Foot(4.5)
length2: Length = 4.5 feet

scala> length1 + length2
res0: Length = 4.3716 meters

scala> length2 + length1
res1: Length = 14.34251968503937 feet

scala> Inch(length1 + length2)
res2: Length = 172.11023622047244 inch

scala> val time1 = Second(90)
time1: Time = 90.0 seconds

scala> val time2 = Hour(.75)
time2: Time = 0.75 hours

scala> Minute(time2)
res3: Time = 45.0 minutes

scala> time1 + length1
:14: error: type mismatch;
 found   : Length
 required: Dimension[Time]
       time1 + length1
               ^

Don’t Be So (Case) Sensitive

Matt — Sat, 15 Aug 2009 07:14:50 +0000

In Scala, as in Java, C and many other languages, identifiers may contain a mix of lower and upper case characters. These identifiers are treated in a case sensitive manner. For example “index”, “Index” and “INDEX” would be treated as three separate identifiers. You can define all three in the same scope. That goes for Scala, Java, and most if not all descendants of the C language. In most of these languages, although case is significant in distinguishing identifiers, and although various capitalization schemes are used by convention, case does not alter functionality. Whether you name a variable “index”, “Index” or “INDEX”, as long as you don’t hide an identifier from an enclosing scope, the code will function in exactly the same way.

Scala, though, diverges slightly from this tradition. Here’s an example. Say we have a Pair of two Ints. Say we also have two plain Int values and we want to know whether those two Ints are equal to the values inside the Pair. In the case that they do match, we also want to know in which order they appear in the Pair.

Operations on tuples (such as a Pair) can often be implemented neatly by pattern matching. Here’s one solution to this problem:

def matchPair(x: (Int,Int), A: Int, b: Int): String = 
x match {
  case (A, b) => "Matches (A, b)"
  case (b, A) => "Matches (b, A)"
  case _      => "Matches neither"
}

This is completely unsurprising code except for one little detail. One of the function parameters is upper case while the other two are lower case. Other than that, there’s nothing unusual so far. So let’s try out this code in the Scala interpreter.

scala> def matchPair(x: (Int,Int), A: Int, b: Int): String = 
     | x match {
     |   case (A, b) => "Matches (A, b)"
     |   case (b, A) => "Matches (b, A)"
     |   case _      => "Matches neither"
     | }
matchPair: ((Int, Int),Int,Int)String

scala> val pair = (5, 10)
pair: (Int, Int) = (5,10)

scala> matchPair( pair,  5, 10 )
res1: String = Matches (A, b)

scala> matchPair( pair, 10,  5 )
res2: String = Matches (b, A)

scala> matchPair( pair, 99, 99 )
res3: String = Matches neither

So far so good! It returns the expected value when the values match in order, in reverse order, and when both values don’t match. Is this sufficient unit testing? What other tests would you run?

As you may well guess, no, this isn’t sufficient unit testing. Let’s try the case where one Int matches but not the other:

scala> matchPair( pair,  5, 99 )
res4: String = Matches (A, b)

scala> matchPair( pair, 99, 10 )
res5: String = Matches (b, A)

That didn’t work right. Is the matchPair function telling us that ‘pair’ (which is (5, 10) ) matches (5, 99) or (99 10)? That’s what it look like, but no. Scala does something a little bit surprising here. Do you know why?

As I said before, you can have variable and constants in Scala with upper or lower case names. Both are legal, just as they are in Java. But Scala makes some distinctions that Java doesn’t. Within a pattern (the part between ‘case’ and ‘=>’) Scala treats simple lower case identifiers differently. It uses them as new variables into which matched data is stored, but this is not the case for identifiers that begin with an upper case letter!

If you want to capture results of a pattern match in Scala you must use a lower case identifier and that identifier will hide any identifiers with the same name from an enclosing scope. So in our example function, “case (A, b)” matches a Pair. The first element of the pair is “A” which start with an upper case letter, so pattern matching results can’t be stored in it. It is used the way we intended, i.e. the pattern is matched if x._1 equals A.

The “b” in “case (A, b)”, though, begins with a lower case letter so it is assigned the value of x._2 (assuming x._1 equals A). It is as if you had typed “val b = x._2” in the function body. Within the case line, the “b” from the pattern hides the parameter named “b”.

So how can we make this function work the way we want? Here’s one way:

def matchPair(x: (Int,Int), A: Int, B: Int): String = 
x match {
  case (A, B) => "Matches (A, B)"
  case (B, A) => "Matches (B, A)"
  case _      => "Matches neither"
}

Now both the the Int parameters start with an upper case letter and are therefore tested against x._1 and x._2. This code passes our tests. Note that the code behaves differently simply based on the parameter names we choose. There’s another way to prevent Scala from using the identifiers for storing pattern results.

def matchPair(x: (Int,Int), a: Int, b: Int): String = 
x match {
  case (`a`, `b`) => "Matches (a, b)"
  case (`b`, `a`) => "Matches (b, a)"
  case _          => "Matches neither"
}

You can use the more traditional lower case parameter names if you quote them using the backquote character. That’s the key to the left of the “1” on my keyboard. This matchPair is equivalent to the one that used capital “A” and “B”.

Another quick example:

scala> val pair = (5, 10)
pair: (Int, Int) = (5,10)

scala> val (a,b) = pair
a: Int = 10
b: Int = 5

scala> val (X,Y) = pair
:5: error: not found: value X
       val (X,Y) = pair
            ^
:5: error: not found: value Y
       val (X,Y) = pair
              ^

You know that you can use the construct from line 4 above to declare and initialize multiple vals or vars using a tuple, right? And you know how that magic is done? Patterns, so the same principle applies here. The capitalized identifiers “X” and “Y” are taken to refer to existing identifiers because they can’t be used to store pattern match results. Since no such identifiers had been defined, you get an error.

If we define these values beforehand then Scala tries to match their values:

scala> val pair = (5,10)
pair: (Int, Int) = (5,10)

scala> val I = 5
I: Int = 5

scala> val J = 10
J: Int = 10

scala> val (I,q) = pair
q: Int = 10

scala> val (J,r) = pair
scala.MatchError: (5,10)
        at .(:6)
        at .()
        at RequestResult$.(:3)
        at RequestResult$.()
        at RequestResult$result()
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(...

scala> val (I,J) = pair

In line 10, the value I has been declared and initialized to 5 so it matches the first part of the Pair. The identifier q becomes a new val initialized to the value in the second part of the Pair.

In line 13, we do the same thing but we try to match J to the first part of the Pair. This won’t work since pair._1 is 5 and J is 10. A MatchError is thrown.

In line 24, we use both of the capitalized identifiers. They match, but there are no lower case identifiers to make new values out of, so the line does nothing except to confirm (by not throwing an Error) that I equals pair._1 and J equals pair._2.

Now you know how to match when you want to match and assign results when you want to assign results. I hope that being able to match against already-defined identifiers will make your matching code more powerful.

Lots And Lots Of foldLeft Examples

Matt — Thu, 30 Jul 2009 07:09:06 +0000

In my last post I reviewed the implementation of scala.List’s foldLeft and foldRight methods. That post included a couple of simple examples, but today I’d like to give you a whole lot more. The foldLeft method is extremely versatile. It can do thousands of jobs. Of course, it’s not the best tool for EVERY job, but when working on a list problem it’s a good idea to stop and think, “Should I be using foldLeft?”

Below, I’ll present a list of problem descriptions and solutions. I thought about listing all the problems first, and then the solutions, so the reader could work on his own solution and then scroll down to compare. But this would be very annoying for those who refuse, against my strenuous urging, to start up a Scala interpreter and try to write their own solution to each problem before reading my solution.

Sum

Write a function called ‘sum’ which takes a List[Int] and returns the sum of the Ints in the list. Don’t forget to use foldLeft.

def sum(list: List[Int]): Int = list.foldLeft(0)((r,c) => r+c)
def sum(list: List[Int]): Int = list.foldLeft(0)(_+_)

I’ll explain this first example in a bit more depth than the others, just to make sure we all know how foldLeft works.

These two definitions above are equivalent. Let’s examine the first one. The foldLeft method is called on the list parameter. The first parameter is 0. This is the starting value, and the value that will be returned if list is empty. The second parameter is a function literal. It takes parameters ‘r’ (for result) and ‘c’ (for current) and returns the sum of these two values. Scala is smart enough to figure out that since the first parameter (0) is an Int, the ‘r’ parameter must also be an Int. The initial value is always the same type as ‘r’. Scala can also tell that since ‘list’ is a List[Int] the ‘c’ parameter must also be an Int, so we don’t have to specify their types in the parameter list.

The foldLeft method takes that initial value, 0, and the function literal, and it begins to apply the function on each member of the list (parameter ‘c’), updating the result value (parameter ‘r’) each time. That result value that we call ‘r’ is sometimes called the accumulator, since it accumulates the results of the function calls.

In the first defintion, foldLeft’s second parameter (a function literal) uses explicitly named parameters. Notice that ‘r’ and ‘c’ are each referred to exactly once in the function literal, and in the same order as the parameter list. When function literal parameters are used in this way (once each, same order) you can use the shorthand demonstrated in the second definition. The first ‘_’ stands for ‘r’, and the second one stands for ‘c’.

Product

Now that you’ve got the idea, try this one. Write a function that takes a List[Int], and returns the product (of multiplication) of all the Ints in the list. It will be similar to the ‘sum’ function, but with a couple of differences.

def product(list: List[Int]): Int = list.foldLeft(1)(_*_)

Did you get it? It’s the same as ‘sum’ with two exceptions. The initial value is now 1, and the function literal’s parameters are multiplied instead of added. If the initial value were 0, as in ‘sum’, then the function would always return 0.

Count

This one’s a little different. Write a function that takes a List[Any] and returns the number of items in the list. Don’t just call list.length()! Implement it using foldLeft.

def count(list: List[Any]): Int =
  list.foldLeft(0)((sum,_) => sum + 1)

First, we pick our initial value. Remember that this is the value that will be returned for an empty list. An empty list has 0 elements, so we use 0. What function do we want to apply for every item in the list? We just want to increase the result value by one. We call that parameter ‘sum’ in this solution. We don’t care about the actual value of each list element, so we call the second parameter ‘_’, which means it should be discarded.

Average

Here’s a fun one. Write a function that takes a List[Double] and returns the average of the list’s values. There are two ways to go about this one. You could combine two of the previous solutions, using two foldLeft calls, or you could combine them into a single foldLeft. Try to find both solutions.

def average(list: List[Double]): Double =
  list.foldLeft(0.0)(_+_) / list.foldLeft(0.0)((r,c) => r+1)

def average(list: List[Double]): Double = list match {
  case head :: tail => tail.foldLeft( (head,1.0) )((r,c) =>
    ((r._1 + (c/r._2)) * r._2 / (r._2+1), r._2+1) )._1
  case Nil => NaN
}

The first solution is pretty easy and combines the ‘sum’ and ‘count’ solutions. In real life, of course, you wouldn’t use foldLeft to find the length of the list. You’d just use the length() method. Other than that, though, this is a perfectly sensible solution.

The second solution is more complex. First, the list is matched against two patterns. It is either interpreted as a head item followed by a tail, or as an empty list (Nil). If it’s empty, the function returns the same thing as the first solution, NaN (Not a Number) because you can’t divide by 0.

If the list is not empty, we use a Pair as our initial value. A Pair is just an ordered pair of values. It’s a convenient way to bundle values together. We use it when we need to keep track of more than one accumulator value. In this case, we want to keep track of the average “so far” and also the number of values that the average represents. If the function literal were just passed the average so far, it wouldn’t know how to weight the next value. Members of a Pair are accessed using special methods called ‘_1’ and ‘_2’. You can have groupings longer than 2, also. These are named Tuple3, Tuple4, and so on. In fact, Pair is just an alias of Tuple2. Notice that we didn’t use the word Pair or Tuple2 anywhere in the code. If you enclose a comma-delimited series of values in parentheses, Scala converts that series into the appropriate TupleX.

After we have built up the result, it is a Pair containing the average and the number of items in the list. We only want to return the average so we call ‘_1’ on the result of foldLeft.

Last

Whew! That one was a little tough. Here’s an easier one. Given a List[A] return the last value in the list. Again, no using List’s last() method.

def last[A](list: List[A]): A =
  list.foldLeft[A](list.head)((_, c) => c)

Easy! Mostly. You’ll notice that we’re using a type parameter, A, in this one. If you’re not familiar with type parameters, too bad. I can’t explain them here. Suffice it to say that our use of A here allows us to take a list of any type of contents, and return a result of just that type. So Scala knows that when this is called on a List[Int], it will return an Int. When it’s called on a List[String], it returns a String.

First, we pick an initial value. For the empty list the concept of a last item doesn’t make any sense, so forget that. We can use any value, so long as it’s of type A. list.head is convenient, so that’s our initial value. The function literal is the simplest we’ve seen. For each item in the list, it just returns that item itself. So when it gets to the end of the list, the accumulator holds the last item. We don’t use the accumulator value in the function literal, so it gets parameter name ‘_’.

Penultimate

Write a function called ‘penultimate’ that takes a List[A] and returns the penultimate item (i.e. the next to last item) in the list. Hint: Use a tuple.

def penultimate[A](list: List[A]): A =
  list.foldLeft( (list.head, list.tail.head) )((r, c) => (r._2, c) )._1

This one is very much like the function ‘last’, but instead of keeping just the current item it keeps a Pair containing the previous and current items. When foldLeft completes, its result is a Pair containing the next-to-last and last items. The “_1” method returns just the penultimate item.

Contains

Write a function called ‘contains’ that takes a List[A] and an item of type A, and returns true if the item is one of the members of the list, and false if it isn’t.

def contains[A](list: List[A], item: A): Boolean =
  list.foldLeft(false)(_ || _==item)

We choose an initial value of false. That is, we’ll assume the item is not in the list until we can prove otherwise. We use each of the two parameters exactly once and in the proper order, so we can use the ‘_’ shorthand in our function literal. That function literal returns the result so far (a Boolean) ORed with a comparison of the current item and the target value. If the target is ever found, the accumulator becomes true and stays true as foldLeft continues.

Get

Write a function called ‘get’ that takes a List[A] and an index Int, and returns the list value at the index position. Throw an exception if the index is out of bounds.

def get[A](list: List[A], idx: Int): A =
  list.tail.foldLeft((list.head,0)) {
    (r,c) => if (r._2 == idx) r else (c,r._2+1)
  } match {
    case (result, index) if (idx == index) => result
    case _ => throw new Exception("Bad index")
  }

This one has two parts. First there’s the foldLeft, and the result is pattern matched. The foldLeft is pretty easy to follow. The accumulator is a Pair containing the current item and the current index. The current item keeps updating and the current index keeps incrementing until the current index equals the passed in idx. Once the correct index is found the same accumulator is returned over and over. This works fine if idx parameter is in bounds. If it’s out of bounds, though, the foldLeft just returns a Pair containing the last item and the last index. That’s where the pattern match comes in. If the Pair contains the right index then we use the result item. Otherwise, we throw an exception.

MimicToString

Write a function called ‘mimicToString’ that mimics List’s own toString method. That is, it should return a String containing a comma-delimited series of string representations of the list contents with “List(” on the left and “)” on the right.

def mimicToString[A](list: List[A]): String = list match {
  case head :: tail => tail.foldLeft("List(" + head)(_ + ", " + _) + ")"
  case Nil => "List()"
}

This one also uses a pattern match, but this time the match happens first. The pattern match just treats the empty list as a special case. For the general case (a non-empty list) we use, of course, foldLeft. The accumulator starts out as “List(” + the head item. Then each remaining item (notice foldLeft is called on tail) is appended with a leading “, ” and a final “)” is added to the result of foldLeft.

Reverse

This one’s kind of fun. Make sure to try it before you look at my solution. Write a function called ‘reverse’ that takes a List and returns the same list in reverse order.

def reverse[A](list: List[A]): List[A] =
  list.foldLeft(List[A]())((r,c) => c :: r)

A very simple solution! The initial value of the accumulator is just an empty list. We don’t use Nil, but instead spell out the List type so that Scala will know what type to make ‘r’. As I say, we start with the empty list which is sensible because the reverse of an empty list is an empty list. Then, as we go through the list, we place each item at the front of the accumulator. So the item at the front of list becomes the last item in the accumulator. This goes on until we reach the end of list, and that last member of list goes onto the front of the accumulator. It’s a really neat and tidy solution.

Unique

Write a function called ‘unique’ that takes a List and returns the same List, but with duplicated items removed.

def unique[A](list: List[A]): List[A] =
  list.foldLeft(List[A]()) { (r,c) =>
    if (r.contains(c)) r else c :: r
  }.reverse

As usual, we start with an empty list. foldLeft looks at each list item and if it’s already contained in the accumulator then then it stays as it is. If it’s not in the accumulator then it’s appended. This code bears a striking similarity to the ‘reverse’ function we wrote earlier except for the “if (r.contains(c)) r” part. Because of this, the foldLeft result is actually the original list with duplicates removed, but in reverse order. To keep the output in the same order as the input, we add the call to reverse. We could also have chained on the foldLeft from the ‘reverse’ function, like so:

def unique[A](list: List[A]): List[A] =
  list.foldLeft(List[A]()) { (r,c) =>
    if (r.contains(c)) r else c :: r
  }.foldLeft(List[A]())((r,c) => c :: r)

ToSet

Write a function called ‘toSet’ that takes a List and returns a Set containing the unique elements of the list.

def toSet[A](list: List[A]): Set[A] =
  list.foldLeft(Set[A]())( (r,c) => r + c)

Super easy one. You just start out with an empty Set, which would be the right answer for an empty List. Then you just add each list item to the accumulator. Since the accumulator is a Set, it takes care of eliminating duplicates for you.

Double

Write a function called ‘double’ that takes a List and a new List in which each item appears twice in a row. For example double(List(1, 2, 3)) should return List(1, 1, 2, 2, 3, 3).

def double[A](list: List[A]): List[A] =
  list.foldLeft(List[A]())((r,c) => c :: c :: r).reverse

Again, pretty easy. Are you starting to see a pattern. When you use foldLeft to transform one list into another, you usually end up with the reverse of what you really want.

Alternately, you could have used the foldRight method instead. This does the same thing as foldLeft, except it accumulates its result from back to front instead of front to back. I can’t recommend using it, though, due to problems I point out in my other post on foldLeft and foldRight. But here’s what it would look like:

def double[A](list: List[A]): List[A] =
  list.foldRight(List[A]())((c,r) => c :: c :: r)

InsertionSort

This one takes some thinking. Write a function called ‘insertionSort’ that uses foldLeft to sort the input List using the insertion sort algorithm. Try it on your own before you look at the solution.

Need a hint? Use List’s ‘span’ method.

Did you find a solution? Here’s mine:

def insertionSort[A <% Ordered[A]](list: List[A]): List[A] =
  list.foldLeft(List[A]()) { (r,c) =>
    val (front, back) = r.span(_ < c)
    front ::: c :: back
  }

First, the type parameter ensures that we have elements that can be arranged in order. We start, predictably, with an empty list as our initial accumulator. Then, for each item we assume the accumulator is in order (which it always will be), and use span to split it into two sub-lists: all already-sorted items less than the current item, and all already-sorted items greater than or equal to the current item. We put the current item in between these two and the accumulator remains sorted. This is, of course, not the fastest way to sort a list. But it’s a neat foldLeft trick.

Pivot

Speaking of sorting, you can implement part of quicksort with foldLeft, the pivot. Write a function called ‘pivot’ that takes a List, and returns a Tuple3 containing: (1) a list of all elements less than the original list’s first element, (2) the first element, and (3) a List of all elements greater than or equal to the first element.

def pivot[A <% Ordered[A]](list: List[A]): (List[A],A,List[A]) =
  list.tail.foldLeft[(List[A],A,List[A])]( (Nil, list.head, Nil) ) {
    (result, item) =>
    val (r1, pivot, r2) = result
    if (item < pivot) (item :: r1, pivot, r2) else (r1, pivot, item :: r2)
  }

We’re using the first element, head, as the pivot value, so we skip the head and call foldLeft on list.tail. We initialize the accumulator to a Tuple3 containing the head element with an empty list on either side. Then for each item in the list we just pick which of the two lists to add to based on a comparison with the pivot value.

If you take the additional step of turning this into a recursive call, you can implement a quicksort algorithm. It probably won’t be a very efficient one because it will involve a lot of building and rebuilding lists. Give it a try if you like, and then look at my solution:

def quicksort[A <% Ordered[A]](list: List[A]): List[A] = list match {
  case head :: _ :: _ =>
    println(list)
    list.foldLeft[(List[A],List[A],List[A])]( (Nil, Nil, Nil) ) {
      (result, item) =>
      val (r1, r2, r3) = result
      if      (item < head) (item :: r1, r2, r3)
      else if (item > head) (r1, r2, item :: r3)
      else                  (r1, item :: r2, r3)
    } match {
      case (list1, list2, list3) =>
        quicksort(list1) ::: list2  ::: quicksort(list3)
    }
  case _ => list
}

Basically, for all lists that have more than 1 element the function chooses the head element as the pivot value, uses foldLeft to divide the list into three (less than, equal to, and greater than the pivot), recursively sorts the less-than and greater-than lists, and knits the three together.

Encode

Ok, we got a little into the weeds with that last one. Here’s a simpler one. Write a function called ‘encode’ that takes a List and returns a list of Pairs containing the original values and the number of times they are repeated. So passing List(1, 2, 2, 2, 2, 2, 3, 2, 2) to encode will return List((1, 1), (2, 5), (3, 1), (2, 2)).

def encode[A](list: List[A]): List[(A,Int)] =
list.foldLeft(List[(A,Int)]()){ (r,c) =>
    r match {
      case (value, count) :: tail =>
        if (value == c) (c, count+1) :: tail
        else            (c, 1) :: r
      case Nil =>
        (c, 1) :: r
    }
}.reverse

Decode

You knew this was coming. Write a function called ‘decode’ that does the opposite of encode. Calling ‘decode(encode(list))’ should return the original list.

def decode[A](list: List[(A,Int)]): List[A] =
list.foldLeft(List[A]()){ (r,c) =>
    var result = r
    for (_ <- 1 to c._2) result = c._1 :: result
    result
}.reverse

Encode and decode could both have been written by using foldRight and dropping the call to reverse.

Group

One last example. Write a function called ‘group’ that takes a List and an Int size that groups elements into sublists of the specified sizes. So calling “group( List(1, 2, 3, 4, 5, 6, 7), 3)” should return List(List(1, 2, 3), List(4, 5, 6), List(7)). Don’t forget to make sure list items are in the right order. Try it yourself before you look at the solution below.

def group[A](list: List[A], size: Int): List[List[A]] =
  list.foldLeft( (List[List[A]](),0) ) { (r,c) => r match {
    case (head :: tail, num) =>
      if (num < size)  ( (c :: head) :: tail , num + 1 )
      else             ( List(c) :: head :: tail , 1 )
    case (Nil, num) => (List(List(c)), 1)
    }
  }._1.foldLeft(List[List[A]]())( (r,c) => c.reverse :: r)

This code uses the first foldLeft to group the items in a way that’s convenient to list operations, and that last foldLeft to fix the order, which would otherwise be wrong in both the outer and inner lists.

The End!

That’s all for now. If you know of any neat foldLeft tricks, please do leave a comment. I’d be interested to hear about it.

Building A Simple Scala List From Scratch

Matt — Wed, 22 Jul 2009 07:11:27 +0000

In Java you don’t see a lot of linked lists, and if you do it’s almost always java.util.LinkedList. People never write their own lists. They don’t really need to, I suppose. The one from java.util is fine. Plenty of people are leading fulfilling software careers never having implemented their own linked list. But it’s kind of a shame. Knowing how your data structures work makes you a better programmer.

It’s even rarer for a person to implement his own linked list in Scala. Scala’s scala.List is one of the most used classes in the language, so it’s packed with functionality. It’s abstract, covariant, it has helper objects such as List and Nil and the little-known ‘::’ class, it inherits from Product, Seq, Collection, Iterable, and PartialFunction. The machinery of List pulls in Array, ListBuffer, and more. It can be hard to take it all in.

So let’s build our own linked list. We’ll start out with something very basic and un-Scala-like. Then we’ll improve it gradually until we have something a little closer to scala.List. I encourage you to fire up your Scala interpreter and follow along.

Back to Basics

First, a short review of linked lists. A linked list is a chain of nodes, each referring to exactly one other node until you get to the end of the chain. You refer to the list by its first item and you follow the chain of references to reach the other nodes.

What are the requirements for our first try? Our node should be able to hold a piece of data, and refer to the next node. It should also be able to report its length, and provide a toString method so we can visualize the list. Here we go.

class MyList(val head: Any, val tail: MyList) {
  def isEmpty = (head == null && tail == null)
  def length: Int = if (isEmpty) 0 else 1 + tail.length
  override def toString: String = if (isEmpty) "" else head + " " + tail
}

The value ‘head’ holds the data for the node, ‘tail’ refers to the next element in the chain. The ‘isEmpty’ method is true if the head and tail are both null. The length and toString methods are both defined using a similar pattern: if (isEmpty) [base result] else [data for current node + result of same method on tail].

Here’s what it looks like when we use this class:

scala> var list = new MyList(null, null)
list: MyList =

scala> list.length
res0: Int = 0

scala> list.isEmpty
res1: Boolean = true

scala> list = new MyList("ABC", list)
list: MyList = ABC

scala> list.length
res3: Int = 1

scala> list.isEmpty
res4: Boolean = false

scala> list = new MyList("XYZ", list)
list: MyList = XYZ ABC

scala> list = new MyList("123", list)
list: MyList = 123 XYZ ABC

scala> list.tail.head
res7: Any = XYZ

Not bad. It gets the job done. But it has some problems. First is the use of ‘null’. Use of null references is sloppy and increases the odds of a null pointer exception so, ideally, we don’t want to see that. It has other problems, too. It’s too verbose. It’s not typesafe. But for now let’s concentrate on getting rid of the nulls.

No Nulls Is Good Nulls

How can we do it? We’re using the null as a special value, a marker to tell us when a node is at the end of a list. So we’ll just use something else as that marker instead. What can we use? We’ll create a special object for the empty list. It will be recognized as empty just based on its identity, not on null values. So let’s try it:

class MyList(val head: Any, val tail: MyList) {
  def isEmpty = false
  def length: Int = if (isEmpty) 0 else 1 + tail.length
  override def toString: String = if (isEmpty) "" else head + " " + tail
}

object MyListNil extends MyList("arbitrary value", null) {
  override def isEmpty = true
}

That’s better. (The observant reader will note the similarity of MyListNil to scala.List’s Nil object.) We got rid of the nulls in the isEmpty method, but we still have to put something in the head and tail parameters of the MyList constructor. We put an arbitrary non-null value in head, but what do we put for tail? Either null or create a new MyList. And how can that MyList be instantiated? It also needs a tail. Vicious circle. So this solution leaves us still stuck with a null.

Earlier, the null was there to mark a special node. We factored out that usage. Now it’s there to allow us to create the MyListNil. How can we factor that out? MyListNil is required to call its parent’s constructor. What if had no parent? Then it wouldn’t be a MyList anymore. What if it had an abstract parent? Now you’re talking. Let’s see what that would look like.

abstract class MyList {
  def head: Any
  def tail: MyList
  def isEmpty: Boolean
  def length: Int
}

class MyListImpl(val head: Any, val tail: MyList) extends MyList {
  def isEmpty = false
  def length: Int = 1 + tail.length
  override def toString: String = head + " " + tail
}

object MyListNil extends MyList {
  def head: Any = throw new Exception("head of empty list")
  def tail: MyList = throw new Exception("tail of empty list")
  def isEmpty = true
  def length = 0
  override def toString =  ""
}

It’s a little more code, but much neater. There are no nulls anywhere. Here’s how it looks when we use this new MyList:

scala> var list: MyList = MyListNil
list: MyList =

scala> list = new MyListImpl("ABC", list)
list: MyList = ABC

scala> list = new MyListImpl("XYZ", list)
list: MyList = XYZ ABC

scala> list = new MyListImpl("123", list)
list: MyList = 123 XYZ ABC

scala> list.length
res3: Int = 3

scala> list.tail.head
res4: Any = XYZ

scala> list.tail.tail.tail.head
java.lang.Exception: head of empty list
        at ...

Pretty neat. The equivalent of MyListImpl in the Scala’s real List implementation is a class called ‘::’, which has that funny name, by the way, because it looks nice in pattern matching code. Sometimes ‘::’ is referred to as cons. With nulls finally eliminated, we can concentrate on other issues.

Brevity Is The Heart Of List

The thing that I notice at this point is that a lot of typing (on the keyboard) is required to use this list. We have to type out “list = new MyListImpl(…, list)” every time we add an item. We can improve this with a new method.

abstract class MyList {
  [...]
  def add(item: Any): MyList = new MyListImpl(item, this)
}

Now we have classes referring to each other. MyList creates new MyListImpls, and MyListImpl extends MyList. So you’ll need to put these classes in a .scala file and compile them instead of just typing them into the Scala interpreter. But, wow! Look how much easier it is to use MyList now:

scala> var list = MyListNil add("ABC") add("XYZ") add("123")
list: MyList = 123 XYZ ABC

scala> list.length
res1: Int = 3

So much easier! One thing I notice, though, is that the order of items in the code is different from the order produced by toString. We can change our ‘add’ method so that is right-associative instead of left-associative by using a method name that ends in ‘:’ (colon). We’ll use ‘::’ as the method name since that’s what scala.List uses.

abstract class MyList {
  [...]
  def ::(item: Any): MyList = new MyListImpl(item, this)
}

scala> var list = "ABC" :: "XYZ" :: "123" :: MyListNil
list: MyList = ABC XYZ 123

Now we’re really getting somewhere. This is starting to look more like scala.List. One other thing that the standard list implementation gives you is a shortcut for initializing lists. It looks like “List(1, 2, 3, 4)”. Notice there’s no ‘new’ keyword. This is done using the scala.List helper object and its ‘apply’ method. Below is our own MyList helper object.

object MyList {
  def apply(items: Any*): MyList = {
    var list: MyList = MyListNil
    for (idx <- 0 until items.length reverse)
      list = items(idx) :: list
    list
  }
}

scala> var list = MyList("ABC", "XYZ", "123")
list: MyList = ABC XYZ 123

scala> list = "Cool" :: list
list: MyList = Cool ABC XYZ 123

Better Type-Safe Than Sorry

Better. Our code looks much neater now when we use MyList. I’ll introduce just one more improvement to MyList. It still has a rather glaring problem. It provides no type information. It keeps all of its data using a reference to Any. If you don’t see why this is a problem, let’s see what happens when we want to get the length of some items in a MyList:

scala> var list = MyList("ABC", 12345, "WXYZ")
list: MyList = ABC 12345 WXYZ

scala> list.head.length
:6: error: value length is not a member of Any
       list.head.length
                 ^

scala> list.head.asInstanceOf[String].length
res10: Int = 3

scala> list.tail.head.asInstanceOf[String].length
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
        at .(:6)

Ouch! First, when we try to call method ‘length’ on list.head Scala complains that list.head is a reference to Any. Any doesn’t have a length method. This isn’t a dynamically typed language like, say, Ruby. An object has to have the right type before we can start calling methods. What to do? You could implement a MyStringList where the head has type String. But then you’ll need a MyIntList, MyDoubleList, etc. What we need is a way to specify the type of data in the list when we create the MyList instance. What we need is a type parameter.

Here’s the complete MyList code using a type parameter, and a little demonstration code:

abstract class MyList[A] {
  def head: A
  def tail: MyList[A]
  def isEmpty: Boolean
  def length: Int
  def ::(item: A): MyList[A] = new MyListImpl[A](item, this)
}

class MyListImpl[A](val head: A, val tail: MyList[A]) extends MyList[A] {
  def isEmpty = false
  def length: Int = 1 + tail.length
  override def toString: String = head + " " + tail
}

object MyListNil extends MyList[Nothing] {
  def head: Nothing = throw new Exception("head of empty list")
  def tail: MyList[Nothing] = throw new Exception("tail of empty list")
  def isEmpty = true
  def length = 0
  override def toString =  ""
}

object MyList {
  def apply[A](items: A*): MyList[A] = {
    var list: MyList[A] = MyListNil.asInstanceOf[MyList[A]]
    for (idx <- 0 until items.length reverse)
      list = items(idx) :: list
    list
  }
}

scala> var list = MyList("ABC", "WXYZ", "123")
list: MyList[java.lang.String] = ABC WXYZ 123

scala> list.head.length
res0: Int = 3

scala> 3.14159 :: list
:6: error: type mismatch;
 found   : Double
 required: java.lang.String
       3.14159 :: list
               ^

scala> var list = MyList("ABC", 123, 3.14159)
list: MyList[Any] = ABC 123 3.14159

Look at line 32. The “MyList(…)” returns a MyList[String]. Scala figures out from the parameters what type to use. In line 35, you can see how much easier it is to use the list contents when you know the type at compile time.

If you try to mix types, as in line 45, Scala determines the nearest common ancestor of the types (Any, in this case) and uses that. However, if the type parameter is already determined, as in line 38, it won’t change when you try to add data of a different type. To make line 38 work, we can make a small change to the ‘::’ method:

abstract class MyList[A] {
  [...]
  def ::[B >: A](item: B): MyList[B] = 
    new MyListImpl(item, this.asInstanceOf[MyList[B]])
}

scala> var list = MyList("ABC", "XYZ")
list: MyList[java.lang.String] = ABC XYZ

scala> 3.14159 :: list
res0: MyList[Any] = 3.14159 ABC XYZ

This says that ‘::’ takes a parameter of type B, which is either A or a superclass of A, and returns a MyList[B]. So if you have a MyList[String] and you call ‘::’ on it with a Double parameter, Scala figures out that although Double is not a superclass of String, String and Double are both descendants of Any, and it returns a MyList[Any].

Conclusion

That’s a good stopping point for now. Obviously you can take the MyList class a lot further and add a lot more methods, but we’ve created some code that approximates the basics provided by scala.List. In fact, you could take several of the scala.List methods (foldLeft, for example) and basically drop them right into MyList and they’d work fine.

Scala Boggles The Mind

Matt — Mon, 06 Jul 2009 12:56:25 +0000

I was playing a game on my iPhone called Scramble the other day. It’s a great game. You are presented with a 4×4 grid of letters, and your job is to find words by chaining together adjacent letters. It bears a passing similarity to Boggle. I was playing online and I noticed that many other players play much better than I do. Well, a software developer doesn’t take a thing like this lying down. I decided it was time to write a Scramble (or Boggle) solver!

And since I’m trying to pick up the Scala language whenever I have an opportunity, I wrote the whole thing in Scala. I hope it will be, for the reader, an interesting study of a complete and useful (though simple) example Scala program.

First, I had to decide on a strategy. If we try every path on the game board, that’s inefficient. For example, if we try a path XQ then we can stop right there. XQ is not a word and no word starts with the letters XQ. This suggests using some sort of spell-checker-like logic. So let’s first create a dictionary data structure that allow us to look up words and eliminate dead ends. What we need is a tree. Here’s the data structure I came up with.

import scala.collection.mutable.HashMap

class LetterTree {
    private val nodes: HashMap[Char,LetterTree] = new HashMap[Char,LetterTree]
    var terminal: Boolean = false
    def addWord(word: String): Unit = addWord(word.toList)
    def addWord(word: List[Char]): Unit = word match {
        case Nil          => terminal = true
        case head :: tail => nodes.getOrElseUpdate(head, new LetterTree).addWord(tail)
    }
    def getSubTree(letter: Char): Option[LetterTree] =
        if (nodes.contains(letter)) Some(nodes(letter)) else None
}

It’s a mutable class. I toyed with some ideas for an immutable class, but it just complicated things more than I wanted to cope with. Our dictionary will be a tree in which LetterTree is the node class. You can see that a LetterTree has two member data: a hashmap of child LetterTrees indexed by Char, and a flag called terminal. So a LetterTree is a tree node whose child nodes are indexed by letter, and which can be marked (via the terminal flag) as ending a word. This is a pretty efficient way to store words.

There are two addWord methods for populating the tree. One takes a list of Chars. The other takes a String and is included for convenience. It converts its String parameter to a list of Chars and calls the other addWord method. If the addWord method is passed an empty list (Nil), then it has reached the end of a word and sets the terminal flag. Otherwise, it takes the first Char in the list and looks up (or else creates) the LetterTree mapped to that Char. The method then adds the remainder of the Char list to that mapped LetterTree.

Finally, there’s the getSubTree method. This will be useful when we start using the tree to look up potential word matches.

So imagine that we create a LetterTree and add the following words: item, its, it. We get a structure like this:

Figure 1

Each of the boxes represents a LetterTree instance. Each arrow, labeled with letter, represents an entry in the ‘nodes’ HashMap. The top level LetterTree does not itself represent a letter. It’s just a starting point. Its ‘nodes’ member has just one mapping: letter ‘i’ maps to a second LetterTree. That second LetterTree doesn’t have its ‘terminal’ flag set, so we haven’t made a word yet. Its ‘nodes’ map has a single entry mapping ‘t’ to a third LetterTree. This third LetterTree is a terminal node, so we know that the sequence ‘it’ forms a word. The third LetterTree’s ‘nodes’ map has entries for ‘s’ and ‘e’. The LetterTree mapped to ‘s’ is terminal, and the one mapped to ‘e’ is not. The ‘e’ node has a child node ‘m’ that is a terminal.

This is a pretty efficient way to store words. We got to use the sequence ‘it’ 3 times! Also, note that there is exactly one terminal node for each word. If we add a word, we will either append a new leaf node which will be terminal, or we will make one of the existing nodes terminal.

Now, all we have to do is create a LetterTree and call the addWord method for every word we can think of. That could be annoying. Let’s create an improved LetterTree that will read a list of words from a file.

import java.io.File
import scala.io.Source

class FileLetterTree(path: String) extends LetterTree {
    val file = new File(path)
     for (line <- Source.fromFile(file).getLines) addWord(line.trim)
}

Can you believe how easy that was?! We just extend LetterTree, take a file path as a constructor parameter, use the scala.io.Source class to get all lines, and add each line as a word. Now just find a text file containing all English words. You should be able to google it.

You might want to ensure the file contains only words 3 letters or longer (as required by the rules of the game), only lower case, and only the 26 english letters. Since Scramble has no ‘Q’ piece (neither does Boggle) but only a ‘Qu’ piece, you might want to do a global replace on your word file, replacing all instance of ‘qu’ with ‘q’.

Ok, now we have a data structure for looking up words. Now we need to create some code that represents the game board. We’ll assume a 4-by-4 board.

class GameBoard(lettersStr: String) {
    private val ltrStr = lettersStr.toLowerCase()
    if (!ltrStr.matches("^[a-z]{16}$"))
    throw new Exception("Exactly 16 letters a-z are required.")

    override def toString: String =
        ltrStr.substring(0,4)  + "\n" + ltrStr.substring(4,8) + "\n" +
        ltrStr.substring(8,12) + "\n" + ltrStr.substring(12,16)

    case class Letter(letter: Char) {
        var neighbors = List[Letter]()
        def addNeighbor(nbr: Letter) = { neighbors = nbr :: neighbors }
        override def toString = letter.toString
    }

    val letters = new Array[Array[Letter]](4,4)
    for (idx <- 0 until ltrStr.length)
        letters(idx/4)(idx % 4) = Letter(ltrStr(idx))

    for ( idx <- 0 to 3; jdx <- 0 to 3; iOff <- -1 to 1; jOff <- -1 to 1;
          if (iOff != 0 || jOff != 0) &&
          idx + iOff >= 0 && idx + iOff < 4 &&
          jdx + jOff >= 0 && jdx + jOff < 4 )
        letters(idx)(jdx).addNeighbor(letters(idx + iOff)(jdx + jOff))
}

That’s a lot of new code, but it is actually pretty simple if we break it down:

Lines 2-4 just ensure that we have good input. The code converts the constructor parameter to lower case, and then confirms that it is composed of exactly 16 letters from a to z.

The next section is a toString method. It just returns the 16-letter string in 4-letter chunks separated by newlines.

Next, there is an inner class called Letter. It encapsulates one letter on the game board, and includes a way to keep track of neighboring Letters (think of it as a graph node).

Line 16 creates the game board, a 4-by-4 array of Letters. The for-loop that follows initializes each of the 16 Letter objects using the corresponding letters from the 16-letter string.

All that’s left is to tell each Letter who his neighbors are. This is done with the somewhat complex for-loop on line 20. This structure loops through two indices, idx and jdx, as well as two offsets, iOff and jOff. It’s like four nested loops combined into one. It loops through idx and jdx values from 0 to 3, so it runs for each of the 16 Letters in the grid. It also loops through iOff and jOff from -1 to 1, so it looks at each neighbor of each Letter. See? For each of the 16 Letters in the grid, it looks at each of the 9 Letters around that Letter (including the Letter itself).

The last section of the for-loop header (lines 21-23) defines which combinations of idx, jdx, iOff, and jOff are valid and should be processed in the loop body. You see why, right? First of all, there’s no need for a Letter to add itself to its list of neighbors. It’s against the game rules to use the same grid position twice in the same word, so any iterations in which iOff and jOff are both 0 are eliminated at line 21.

Also, there are grid positions at the edges and corners. Those positions will have fewer neighbors. So for the Letter at position idx=0, jdx=0, offsets of iOff=-1 OR jOff=-1 make no sense. There are no neighboring Letters in the -1 direction at that position. These restrictions are made by lines 22 and 23.

Finally, if the indices and offsets pass the tests, the Letter at location (idx, jdx) is assigned a new neighbor, the Letter at (idx+iOff, jdx+jOff).

Of course, this isn’t the only way to write a program like this. We could have used a simple array of characters and put the neighbor-finding logic in the solver code, but I chose to divide up the responsibilities like this.

We’ve finished setting up the game board. We have a dictionary of legal words. Now we’re ready to play. Let’s add a function called findWords to the GameBoard class.

def findWords(tree: LetterTree): List[String] = {
    def findWords(tree: LetterTree, letter: Letter, sofar: List[Letter]): List[String] = {
        tree.getSubTree(letter.letter) match {
          case Some(subTree) =>
            var words: List[String] = Nil
            if (subTree.terminal) words = (letter :: sofar).foldLeft("")((c,n) => n+c) :: words
            for (nextLetter <- letter.neighbors if !sofar.contains(nextLetter))
            words = findWords(subTree, nextLetter, letter :: sofar) ::: words
            words
          case None => Nil
        }
    }
    var words: List[String] = Nil
    for (idx <- 0 to 3; jdx <- 0 to 3)
        words = words ++ findWords(tree, letters(idx)(jdx), Nil)
    words
}

This is the heart of our little program. This is the code that does the real work. The first thing that happens in this function is that we define an inner function. We’ll look at that in a second. First, look at code below the inner function. We create an empty List of Strings, then for each Letter in the 4-by-4 grid we call the inner findWords function. We pass in the dictionary tree (at the top tree level), the current Letter, and an empty list. That’s the list of letters used so far. At this point, no letters have been used yet, so it’s empty (Nil).

The result of the inner function call is a list of all valid words that start with the given Letter. That list is added to the list of all valid words. We do this for each of the 16 Letters.

Now, for that inner function. Let’s first examine the parameter list. First is tree of type LetterTree. This is the dictionary class we built earlier. On the first call, we pass in the whole dictionary, the top level tree. As we search for words, though, we will pass sub-trees, sub-sub-trees and so forth. The second parameter is a letter of type Letter. That’s just the current letter that we’re evaluating.

The last parameter is called sofar and it has type List[Letter]. Why is it called sofar? Because it’s the ordered list of connected Letters from the game board that we know begin words in the dictionary. The sofar list might contain the letters (c, o, m, p) because this is a prefix to words like compute, computer, computing, comparison, etc. We would never be passed a sofar list containing (c, o, m, p, x) because although this sequence could show up on the game board, this is not a prefix to any word in the dictionary. This parameter will grow longer with each recursive call to the inner function. This is why on the first call to the inner function we pass the empty list, Nil, as the sofar parameter.

We first look up the sub-tree beginning with the letter value of the Letter parameter. If we find that there is no sub-tree for this letter (the result of getSubTree matches None) then we know that there are no words beginning with the letter. It’s a dead end, so we return an empty list and we do not recurse any further.

If we do find a sub-tree, though, that means that there are words that begin with the sofar list followed by this letter. On line 6, we check to see if the node for the current letter is a terminal node in our dictionary. If it is, then we have a legal word. We append the current letter to the sofar list, convert the list to a string, and put it in the local words list.

In the next line, we loop through all of the current Letter’s neighbors, excluding any that are already in the sofar list. For each unused neighbor, we make a recursive call. This time, the parameters are the dictionary sub-tree, the neighbor letter generated by the for loop, and the sofar list with the current Letter appended. The list of words returned by each recursive call is combined with the local word list. After all the neighboring letters have been checked, the word list is returned from the function.

This is basically a graphs problem. The game board is an undirected graph. The dictionary is a tree, which is a type of graph. We are taking circuit-free paths along the game board graph and mapping them to paths on the tree, accumulating a list of matching paths for which the end is marked terminal on the tree. There are Java (and maybe Scala) libraries for dealing with graphs, and when I get a chance I’d like to see if I can get a more tidy implementation using one of these libraries.

Now, we’ll just pull all this together in a neat little package:

class PuzzleSolver(dictionaryPath: String) {
    val tree = new FileLetterTree(dictionaryPath)
    def solve(letters: String) = {
        val board = new GameBoard(letters)
        val wordSet = new HashSet[String]() ++ board.findWords(tree)
        val sortedWords = wordSet.toList.sort{ (a,b) =>
            a.length > b.length || (a.length == b.length && a > b)
        }
        println(sortedWords)
    }
}

Now we can create an instance of the PuzzleSolver class for a dictionary file that we specify. Then we can call the solve function for game board configurations. This class finds all the legal words contained in the game board, sorts them by length first and alphabetically second, and prints them out.

Here’s a sample session in the Scala interpreter:

scala> val solver = new PuzzleSolver("./words.txt")
solver: PuzzleSolver = PuzzleSolver@1876e5d

scala> solver.solve("TestingTheSolver")
List(storeen, torsel, tinsel, tensor, seeing, nestor, inseer, 
ingest, verst, verso, torse, tinge, store, soree, soget, snite, 
sneer, rotse, rotge, rosel, reest, orsel, inset, insee, ingot, 
hinge, gorse, geest, vlei, vest, vent, vein, veer, veen, tore, 
togs, ting, tine, tien, teng, stog, sore, snee, sero, sent, 
seit, sego, seer, seen, seel, rose, rest, rees, reen, reel, 
ogee, neti, nest, neer, lest, lent, lens, lehi, lees, leer, 
iten, hint, hing, hest, hent, hein, heer, gore, goes, goer, 
gest, gent, gens, gein, eros, vei, vee, tor, tog, toe, tin, 
tie, ten, teg, sot, sog, soe, set, ser, sen, seg, see, rot, 
rog, roe, rev, ree, ose, ore, oes, oer, nit, net, nei, nee, 
lev, les, len, lei, leg, lee, ing, hit, hin, hie, hen, hei, 
got, gos, gor, get, ges, gen, gel, gee, ers, ens, ego, eer, 
eel)

scala>

It works! Here’s the complete source code:

import java.io.File  
import scala.io.Source  
import scala.collection.mutable.HashMap  
import scala.collection.immutable.HashSet  
  
class LetterTree {  
    private val nodes: HashMap[Char,LetterTree] = new HashMap[Char,LetterTree]  
    var terminal: Boolean = false  
    def addWord(word: String): Unit = addWord(word.toList)  
    def addWord(word: List[Char]): Unit = word match {  
        case Nil          => terminal = true  
        case head :: tail => nodes.getOrElseUpdate(head, new LetterTree).addWord(tail)  
    }  
    def getSubTree(letter: Char): Option[LetterTree] =  
        if (nodes.contains(letter)) Some(nodes(letter)) else None  
}  

class FileLetterTree(path: String) extends LetterTree {  
    val file = new File(path)  
    for (line <- Source.fromFile(file).getLines) addWord(line.trim)  
}  


class GameBoard(lettersStr: String) {  
    private val ltrStr = lettersStr.toLowerCase()  
    if (!ltrStr.matches("^[a-z]{16}$"))  
    throw new Exception("Exactly 16 letters a-z are required.")  
  
    override def toString: String =  
        ltrStr.substring(0,4)  + "\n" + ltrStr.substring(4,8) + "\n" +  
        ltrStr.substring(8,12) + "\n" + ltrStr.substring(12,16)  
  
    case class Letter(letter: Char) {  
        var neighbors = List[Letter]()  
        def addNeighbor(nbr: Letter) = { neighbors = nbr :: neighbors }  
        override def toString = letter.toString  
    }  
  
    val letters = new Array[Array[Letter]](4,4)  
    for (idx <- 0 until ltrStr.length)  
        letters(idx/4)(idx % 4) = Letter(ltrStr(idx))  
  
    for ( idx <- 0 to 3; jdx <- 0 to 3; iOff <- -1 to 1; jOff <- -1 to 1;  
          if (iOff != 0 || jOff != 0) &&  
          idx + iOff >= 0 && idx + iOff < 4 &&  
          jdx + jOff >= 0 && jdx + jOff < 4 )  
        letters(idx)(jdx).addNeighbor(letters(idx + iOff)(jdx + jOff))  

  def findWords(tree: LetterTree): List[String] = {  
      def findWords(tree: LetterTree, letter: Letter, sofar: List[Letter]): List[String] = {  
          tree.getSubTree(letter.letter) match {  
            case Some(subTree) =>  
              var words: List[String] = Nil  
              if (subTree.terminal) words = (letter :: sofar).foldLeft("")((c,n) => n+c) :: words  
              for (nextLetter <- letter.neighbors if !sofar.contains(nextLetter))  
              words = findWords(subTree, nextLetter, letter :: sofar) ::: words  
              words  
            case None => Nil  
          }  
      }  
      var words: List[String] = Nil  
      for (idx <- 0 to 3; jdx <- 0 to 3)  
          words = words ++ findWords(tree, letters(idx)(jdx), Nil)  
      words  
  }  
}  

class PuzzleSolver(dictionaryPath: String) {  
    val tree = new FileLetterTree(dictionaryPath)  
    def solve(letters: String) = {  
        val board = new GameBoard(letters)  
        val wordSet = new HashSet[String]() ++ board.findWords(tree)  
        val sortedWords = wordSet.toList.sort{ (a,b) =>  
            a.length > b.length || (a.length == b.length && a > b)  
        }  
        println(sortedWords)  
    }  
}

One last thing: To be clear, no I don’t actually use this to cheat at online games. Just knowing that I could is satisfying enough for me.

A New Post Template for WordPress

Matt — Wed, 20 Aug 2008 14:20:49 +0000

I like to include a copyright notice on my posts, a small one down on the lower right, and a couple of links for RSS and Twitter. The problem is that I always go back and add it as an afterthough. So I write a post, proof it, convince myself that it’s perfect, post it, and then I notice that I didn’t put that little footer at the end.

I use wordpress.com to host my site. I looked around the admin console for something pertaining to post templates but didn’t find anything. I googled a little to see whether such a feature exists, but I didn’t find anything.

Then it occurred to me. Matt Malone, you handsome devil, aren’t you a professional software developer? And don’t you have the gall to write a blog proclaiming yourself such? Why don’t you write something yourself? So I did. I wrote a very simple greasemonkey script. Here it is below. Feel free to install the script if you use wordpress.com and think it would be useful.

// ==UserScript==
// @name           WordPress Post Template
// @namespace      oldfashionedsoftware.com
// @description    Inserts some template text in new blog posts
// @include        http://matthewmalone.wordpress.com/wp-admin/post-new.php
// ==/UserScript==
var postTextAreaList, postTextArea;
postTextAreaList = document.evaluate( "//textarea[@name='content']",
    document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
postTextArea = postTextAreaList.snapshotItem(0);
postTextArea.value = "Your template text here";