Sun 17 Aug 2008
Easy Parsing in Scala: Addendum
Posted by Matt under Scala
No Comments
Yesterday I made a post called Easy Parsing in Scala about using the Scala parsing libraries. I’ve made a couple changes to the code since then.
First, I noticed that the regex method takes a Regex object as its only parameter. Why, I thought to myself, didn’t they just make the method take a String so I don’t have to keep typing “new Regex”. Duh. Big duh. They’re giving me the opportunity to reuse Regex objects instead of stupidly recreating them over and over. So I added three private constant regular expressions that I could reuse: spaceRegex, numberRegex, and wordRegex. I couldn’t make a constant Regex for the one that matches a given number of characters, of course.
Second, I eliminated some repetition by adding a regexAndSpace method that matches a regular expression and then throws away the following whitespace. That’s a job that’s repeated 3 times, so I thought it made sense to factor it out. Without further ado, here’s the updated code:
import scala.util.parsing.combinator._ import scala.util.matching.Regex object SvnParser extends RegexParsers { private val spaceRegex = new Regex("[ \\n]+"); private val numberRegex = new Regex("[0-9]+"); private val wordRegex = new Regex("[a-zA-Z][a-zA-Z0-9-]*"); private def space = regex(spaceRegex) private def regexAndSpace(re: Regex) = regex(re) <~ space override def skipWhitespace = false def number = regexAndSpace(numberRegex) def word = regexAndSpace(wordRegex) def string = regex(numberRegex) >> { len => ":" ~> regexAndSpace(new Regex(".{" + len + "}")) } def list: Parser[List[Any]] = "(" ~> space ~> ( item + ) <~ ")" <~ space def item = ( number | word | string | list ) def parseItem(str: String) = parse(item, str) } SvnParser.parseItem("( 5:abcde 3:abc \n 20:three separate words ( abc def \n\n\n 123 ) ) ") match { case SvnParser.Success(result, _) => println(result.toString) case _ => println("Could not parse the input string.") }
No Responses to “ Easy Parsing in Scala: Addendum ”