Monday, May 4, 2009

File utilities and the beauty of closures

Arg! I made the same mistake again. You know: the one where you made a bad svn commit, and there are those pesky .svn folders everywhere. Well, it happened to me twice in the same night, and I'm too lazy to delete them by hand. I decided a simple Scala script would do the trick.

A simple little recursive function would do the trick. I fired up the scala interpreter, and whipped out the following function in a minute or two:

scala> def recurse(folder: {
  | if(folder.isDirectory) {
  |   folder.listFiles.foreach{recurse(_)}
  | } 
  | if (folder.getName.contain(".svn")) folder.delete
recurse: (

scala> recurse(new File("."))

Bam! That did the trick! I started to think about the function more (even though it was already stupid simple). There's a way to make this recursive function more general. Say I wanted a function that allows me to traverse a directory and do whatever I want, instead of traverse a directory and only delete .svn directories. A small change to the recursive function would allow that:

def recurse(folder: File) (action: File => Unit) {
  if(folder.isDirectory) {
    folder.listFiles.foreach(f => recurse(f)(action))

The keen Scalafied eye would quickly see what I did there. For the rest of us, I'm going to begin explaining now:

The function recurse now takes 2 parameters: a, and a function that takes in a and returns Unit (which in the Scala world is equivalent to Java's void). The function parameter is named action. For each file, and directory, action will be called on it. So, now I can call it like so:

recurse(new File(".")) { f=> 
  if (f.getName.contains(".svn")) f.delete

That's just doing the same as before, but now, let's say I wanted to find every modified .scala from now to a week ago:

// Get the date from a week ago
val cal = Calendar.getInstance
cal.add(Calendar.DATE, -7)
recurse(new File(".")) { f=>
  if (f.getName.endsWith(".scala") && f.lastModified >= cal.getTime) println(f.getName)

Even better still: with Scala's closure support and partial functions, you can do some nifty actions with our very basic utility function. I want to show you the latter first:

// This is partial function syntax
// Regardless of the parent directory, I want to delete svn's
val deleteSvnIn = recurse(_: File) { f=>
  if (f.getName.contains(".svn")) f.delete

// Calling the partial looks very natural
deleteSvnIn(new File("."))
deleteSvnIn(new File("/some/very/bad/dir"))

// Or vice versa
val inCurrentDir(new File(".")) _

inCurrentDir { f =>
  if(f.getName.endsWith(".scala")) println(f.getAbsolutePath)

What I've shown you thus far is nothing spectacular. What I mean by that, is simply the same could be accomplished with any language that allows anonymous functions. Now, I'll show you closures. Allow me to give you a scenario first, so what I say makes sense: You were giving this library to work with, except you actually need the recurse function to return a value not just arbitrarily do something. You need to persist some data whether it be a count, a list, whatever. Given that you can't change
the horribly coded utility function, your options are:

  • Write another function that does what you need.
  • Rely on Scala's closure support for your help, rewrite the function if time permits, and submit a patch :)

Let's rely on closures for the time being.

// I want to count all the svn directories
var count = 0
recurse(new File(".")) { f=>
  if (f.getName.contains(".svn")) count += 1


// I want to store all scala files
var ls = List[File]()
recurse(new File(".")) { f=>
  if (f.getName.endsWith(".scala")) ls += f

And there you have it! Even though we're solving problems outside the scope of this function's original goal, you're able to bend the rules with the power of closures. The functional programmer would have noticed the use of var's and scowled. I merely wanted to prove a point, not start a flame war.

-- Philip

No comments:

Post a Comment