Morning Doughnuts 4

  • According to Reuters surgeons who play video games are more skilled. Remind me to ask the doctor if s/he owns an XBOX 360 the next time I am getting operated on.
  • I have reached the National Championship game in dynasty mode of NCAA Football 2007. The opponent of my BYU Cougars…why that would be Harry’s alma mater, the USC Trojans. Funny how that worked out.
  • Nicholas Allen writes in his blog about when you should use Indigo to write a channel, and more importantly when you should not. As most of you know Harry and I are doing quite a bit of work with WCF so we are interested in this type of advice.
  • Our team has been thinking about how to manage a large number of services in an automated fashion. This would include deploying new services, monitoring the services, automatically handling scaling, service discovery, and automated provisioning to name a few possible capabilities. I almost think of it like the next version of UDDI, especially when it comes to provisioning. I think that as systems become more distributed that the ability to automatically manage these systems is going to be key to their success. I know that some thought has already gone on in this area by people far smarter than I, but as I consider how to operate an infrastructure with thousands of services in it it is apparent that the opportunity is there for us to design and implement a system management framework that automates the majority of the tasks. I need to spend some time to consider how the framework would work, and document the capabilities.

Internal DSLs in PowerShell

(Harry is on a secret mission in uncharted space this week, so instead of the daily Morning Coffee post, you get a series of autoposted essays. This post combines both some leftover learnings about Ruby from Harry’s Web 2.0 days with his recent obsession with PowerShell.)

My first introduction to the idea of internal DSLs was an article on Ruby Rake by Martin Fowler. Rake is Ruby’s make/build utility. Like most build tools like Ant and MSBuild, Rake is a dependency management system. Unlike Ant and MSBuild, Rake doesn’t use an XML based language. It uses Ruby itself, which has huge benefits when you start doing custom tasks. In Ant or MSBuild, building a custom task requires you to use a external environment (batch file, script file or custom compiled task object). In Rake, since it’s just a Ruby file, you can start writing imperative Ruby code in place.

Here’s the simple Rake sample from Fowler’s article:

task :codeGen do  
  # do the code generation
end

task :compile => :codeGen do  
  # do the compilation
end

task :dataLoad => :codeGen do  
  # load the test data
end

task :test => [:compile, :dataLoad] do  
  # run the tests
end

The task keyword takes three parameters: the task name, an array containing the task dependencies and a script block containing the code to execute to complete the task. Ruby’s flexible syntax allows you to specify task without any dependencies (:codegen), with a single dependency (:compile => :codegen), and with multiple dependencies (:test => [:compile,:dataLoad])

So what would this look like if you used Powershell instead of Ruby? How about this:

task codeGen {  
  # do the code generation
}
task compile codeGen {
  # do the compilation
}

task dataLoad codeGen {  
  # load the test data
}

task test compile,dataLoad {
  # run the tests
}

Not much different. PS uses brackets for script blocks while Ruby uses do / end, but that’s just syntax. Since it lacks Ruby’s concept of symbols (strings that start with a colon), PS has to use strings instead. Otherwise, it’s almost identical. They even both use the # symbol to represent a line comment.

There is one significant difference. For tasks with dependencies, Rake uses a hash table to package the task name and its dependencies. The => syntax in Ruby creates a hash table. Since the hash table has only a single value, you can leave of the surrounding parenthesis. The key of this single item hash table is the task name while the value is an array of task names this task depends on. Again, Ruby’s syntax is flexible, so if you have only a single dependency, you don’t need to surround it in square brackets.

In Powershell, the hash table syntax isn’t quite so flexible, you have to surround it with @( ). So using Rake’s syntax directly would result in something that looked like “task @(test = compile,dataLoad) {…}” which is fairly ugly. You don’t need to specify the square brackets on the array, but you having to add the @( is a non-starter, especially since you wouldn’t have them on a task with no dependencies.

So instead, I thought a better approach would be to use PS’s variable parameter support. Since all tasks have a name, the task function is defined simply as “function task ([string] $name)”. This basically says there’s a function called task with at least one parameter called $name. (All variables in PS start with a dollar sign.) Any parameters that are passed into the function that aren’t specified in the function signature are passed into the function in the $args variable.

This approach does mean having to write logic in the function to validate the $args parameters. Originally, I specified all the parameters, so that it looked like this: “function global:task([string] $name, [string[]] $depends, [scriptblock] $taskDef)”. That didn’t work for tasks with no dependencies, since it tried to pass the script block in as the $depends parameter.

Here’s a sample task function that implements the task function shown above. It validates the $args input and builds a custom object that represents the task. (Note, the various PS* objects are in the System.Management.Automation namespace. I omitted the namespaces to make the code readable.)

function task([string] $name) {
  if (($args.length -gt 2) -or ([string]::isnullorempty($name))) {
    throw "task syntax: task name [<dependencies>] [<scriptblock>]"
  }
  if ($args[0] -is [scriptblock]) {
    $taskDef = $args[0]
  }
  elseif ($args[1] -is [scriptblock]) {
    $depends = [object[]]$args[0]
    $taskDef = $args[1]
  }
  else {
    $depends = [object[]]$args[0]
    #if a script block isn't passed in, use an empty one
    $taskDef = {}
  }

  $task = new-object PSObject
  $nameProp = new-object PSNoteProperty Name,$name
  $task.psobject.members.add($nameProp)
  $dependsProp = new-object PSNoteProperty Dependencies,$depends
  $task.psobject.members.add($dependsProp)
  $taskMethod = new-object PSScriptMethod ExecuteTask,$taskDef
  $task.psobject.members.add($taskMethod)
  $task
}

Of course, you would need much more than this if you were going to build a real build system like Rake in PowerShell. For example, you’d need code to collect the tasks, order them in the correct dependency order, execute them, etc. Furthermore, Rake supports other types of operations, like file tasks and utilities that you’d need to build.

However, the point of this post isn’t to rebuild Rake in PS, but to show how PS rivals Ruby as a language for building internal DSLs. On that front, I think PowerShell performs beautifully.

I’m looking forward to using PowerShell’s metaprogramming capabilities often in the future.