Fixing Powershell’s Busted Resolve-Path Cmdlet

Usually, my PowerShell posts are effusive in their praise. However, who thought up this “feature” gets no praise from me:

PS»Resolve-Path ~missing.file
Resolve-Path : Cannot find path 'C:Usershpiersonmissing.file' because it does not exist.

In my opinion, this is a bad design. Resolve-Path assumes that if the filename being resolved doesn’t exist, then it must be an error. But in the script I’m building, I’m resolving the path of a file that I’m going to create. In other words, I know a priori that the file doesn’t exist. Yet Resolve-Path insists on throwing an error. I would have expected there to be some switch you could pass to Resolve-Path telling it to skip path validation, but there’s not.

And the worst thing is, I can see that Resolve-Path came up with the “right” answer – it’s right there in the error message!

Searching around, I found a thread where someone else was having the same problem. Jeffrey Snover – aka Distinguished Engineer, inventor of Powershell and target of Erik Meijer’s Lang.NET coin throwing stunt – suggested using –ErrorAction and –ErrorVariable to ignore the error and retrieve the resolved path from the TargetObject property error variable. Like Maximilian from the thread, using this approach feels fragile and frankly kinda messy, but I needed a solution. So I wrote the following function that wraps up access to the error variable so at least I don’t have fragile messy code sprinkled through out my script.

function force-resolve-path($filename)
{
  $filename = Resolve-Path $filename -ErrorAction SilentlyContinue
                                     -ErrorVariable _frperror
  if (!$filename)
  {
    return $_frperror[0].TargetObject
  }
  return $filename
}

The script is pretty straightforward. –ErrorAction SilentlyContinue is PowerShell’s version of On Error Resume Next in Visual Basic. If the cmdlet encounters an error, it gets stashed away in the variable specified by ErrorVariable (it’s also added to $Error so you can still retrieve the error object if ErrorVariable isn’t specified) and continues processing. Then I manually check to see if resolve-path succeeded – i.e. did it return a value – and return the TargetObject of the Error object if it didn’t.

As I said, fragile and kinda messy. But it works.

HawkCodeBox

Last month, I lamented the lack of extensibility of the WPF text box. While there are several vendors and at least one open source custom syntax highlighting text box, it still really bothers me how inextensible the basic WPF text box is. I just want to do a simple colorizing REPL – why is that so hard?

So instead of using any of those syntax highlighting text boxes, I decided to build my own using the approach Ken Johnson wrote about on Code Project. As I wrote before, it’s a hack – you set the text box’s foreground and background brushes to transparent so that you can override OnRender – but it works.

The big change I made from Ken’s code was to use DLR TokenCategorizer instead of regular expressions to tokenize the code. TokenCategorizer is a service provided by the DLR hosting API, which will tokenize a given script source for you. Here’s the code that colorizes the text in the text box.

var source = Engine.CreateScriptSourceFromString(this.Text);
var tokenizer = Engine.GetService<TokenCategorizer>();
tokenizer.Initialize(null, source, SourceLocation.MinValue);

var t = tokenizer.ReadToken();
while (t.Category != TokenCategory.EndOfStream)
{
    if (SyntaxMap.ContainsKey(t.Category))
    {
        ft.SetForegroundBrush(_syntaxMap[t.Category],
             t.SourceSpan.Start.Index, t.SourceSpan.Length);
    }

    t = tokenizer.ReadToken();
}

As you can see, I ask the engine for a TokenCategorizer, initialize it with the text box’s current contents, then iterate thru the tokens, looking for ones in my SyntaxMap. If the token category is in the syntax map, we change the foreground brush for that span of formatted text (ft is a WPF FormattedText instance I created earlier in the method.

Of course, this approach isn’t very efficient – it re-colorizes the entire file on every change. It turns out that some DLR TokenCategorizer are restartable so you can cache the tokenizer state at any point and then return later with a new TokenCategorizer instance and pick up tokenizing where you left off. With this approach, you could say tokenize a line at a time, allowing you to only need to retokenize the line where the change occurred rather than the entire file. But only IronPython supports tokenizer restarting today, so I decided to take the easy way and simple re-colorize on every change.

I named the project HawkCodeBox and I’ve published the source up on GitHub. It’s fairly simple, but of course the goal wasn’t to build the be-all-end-all text editor – other people in the VS team already have that job.

Functions that Create Functions in Powershell

Since I started using Powershell, I’m very picky about what I let on my path. I feel it’s much cleaner to create aliases or functions rather than letting all kinds of crud creep into my path.

Recently, I installed the latest IronRuby release and discovered there’s a whole bunch of little batch file wrappers around common Ruby commands like gem and rake. While being able to simply type “igem” or “irake” is much easier than typing ir "C:\Program Files\ironruby-0.6.0\bin\igem", I didn’t want to pollute my path – even with a product from my team. Instead, I wanted to create a Powershell function for each of those IronRuby-fied commands. Furthermore, I wanted to avoid manually creating a function for each Ruby command – these batchfiles are literally identical except for their name, so I figured it would be possible automate the function creation in Powershell. Here’s what I came up with:

$iralias = get-alias ir -EA SilentlyContinue
if ($iralias -eq $null) {return}

$irbindir = split-path $iralias.Definition

function make-rubyfunction($cmd)
{
  $cmdpath = join-path $irbindir $cmd
  set-item function:global:$cmd -Value {ir $cmdpath $args}.GetNewClosure()
  write-host "Added IronRuby $_ command"
}

("igem","iirb","irackup","irails","irake","irdoc","iri") |
  %{make-rubyfunction $_}

I start by getting the ir alias, which I’m setting in my traditional fashion. The Ruby command files are in the same directory as ir.exe, which is what ir is aliased to. If the ir alias isn’t set, I quit out of the script without setting anything.

The make-rubyfunction function is the primary workhorse of this script. You pass in a command name as a string, and it uses set-item on the function provider to create a new function. Note, I had to explicitly create this function in the global scope since I’m running the set-item cmdlet inside a script.

Getting the value for the function took a bit of head banging to figure out. I’m used to Python, which automatically closes over variables, so my first attempt was to set the function value to something like { ir $cmdpath $args }. But Powershell doesn’t close automatically, so that fails since $cmd isn’t defined inside the function. I asked around on the internal Powershell alias, and someone pointed me to the new GetNewClosure function in Powershell v2. In other words, Powershell only supports manual closures, which is kind of wonky, but works OK for this scenario. I create a new script block that references in-scope variable $cmdpath and GetNewClosure automatically creates a new script block where that value is captured and embedded. More info on GetNewClosure in the docs.

Now, I’m using Win7 exclusively at this point, so depending on a v2 feature didn’t bother me. However, if you’re using Powershell v1, you could still accomplish something similar using text substitution. Here’s my original (i.e. pre-GetNewClosure) version of make-rubyfunction

function make-rubyfunction($cmd)
{
  $cmdpath = join-path $irbindir $cmd
  $p = "ir `"$cmdpath`" `$args"
  set-item function:global:$cmd -Value $p
  write-host "Added IronRuby $_ command"
}

I’m using Powershell’s standard text substitution mechanism to create the function value as a string. Note that I’m escaping the dollar sign in $args, so that does not get substituted the way $cmdpath does. GetNewClosure feels cleaner, so that’s how I ended up doing it, but both ways seem to work fine.

Finally, I pass an array of IronRuby commands down the pipe to make-rubyfunction. I love the pipe command, though it feels strange to use parentheses instead of square brackets for list comprehensions like Python and F#!

Anyway, the script – as usual – is up on my SkyDrive. At some point, I want to do something similar for common IronPython scripts like pyc and ipydbg. Until then, hopefully someone out there will find it useful (like maybe the IronRuby team?).

The Texas Dependency Injection Massacre

Since I think I’ve beaten the “I think what most people call architecture is really engineering” meme to death, let’s move on to something else. Eric Smith of The Limber Lambda blog (love that name!) commented:

I’m a little concerned with the intimation that use of interfaces, respect for visibility of type members and use of dependency injection equates to “over-engineering”. As with everything, it depends on what you’re trying to achieve, and generalisations in this regard, especially when junior people who may not understand what’s at stake are reading, can be damaging.

I find it an uphill battle to engender a constructive mindset in developers who have established bad habits and whose pride lies in the way of addressing those habits.

Anti-”process” talk by Joel Spolsky and the “pragmatism brigade” makes it harder. A while ago I had a new developer refuse to write unit tests despite it being an established practice in our team because “… Jeff and Joel said they were bad in the StackOverflow podcast …”. Yikes.

Let me be very clear. I never suggested that techniques such as interfaces and dependency injection are over engineering. These are good engineering practices, and every software engineer should understand them. And if Joel and Jeff really said unit tests were bad, well that would be about the dumbest thing I’d have every heard either of those two say. Yikes indeed.

But as Eric writes, “it depends on what you’re trying to achieve”. Engineering techniques like dependency injection, polymorphism, encapsulation are tools, and there are many good reasons to use them. But like many tools, they can also be used for evil.

In other words, the tools themselves are always innocent – you have to look at how and why they are being used by the people who are using them.

Let’s take dependency injection as an example. Externalizing a software component’s dependencies enables you to test it isolation from the rest of your system. For example, it’s very common to inject a dependency that writes to a durable store, such as a logger or a data access component. In your unit tests, you inject a mock durable store instead of the real dependency. The mock will be faster (no need to actually write to disk), cleaner (no need to clean up the files on disk between test runs) and will behave exactly to the spec (bugs in the dependency component won’t create false failures in the component you’re testing). Those are all good engineering arguments for using DI, full stop.

Furthermore, DI helps insulate a software component against changes in its dependencies. I may not be able to predict specific changes with any precision, but it’s probably safe to assume that there a given component’s dependencies aren’t going to remain completely static. DI doesn’t insulate you 100% from possible changes – in particular, it doesn’t help if the dependency’s interface changes.

But I would argue that you can go too far with DI. Let’s go back to the logger component example I described above. Maybe, the over engineer thinks, we’ll want the logger to write to the database instead of the file system in the future. Or maybe we’ll want the logger to write to a different database. And if it’s supporting a different database, then maybe the logger should support different back end databases. Or maybe, Or Maybe, OR MAYBE..

We’ve gone from a simple component that logs to the file system and turned it into a engineering monstrosity with multiple points of variability and extensibility. When you start saying “maybe we should” or “this could change in the future” or stuff like that, that’s when you start over engineering something.

Unfortunately, there’s only one way to know when you’ve started over-engineering: Experience. Sorry Eric, I can’t help you with your junior engineers. As David Lee Roth once sang, Experience is the “worst teacher goin’”. But if there’s a better way to learn, I don’t know it. In the meantime, I suggest code reviews and pair programming.

Syntax Highlighting TextBoxes in WPF – A Sad Story

One of the big new features in VS 2010 is the WPF based editor. With it, you can build all sorts of cool stuff like control the visualization of XML doc comments, change how intellisense looks, even scale the size of text based on the location of the caret. Huzzah for the WPF Visual Studio editor!

However, as wicked awesome as the new editor is, AFAIK it’s not going to be released as a separate component. So while the PowerShell, Intellipad and other teams inside Microsoft can reuse the VS editor bits, nobody else can. So if you want to do something like embed a colorizing REPL in your WPF app, you’ll have to use something else.

I’ve thought about putting a WPF based UI on top of ipydbg (though now I’d probably use the new lightweight debugger instead). So I downloaded John’s repl-lib code to see how he was doing it. Turns out his REPL control is essentially a wrapper around WPF’s RichTextBox control. It works, but it seems kinda kludgy. For example, the RichTextBox supports bold, italics and underline hotkeys, so John’s REPL does too. Though it is possible to turn off these formatting commands, I decided to take a look at modifying how the plain-old TextBox renders. After all, WPF controls are supposed to be lookless, right?

Well, apparently not all the WPF controls are lookless. In particular to this post, the TextBox is definitely NOT lookless. It looks like the text editing capabilities of TextBox are provided by the Sys.Win.Documents.TextEditor class while the text rendering is provided by the Sys.Win.Controls.TextBoxView class. Both of those classes are internal, so don’t even think about trying to customize or reuse them.

The best (and I use that term loosely) way I found for customizing the TextBox rendering was a couple of articles on CodeProject by Ken Johnson. Ken’s CodeBox control inherits from TextBox and sets the Foreground and Background to transparent (to hide the result of TextBoxView) and then overloads OnRender to render the text with colorization. Rendering the text twice – once transparently and once correctly – seems like a better solution than using the RichTextBox, but it’s still pretty kludgy. (Note, I’m calling the TextBox design kludgy – Ken’s code is a pretty good work around).

So if you want a colorized text box in WPF, your choices are:

  • Build your own class that inherits from RichTextBox, disabling all the formatting commands and handling the TextChanged event to do colorization
  • Build your own class that inherits from TextBox, but set Foreground an Background colors to transparent and overload OnRender to do the visible text rendering.
  • Use a 3rd party control. The only one I found was the AqiStar TextBox. No idea how good it is, but they claim to be a true lookless control. Any other syntax highlighting WPF controls around that I don’t know about?