Blog Posts from July 2009 (page 1 of 2)

Functions that Create Functions in Powershell

Since I started using Powershell, I’m very picky about what I let on my path. I feel it’s much cleaner to create aliases or functions rather than letting all kinds of crud creep into my path.

Recently, I installed the latest IronRuby release and discovered there’s a whole bunch of little batch file wrappers around common Ruby commands like gem and rake. While being able to simply type “igem” or “irake” is much easier than typing ir "C:\Program Files\ironruby-0.6.0\bin\igem", I didn’t want to pollute my path – even with a product from my team. Instead, I wanted to create a Powershell function for each of those IronRuby-fied commands. Furthermore, I wanted to avoid manually creating a function for each Ruby command – these batchfiles are literally identical except for their name, so I figured it would be possible automate the function creation in Powershell. Here’s what I came up with:

$iralias = get-alias ir -EA SilentlyContinue
if ($iralias -eq $null) {return}

$irbindir = split-path $iralias.Definition

function make-rubyfunction($cmd)
  $cmdpath = join-path $irbindir $cmd
  set-item function:global:$cmd -Value {ir $cmdpath $args}.GetNewClosure()
  write-host "Added IronRuby $_ command"

("igem","iirb","irackup","irails","irake","irdoc","iri") |
  %{make-rubyfunction $_}

I start by getting the ir alias, which I’m setting in my traditional fashion. The Ruby command files are in the same directory as ir.exe, which is what ir is aliased to. If the ir alias isn’t set, I quit out of the script without setting anything.

The make-rubyfunction function is the primary workhorse of this script. You pass in a command name as a string, and it uses set-item on the function provider to create a new function. Note, I had to explicitly create this function in the global scope since I’m running the set-item cmdlet inside a script.

Getting the value for the function took a bit of head banging to figure out. I’m used to Python, which automatically closes over variables, so my first attempt was to set the function value to something like { ir $cmdpath $args }. But Powershell doesn’t close automatically, so that fails since $cmd isn’t defined inside the function. I asked around on the internal Powershell alias, and someone pointed me to the new GetNewClosure function in Powershell v2. In other words, Powershell only supports manual closures, which is kind of wonky, but works OK for this scenario. I create a new script block that references in-scope variable $cmdpath and GetNewClosure automatically creates a new script block where that value is captured and embedded. More info on GetNewClosure in the docs.

Now, I’m using Win7 exclusively at this point, so depending on a v2 feature didn’t bother me. However, if you’re using Powershell v1, you could still accomplish something similar using text substitution. Here’s my original (i.e. pre-GetNewClosure) version of make-rubyfunction

function make-rubyfunction($cmd)
  $cmdpath = join-path $irbindir $cmd
  $p = "ir `"$cmdpath`" `$args"
  set-item function:global:$cmd -Value $p
  write-host "Added IronRuby $_ command"

I’m using Powershell’s standard text substitution mechanism to create the function value as a string. Note that I’m escaping the dollar sign in $args, so that does not get substituted the way $cmdpath does. GetNewClosure feels cleaner, so that’s how I ended up doing it, but both ways seem to work fine.

Finally, I pass an array of IronRuby commands down the pipe to make-rubyfunction. I love the pipe command, though it feels strange to use parentheses instead of square brackets for list comprehensions like Python and F#!

Anyway, the script – as usual – is up on my SkyDrive. At some point, I want to do something similar for common IronPython scripts like pyc and ipydbg. Until then, hopefully someone out there will find it useful (like maybe the IronRuby team?).

The Texas Dependency Injection Massacre

Since I think I’ve beaten the “I think what most people call architecture is really engineering” meme to death, let’s move on to something else. Eric Smith of The Limber Lambda blog (love that name!) commented:

I’m a little concerned with the intimation that use of interfaces, respect for visibility of type members and use of dependency injection equates to “over-engineering”. As with everything, it depends on what you’re trying to achieve, and generalisations in this regard, especially when junior people who may not understand what’s at stake are reading, can be damaging.

I find it an uphill battle to engender a constructive mindset in developers who have established bad habits and whose pride lies in the way of addressing those habits.

Anti-”process” talk by Joel Spolsky and the “pragmatism brigade” makes it harder. A while ago I had a new developer refuse to write unit tests despite it being an established practice in our team because “… Jeff and Joel said they were bad in the StackOverflow podcast …”. Yikes.

Let me be very clear. I never suggested that techniques such as interfaces and dependency injection are over engineering. These are good engineering practices, and every software engineer should understand them. And if Joel and Jeff really said unit tests were bad, well that would be about the dumbest thing I’d have every heard either of those two say. Yikes indeed.

But as Eric writes, “it depends on what you’re trying to achieve”. Engineering techniques like dependency injection, polymorphism, encapsulation are tools, and there are many good reasons to use them. But like many tools, they can also be used for evil.

In other words, the tools themselves are always innocent – you have to look at how and why they are being used by the people who are using them.

Let’s take dependency injection as an example. Externalizing a software component’s dependencies enables you to test it isolation from the rest of your system. For example, it’s very common to inject a dependency that writes to a durable store, such as a logger or a data access component. In your unit tests, you inject a mock durable store instead of the real dependency. The mock will be faster (no need to actually write to disk), cleaner (no need to clean up the files on disk between test runs) and will behave exactly to the spec (bugs in the dependency component won’t create false failures in the component you’re testing). Those are all good engineering arguments for using DI, full stop.

Furthermore, DI helps insulate a software component against changes in its dependencies. I may not be able to predict specific changes with any precision, but it’s probably safe to assume that there a given component’s dependencies aren’t going to remain completely static. DI doesn’t insulate you 100% from possible changes – in particular, it doesn’t help if the dependency’s interface changes.

But I would argue that you can go too far with DI. Let’s go back to the logger component example I described above. Maybe, the over engineer thinks, we’ll want the logger to write to the database instead of the file system in the future. Or maybe we’ll want the logger to write to a different database. And if it’s supporting a different database, then maybe the logger should support different back end databases. Or maybe, Or Maybe, OR MAYBE…

We’ve gone from a simple component that logs to the file system and turned it into a engineering monstrosity with multiple points of variability and extensibility. When you start saying “maybe we should” or “this could change in the future” or stuff like that, that’s when you start over engineering something.

Unfortunately, there’s only one way to know when you’ve started over-engineering: Experience. Sorry Eric, I can’t help you with your junior engineers. As David Lee Roth once sang, Experience is the “worst teacher goin’”. But if there’s a better way to learn, I don’t know it. In the meantime, I suggest code reviews and pair programming.

Architecture Astronauts and Over Engineers

Since it’s apparently Architecture Weektm [1] here at DevHawk, here’s another of my favorite Dilbert cartoons of all time – relevant to the discussion at hand.

Two interesting comments on yesterday’s post:

Architectural thinking is a necessary (and very important) part of software development – but beyond the systems level (which is systems administration and not software architecture) I have a hard time seeing divorcing architectural thinking from the actual development as anything but a terrible thing. Although I see that your definition of architecture (at the functional level) does not match my caricature of the ‘architecture astronauts’ which I do think can be endemic in languages that encourage additional layers of architecture.\ [Michael Foord]

So based on the definition of architecture I’m reading into your post, you wouldn’t consider the choice of object-oriented versus functional programming styles from an architectural perspective? I’m trying to understand what level of architecture you mean here. Like Michael, I usually think of architecture even down into the implementation patterns level (hence the architecture astronauts), but that seems to be included in what you might be calling an engineering concern.
[Ryan Riley]

Let me be very clear. Using my definition, there is no such thing “architecture even down into the implementation patterns level”. I’d argue that the implementation patterns level is engineering, not architecture. From what I’ve seen, the terms “architecture” and “engineering” tend to be used interchangeably in the software industry, and frankly I think that’s a mistake. I said as much in yet another post I wrote four years ago:

Architecture is the intersection between business and IT.

If a decision doesn’t effect a business person, it’s not an architecture decision. I’m not saying it’s not important – I think the role of the software engineer is critical in large-scale enterprise system design and construction. And I will readily admit that often a single person is responsible for both architecture and engineering. But that doesn’t make them the same activity. As long as we continue to confuse the two disciplines, we hold them both back.

Michael and Ryan (or anyone else for that matter) are welcome to disagree with my definition of architecture. I often joke that if you asked ten architects to define “architecture”, you’d get twelve answers. But that’s my definition and I’m sticking to it.

But what of the Architecture Astronauts? Both Michael and Ryan mentioned them. Unsurprisingly, I think that term is used too broadly as well. If you go back and read Joel’s original post of Architecture Astronauts, there wasn’t much reference, if any, to the implementation layer at all.

The Architecture Astronauts will say things like: “Can you imagine a program like Napster where you can download anything, not just songs?” Then they’ll build applications like Groove that they think are more general than Napster, but which seem to have neglected that wee little feature that lets you type the name of a song and then listen to it — the feature we wanted in the first place. Talk about missing the point. If Napster wasn’t peer-to-peer but it did let you type the name of a song and then listen to it, it would have been just as popular.
[Joel on Software, Don’t Let Architecture Astronauts Scare You]

I feel that my definition fits very well with the way Joel writes about architecture in this paragraph. The Architect Astronaut is trying to solve a real business problem – people need access to information besides music. But the mistake they make is thinking they can solve multiple problems with a single solution. So they abstract higher and higher until they’ve lost sight of the original problem and can only focus on the abstractions. If you look at what Joel has to say about technologies like Hailstorm and Jini, you see the same pattern emerge.

This isn’t to say that similar problems of over-abstraction don’t happen at the implementation layer – they do. But they happen for very different reasons. Astronaut Architects are trying to solve multiple problems with a single solution. But when over-abstraction happens at the implementation level, it because someone thought they could predict the future.

We’ve all seen our fair share of over-engineered systems that introduce significant unneeded complexity on the off chance that the development team can successfully predict the kind of change likely to come in the next version of the product. Invariably, the team’s precognitive abilities are revealed to be as poor as everyone else’s, so they’re left with a bunch of extra layers of software cruft that has to be maintained but provides zero additional value to the system. I’ve blogged about that problem before as well: Kitchen Sink Variability.

Since I’m big on keeping the terminology of architecture and engineering separate, then I’d argue that we need a different term than Architecture Astronaut for people who want to introduce additional layers of abstraction at the implementation layer on the off chance that they don’t suck at precognition. Since we call such systems over-engineered, wouldn’t that make the people who build them “Over Engineers”?

  1. It’s like Shark Week, but with white boards and even more terrifying. ↩︎

Dynamic Languages in Architecture

In the comments from yesterday’s post, IronPython MVP and author extraordinaire Michael Foord asked:

Has your view on architecture as a discipline separate from coding changed since working with dynamic languages?

In a word:“No” (though as always, I reserve the right to be wrong and/or convinced otherwise.)

When I was an architect, I tried very hard to treat it as a “discipline separate from coding”. To use my last post as an example, building a central repository of system audit information is an architectural decision. A bad one IMHO – at least the way Dilbert’s PHB described it – but an architectural decision all the same. It was a decision about what kind of system to build, part of an overall application portfolio, as opposed to a decision about how to build the system.

I’ve held this opinion of architecture for a long time. Four years (and three jobs) ago, I wrote the following:

IMO, building a system that has a set of functional requirements (track customers, process orders, etc) and non-functional constraints (sub-second response time, support 10,000 concurrent users, use Microsoft Windows platform, etc) is an engineering problem. Coming up with the lists of functional requirements and non-functional constraints is the architecture problem.

Working with dynamic languages has dramatically changed my view of engineering and design of individual systems. But from the pure architecture perspective, I want to be able to treat individual systems as black boxes as much as possible. That means the programming language is an implementation detail that shouldn’t matter to the architect.

Note the significant bet-hedging language in the paragraph above. I’m using phrases like “shouldn’t matter” and “as much as possible” because we all know that there’s no such thing as a “pure architecture perspective”. Unlike building architecture, software architecture is in constant flux at every level. At the enterprise level, there are always new regulatory obligations, new competitors and new partners to consider. At the end-to-end process level, there are always new systems or new version of existing systems coming on line. And at the individual system level, there are always new – or at least new versions – of tools, frameworks and languages being released.

Once you introduce time into your architecture perspective, individual system engineering will affect the overall architecture, since system engineering affects the rate of change. Language choice will certainly have some engineering impact. However, in my experience language choice is rarely high on the list of concerns relative to things like project scope and team experience.

So my “No” answer to Michael’s question is predicated on the following:

  • As an architect, I want to consider individual systems as black boxes where implementation details like language choice are completely irrelevant.
  • As a practical architect, I realize that some system implementation details are relevant – especially over time – but in my experience language choice isn’t one of them.

On the other hand, most IT shops try to standardize on one programming language – certainly MS IT did – so maybe language choice would be more architecturally relevant in a mixed language shop. I’d love to hear from folks who have multiple standard languages in their IT shop – especially if you have both static and dynamic languages on your standards list.

Probably Wrong Info Is Worse Than No Info At All

Like many geeks, I love Dilbert. However, I rarely identify with it as well as I did Sunday.

I kid you not, I’ve had almost exactly this conversation back when I worked in MS IT. They have this big repository of information about deployed applications. Technically, you’re not supposed to deploy an application without listing it in the application repository. Like Dilbert, I never really understood what people were going to do with this information, but the projects I was on dutifully collected the relevant information and put it into the repository.

And never thought of it again. Ever.

And therein lies the problem. Populating the application repository was an artificial step on the critical path of the deployment process. Writing the software, acquiring the physical hardware to run it on, stuff like that really is on the critical path. Populating the application repository was extra busy work legislated by someone (I forget if it was the central architecture team or management) that didn’t benefit the project in the slightest. As such, it was given the minimal about of attention and effort, meaning there was little quality or consistency in the data. Worse yet, when the application changed or was decommissioned , updating the application repository just didn’t happen. I mean, it was supposed to, but rarely did.

So you ended up with a repository of information that was worse than useless. I had a colleague who insisted that the repository had some value because “not all of the data was wrong”. Of course, he couldn’t tell me with any consistency which data was accurate and therefore valuable and which was not. Hence, my argument that it was “worse than useless”.

The only way an application repository is going to be of any value at all is if you can collect the data automatically. My old teammate Buzz coined a phrase we used often: “The Truth Is On The Edge”. You should always regard any central repository of information with a very critical eye since it’s rarely going to be the truth.

(Ed. Note – Man, it’s been a long time since I’ve written about Architecture. My last Architecture post was almost a year ago. I don’t miss the job but I do miss my old teammates – in particular Buzz, Rick, Dale and of course Nick Malik.)