Passion * Technology * Ruthless Competence

Tuesday, July 28, 2009

Functions that Create Functions in Powershell

Since I started using Powershell, I’m very picky about what I let on my path. I feel it’s much cleaner to create aliases or functions rather than letting all kinds of crud creep into my path.

Recently, I installed the latest IronRuby release and discovered there’s a whole bunch of little batch file wrappers around common Ruby commands like gem and rake. While being able to simply type “igem” or “irake” is much easier than typing “ir "C:\Program Files\ironruby-0.6.0\bin\igem"”, I didn’t want to pollute my path - even with a product from my team. Instead, I wanted to create a Powershell function for each of those IronRuby-fied commands. Furthermore, I wanted to avoid manually creating a function for each Ruby command – these batchfiles are literally identical except for their name, so I figured it would be possible automate the function creation in Powershell. Here’s what I came up with:

$iralias = get-alias ir -EA SilentlyContinue
if ($iralias -eq $null) {return}

$irbindir = split-path $iralias.Definition

function make-rubyfunction($cmd)
{
  $cmdpath = join-path $irbindir $cmd
  set-item function:global:$cmd -Value {ir $cmdpath $args}.GetNewClosure()
  write-host "Added IronRuby $_ command"
}

("igem","iirb","irackup","irails","irake","irdoc","iri") | 
  %{make-rubyfunction $_} 

I start by getting the ir alias, which I’m setting in my traditional fashion. The Ruby command files are in the same directory as ir.exe, which is what ir is aliased to. If the ir alias isn’t set, I quit out of the script without setting anything.

The make-rubyfunction function is the primary workhorse of this script. You pass in a command name as a string, and it uses set-item on the function provider to create a new function. Note, I had to explicitly create this function in the global scope since I’m running the set-item cmdlet inside a script.

Getting the value for the function took a bit of head banging to figure out. I’m used to Python, which automatically closes over variables, so my first attempt was to set the function value to something like { ir $cmdpath $args }. But Powershell doesn’t close automatically, so that fails since $cmd isn’t defined inside the function. I asked around on the internal Powershell alias, and someone pointed me to the new GetNewClosure function in Powershell v2. In other words, Powershell only supports manual closures, which is kind of wonky, but works OK for this scenario. I create a new script block that references in-scope variable $cmdpath and GetNewClosure automatically creates a new script block where that value is captured and embedded. More info on GetNewClosure in the docs.

Now, I’m using Win7 exclusively at this point, so depending on a v2 feature didn’t bother me. However, if you’re using Powershell v1, you could still accomplish something similar using text substitution. Here’s my original (i.e. pre-GetNewClosure) version of make-rubyfunction

function make-rubyfunction($cmd)
{
  $cmdpath = join-path $irbindir $cmd
  $p = "ir `"$cmdpath`" `$args"
  set-item function:global:$cmd -Value $p
  write-host "Added IronRuby $_ command"
}

I’m using Powershell’s standard text substitution mechanism to create the function value as a string. Note that I’m escaping the dollar sign in $args, so that does not get substituted the way $cmdpath does. GetNewClosure feels cleaner, so that’s how I ended up doing it, but both ways seem to work fine.

Finally, I pass an array of IronRuby commands down the pipe to make-rubyfunction. I love the pipe command, though it feels strange to use parentheses instead of square brackets for list comprehensions like Python and F#!

Anyway, the script – as usual – is up on my SkyDrive. At some point, I want to do something similar for common IronPython scripts like pyc and ipydbg. Until then, hopefully someone out there will find it useful (like maybe the IronRuby team?).

Posted By Harry Pierson at 4:59 PM Pacific Daylight Time

Friday, July 17, 2009

The Texas Dependency Injection Massacre

Since I think I’ve beaten the “I think what most people call architecture is really engineering” meme to death, let’s move on to something else. Eric Smith of The Limber Lambda blog (love that name!) commented:

I'm a little concerned with the intimation that use of interfaces, respect for visibility of type members and use of dependency injection equates to "over-engineering". As with everything, it depends on what you're trying to achieve, and generalisations in this regard, especially when junior people who may not understand what's at stake are reading, can be damaging.

I find it an uphill battle to engender a constructive mindset in developers who have established bad habits and whose pride lies in the way of addressing those habits.

Anti-"process" talk by Joel Spolsky and the "pragmatism brigade" makes it harder. A while ago I had a new developer refuse to write unit tests despite it being an established practice in our team because "... Jeff and Joel said they were bad in the StackOverflow podcast ...". Yikes.

Let me be very clear. I never suggested that techniques such as interfaces and dependency injection are over engineering. These are good engineering practices, and every software engineer should understand them. And if Joel and Jeff really said unit tests were bad, well that would be about the dumbest thing I’d have every heard either of those two say. Yikes indeed.

But as Eric writes, “it depends on what you're trying to achieve”. Engineering techniques like dependency injection, polymorphism, encapsulation are tools, and there are many good reasons to use them. But like many tools, they can also be used for evil.

In other words, the tools themselves are always innocent – you have to look at how and why they are being used by the people who are using them.

Let’s take dependency injection as an example. Externalizing a software component’s dependencies enables you to test it isolation from the rest of your system. For example, it’s very common to inject a dependency that writes to a durable store, such as a logger or a data access component. In your unit tests, you inject a mock durable store instead of the real dependency. The mock will be faster (no need to actually write to disk), cleaner (no need to clean up the files on disk between test runs) and will behave exactly to the spec (bugs in the dependency component won’t create false failures in the component you’re testing). Those are all good engineering arguments for using DI, full stop.

Furthermore, DI helps insulate a software component against changes in its dependencies. I may not be able to predict specific changes with any precision, but it’s probably safe to assume that there a given component’s dependencies aren’t going to remain completely static. DI doesn’t insulate you 100% from possible changes – in particular, it doesn’t help if the dependency’s interface changes.

But I would argue that you can go too far with DI. Let’s go back to the logger component example I described above. Maybe, the over engineer thinks, we’ll want the logger to write to the database instead of the file system in the future. Or maybe we’ll want the logger to write to a different database. And if it’s supporting a different database, then maybe the logger should support different back end databases. Or maybe, Or Maybe, OR MAYBE..

We’ve gone from a simple component that logs to the file system and turned it into a engineering monstrosity with multiple points of variability and extensibility. When you start saying “maybe we should” or “this could change in the future” or stuff like that, that’s when you start over engineering something.

Unfortunately, there’s only one way to know when you’ve started over-engineering: Experience. Sorry Eric, I can’t help you with your junior engineers. As David Lee Roth once sang, Experience is the “worst teacher goin’”. But if there’s a better way to learn, I don’t know it. In the meantime, I suggest code reviews and pair programming.

Posted By Harry Pierson at 4:55 PM Pacific Daylight Time

Wednesday, July 15, 2009

Architecture Astronauts and Over Engineers

Since it’s apparently Architecture Week™ [1] here at DevHawk, here’s another of my favorite Dilbert cartoons of all time – relevant to the discussion at hand.

Dilbert.com

Two interesting comments on yesterday’s post:

Architectural thinking is a necessary (and very important) part of software development - but beyond the systems level (which is systems administration and not software architecture) I have a hard time seeing divorcing architectural thinking from the actual development as anything but a terrible thing. Although I see that your definition of architecture (at the functional level) does not match my caricature of the 'architecture astronauts' which I do think can be endemic in languages that encourage additional layers of architecture. [Michael Foord]

So based on the definition of architecture I'm reading into your post, you wouldn't consider the choice of object-oriented versus functional programming styles from an architectural perspective? I'm trying to understand what level of architecture you mean here. Like Michael, I usually think of architecture even down into the implementation patterns level (hence the architecture astronauts), but that seems to be included in what you might be calling an engineering concern. [Ryan Riley]

Let me be very clear. Using my definition, there is no such thing “architecture even down into the implementation patterns level”. I’d argue that the implementation patterns level is engineering, not architecture. From what I’ve seen, the terms “architecture” and “engineering” tend to be used interchangeably in the software industry, and frankly I think that’s a mistake. I said as much in yet another post I wrote four years ago:

Architecture is the intersection between business and IT.

If a decision doesn't effect a business person, it's not an architecture decision. I'm not saying it's not important - I think the role of the software engineer is critical in large-scale enterprise system design and construction. And I will readily admit that often a single person is responsible for both architecture and engineering. But that doesn't make them the same activity. As long as we continue to confuse the two disciplines, we hold them both back.

Michael and Ryan (or anyone else for that matter) are welcome to disagree with my definition of architecture. I often joke that if you asked ten architects to define “architecture”, you’d get twelve answers. But that’s my definition and I’m sticking to it.

But what of the Architecture Astronauts? Both Michael and Ryan mentioned them. Unsurprisingly, I think that term is used too broadly as well. If you go back and read Joel’s original post of Architecture Astronauts, there wasn’t much reference, if any, to the implementation layer at all.

The Architecture Astronauts will say things like: "Can you imagine a program like Napster where you can download anything, not just songs?" Then they'll build applications like Groove that they think are more general than Napster, but which seem to have neglected that wee little feature that lets you type the name of a song and then listen to it -- the feature we wanted in the first place. Talk about missing the point. If Napster wasn't peer-to-peer but it did let you type the name of a song and then listen to it, it would have been just as popular

[Joel on Software, Don't Let Architecture Astronauts Scare You]

I feel that my definition fits very well with the way Joel writes about architecture in this paragraph. The Architect Astronaut is trying to solve a real business problem - people need access to information besides music. But the mistake they make is thinking they can solve multiple problems with a single solution. So they abstract higher and higher until they’ve lost sight of the original problem and can only focus on the abstractions. If you look at what Joel has to say about technologies like Hailstorm and Jini, you see the same pattern emerge.

This isn’t to say that similar problems of over-abstraction don’t happen at the implementation layer – they do. But they happen for very different reasons. Astronaut Architects are trying to solve multiple problems with a single solution. But when over-abstraction happens at the implementation level, it because someone thought they could predict the future.

We’ve all seen our fair share of over-engineered systems that introduce significant unneeded complexity on the off chance that the development team can successfully predict the kind of change likely to come in the next version of the product. Invariably, the team’s precognitive abilities are revealed to be as poor as everyone else's, so they’re left with a bunch of extra layers of software cruft that has to be maintained but provides zero additional value to the system. I’ve blogged about that problem before as well: Kitchen Sink Variability.

Since I’m big on keeping the terminology of architecture and engineering separate, then I’d argue that we need a different term than Architecture Astronaut for people who want to introduce additional layers of abstraction at the implementation layer on the off chance that they don’t suck at precognition. Since we call such systems over-engineered, wouldn’t that make the people who build them “Over Engineers”?


[1] It’s like Shark Week, but with white boards and even more terrifying.

Posted By Harry Pierson at 5:12 PM Pacific Daylight Time

Tuesday, July 14, 2009

Dynamic Languages in Architecture

In the comments from yesterday’s post, IronPython MVP and author extraordinaire Michael Foord asked:

Has your view on architecture as a discipline separate from coding changed since working with dynamic languages?

In a word:“No” (though as always, I reserve the right to be wrong and/or convinced otherwise.)

When I was an architect, I tried very hard to treat it as a “discipline separate from coding”. To use my last post as an example, building a central repository of system audit information is an architectural decision. A bad one IMHO - at least the way Dilbert’s PHB described it - but an architectural decision all the same. It was a decision about what kind of system to build, part of an overall application portfolio, as opposed to a decision about how to build the system.

I’ve held this opinion of architecture for a long time. Four years (and three jobs) ago, I wrote the following:

IMO, building a system that has a set of functional requirements (track customers, process orders, etc) and non-functional constraints (sub-second response time, support 10,000 concurrent users, use Microsoft Windows platform, etc) is an engineering problem. Coming up with the lists of functional requirements and non-functional constraints is the architecture problem.

Working with dynamic languages has dramatically changed my view of engineering and design of individual systems. But from the pure architecture perspective, I want to be able to treat individual systems as black boxes as much as possible. That means the programming language is an implementation detail that shouldn’t matter to the architect.

Note the significant bet-hedging language in the paragraph above. I’m using phrases like “shouldn’t matter” and “as much as possible” because we all know that there’s no such thing as a “pure architecture perspective”. Unlike building architecture, software architecture is in constant flux at every level. At the enterprise level, there are always new regulatory obligations, new competitors and new partners to consider. At the end-to-end process level, there are always new systems or new version of existing systems coming on line. And at the individual system level, there are always new – or at least new versions - of tools, frameworks and languages being released.

Once you introduce time into your architecture perspective, individual system engineering will affect the overall architecture, since system engineering affects the rate of change. Language choice will certainly have some engineering impact. However, in my experience language choice is rarely high on the list of concerns relative to things like project scope and team experience.

So my “No” answer to Michael’s question is predicated on the following:

  • As an architect, I want to consider individual systems as black boxes where implementation details like language choice are completely irrelevant.
  • As a practical architect, I realize that some system implementation details are relevant – especially over time - but in my experience language choice isn’t one of them.

On the other hand, most IT shops try to standardize on one programming language – certainly MS IT did – so maybe language choice would be more architecturally relevant in a mixed language shop. I’d love to hear from folks who have multiple standard languages in their IT shop – especially if you have both static and dynamic languages on your standards list.

Posted By Harry Pierson at 11:28 AM Pacific Daylight Time

Monday, July 13, 2009

Probably Wrong Info Is Worse Than No Info At All

Like many geeks, I love Dilbert. However, I rarely identify with it as well as I did Sunday.

Dilbert.com

I kid you not, I’ve had almost exactly this conversation back when I worked in MS IT. They have this big repository of information about deployed applications. Technically, you’re not supposed to deploy an application without listing it in the application repository. Like Dilbert, I never really understood what people were going to do with this information, but the projects I was on dutifully collected the relevant information and put it into the repository.

And never thought of it again. Ever.

And therein lies the problem. Populating the application repository was an artificial step on the critical path of the deployment process. Writing the software, acquiring the physical hardware to run it on, stuff like that really is on the critical path. Populating the application repository was extra busy work legislated by someone (I forget if it was the central architecture team or management) that didn’t benefit the project in the slightest. As such, it was given the minimal about of attention and effort, meaning there was little quality or consistency in the data. Worse yet, when the application changed or was decommissioned , updating the application repository just didn’t happen. I mean, it was supposed to, but rarely did.

So you ended up with a repository of information that was worse than useless. I had a colleague who insisted that the repository had some value because “not all of the data was wrong”. Of course, he couldn’t tell me with any consistency which data was accurate and therefore valuable and which was not. Hence, my argument that it was “worse than useless”.

The only way an application repository is going to be of any value at all is if you can collect the data automatically. My old teammate Buzz coined a phrase we used often: “The Truth Is On The Edge”. You should always regard any central repository of information with a very critical eye since it’s rarely going to be the truth.

(Ed. Note – Man, it’s been a long time since I’ve written about Architecture. My last Architecture post was almost a year ago. I don’t miss the job but I do miss my old teammates – in particular Buzz, Rick, Dale and of course Nick Malik.)

Posted By Harry Pierson at 10:37 AM Pacific Daylight Time

Thursday, July 09, 2009

Syntax Highlighting TextBoxes in WPF – A Sad Story

One of the big new features in VS 2010 is the WPF based editor. With it, you can build all sorts of cool stuff like control the visualization of XML doc comments, change how intellisense looks, even scale the size of text based on the location of the caret. Huzzah for the WPF Visual Studio editor!

However, as wicked awesome as the new editor is, AFAIK it’s not going to be released as a separate component. So while the PowerShell, Intellipad and other teams inside Microsoft can reuse the VS editor bits, nobody else can. So if you want to do something like embed a colorizing REPL in your WPF app, you’ll have to use something else.

I’ve thought about putting a WPF based UI on top of ipydbg (though now I’d probably use the new lightweight debugger instead). So I downloaded John’s repl-lib code to see how he was doing it. Turns out his REPL control is essentially a wrapper around WPF’s RichTextBox control. It works, but it seems kinda kludgy. For example, the RichTextBox supports bold, italics and underline hotkeys, so John’s REPL does too. Though it is possible to turn off these formatting commands, I decided to take a look at modifying how the plain-old TextBox renders. After all, WPF controls are supposed to be lookless, right?

Well, apparently not all the WPF controls are lookless. In particular to this post, the TextBox is definitely NOT lookless. It looks like the text editing capabilities of TextBox are provided by the Sys.Win.Documents.TextEditor class while the text rendering is provided by the Sys.Win.Controls.TextBoxView class. Both of those classes are internal, so don’t even think about trying to customize or reuse them.

The best (and I use that term loosely) way I found for customizing the TextBox rendering was a couple of articles on CodeProject by Ken Johnson. Ken’s CodeBox control inherits from TextBox and sets the Foreground and Background to transparent (to hide the result of TextBoxView) and then overloads OnRender to render the text with colorization. Rendering the text twice – once transparently and once correctly – seems like a better solution than using the RichTextBox, but it’s still pretty kludgy. (Note, I’m calling the TextBox design kludgy – Ken’s code is a pretty good work around).

So if you want a colorized text box in WPF, your choices are:

  • Build your own class that inherits from RichTextBox, disabling all the formatting commands and handling the TextChanged event to do colorization
  • Build your own class that inherits from TextBox, but set Foreground an Background colors to transparent and overload OnRender to do the visible text rendering.
  • Use a 3rd party control. The only one I found was the AqiStar TextBox. No idea how good it is, but they claim to be a true lookless control. Any other syntax highlighting WPF controls around that I don’t know about?
Posted By Harry Pierson at 3:18 PM Pacific Daylight Time

Wednesday, July 08, 2009

Microsoft.Scripting.Debugging

If you’ve compiled IronPython from source recently, you may have noticed a new DLL: Microsoft.Scripting.Debugging. This DLL contains a lightweight, non-blocking debugger for DLR based languages that is going to enable both new scenarios as well as better compatibility with CPython. Needless to say, we’re very excited about it.

When I was actively working on my ipydbg series, I got several emails asking about using it in an embedded scripting scenario. Unfortunately, the ipydbg approach doesn’t work very well in the embedded scripting scenario. ipydbg uses ICorDebug and friends, which completely blocks the application being debugged. This means, your debugger has to run in a separate process. So either you run your debugger in your host app process and your scripts in a separate process or you run your debugger in a separate process debugging both the scripts and the host app. Neither option is very appealing.

Now with the DLR Debugger, you can run all three components in the same process. I think of the DLR debugger as a “cooperative” debugger in much the same way that Windows 3.x supported cooperative multitasking. It’s also known as trace or traceback debugging. Code being debugged yields to the debugger at set points during its execution. The debugger then does whatever it wants, including showing UI and/or letting the developer inspect or modify program state. When the debugger returns, execution of the original code continues until the next set point wherein the process repeats itself.

The primary point of entry for the DLR Debugger is the DebugContext class. Notable there is the TransformLambda method, which takes a normal DLR LambdaExpression and transforms it into a cooperatively debugged LambdaExpression. LambdaExpressions can contain DebugInfoExpressions – typically we insert them at the start of every Python code line as well as one at the end of the function. When we run IronPython in debug mode (i.e. –D), those get turned into sequence points as we saw back when I was working on ipydbg. When using the DLR Debugger, those DebugInfoExpressions are transformed into calls out to IDebugCallback.OnDebugEvent. The DLR Debugger implements the IDebugCallback interface on the TracePipeline class which also implements ITracePipeline. In OnDebugEvent, TracePipeline calls out to an ITraceCallback instance you provide. The extra layer of indirection means you can change your traceback handler without having to regenerate the debuggable version of your functions.

Of course, we hide all this DLR Debugger goo from you in IronPython. Python already has a mechanism for doing traceback debugging – sys.settrace. Our ITraceCallback, PythonTracebackListener, wrapps the DLR Debugger API to expose the sys.settrace API. That makes this feature a twofer – new capability for IronPython + better compatibility with CPython. Instead of needing a custom tool (i.e. ipydbg) you can now use PDB from the standard Python library (modulo bugs in our implementation). I haven’t been working on ipydbg recently since you’ll be able to use PDB soon enough.

For those hosting IronPython, we also have a couple of static extension methods in our hosting API (look for the SetTrace functions in IronPython\Hosting\Python.cs). These are simply wrappers around sys.settrace, so it has the same API regardless if you access it from inside IronPython or from the hosting API. But if you’re hosting IronPython in a C# application, those extension methods are very convenient to use.

This debugger will be in our regular releases of IronPython as of 2.6 beta 2 which is scheduled to drop at the end of this month. For those who just can’t wait, it’s available as source code starting with yesterday’s changeset. Please let us know what you think!

Posted By Harry Pierson at 2:42 PM Pacific Daylight Time
Change Congress
Recent Bookmarks
Tags .NET Framework (2) __clrtype__ (9) ADO.NET (5) Agile (7) AJAX (3) Architecture (288) Guidance (6) Interop (2) Modelling (61) Patterns (7) Process (4) SOA (94) Web Services (5) ASP.NET (25) Async Messaging (2) Azure (1) Battlestar Galactica (3) BI (2) BizTalk (4) Blogging (117) dasBlog (11) Podcasting (4) BPM (1) C# (11) C++ (4) Capitals (5) CardSpace (3) CLR (2) CodePlex (1) College Football (10) Comedy Central (1) Community (81) Concurrency (6) Consumer Electronics (1) Database (13) Debugger (23) Dependency Injection (2) Development (122) C Plus Plus (1) Embedded (5) Lanugages (42) Media (2) P2P (11) Rotor (1) SharePoint (6) SOP (3) DIY (1) DLR (25) Domain Specific Languages (15) Durable Messaging (5) Dynamic Languages (12) Dynamic Silverlight (1) Education (3) Enterprise 2.0 (1) Entertainment (14) ETech (15) F# (51) Functional Programming (17) Game Development (2) Guidance Automation (3) Hardware (8) HawkCodeBox (1) HawkEye (3) Health (1) Hockey (31) Home Electronics (1) Home Network (5) Hosting API (1) Humor (5) IASA (1) Idempotence (3) infrastructure (5) Instrumentation (4) Integration (2) IronPython (112) IronRuby (16) Java (2) Job (3) Kodu (1) LangNET (2) Lightweight Debugger (5) LINQ (23) Live Framework (3) Live Mesh (2) Lost (1) Master Data Management (1) Media 2.0 (6) Microsoft (31) MIX06 (2) Mobile Phone (1) Monads (5) Morning Coffee (172) Object Oriented (4) Office (5) Open Source (8) Open Space (2) Operations (3) Other (135) Art (1) Books (1) Family (33) Games (18) General Geekery (27) Home Theater (1) Movies (23) Music (20) Politics (3) Society (1) Sports (37) Working at MSFT (19) Parallel Programming (3) Parsing Expression Grammar (16) patterns & practices (2) PDC08 (5) Politics (48) Polyglot (3) PowerPoint (2) PowerShell (39) Presentation (7) Projects (1) HawkWiki (1) Pygments (5) Python (6) Quote of the Day (4) Refactoring (1) Research (2) REST (18) Reuse (5) Robotics (2) Rock Band (4) Rome (5) Ruby (23) Ruby on Rails (1) Sci-Fi (2) Scripting (4) Security (3) Service Broker (14) SharePoint (2) Silverlight (20) Social Software (1) Software + Services (2) Software Design (2) Software Engineering (1) Software Factories (11) Software Industry (1) Space Elevator (1) Spark (1) SQL Server (2) Stephen Colbert (1) TechEd (7) TechEd06 (1) TechRec League (1) Television (6) Travel (7) Unified Client (1) Unit Testing (4) USC (1) UX (1) Virtual PC (2) Visual Basic (3) Visual Studio (20) Volta (2) Washington Capitals (37) WCF (31) Web 2.0 (67) Web Services (7) WF (21) Windows (3) Windows Live (29) Windows Live Writer (3) WPF (8) Xbox (1) Xbox 360 (54) XML (11) XNA (15) Zune (4)
Disclaimer: The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion. Inappropriate comments will be deleted at the authors discretion.