DataReaders, LINQ to XML and Range Generation

I’m doing a bunch of database / XML stuff @ work, so I decided to use to VS08 beta 2 so I can use LINQ. For reasons I don’t want to get into, I needed a way to convert arbitrary database rows, read using a SqlDataReader, into XML. LINQ to SQL was out, since the code has to work against arbitrary tables (i.e. I have no compile time schema knowledge). But XLinq LINQ to XML helped me out a ton. Check out this example:

const string ns = "{http://some.sample.namespace.schema}";

while (dr.Read())
{
    XElement rowXml = new XElement(ns + tableName,
        from i in GetRange(0, dr.FieldCount)
        select new XElement(ns + dr.GetName(i), dr.GetValue(i)));
}

That’s pretty cool. The only strange thing in there is the GetRange method. I needed an easy way to build a range of integers from zero to the number of fields in the data reader. I wasn’t sure of any standard way, so I wrote this little two line function:

IEnumerable<int> GetRange(int min, int max)
{
    for (int i = min; i < max; i++)
        yield return i;
}

It’s simple enough, but I found it strange that I couldn’t find a standard way to generate a range with a more elegant syntax. Ruby has standard range syntax that looks like (1..10), but I couldn’t find the equivalent C#. Did I miss something, or am I really on my own to write a GetRange function?

Update: As expected, I missed something. John Lewicki pointed me to the static Enumerable.Range method that does exactly what I needed.

Fantasy, Free Code and the SharePoint Model

It might sounds like a fantasy, but Nick Malik really wants free code. He started talking about it a few months ago when he was getting raked over the coals by debating Mort with the agile .NET community:

Rather than look at “making code maintainable,” what if we look at making code free.  Why do we need to maintain code?  Because code is expensive to write.  Therefore, it is currently cheaper to fix it than rewrite it.  On the other hand, what if code were cheap, or free?  What if it were cheaper to write it than maintain it?  Then we would never maintain it.  We’d write it from scratch every time.

Then about a week ago, he laid out the reasons why free code would be a good thing. At a high level (Nick’s blog has the details), those reasons are:

  1. Lower the cost of IT through reduced skill requirements.
  2. The speed of development goes up.
  3. Projects become more agile.
  4. Solution quality goes up.

Talking about the benefits of free code is sorta like talking about talking about the benefits of dating a movie star. The benefits are actually pretty obvious, but talking about them doesn’t really help you get there from here.

Actually, Nick isn’t suggesting that all code can be free. He’s focused on separating out business process modeling/development from the rest of software development. In essence, he’s describing a new class of developer (should we call the persona Nick as an homage?) who needs their own class of tools and for the IT department to “formally” allow them to “easily develop, test, and deploy [aka host] their processes.” For the most part, these BP developers wouldn’t be traditional developers. They’d be more like software analysts who do the work directly instead of writing specs for developers.

I call this separation of business and IT concerns the SharePoint Model. SharePoint, IMO, does an amazing job of separating the concerns and needs of business and IT users when it comes to running intranet web sites. Only the truly geeky stuff that requires specialized access, knowledge or equipment – installing the SharePoint farm in the first place, keeping it backed up, installing service packs, etc. – is done by IT. Everything else is done by the business folks. Need a new site? Provision it yourself. Need to give your v-team members access to it? Do it yourself. I see similarities in the free BP code approach Nick’s suggesting. I’d even argue that SharePoint is the natural “host” for such business processes. It already supports WF and can provide access to back-end enterprise data via the Business Data Catalog.

On the other hand, some of what Nick suggests seems fairly impractical. For example, he thinks IT should “formally and officially take control of managing the common enterprise information model and the business event ontology.” First off, who officially manages this today? Does such an official information model or event ontology even exist? I’m guessing not. That means you’ve got to start by getting the business people to agree on one. That’s usually a sucker’s bet. Nick also suggests we “reduce the leaky abstractions” in our services. To suggest this is even possible seems incredibly naive.

The good news is the things that will work (evolving BP into its own development discipline, building custom tools for BP development, getting IT into the BP hosting business) don’t depend in any way on the things that wont work (getting lots of folks to agree on anything, breaking the laws of physics, overcoming the law of leaky abstractions). I’m not sure it will result it truly free code, but it sure would bring the costs down dramatically. Thus, I think most of Nick’s free code vision is quite practical and not a fantasy at all.

As for dating a movie star, you’re on your own.

*Now* How Much Would You Pay For This Code?

Via Larkware and InfoQ, I discovered a great post on code reuse by Dennis Forbes: Internal Code Reuse Considered Dangerous. I’ve written about reuse before, albeit in the context of services. But where I wrote about the impact of context on reuse (high context == low or no reuse), Dennis focused on the idea of accidental reuse. Here’s the money quote from Dennis:

Code reuse doesn’t happen by accident, or as an incidental – reusable code is designed and developed to be generalized reusable code. Code reuse as a by-product of project development, usually the way organizations attempt to pursue it, is almost always detrimental to both the project and anyone tasked with reusing the code in the future. [Emphasis in original]

I’ve seen many initiatives of varying “officialness” to identify and produce reusable code assets over the years, both inside and outside Microsoft. Dennis’ point that code has to be specifically designed to be reusable is right on the money. Accidental code (or service) reuse just doesn’t happen. Dennis goes so far as to describe such efforts as “almost always detrimental to both the project and anyone tasked with reusing the code in the future”.

One of the more recent code reuse efforts I’ve seen went so far as to identify a reusable asset lifecycle model. While it was fairly detailed at the lifecycle steps that came after said asset came into existence, it was maddeningly vague as to how these reusable assets got built in the first place. The lifecycle said that a reusable asset “comes into existence during the planning phases”. That’s EA-speak for “and then a miracle happens”.

Obviously, the hard part about reusable assets is designing and building them in the first place. So the fact that they skimped on this part of the lifecycle made it very clear they had no chance of success with the project. I shot back the following questions, but never got a response. If you are attempting such a reuse effort, I’d strongly advise answering these questions first:

  • How does a project know a given asset is reusable?
  • How does a project design a given asset to be reusable?
  • How do you incent (incentivize?) a project to invest the extra resources (time, people, money) in takes to generalize an asset to be reusable?

And to steal one from Dennis:

  • What, realistically, would competitors and new entrants in the field offer for a given reusable asset?

Carl Lewis wonders Is your code worthless? As a reusable asset, probably yes.

Thoughts on C# Fluent Interfaces

Martin Fowler points to a couple of articles by Anders Norås on building internal / embedded domain specific languages in C#. Anders has built a DSL for creating calendar events and tasks, like you might expect to do in Outlook. Here’s an example:

ToDoComponent planningTask =
   Plan.ToDo("Plan project X").
      StartingNow.
      MustBeCompletedBy("2007.08.17").
      ClassifyAs("Public");
planningTask.Save();

EventComponent planningMeeting =
   Plan.Event("Project planning meeting").
      RelatedTo(planningTask).
      WithPriority(1).
      At("Head office").
      OrganizedBy("jane@megacorp.com", "Jane Doe").
      StartingAt("12:00").Lasting(45).Minutes.
      Attendants(
         "peter@megacorp.com",
         "paul@megacorp.com",
         "mary@contractor.com").AreRequired.
      Attendant("john@megacorp.com").IsOptional.
      Resource("Projector").IsRequired.
      ClassifyAs("Public").
      CategorizeAs("Businees", "Development").
      Recurring.Until(2008).EverySingle.Week.On(Day.Thursday).
      Except.Each.Year.In(Month.July | Month.August);
planningMeeting.SendInvitations();

It may not be as clean as a say a Ruby version might be, but even with all the parens and periods it’s still pretty readable. Fowler calls this a fluent interface, a term I like better than “internal DSL”.

Two things jumped out at me reading Anders’ entry on how he built this fluent interface. First, there’s a lot of code to make this work. Anders didn’t publish the code, but he did admit:

“Believe me, there will be a lot of code when you’re done. I’m almost there with this DSL, and at the time of writing it consists of 58 classes not including the API and tests.”

That’s 58 classes just to implement the fluent interface, not counting the underlying EventComponent API. That’s a lot of non-business logic code to write. How many projects are willing to invest that kind of time and effort to build a fluent interface? (I would guess “not many”)

However, I bet there’s a lot of template-izable code in Anders fluent interface. When he writes about keeping the language consistent by “creating branches within our grammar using different descriptor objects”, I can help but think about parser development with YACC and the like. These tools typically use a DSL like BNF. Maybe we could build a DSL for building fluent interfaces?

Second, Anders makes a very interesting point about the structure of the fluent interface code:

Writing DSLs is a little different from the regular object oriented programming style. You might have noticed that the Plan class has a verb for its name rather than the usual noun. This allows us to have a natural starting point for writing out the “sentence” explaining our intention.

Where have you seen this verb based approach before? Powershell cmdlets.

Windows PowerShell uses a verb-noun pair format for the names of cmdlets and their derived .NET classes. For example, the Get-Command cmdlet provided by Windows PowerShell is used to retrieve all commands registered in the Windows PowerShell shell. The verb part of the name identifies the action that the cmdlet performs. The noun part of the name identifies the entity on which the action is performed.
[Cmdlet Verb Names, MSDN Library]

I’ve written about this aspect of PowerShell before:

In OO, most of the focus is on objects, naturally. However, administrators (i.e. the target audience of PS) tend to be much more task or action focused than object focused. Most OO languages don’t have actions as a first class citizens within the language. C# and Java don’t even allow stand alone functions – they always have to be at least static members of a class.

I’m fairly sure there are many reasons why strongly typed OO languages aren’t popular among administrators. I’m not going to go down the static/dynamic typing rat hole here, but I would guess the object/action language tradeoff is almost as important as the typing tradeoff. What’s nice about PowerShell is that while it has strong object support, it also has strong action support as well. In PS, actions are called Cmdlets. While I’m not a big fan of the name, having first class support for them in PS is one of the things I find most interesting.
[Perusing Powershell Part 1: Get-SQLServer, DevHawk]

While there is no first-class support for verbs or actions in C#, it looks like Anders has essentially rolled his own. For example, his Plan.Event() method returns a new EventDescriptor object. Subsequent calls on this object (RelatedTo, WithPriority, OrganizedBy) change the internal state of this EventDescriptor object. When you reach the end of the chain of calls, EventDescriptor has an implicit EventComponent cast operator that creates a new EventComponent with all the data that’s been collected along the chain by the EventDescriptor.

Again, I can help but think a significant amount of code in this approach can be generalized and the creation automated. Also, I wonder if any of the new C# 3.0 capabilities could be used to improve the implementation. For example, would Extension Methods make it easier to build the fluent interface? Maybe / Maybe not. Regardless, Anders has given me a lot to noodle on.

Morning Coffee 98

  • Morning Coffee was canceled on Thursday and Friday on account of a kidney stone. So not fun. Luckily, it was a little one and it was alone, but I will be listening very closely to my doctor’s advice to avoid another.
  • Took the kids to see Ratatouille last Tuesday and saw Transformers yesterday with my wife due to fluke babysitter luck. I liked Ratatouille, but I’m not sure it’s the 51st best movie of all time. On the other hand, major props for making a kid movie with a significant lack of toy tie-ins. Ratatouille is a better movie that Cars, but I don’t see my four year old boy trading in is Lightning McQueen toy car for a Remy the Rat. Transformers on the other hand obviously did not forgo the toy tie-ins! Still, it wasn’t bad. Kinda reminded me of The Rock with a bigger budget.
  • Micahville listed DevHawk on it’s list of 69 Tech Blogs That Don’t Suck. Thanks!
  • David Ing boldly writes that C# is getting fat. Or maybe it’s just big-boned. My take: no question that integrated query is a big feature that covers a lot of surface area. But given the prevalence of databases and other queriable stores, it’s critical to improving programmer productivity. Go read Todd Proebsting’s talk on Disruptive Programming Language Technologies. Two of his candidates for disruptive language technologies were Database Integration and Manipulating XML. LINQ neatly covers both.
  • According to John Shewchuck, the new BizTalk Services release is available. However, when I click on the “what’s new” page, it tells me they’re experiencing technical difficulties. (Their error page is Oops.aspx. Funny!)
  • Scott Hanselman has Programming Personas 2.0. Who are you? I thought I was and “Order n” Architect (the quote “Where’s the whiteboard” is spot on) but my CS background isn’t as strong as the persona’s.
  • Sam Gentile is starting to dig into Concurrency and he has a great list of links that have influenced his design.