Language Features I Wish C# Had – Tuples

Several languages, such as Python, have the concept of a Tuple built into the lanugage. One of things it’s used for in Python is multiple return values. So you can call “return x,y” to return two values. Of course, C# can only return one. If you need to return more values, you have to define out parameters.

LINQ / C# 3.0 / VB 9 support the idea of anonymous types, which is similar to a tuple. The big difference is that, because they’re anonymous, they can’t leave the scope they’re defined in. In other words, they’re great within a function, but if you want to pass them out of your function type-safely, you have to define a non-anonymous type for them.

Interestingly enough, F# supports tuples, though it a bit of a hack. Since the CLR doesn’t support tuples, F# basically defines different Tuple classes for up to seven tuple parameters (i.e. Tuple<t1,t2,t3,t4,t5,t6,t7>), For .NET 1.x, it’s even worse – they have to define different type names (Tuple2, Tuple3, etc). Ugh.

Update: Robert Pickering pointed out that F#’s tuple implementation is entirely transparent inside of F#. He’s right – I was writing from the perspective of a C# developer using F#’s implementation of tuples. Maybe I need to be looking closer at F#?

Business Processes Are Services Too

I’ve been having a conversation with Piyush Pant over on his blog that started as a comment he left on my Services Aren’t Stateless post. He thinks that I’m “missing the crucial point here by implicitly conflating business process and service state”. While Piyush hasn’t really defined what he means by these terms, I think I understand what he’s getting at. Yes, process and service state are different in many ways, but they are also similar in that they are both service private data.

Pat Helland (side note – I wish Pat would start blogging again) wrote an article some time ago titled Data on the Outside vs. Data on the Inside where he talked about the differences between service private data and data in the space between the services. For example, data on the outside is immutable, requires an open schema for interop, doesn’t need encapsulation and is representable in XML. Service private data is not immutable, doesn’t need an open schema for interop, requires encapsulation and is typically stored in a SQL RDBMS. So on this front, process and service state are both service private data so conflating them makes some sense.

However, what’s not in the article is the idea of Resource and Activity data. Not sure why Pat didn’t include this in the article, but he was talking about it as far back as PDC 2003. Stu Charlton described the difference between resource and activity data in his Autonomous Services article:

Activity Data – This is “work in progress” data for any long-running business operation, and is usually encapsulated by business logic. A classic example is a shopping cart in any e-commerce system. This data is mutable, but typically has low concurrency conflicts, as it is not widely shared. Typically activity data retires after a long running operation completes, and may be archived in a decision support system for later analysis.

Resource Data – This is “state of the business” data, which represents the resources of an organization, and is usually encapsulated by business logic. Examples are: room availability in a hotel, inventory levels in a warehouse, account statuses, employee and customer information. Some resources have a small life span, others may last a very long time (years). Resource data is usually volatile with potential for high concurrency conflicts.

So I’m fairly sure that when Piyush says “process state” I should hear “activity data”. Similarly “service state” is “resource data”. The differences between activity and resource data lead to some interesting implementation artifacts, which I assume he getting at when he says I’m conflating the two. For example, since activity data like shopping cart has low or no concurrency issues, using an optimistic concurrency scheme is entirely appropriate, which you would never use for highly volatile resource data like warehouse inventory levels. In fact, since activity data doesn’t have concurrency issues, you could even store it inside an instance of workflow or orchestration, which gets serialized to a persistent store when it’s in an idle state.

However, the fact that activity and resource data is handled differently doesn’t mean that most services won’t have activity data. When Thomas Erl says that that stateless services is a “common principle of service orientation”, essentially what I think he’s saying that services should only have resource data. And as I said before, this seems wrong to me. Sure, some services will be stateless. But all services? Services implement business capabilities. Most business capabilities are long running processes. Doesn’t that imply that most services in the enterprise will need to be long running workflows or orchestrations?

So for the most part, Piyush and I just seem to have different names for the same concepts. The one issue I have with Piyush’s descricription of process and service state is that he seems to implicitly assume that processes aren’t services. Why not? Again, not all services will be processes, but if you’re not exposing processes as services, how exactly are you exposing them?

Modular Compilers

During Lang.NET, I ended up sitting next to Hua Ming, who’s been working on the .NET Classbox project I wrote about previously. .NET Classbox introduces a new syntax for “using” to C# – basically, you can use individual classes as well as whole namespaces, and you can extend the individual classes you use. Obviously, that meant having a custom compiler that was 99% vanilla C# + the extra classbox syntax. Rather than building a C# compiler from scratch, the Classbox project extended the Mono Project C# compiler. Hua described the process as taking a “huge amount of time” and he described the compiler as “a monster”. Now, I’m not trying to knock Mono here, I imagine our C# compiler is just as hard to work with. SSCLI’s C# compiler directory is 5.5MB of source code alone spread across 126 .h and 68 .cpp files.

Is it just me, or does it seem crazy to have to muck about with such a large code base in order to add a relatively simple language feature? What I’d like to see is a more modular way of building compilers, so that integrating a small language feature like classbox would be a small amount of effort.

Of course, there is some work that’s been done in this space. MS Research had a Research C# compiler paper, but it’s three years old and one of the two authors has moved on to a cool product group job. I also discovered SUIF and the National Compiler Infrastructure Project, but these don’t look like they’ve been updated in a while.

I like the model that the Research C# compiler proposes. Basically, it looks like this:

  1. Specify the grammar in a modular way. In the paper, the grammar is specified in an Excel file, and you can use multiple files in a modular fashion. i.e. have one file for the core language and another for the extensions.
  2. Late bind a grammar production to an action. Typically, in a lex/yacc style scenario, you embed the action code for a given production directly into the grammar, which makes it extremely hard to extend the existing syntax. In the paper, each production is linked with an instance of a type, so swapping out a new type would seem to be possible.
  3. Generate an abstract syntax tree, that gets processed by multiple visitors. From the paper, the compiler has broken the “traditional” compiler steps – bind, typecheck, rewrite and generate binary (in this case IL) – into separate visitors. That makes adding extra steps or chaning existing steps fairly straightforward.

The only think I don’t like about this specific approach is their Excel file based parser generator. It’s a huge step beyond the LEX/YACC approach as it is scanner-less (having separate scanner and parser steps kills any chance of modularity) but it still has to deal with ambiguous grammars. Personally, I’ve been looking at Parsing Expression Grammars in part because they aren’t ambiguous. For programming lanugages, support ambiguity in the grammar is a bug, not a feature.

Extending WL Writer

So I downloaded the SDK for WL Writer and took a quick look. Basically, there’s two types of extensions you can build:

  • App Launcher – so you can add a “Blog It” button to some other app to remotely launch WL Writer. I assume this is how the WL Toolbar intergration works.
  • Content Source – so you can add some type of custom content to a post. Typical examples would be Technorati tags or Currently Listening To info.

Given that they are trying to support “every blogging service out there”, I’m surprised there’s not a way to build a plugable blogging service. WL Writer only allows you to customize the content of the post via plugins. Customizing the metadata (i.e. categories) is right out. I realize it’s the hip thing to put Technorati tags right in your post content, but Technorati also picks up category information which dasBlog already has great support for. What I’d really like is something that acts like del.icio.us’ new post form, where you can free type in your categories, it highlights words as you type and it shows you a list of all your tags so you can click on them.

One other minor note – WL Writer does a good job for inserting hyperlinks. When you select a word, often the whitespace that follows it is also selected. Some HTML editors will insert the hyperlink over the whole selection – inlcuding the whitespace which makes no sense. WL Writer gets it right and excludes any trailing whitespace from the hyperlink. Cool!

A Few Short Takes

I did say I was going to go a little dark when I took the new job didn’t I? Things have been hectic – my brother’s getting married in just under two weeks and I’m working on getting my part of my new project’s Business Requirements Document (otherwise known as the BRD) done before I leave on vacation. The BRD process is fairly odd for this project – for one, the project team is writing it instead of the business unit. Given that we’re building infrastructure, many of the “business” elements of the BRD are not particularly appropriate. But we’re muddling thru. In a meeting with my boss’s boss’s boss last week, he stressed the need for delivering incremental value. In other words, the need for using an agile process which is cool as far as I’m concerned.

I have a couple of longer posts coming, but here are a few short takes for a Monday morning:

Windows Live Writer

Everyone seems gaga over the new tool, so I downloaded it. Pretty cool. I’m writing this post using it. Typically, I write my posts in FrontPageSharePoint Designer and paste them into the dasBlog web editing interface – I’m pretty particular about the HTML that ends up on my blog. So far, Writer seems up for the job. And I love the Web Layout editing mode. Does have some bugs and missing features. For example, it has spell check, but not background spell check. And as Scott pointed out the category list is totally broken when you have a lot of categories. Writer has an SDK, and one of the examples they suggest building with it is “Tags from tagging services”. I’d like to have a simple text box where I could enter categories as tags, and have it automatically create any categories that aren’t already on my site. I’ve already got a side coding project going, but I’m almost done so maybe I’ll take that up next.

XNA Game Studio

I was researching some Xbox stuff for a customer several months ago and got wind of this plan. I can’t wait to see it running. I recently picked up Frank Luna’sIntro to 3d Game Programming: A Shader Approach based on Dave’s recommendation. I figure most, if not all, the source code will be obsolete in the XNA Framework world, but the concepts are spot on so it’s been a good read.

One aspect of this announcement that I haven’t seen talked about yet is the impact on the mod community. Many games today ship with an SDK – here are examples for Dungeon Siege, Half-Life 2 and Doom 3. Of course, the idea is that modder’s get a popular game and industrial-strength game engine to build on for almost no cost and the game publisher expands the value of their game – any mods require the original game to play. Wouldn’t it be cool if you could mod Halo 3? And combined with Live Anywhere, the possibilities are enormous. I can’t wait to see how this evolves.

New Machine & Vista

For the first time in my nearly 8 year MSFT career, I have a desktop machine. And it’s a nice one – a Dell Precision 690 workstation. 2x dual Xenon CPU, 2x 160GB SCSI Hard Drives, dual link DVI outputs for driving twin widescreen monitors – dual is very big on this machine. Pretty much the only skimpy part of this machine is the RAM – only 2GB. But I’m not running x64 (yet) so that’s not a huge deal (yet).

Of course, such a screaming machine runs the latest Vista build. I’m also running it on my laptop – with Aero Glass even, thanks to this driver. The combo of latest Vista build + latest Office build is pretty sweet.

With new machines and new operating systems, I’ve been spending significant time installing. The Dell box turned out to be a real pain as it only has the SCSI drives which are not standard on the WinXP install disk. I’m dual booting XP/Vista on both machines, but I had to create a custom slipstreamed XP install disk to get my Dell workstation up and running (Vista installed without any extra work). But now I’ve got the baseline install imaged – thanks to BootIt NG which I’ve spoken highly of before – I shouldn’t ever have to do that again.