Monadic Philosophy Part 2 – The LINQ Monad

If you don’t come from a math or philosophy background (and I don’t) “monad” sounds like a made-up word. Of course, understanding OO’s use of terms like “class” and “object” can be hard to grok at first too. But at least those terms have some grounding in real-world concepts that non-math geeks come across. Because I couldn’t draw an analogy of monads to anything at first, it made grasping the concept of monads very hard for me.

It’s such a unfamiliar word that the F# team doesn’t like it either:

“[W]hen the designers of F# talked with the designers of Haskell about this, they agreed that the word monad is a bit obscure and sounds a little daunting and that using other names might be wise.”
[F# Workflows and Haskell Monads, Expert F#, p232]

The F# team thought about calling them workflows, but settled on computation expression. Frankly, I don’t like these names much better. Workflow is too easily confused with WF and if the term computation expression is way to generic. Isn’t everything in programming a computation expression? I think I’ll just stick with monad.

Of course, if there was a short, pithy way of describing a monad, I’m sure that’s what we’d call them. It’s a kinda complicated idea, so there’s no simple two or three word phrase that accurately describes it. “Sequential computation with context flow” is the best I could come up with. It’s a crappy description, but here’s an elegant example that most .NET programmers are probably already familiar with.

var Orders = new List<Order>()

//code to populate orders omitted

var q = Orders
    .Where(x => x.OrderDate < DateTime.Now)
    .OrderBy(x => x.OrderDate)
    .Select(x => new {ID = x.OrderID, Date = x.OrderDate})

Yes it’s true: LINQ is a monad. The two basic concepts about a monad from my description above is that it’s a) a sequence of operations and b) there’s a context that flows from operation to operation. We see both here in this simple LINQ query. I realize I’m using what looks like a LINQ to SQL query here, but for the sake of argument let’s assume that this is all happening in memory.

The query is a sequence of three operations: Where, OrderBy and Select. LINQ has a set of standard query operators that you can mix and match in whatever order you need to. Part of the monad’s job is to enforce the sequence of actions. For C#, that’s not really a big deal, since it has explicit sequencing already. However, other languages like Haskell use lazy evaluation, meaning there is no explicit order of execution. Many lazy evaluation languages use monads in areas, such as I/O, where order of execution matters.

While C# doesn’t need any help to enforce execution order, monads are very useful in the way they flow context between the operations. In the case of LINQ, all the standard query operators take an IEnumerable<T> as their first parameter and return an IEnumerable<T>. Since they have the same inputs and outputs, they can be plugged together in whatever order is required. Yet, you don’t see any reference to GetEnumerator or the enumerator objects they return in the LINQ code above. All that code is hidden inside the LINQ query operators so the LINQ developer doesn’t have to look at it.

If you squint hard enough, IEnumerable kinda looks like a functional construct. It exposes a single method (GetEnumerator) and can be passed around much the same way functional languages like F# pass around first-order functions. Furthermore, the result of calling GetEnumerator is an IEnumerator object that likewise exposes one main function (MoveNext). In other words, you can think of IEnumerable sort of like a function that returns a function that you call to iterate the collection.

So to sum up, a monad is a sequence of operations in a specific order that automatically flows context from one operation to the next. In the LINQ example, C# has built-in constructs – IEnumerable<T>, foreach and yield return – that makes the monad seem less foreign (which is why I used it as my first example!) However, as we’ll see, the concepts of sequence and context flow in a monad still hold even if we’re not using built in features of C# to implement them.

Monadic Philosophy

(Since I accidentally published part one of this series a few minutes ago, I figured I might as well start publishing the series.)

If you start learning functional programming, eventually you’ll come across the idea of a monad. Coming from the object/imperative world of languages like C#, I’ve had a hard time wrapping my head around this concept. There’s no shortage of monad tutorials out there, but most use Haskell’s IO as the prototypical example of a monad. Given that I don’t know Haskell very well, I found it hard to separate the Haskell stuff from monad stuff. So I set monads on the back burner and decided not to worry about them.

However, all that changed when Stephan Tolksdorf alerted me to his very cool monadic parser combinator library FParsec. I found the FParsec parsers much easier to read my F# parser efforts, so I became very interested in monadic parser combinators. As you might guess, a “monadic parser combinator library” makes heavy use of monads. Time to switch burners.

The problem with learning monads with FParsec is that it’s really designed for production use. I needed to break monads down to first principles, so I rolled my own monadic parser library. Make no mistake, if I were looking to build a production parser in F# right now, I’d use with FParsec. My monadic parser library might “get there” eventually, but right now it’s a toy.

Over a series of posts, I’m going to describe what I know about monads. I didn’t set out to write a tutorial on monads – as I said, there are plenty of them out there. However, I found most of the the many monad tutorials I read lacking because the did a good job explaining the “how”, but not such a good job on the “why”. Coming from an imperative world, I wanted to understand the philosophy better. That being said, there’s a lot of tutorial in and around the philosophy. Hopefully, you’ll find both useful.

Programming Languages @ PDC08

The PDC folks pushed out a bunch of new sessions yesterday, including a bunch from my part of DevDiv. You can see the full list of sessions that have been published (so far) at the PDC site.

An Introduction to F#
Learn about Microsoft’s new language, F#, a typed functional programming language for the .NET Framework. F# combines functional programming with the runtime support, libraries, tools, and object model of .Net. Understand how F# asynchronous workflows help tame the complexity of parallel and asynchronous I/O programming and how to use F# in conjunction with tools such as Parallel Extensions for .NET.

Deep Dive: Dynamic Languages in .NET
The CLR has great support for dynamic languages like IronPython. Learn how the new Dynamic Language Runtime (DLR) adds a shared dynamic type system, a standard hosting model, and support for generating fast dynamic code. Hear how these features enable languages that use the DLR to share code with other dynamic and static languages like VB.NET and C#.

Future Directions for Visual Basic
Come learn about the new capabilities in the next version of the language, including: extensions to LINQ, syntax simplifications, and improvements to the IDE. We’ll provide insight into the direction of the language, including dynamic binding, meta-programming, and scripting.

The Future of C#
In this talk Microsoft Technical fellow and C# Chief Architect Anders Hejlsberg outlines the future of C#. He will describe the many forces that influence and shape the future of programming languages and explain how they fit into C#.

Visual C++: 10 is the New 6
Get more done. The next version of Visual C++ is all about improving developer productivity for large-scale applications. Learn about the IntelliSense and browsing experiences, changes to the project and build system, project-less browsing, collaboration through remote symbol indexing, and custom visualization of symbolic information.

Morning Coffee 144

  • I finished Mass Effect last night. I definitely need to play thru that one again, though I’ll probably wait until the new Bring Down the Sky DLC ships next month.
  • Caps won again last night, improving to 20-10-4 since changing coaches at Thanksgiving. They’re now at 57 points, taking the lead in the SE division with a full game on Carolina, Atlanta and Florida. Still a ways to go – 27 games left in the regular season – and things are far from “sewn up” but we’re a damn sight better off than we were in November.
  • Speaking of a horserace, looks like Clinton and Obama are in one after Super Tuesday. Their estimated delegate counts are basically tied. On the other side of the aisle, McCain opened up what is probably insurmountable lead – even though he has the right-wing media stars and Christian leaders against him. Money quote of the day:

“The real story of the night, when you look at their rallies and their turn-out numbers, is that the Dems have two strong candidates either of whom could lead a united party to victory. Forget the gaseous platitudes: in Dem terms, their choice on Super Duper Tuesday was deciding which candidate was Super Duper and which was merely Super. Over on the GOP side, it was a choice between Weak & Divisive or Weaker & Unacceptable. Doesn’t bode well for November.”\

  • Charlie Calvert is starting a new series on the future of C#. First up: Dynamic Lookup. Probably most interesting is the news that the DLR “will be the infrastructure on which the C# team implements dynamic lookup”. Does this mean C# will target the DLR? Sure sounds like it. I think it’s a good addition, but I’m not a fan of the proposed syntax. (via Bitter Coder)
  • Brian McNamara saw me present @ LangNET and sent me a link to his blog. He’s building up a monadic parser combinator library in C# 3.0. This is basically the same concept that FParsec implements, though C#’s syntax is much less attractive than F#’s for this kind of code. However, Brian does a very good job explaining why monadic parser combinators are useful and making the idea accessible to the C# programmer (i.e. you don’t have to learn F# or Haskell to understand what he’s talking about). He also points to Luke Hoban’s C# 3.0 monadic parser implementation.

Another InitImportantThing Approach

I thought of another approach to the InitImportantThing problem that I blogged about yesterday. I think it’s a bit harder to code, but it’s certainly explicit and avoids the magic method that Jon dislikes so much.

The crux of the problem is that ServiceHostBase needs a valid ServiceDescription in order to operate. The WCF team chose to provide said description to ServiceHostBase via the abstract CreateDescription method. But as we saw, ServiceHostBase can’t call CreateDescription from it’s own constructor. So instead, derived classes are forced to call InitializeDescription in their own constructor. Since that call isn’t enforced by the compiler, it’s easy to forget to include it. Since the exception that gets thrown doesn’t really tell you what went wrong, it’s easy to spend hours trying to figure it out.

So here’s a better approach: since the ServiceHostBase needs a valid ServiceDescription in order to operate, why not pass it in as a constructor parameter?

ServiceHostBase has a protected constructor with no parameters. But since it needs you to call InitializeDescription in your derived class constructor, it really needs the ServiceDescription, a collection of ContractDescriptions (also returned from CreateDescription) and a collection of base addresses (passed into InitalizeDescription). If these were parameters on ServiceHostBase’s constructor, it could validate that information directly, without needing abstract or magic methods.

The one problem with this approach is that the creation of a ServiceDescription is non-trivial. ServiceHost’s implementation of CreateDescription generates the ServiceDescription by reflecting over the service type. You still need that code, but now you would call it from the base constructor initializer instead. That means it has to be a static method, but otherwise it would work just fine. Here’s yesterday’s code, updated for this approach:

public abstract class Base
{
    public Base(string importantThing)
    {
        if (string.IsNullOrEmpty(importantThing))
            throw new Exception();

        _importantThing = importantThing;

    }

    private string _importantThing;

    public string ImportantThing  
    {  
        get { return _importantThing; }  
    }
}

public class Derived : Base
{
    private object _data;

    public Derived(DateTime dt) : base(CreateImportantThing(dt))
    {
        _data = dt;
    }

    private static string CreateImportantThing(DateTime dt)
    {
        //this is obviously trivial, but could be much
        //more complicated if need be
        return dt.ToLongDateString();
    }
}

This seems like the best approach to me. You remove the un-obvious magic method call requirement when deriving your own service host while still enforcing the data consistency check in the base class during construction. Best of both worlds, right?

So I wonder why the WCF team didn’t do it this way?