Morning Eggnog 132

  • My parents are coming into town tomorrow so I’m off for the remaining week or so of the year. Blogging will likely be non-existent, unless I blog something I come up with while geeking out with my dad.
  • Juergen van Gael demonstrates how to use TPL from F#. He wrote this once before using F#’s async workflows feature. I like the TPL version, though the new Action<int>(RowTask) is a little wordy. I’m guessing the eventual F# syntax will probably become something compact like action RowTask. (via Don Syme)
  • Andrew Peter ported RoR’s Haml view engine to ASP.NET MVC, calling the result NHaml. I haven’t played around with the new MVC stuff much, but I’m guessing ASP.NET’s control-based approach doesn’t work well when you separate out the controller code. If I’m manually authoring view templates, I’d much rather type NHaml’s syntax than the standard ASP.NET <% …%> syntax. On the other hand, there aren’t any design tools out there today handle the NHaml syntax. Also, I wonder if Andrew is working on a Sass port. (via DNK)

F# PEG Parser Next Steps

There are still a couple of posts to go in my Practical Parsing in F# series. But with Christmas and my parents on their way, I’m taking the rest of the year off.

I’ve stuck the code as it currently stands up on my SkyDrive. Conveniently enough, released their RC1 build yesterday, which includes supports for static test methods. I’ve included the RC1 build in the zip file on SkyDrive, as well as simple batch file so you can run the tests yourself.

Taking a break from this project will give me a good opportunity to figure out where to take it next. As the code stands, it’s not very useful – it simply builds a PEG AST from a PEG grammar. That’s just the first phase of a typical compiler. Without those other phases (you know, like “generate binary code”) this is just an interesting sample.

Since I’m in the “future pondering” phase, now’s the time to make your opinion known. What do you, dear reader, think I should do with this code? Bonus points for wanting to get involved.

Practical F# Parsing: Semantic Productions (2)

Now that I’ve explained the AST, there are several more semantic productions to go. I’m not going to describe them all in detail, just hit a few important highlights.

Many of the semantic productions return lists of other productions. Class returns a list of Ranges, Literal and Identifier returns lists of characters, etc. As you would expect, these multiples are encoded in the grammar. For example, here’s the implementation of Literal:

///Literal <- ['] (!['] Char)* ['] Spacing
///         / ["] (!["] Char)* ["] Spacing
let (|Literal|_|) input =

    let rec (|LitChars|_|) delimiter chars input =
        match input with
        | TOKEN delimiter (_) -> Some(L2S chars, input)
        | Char (c, input) ->  
            (|LitChars|_|) delimiter (chars @ [c]) input
        | _ -> None

    match input with
    | TOKEN "'"  (LitChars "'"  [] (lit, TOKEN "'"  (Spacing(input)))) ->  
        Some(lit, input)
    | TOKEN """ (LitChars """ [] (lit, TOKEN """ (Spacing(input)))) ->  
        Some(lit, input)
    | _ -> None

I’m using a local recursive function LitChars to retrieve the characters between the quote delimiters. The quote parameter – i.e. single or double quote – is passed in as a parameter. I also pass in an empty list of chars as a parameter. Remember that functional programs keep their data on the stack, a list parameter is a common way to keep state in a recursive function. When I match a single non-delimiter character, I add it to the list with the chars @ [c] expression. [c] converts a single value c into a list of one element while the @ operator concatenates to lists. I’m not sure adding the value to he end like that is a good idea perf wise. Programming Erlang recommends only adding items to the head then reversing the list when you’re done matching. But F# isn’t Erlang, so I’m not sure what the guidance is here.

Another thing you find in productions is the backtracking syntactic predicates. We saw an example of them in the implementation of Comment. Often, their used to indicate the end of a list of other productions, such as Literal, above. However, sometimes, they’re used to ensure the correct production is matched. For example, a Primary can be an Identifier, as long as it’s not followed by a left arrow. An identifier followed by a left arrow indicates a Definition.

///Primary <- Identifier !LEFTARROW
///         / OPEN Expression CLOSE
///         / Literal / Class / DOT
let rec (|Primary|_|) input =

    let (|NotLEFTARROW|_|) input =
        match input with
        | LEFTARROW (_) -> None
        | _ -> Some(input)

    match input with
    | Identifier (id, NotLEFTARROW (input)) ->  
        Some(Primary.Identifier(id), input)
    | OPEN ( Expression (exp, CLOSE (input))) ->
        Some(Primary.Expression(exp), input)
    | Literal (lit, input) -> Some(Primary.Literal(lit), input)
    | Class (cls, input) -> Some(Primary.Class(cls), input)
    | DOT (input) -> Some(Primary.Dot, input)
    | _ -> None

Here, I need a way to match the absence of LEFTARROW, so I’ve build a simple local function called NotLEFTARROW. This isn’t very clean IMO – I’d rather have a used a custom operator like !!! and &&& for my backtracking predicates. But I haven’t figured out how to use custom operators as Active Patterns. I was able to write a standard non-operator AP function, but then I have to use the full AP function name. Here’s a version of Primary written that way:

///Backtracking failure predicate
let (|NotPred|_|) f input =
    match f input with
    | Some (_) -> None
    | _ -> Some(input)

let rec (|Primary|_|) input =
    match input with
    | Identifier (id, NotPred (|LEFTARROW|_|) (input)) ->  
        Some(Primary.Identifier(id), input)
    //Other matches omited

Frankly, I don’t think that’s very readable, so I didn’t implement it that way. If I can figure out how to use custom operators and pass around AP functions without using their full ugly name, I’ll change it.

Finally, there are a few things about F#’s scoping rules that you need to understand. F# uses linear scoping, which is to say there’s no way to use a type or function that hasn’t been declared, sort of like C/C++. The difference is that while C/C++ have a way to declare a type or function separately from its implementation, F# has no such capacity. This becomes an issue when you have circular references. For example, Primary can be an Expression, which is a list of SequenceItems, each of which is a Primary with an optional prefix and suffix. In order to declare those in F#, you have to use a special “and” syntax to link the types/functions together.

//ToString and Exp2Str omitted for clarity
type Primary =
| Identifier of string
| Expression of Expression
| Literal of string
| Class of Range list
| Dot  

//ToString omitted for clarity
and SequenceItem =
        primaryItem: Primary;
        itemPrefix: Prefix option;
        itemSuffix: Suffix option;

and Sequence = SequenceItem list

and Expression = Sequence list

Likewise, the AP functions to recognize Primary, SequenceItem, Sequence and Expression are anded together. For me, this is one of the hardest things to get used to about F#. But as you can see from the expressiveness of the code, it’s well worth the trouble