- In writers strike news, the WGA has made side deals with Worldwide
Pants
(aka Dave Letterman’s company), United
Artists
(aka Tom Cruise’s company) and The Weinstein
Company
(
previously known as Miramax). The WGA strategy of divide and conquer seems to me making slow progress. Update: The Weinstein Company was founded by Miramax’s founders Harvey and Bob Weinstein after they left Miramax. But Miramax is still around. Thanks to GrantC for the correction. - They’re still two games under .500, but the Caps completed a season sweep of the Eastern Conference leading Ottawa Senators last night. They’re only 3 games out of the top spot in the (admittedly very weak) Southeast division
- Big tech news today isn’t coming from MSFT-land. Sun is buying MySQL and Oracle is (finally) buying BEA. Both deals seem like pretty significant culture clashes, though Sun/MySQL seems like the better fit of the two.
- There’s a new draft of Service Modeling Language 1.1 available. If you’ll recall, this used to be called the System Definition Model, part of the Dynamic Systems Initiative. Hadn’t heard anything from those folks in a while, good to see they’re making progress.
- Stephan Tolksdorf dropped me a line to tell me he was able to “vastly simplify” FParsec, and as a result it now runs on the current version of F#. Awesome!
- Speaking of F#, Scott Hanselman has a new F# podcast, this time interviewing Dustin Campbell. Check out all of Dustin’s F# posts.
- I didn’t know about the “Copy as Path” feature in Vista. Why is it hidden?
- I was a big fan of the WDS deskbar shortcut feature – a feature that is missing in Vista. Enter Start++ by Brandon Paddock, which adds shortcuts to Vista’s search box. It also supports “iPhone apps” and scripting. But JScript? Where’s the PowerShell love, Brandon?
- EA released the source code to the original SimCity under the GPL. Bil Simser is digging into the code and it looks like he’s going to port it to XNA. (via Ozymandias)
- Wes Haggard has published the source code to CodeHTMLer on CodePlex. He took two updates from me: the F# language definition as well as the ability to choose the font when not using PRE tags.
Morning Coffee 138
Morning Coffee 135
- Bill Gates does his last CES Keynote, and we announce a PC that looks like a purse?
- News that Warner Brothers is going exclusively Blu-Ray is disappointing. However, I’m convinced that neither side will win this format war but that online downloads will trump both. Obviously, XBLM is a significant player in this space, but the market is crowding up quickly. Netflix apparently will unveil a new set-top box @ CES to let you watch HD movies via the Internet.
- Don Syme has a roundup of posts by John Liao about F#. Mostly, WPF + F# with a couple of ASP.NET 2.0 posts and one on XML .
- Speaking of F#, Stephan Tolksdorf has been working on an F# port of MS Research’s Parsec library called FParsec. Parsec is a “monadic parser combinator library”, something I have little experience with, so I’ve gone back to some source research on the topic, which I hope to blog at length about soon.
- Steve Vinoski talks about serendipitous reuse in his latest Internet Computing article. I’m not a believer in reuse in the enterprise, serendipitous or otherwise, but I liked the conclusion to Steve’s article when he wrote “It’s highly ironic that many enterprise architects seek to impose centralized control over their distributed organizations. In many cases, such centralization is a sure recipe for failure.” Also, his point that “control without controlling” works sounds vaguely familiar.
- Update: This is really Morning Coffee 136, but I don’t want to change the title since it’s part of the URL
Morning Eggnog 132
- My parents are coming into town tomorrow so I’m off for the remaining week or so of the year. Blogging will likely be non-existent, unless I blog something I come up with while geeking out with my dad.
- Juergen van Gael
demonstrates how to use
TPL
from F#. He wrote this once
before
using F#’s async workflows feature. I like the TPL version, though
the
new Action<int>(RowTask)
is a little wordy. I’m guessing the eventual F# syntax will probably become something compact likeaction RowTask
. (via Don Syme) - Andrew Peter ported RoR’s Haml view engine to ASP.NET MVC, calling the result NHaml. I haven’t played around with the new MVC stuff much, but I’m guessing ASP.NET’s control-based approach doesn’t work well when you separate out the controller code. If I’m manually authoring view templates, I’d much rather type NHaml’s syntax than the standard ASP.NET <% …%> syntax. On the other hand, there aren’t any design tools out there today handle the NHaml syntax. Also, I wonder if Andrew is working on a Sass port. (via DNK)
F# PEG Parser Next Steps
There are still a couple of posts to go in my Practical Parsing in F# series. But with Christmas and my parents on their way, I’m taking the rest of the year off.
I’ve stuck the code as it currently stands up on my SkyDrive. Conveniently enough, xUnit.net released their RC1 build yesterday, which includes supports for static test methods. I’ve included the RC1 build in the zip file on SkyDrive, as well as simple batch file so you can run the tests yourself.
Taking a break from this project will give me a good opportunity to figure out where to take it next. As the code stands, it’s not very useful – it simply builds a PEG AST from a PEG grammar. That’s just the first phase of a typical compiler. Without those other phases (you know, like “generate binary code”) this is just an interesting sample.
Since I’m in the “future pondering” phase, now’s the time to make your opinion known. What do you, dear reader, think I should do with this code? Bonus points for wanting to get involved.
Practical F# Parsing: Semantic Productions (2)
Now that I’ve explained the AST, there are several more semantic productions to go. I’m not going to describe them all in detail, just hit a few important highlights.
Many of the semantic productions return lists of other productions. Class returns a list of Ranges, Literal and Identifier returns lists of characters, etc. As you would expect, these multiples are encoded in the grammar. For example, here’s the implementation of Literal:
///Literal <- ['] (!['] Char)* ['] Spacing /// / ["] (!["] Char)* ["] Spacing let (|Literal|_|) input = let rec (|LitChars|_|) delimiter chars input = match input with | TOKEN delimiter (_) -> Some(L2S chars, input) | Char (c, input) -> (|LitChars|_|) delimiter (chars @ [c]) input | _ -> None match input with | TOKEN "'" (LitChars "'" [] (lit, TOKEN "'" (Spacing(input)))) -> Some(lit, input) | TOKEN """ (LitChars """ [] (lit, TOKEN """ (Spacing(input)))) -> Some(lit, input) | _ -> None
I’m using a local recursive function LitChars to retrieve the characters
between the quote delimiters. The quote parameter – i.e. single or
double quote – is passed in as a parameter. I also pass in an empty list
of chars as a parameter. Remember that functional programs keep their
data on the stack, a list parameter is a common way to keep state in a
recursive function. When I match a single non-delimiter character, I add
it to the list with the chars @ [c]
expression. [c] converts a single
value c into a list of one element while the @ operator concatenates to
lists. I’m not sure adding the value to he end like that is a good idea
perf wise. Programming
Erlang recommends only adding
items to the head then reversing the list when you’re done matching. But
F# isn’t Erlang, so I’m not sure what the guidance is here.
Another thing you find in productions is the backtracking syntactic predicates. We saw an example of them in the implementation of Comment. Often, their used to indicate the end of a list of other productions, such as Literal, above. However, sometimes, they’re used to ensure the correct production is matched. For example, a Primary can be an Identifier, as long as it’s not followed by a left arrow. An identifier followed by a left arrow indicates a Definition.
///Primary <- Identifier !LEFTARROW /// / OPEN Expression CLOSE /// / Literal / Class / DOT let rec (|Primary|_|) input = let (|NotLEFTARROW|_|) input = match input with | LEFTARROW (_) -> None | _ -> Some(input) match input with | Identifier (id, NotLEFTARROW (input)) -> Some(Primary.Identifier(id), input) | OPEN ( Expression (exp, CLOSE (input))) -> Some(Primary.Expression(exp), input) | Literal (lit, input) -> Some(Primary.Literal(lit), input) | Class (cls, input) -> Some(Primary.Class(cls), input) | DOT (input) -> Some(Primary.Dot, input) | _ -> None
Here, I need a way to match the absence of LEFTARROW, so I’ve build a simple local function called NotLEFTARROW. This isn’t very clean IMO – I’d rather have a used a custom operator like !!! and &&& for my backtracking predicates. But I haven’t figured out how to use custom operators as Active Patterns. I was able to write a standard non-operator AP function, but then I have to use the full AP function name. Here’s a version of Primary written that way:
///Backtracking failure predicate let (|NotPred|_|) f input = match f input with | Some (_) -> None | _ -> Some(input) let rec (|Primary|_|) input = match input with | Identifier (id, NotPred (|LEFTARROW|_|) (input)) -> Some(Primary.Identifier(id), input) //Other matches omited
Frankly, I don’t think that’s very readable, so I didn’t implement it that way. If I can figure out how to use custom operators and pass around AP functions without using their full ugly name, I’ll change it.
Finally, there are a few things about F#’s scoping rules that you need to understand. F# uses linear scoping, which is to say there’s no way to use a type or function that hasn’t been declared, sort of like C/C++. The difference is that while C/C++ have a way to declare a type or function separately from its implementation, F# has no such capacity. This becomes an issue when you have circular references. For example, Primary can be an Expression, which is a list of SequenceItems, each of which is a Primary with an optional prefix and suffix. In order to declare those in F#, you have to use a special “and” syntax to link the types/functions together.
//ToString and Exp2Str omitted for clarity type Primary = | Identifier of string | Expression of Expression | Literal of string | Class of Range list | Dot //ToString omitted for clarity and SequenceItem = { primaryItem: Primary; itemPrefix: Prefix option; itemSuffix: Suffix option; } and Sequence = SequenceItem list and Expression = Sequence list
Likewise, the AP functions to recognize Primary, SequenceItem, Sequence and Expression are anded together. For me, this is one of the hardest things to get used to about F#. But as you can see from the expressiveness of the code, it’s well worth the trouble