I’m interested in parsing because I’m interested in Domain Specific Languages. F# is pretty good for internal DSLs, but internal DSLs are obviously limited by the syntax of the host language. If you want complete control over the language, you’ve got to build your own parser.
The defacto standard for parser development is Yet Another Compiler Compiler, or yacc. There’s a version of yacc for .NET as well as one specifically for F#. However, I’m not a fan of yacc. Yacc parsers are specified using context-free grammar (aka CFG). But CFG’s can be ambiguous – actually, it’s nearly impossible to build an unambiguous CFG. Personally, I’m a big fan of Parsing Expression Grammars (or PEGs) which among other advantages makes it impossible to develop ambiguous grammars. Furthermore, PEGs don’t require a separate lexical analyzer like lex, so I think they’re more suitable for building modular compilers.
Since I like PEGs and F# so much, I developed a parser for the PEG grammar from the original PEG whitepaper using F#. The grammar is much simpler than a language like C#, but with twenty nine grammar productions it’s certainly not trivial. The F# implementation is fairly straightforward backtracking recursive decent parser, which makes it easy to understand even if you’re not a parser guru. It’s also small – around 400 lines of code including comments. But I think the code illustrates both the general value of Functional Programming as well as the specific value of F#. Here’s how the series is shaping up (though this is subject to change):
- The Parse Buffer
- Unit Testing
- Syntactical Productions (1)
- Active Patterns
- Syntactical Productions (2)
- Semantic Productions (1)
- The Abstract Syntax Tree
- Semantic Productions (2)
- Recursion and Predicate Functions
- Caching and Tracing
- C# Interop
I was originally planning to post the code for the parser itself with this post. However, i find that I’m revising the code as I write the articles in this series, so I’m going to hold off for now. If you’re really desperate, drop me a line and I’ll see what I can do.
Update: Almost forgot, if you’re going to follow along at home, I’m using the latest version of F#, v220.127.116.11. Note, the F# Downloads page on the MS Research is woefully out of date, so go to the MS Research Downloads page. Currently, it’s the most recent release. It snaps into VS 2005 and 2008 plus has command line tools. If you’re an VS Express user, Douglas Stockwell explained how to roll your own F# Express.
Much Later Update: The code is now available on my Skydrive.