Functional Understanding

I was showing some of my cool (well, I think it’s cool) F# parsing code to some folks @ DevTeach. I realized very quickly that a) most mainstream developers are fairly unaware of functional programming and b) I suck at explaining why functional programming matters. So I decided to take another stab at it. I probably should have posted this before my recent series on F#, but better late than never I suppose.

Right off the bat, the term “functional” is confusing. When you say “function” to a mainstream developer, they hear “subroutine“. But when you say “function” to a mathematician, they hear “calculation“. Functions in functional programming (aka FP) are closer to the mathematic concept. If you think about math functions, they’re very different than subroutines. In particular, math functions have no intrinsic mutable data. If you have a math function like f(x) = x3, f(7) always equals 343, no matter how many times you call it. This is very different then a function like String.Length() where the value returned depends on the value of the string.

Another interesting aspect of math-style functions is that they have no side-effects. When you call StringBuilder.Append(), you’re changing the internal state of the StringBuilder object. But FP functions don’t work like that. Providing the same input always provides the same output (i.e. the same independent variable always yields the same dependent value).

If you’re a .NET developer, this may sound strange, but you’re probably very familiar with the String class which works exactly the same way.

A String object is a sequential collection of System.Char objects that represent a string. The value of the String object is the content of the sequential collection, and that value is immutable.

A String object is called immutable (read-only) because its value cannot be modified once it has been created. Methods that appear to modify a String object actually return a new String object that contains the modification.

In other words, all variables in FP are a lot like .NET Strings. In fact, in many FP languages, variables are actually called “values” because they don’t, in fact, vary.

It turns out that this approach to programming has significant upside for unit testing and concurrency. Unit tests typically spend a significant effort getting the objects they’re testing into the right state to invoke the function under test. In FP, the result of a function is purely dependent on the values passed into it, which makes unit testing very straight forward. For concurrency, since functions don’t share mutable state, there’s no need to do complicated locking across multiple processors.

But if values don’t vary, how to we managed application state? FP apps typically maintain their state on the stack. For example, my F# parser starts with a string input and return an abstract syntax tree. All the data is passed between functions on the stack. However, for most user-oriented non-console applications, keeping all state on the stack isn’t realistic.  As Simon Peyton Jones points out, “The ultimate purpose of running a program is invariably to cause some side effect: a changed file, some new pixels on the screen, a message sent, or whatever.” So all FP languages provide some mechanism for purposefully implementing side effects, some (like Haskell) stricter in their syntax than others.

One of the nice things about F#’s multi-paradigm nature is that side effects is a breeze. Of course, that’s both a blessing and a curse, since the much of the aforementioned upside comes from purposefully building side-effect free functions. But the more I work with F#, the more I appreciate the ability to do both functional as well as imperative object-oriented operations in the same language. For example, my parsing code so far is purely functional – it takes in a string to be parsed and returns an AST. But the logical next step would be to generate output based on that AST. Since F# supports non-functional code – not to mention the rich Base Class Library – generating output should be straightforward.

Studio Busting

A week ago, I wrote that the ongoing writers strike might accelerate the transition to Media 2.0. Several other folks think the same way and explain why much better than I have. Marc Andreessen (aka creator of Mosaic) has a fantastic post that not only explains this transition better than I can, it also helped me understand my views on unions in general.

In the post, he describes two economic models – the Hollywood model and the Silicon Valley model. The Hollywood model is highly-centralized, with a small number of huge companies (aka “big media”) owning practically everything. In contrast, the Silicon Valley model is highly-decentralized, where pretty much anyone can create a company or bring a product to market. Marc believes that the entertainment industry at large is transitioning to the decentralized model. I agree 110% – the general decentralization trend is one I highlight in my “Moving Beyond Industrial Software” presentation that I’ve been delivering recently.

Unions are a response to the dramatic power differential between an employer and individual employees. By pooling (aka centralizing) their bargaining power, the union provides a counter-balance to the power wielded by the employer(s). But in a decentralized model, unions aren’t really necessary. Marc describes the “alignment of interests between creators and financiers” as “near-perfect”. Near-perfect might be a bit on the rosy side, but it’s a model I’m much more comfortable with than mega-corporations & unions.

Some believe that the AMPTP (aka the studios) is trying to break the entertainment unions. But what if those unions decided to break the studios? I gotta think that while there are lots of quality writers out there, the best in the business are members of the writers guild. What if they just decided to stop writing for the studios and go into business for themselves? Patrick Goldstein of the LA Times wonders the exact same thing.

Morning Coffee 127

F# Hawkeye : Assorted Not-So-Goodness

(Harry is @ DevTeach in Vancounver with his family this week. He was hoping to still do Morning Coffee posts, but that’s turned out to be infeasible. So instead, you get a series of pre-written posts about F#.)

It’s not all puppy dogs and ice cream with F#. Here are a few things that I didn’t like about the lanugage.

Linear Scoping

In C#, a given piece of code is able to call any function it wants to (limited by CAS and visibility of course). If I define two functions, the first can call the second even thought the compiler has never seen the second function when it’s parsing the first.

F# has linear scoping like C++ does. You can’t call functions that haven’t already been defined in the file (or a previous file that’s already been fed to the compiler). This makes writing mutually recursive functions (A calls B, B calls C, C calls A) fairly annoying. Typically, in F# we declare functions with “let”. But in the Additive function above, we’re declaring the function with “and”. By using “and”, we’re basically chaining together function declarations into a logical unit. Then, you mark the first function declaration in the chain as recursive, and now those methods are enabled for mutual recursion. Not exactly intuitive.

Frankly, I like C#’s ability to bind to methods that haven’t been declared yet. I wonder if this is an intrinsic issue with FP or F# scoping rules, or if it’s something they could fix if they took the time.

No Method Overloading

In my CheckForToken method, I use a string type to hold the token I’m looking for, since tokens can often be multi-character. However, for one character tokens, this is over kill. Not just in terms of creating a string object to contain just one character, but also in how I pattern match the token against the input string. If we’re only looking for one character, we can skip recursion entirely. Yet there’s no way to define two functions called Token that have different signatures.

Given type inference, this isn’t surprising, but it’s still a little annoying for folks coming from C# land.

Limited VS support

More evidence of a rotted brain. F#’s integration into VS is limited at best. It does syntax highlighting and debugging, but that’s about it. The problem I keep running into is that I have two projects – my main project and my test project. Even though I’ve define the test project as being dependent on the main project, it doesn’t automatically compile the test project when I change the main project. So I keep doing something like fix a bug, recompile, then run NUnit to see if the light turned green. It doesn’t, because I haven’t rebuilt the test project and it’s still referencing the old version of the main project. Now that it’s a “real” product, I’m hoping to see better integration into VS08. Maybe even an F# Express that leverages the new VS08 shell?

F# Hawkeye : Assorted Goodness

(Harry is @ DevTeach in Vancounver with his family this week. He was hoping to still do Morning Coffee posts, but that’s turned out to be infeasible. So instead, you get a series of pre-written posts about F#.)

Significant Whitespace

If you’re a Python programmer, you already know this one. Instead of delineating code blocks explicit with curly braces or begin/end keywords, F# uses whitespace. Code blocks are indented relative to their parent. This enforces readability standards as well as conciseness. You can see that in the code Additive function above. Technically, this is optional in F# if you specify the #light compiler option, but pretty much all the docs and books assume this by default.

Custom Operators

This is minor, but cool nonetheless. Many languages let you overload existing operators like + and *. However, F# goes a step further and also lets you create custom operators. You just pick a combination of symbols that isn’t already being used and define a function for it. For example, in my parsing code I wanted a simple way to adorn my input parse strings in my tests so that I could later easily change their type if I changed the type of NextChar and CheckForToken as described above. I defined the “double bang” operator !!. Currently, double bang converts a string into a character list, but originally it simply returned the string since I had written my Char and Token classes in terms of string.