Is Serendipity the Heart of the WS-*/REST Debate?

Thanks to Technorati, I found this post by John Heintz. He’s checking out John Evdemon’se-book on SOA and has a problem with this overview:

SOA is an architectural approach to creating systems built from autonomous services. With SOA, integration becomes forethought rather than afterthought. This book introduces a set of architectural capabilities, and explores them in subsequent chapters.

To which John H. responds:

I, for one, would rather build on an architecture that promotes integration as an afterthought, so I don’t have to think about it before hand!!!

Yeah, I’d rather not have to think about integration before hand either. On the other hand, I want integration that actually works. It sounds like John H. is suggesting here that REST somehow eliminates the need to consider integration up front. It doesn’t. Consider this: if you’re building a Web 2.0 site then you are expected to expose everything in your site via APP, RSS and/or RESTful POX services. In other words, the Web 2.0 community expects you to have the forethought to enable integration. If you don’t, Marc Canter will call you out in front of Bill Gates and Tim O’Reilly.

This integration by afterthought approach seems to be big among RESTifarians. John H. links to a REST discussion post by Nick Gall advocating the principle of generality, “unexpected reuse” and “design for serendipity”. Money quote:

The Internet and the Web are paradigms of Serendipity-Oriented Architectures. Why? Largely because of their simple generality. It is my belief that generality is one of the major enablers of serendipity. So here I immodestly offer Gall’s General Principle of Serendipity: “Just as generality of knowledge is the key to serendipitous discovery, generality of purpose is the key to serendipitous (re)use.”

Serendipity means “the accidental discovery of something pleasant, valuable, or useful“. “Serendipitous reuse” sounds an awful lot like accidental reuse. Most enterprises have been there, done that and have nothing to show for their efforts or $$$ except the team t-shirt. Yet Tim Berners-Lee believes “Unexpected reuse is the value of the web” and Roy Fielding tells us to “Engineer for serendipity”. What gives?

First off, enterprises aren’t interested in unexpected or serendipitous reuse. They want their reuse to be systematic and predictable. The likelihood of serendipitous reuse is directly related to the number of potential reusers. But the number of potential reusers inside the enterprise is dramatically smaller than out on the public Internet. That brings the chance for serendipitous reuse inside the enterprise to nearly zero.

Second, enterprise systems aren’t exactly known for their “simple generality”. If Nick’s right that “generality of purpose is the key to serendipitous (re)use”, then enterprises might as well give up on serendipitous reuse right now. As I said last year, it’s a question of context. Context is specifics, the opposite of generality. Different business units have different business practices, different geographies have different laws, different markets have different competitors, etc. If an enterprise operates in multiple contexts – and most do - enterprise solutions have to take them into account. Those different contexts prevent you from building usable – much less reusable – general solutions.

Finally, I think the amount of serendipitous reuse in REST is overstated. If you build an app on the Facebook Platform, can you use it on MySpace? Nope. If you build an app that uses the Flickr services, will it work with Picasa Web Albums? Nope. Of course, there are exceptions – pretty much everyone supports the MetaWeblog API – but those exceptions seem few and far between to me. Furthermore, the bits that are getting reused – such as identifier, format and protocol – are infrastructure capabilities more suitable to reuse anyway. Serendipitously reusing infrastructure capabilities is much easier than serendipitously reusing business capabilities, REST or not.

The problems that stand in the way of reuse aren’t technology ones. Furthermore, the reuse problems face by enterprises are very different than ones faced by Web 2.0 companies. REST is a great approach, but it isn’t a one-size-fits-all technology solution that magically relegates integration and reuse to “afterthought” status. Serendipity is nice, when it happens. However, by definition it’s not something you can depend on.

Service Factory Customization Workshop Day One

No morning coffee posts for the first half of this week, because I’m in training thru Wednesday. Day one was mostly overview of GAT and DSL, which was review for me. Today we’re starting to dig into some of the new stuff they’ve build for the new version of WSSF, so I’m paying much more attention today.

This isn’t your typical workshop in that the content is sort of being generated on the fly. As I type, we’re voting on what we’re going to cover for the next two days. Most classes I’ve been in are pre-programmed, the teacher doesn’t ask the class what topics should be covered and what order. There isn’t even one “teacher” - there are five folks from p&p including the architect, dev lead and PM of WSSF that are tag-teaming. Even the hands-on labs aren’t completely ironed out – they’re evolving the lab directions as we do the labs. It’s atypical, but it works.

Afternoon Coffee 106

Lots of meetings today, so my coffee post is late…

  • The Big Newstm: Visual Studio 2008 and .NET Framework 3.5 Beta 2 is available for download. Soma and Scott have more. Silverlight 1.0 RC and the Silverlight Add-in for VS08 will apparently be available in a couple of days. Finally, there’s a go-live license for the framework, so you get a head-start deploying apps before VS08 and NETFX 3.5 RTM. Time to build out a new VPC image.
  • Next week, I’m attending the p&p Service Factory v3 Customization Workshop. I’m looking forward to playing with the new Service Factory drop, but I’m really interested in learning more about building factories. I wonder if they’re going to discuss their VS08 plans.
  • Nick Malik recently wrote about making “middle out SOA” work. I hate that term “middle-out”. It feels like we’re pinning our hopes on middle-out because we know top-down and bottom-up don’t work. My old boss John DeVadoss (who assures me he’ll be blogging regularly again “soon”) big vs. little SOA, with big SOA being “dead”. I like the term “little SOA” better than “middle-out SOA”, but just because big SOA is a big failure, doesn’t mean little SOA will make any headway.
  • There’s a new F# drop available. Don Syme has the details. Looks like they’ve got some interesting new constructs for async and parallel programing.
  • ABC announced yesterday that they are streaming HD on their website. So you can check out the season finale of Lost in HD for free. They embed commercials so it’s not really “for free”, but you don’t have to pay $3 an episode like you do on XBLM. I wonder if XBLM might offer this capability in the future? Certainly would increase my use of XBLM. (as would an all-you-can-eat pricing scheme)

The Durable Messaging Debate Continues

Last week, Nick Malik responded to Libor Soucek’s advice to avoid durable messaging. Nick points out that while both durable and non-durable messaging requires some type of compensation logic (nothing is 100% foolproof because fools are so ingenious), the durable messaging compensation logic is significantly simpler.

This led to a very long conversation over on Libor’s blog. Libor started by clarifying his original point, and then the two of them went back and forth chatting in the comments. It’s been very respectful, Libor calls both Nick and I “clever and influential” though he also thinks we’re wrong on this durable messaging thing. In my private emails with Libor, he’s been equally respectful and his opinion is very well thought out, though obviously I think he’s the one who’s wrong. 😄

I’m not sure how much is clear from Libor’s public posts, but it looks like most of his recent experience comes from building trading exchanges. According to his about page, he’s been building electronic trading systems since 2002. While I have very little experience in that domain, I can see very clearly how the highly redundant, reliable multicast approach that he describes would be a very good if not the best solution.

But there is no system inside Microsoft IT that looks even remotely like a trading exchange. Furthermore, I don’t think approaches for building a trading exchange generalize well. So that means Nick and I have very different priorities than Libor, something that seems to materialize as a significant amount of talking past each other. As much as I respect Libor, I can’t shake the feeling that he doesn’t “get” my priorities and I wouldn’t be at all surprised if he felt the same way about me.

The biggest problem with his highly redundant approach is the sheer cost when you consider the large number of systems involved. According to Nick, MSIT has “over 2700 applications in 82 solution domains”. When you consider the cost for taking a highly redundant approach across that many applications, the cost gets out of control very quickly. Nick estimates that the support staff cost alone for tripling our hardware infrastructure to make it highly redundant would be around half a billion dollars a year. And that doesn’t include hardware acquisition costs, electricity costs, real-estate costs (gotta put all those servers somewhere) or any other costs. The impact to Microsoft’s bottom line would be enormous, for what Nick calls “negligible or invisible” benefit.

There’s no question that high availability costs big money. I just asked Dale about it, and he said that in his opinion going above 99.9% availability increases costs “nearly exponentially”. He estimates just going from 99% to 99.9% doubles the cost. 99% availability is almost 15 minutes of downtime per day (on average). 99.9% is about 90 seconds downtime per day (again, on average).

How much is that 13 extra minutes of uptime per day worth? I would say “depends on the application”. How many of the 2700 applications Nick mentions need even 99% availability? Certainly some do, but I would guess that less than 10% of those systems need better than 99% availability. What pretty much all of them actually need is high reliability, which is to say they need to work even in the face of “hostile or unexpected circumstances” (like system failures and downtime).

High availability implies high reliability. However, the reverse is not true. You can build systems to gracefully handle failures without the cost overhead of highly redundant infrastructure intended to avoid failures. Personally, I think the best way to build such highly reliable yet not highly available systems is to use durable messaging, though I’m sure there are other ways.

This is probably the biggest difference between Libor and me. I am actively looking to trade away availability (not reliability) in return for lowering the cost of building and running a system. To someone who builds electronic trading systems like Libor, that probably sounds completely wrongheaded. But an electronic trading system would fall into the minority of systems that need high availability (ultra high five nines of availability in this case). For the systems that actually do need high availability, you have to invest in redundancy to get it. But for the rest of the systems, there’s a less costly way to get the reliability you need: Durable Messaging.

*Now* How Much Would You Pay For This Code?

Via Larkware and InfoQ, I discovered a great post on code reuse by Dennis Forbes: Internal Code Reuse Considered Dangerous. I’ve written about reuse before, albeit in the context of services. But where I wrote about the impact of context on reuse (high context == low or no reuse), Dennis focused on the idea of accidental reuse. Here’s the money quote from Dennis:

Code reuse doesn’t happen by accident, or as an incidental – reusable code is designed and developed to be generalized reusable code. Code reuse as a by-product of project development, usually the way organizations attempt to pursue it, is almost always detrimental to both the project and anyone tasked with reusing the code in the future. [Emphasis in original]

I’ve seen many initiatives of varying “officialness” to identify and produce reusable code assets over the years, both inside and outside Microsoft. Dennis’ point that code has to be specifically designed to be reusable is right on the money. Accidental code (or service) reuse just doesn’t happen. Dennis goes so far as to describe such efforts as “almost always detrimental to both the project and anyone tasked with reusing the code in the future”.

One of the more recent code reuse efforts I’ve seen went so far as to identify a reusable asset lifecycle model. While it was fairly detailed at the lifecycle steps that came after said asset came into existence, it was maddeningly vague as to how these reusable assets got built in the first place. The lifecycle said that a reusable asset “comes into existence during the planning phases”. That’s EA-speak for “and then a miracle happens”.

Obviously, the hard part about reusable assets is designing and building them in the first place. So the fact that they skimped on this part of the lifecycle made it very clear they had no chance of success with the project. I shot back the following questions, but never got a response. If you are attempting such a reuse effort, I’d strongly advise answering these questions first:

  • How does a project know a given asset is reusable?
  • How does a project design a given asset to be reusable?
  • How do you incent (incentivize?) a project to invest the extra resources (time, people, money) in takes to generalize an asset to be reusable?

And to steal one from Dennis:

  • What, realistically, would competitors and new entrants in the field offer for a given reusable asset?

Carl Lewis wonders Is your code worthless? As a reusable asset, probably yes.