This Isn’t The Droid I’m Looking For

Since David Ing responded to my REST/CRUD/CRAP entry on his blog, I guess that means I respond to his response here, right?

Actually, this is going to be very short^[Wow, this post wasn’t that short at all. Can you imagine how long this would have been if I had mostly disagreed with David?] because I mostly agree with what David wrote. For example:

If we say stuff like ‘REST shall/must replace all foolish WS-* SOAP systems (insert Nelson-style HaHa here)’ then we are just repeating the same lessons as before – One size won’t fit all and there is no One Ring to Bind Them illusion in REST as well. If you have something complex that fits the WS-* style then maybe REST isn’t the droid you are looking for.

Of course, this begs the question “what is complex?” David described complex as “long running transaction updating multiple distributed stores”. Personally, I’d replace “stores” with “systems” and I hate term “long running transaction” so I would have written “long running operation” or “business process”, but grammatical nitpicking aside that certainly seems like a fair description. Certainly, most of the scenarios I’m looking at in my day job could be described as being long running and updating multiple systems. David says later that he thinks “a lot of apps don’t need to be modeled that way”, but I’m not aware of the alternatives so I’ll let him expand on this on his own blog. I’ll try to get to writing about my thoughts around protocol state next week.

I did disagree with this one point:

We may say REST is really about a protocol state and not CRUD, but unless the rest of the world gravitates to that view then I’m afraid it just won’t be. If, say, through some ongoing groundswell of common usage, people start modelling entities as dereferenceable URI’s and using POST to do Create and Updates, then REST will be about CRUD by default. This is all very unacademic and unjust, but thems the breaks peachy. The lesson from SOAP is not to fight it by trying to re-educate the masses after they get a perception in their head.

I think it doesn’t really matter how the rest of the world gravitates, it only matters how you as the the service provider choose to expose your service. If you’re doing something simple like ATOM publishing, you can probably get away with REST as CRUD. (Would that be hi or lo REST?) If you’re doing something more complex, either in terms of being long running or needing multiple updates in one atomic operation like Tim’s airline example, you’ll probably need to gravitate towards REST as protocol state. But can’t the two models can co-exist nicely, even in the same app? No re-education required.

Update: – From Fielding’s Dissertation: “REST relies instead on the author choosing a resource identifier that best fits the nature of the concept being identified”. My point exactly.

REST is neither CRUD nor CRAP

In the wake of my praise for CRUD is CRAP, David Ingasked “how do you reconcile something like REST (astoria etc) with being CRUD-adverse? Is there a happy place where the two can go for coffee?” Sure there is: I hear Tim Ewald’s XML Nation serves great coffee and the scones are pretty good. ^[Actually, I have no idea if Tim even likes coffee or scones. FYI, DevHawk Nation would not feature great coffee or pretty good scones. We would, however, have Arrogant Bastard Ale on tap.]

Seriously, the key observation that Tim recently made is that REST != CRUD. Sure, it can be used that way and for simple scenarios it works fine. (I’ll define “simple scenarios” in a second.) But I don’t believe CRUD style REST works in the large. Tim said you can’t build with just CRUD because it’s “to simplistic to be useful”. I’ll go even more fundamental, using REST for CRUD means having giving up transactions entirely. I’ve already accepted that building loosely coupled services means giving up distributed transactions. But the idea of giving up transactions entirely is just crazy talk.

So when I said “simple scenarios” above, I meant “scenarios that don’t need transactions”. (I take it as a given that RESTifarians aren’t hot for WS-AtomicTransaction.) ATOM Publishing is a simple scenario because the web resource authoring scenario doesn’t need transactions to protect updates to multiple resources at a time. If it did, I don’t believe the REST as CRUD approach they use would work.

As you might guess then, I’m not a fan of Astoria. I believe the sweet spot for so called “data services” will be read only (because they don’t need transactions, natch). I’m sure there are some read/write scenarios Astoria will be useful for, but I think they will be limited – at least within the enterprise.

If REST != CRUD, then what is it? Let’s go back to Tim’s post:

Every communication protocol has a state machine. For some protocols they are very simple, for others they are more complex. When you implement a protocol via RPC, you build methods that modify the state of the communication. That state is maintained as a black box at the endpoint. Because the protocol state is hidden, it is easy to get things wrong. For instance, you might call Process before calling Init. People have been looking for ways to avoid these problems by annotating interface type information for a long time, but I’m not aware of any mainstream solutions. The fact that the state of the protocol is encapsulated behind method invocations that modify that state in non-obvious ways also makes versioning interesting.

The essence of REST is to make the states of the protocol explicit and addressable by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you’re moving to, making that your new state. A state’s representation includes the links (arcs in the graph) to the other states that you can move to from the current state. This is exactly how browser based apps work, and there is no reason that your app’s protocol can’t work that way too. (The ATOM Publishing protocol is the canonical example, though its easy to think that its about entities, not a state machine.)

While I disagree with Tim’s disagreement of ATOM (i.e. I believe APP is about entities, but it works because it doesn’t need transactions), I agree 100% that REST is about protocol state. Tim lays this very clear in his airline reservation sample. Thus, I can spurn CRUD and still embrace REST if I want to.

Further, Tim’s points on the opaque nature of RPC style interactions (which web services appear to have fallen into despite the best of intentions) are spot on. If you’re doing simple request/response services, the protocol state is trivial, so that works fine. However, in the scenarios I face, long running services are the norm and managing the protocol state is critical. I’ve got some ideas on how to do that, but that’s a future blog post.

How I Learned to Stop Worrying and Love WCF

Regular readers of DevHawk are likely aware of my obsession interest in SQL Service Broker (aka SSB). I’ve also been doing a lot of WCF work lately. While there are parts of WCF that I think rock, overall I’ve found WCF lacking due to it’s lack of support for long running services, which SSB excels at.

So it was with great interest that I read this recent article on Integrating WF and WCF. WF is expressly designed for long running systems, so I wanted to see how the article dealt with the WCF’s lack of support for such scenarios. Unfortunately, the article basically sidesteps the issue. While it has lots of great info about hosting WF inside a WCF service, the article uses duplex channels for communication between the service and its clients. As I have pointed out before, this approach is impractical because it requires that both the service and its consumer remain alive in memory until the WF end.

Remember this quote from Essential WF?

“It is wishful thinking to assume that the operating system process (or CLR application domain) in which the program begins execution will survive for the required duration.”

So basically this WCF/WF sample is wishful thinking. Fine for a demo, but given the severe lack of information out there on integrating these two technologies, I’m worried that many people will read this article as best practice guidance, which in my opinion would be a mistake.

But instead of firing up my blog (that is, like last time) to write a scathing post about how broken this sample is, I emailed Paul which led to a concall with Shy to discuss WCF’s lack of support for long running services. Imagine my surprise when Shy agreed with me completely, furthermore saying that support for long running services had been “out of scope” for v1 of WCF. I thought that the whole point of duplex channels was for long running services. But apparently I was wrong.

Shy said to think of the duplex channel in terms of sockets, rather than long running conversations. And just like that, WCF made a ton more sense to me. I had been directly comparing the SSB and WCF communication models, but that’s apples and oranges. It would be like comparing SSB to TCP.

If you think about it, vanilla HTTP works a lot more like UDP, even though it’s layered on top of TCP. Both UDP and HTTP support connectionless operations and neither UDP nor HTTP are reliable or provide message ordering. The comparison isn’t perfect: for example, UDP isn’t limited to a single response for an incoming request. But by and large, HTTP is a very UDP style protocol.

If HTTP is basically UDP, then WS-* is trying to be TCP. Frankly, I never understood the point of WS-ReliableMessaging. I always thought reliability == durability == SSB or MSMQ. But when you realize that HTTP lacks TCP-like reliability and ordering capabilities, suddenly this WS spec makes sense. In fact, Shy made this exact point almost a year ago. At the time, I didn’t get it because I didn’t understand the duplex channel as sockets analogy. Now, I see the value of adding these capabilities to HTTP.

What Shy said was clear and to the point but unfortunately completely missing in the official WCF documentation. For example, the docs on Duplex Services say this:

A duplex service contract is a message exchange pattern in which both endpoints can send messages to the other independently. A duplex service, therefore, can send messages back to the client endpoint, providing event-like behavior. Duplex communication occurs when a client connects to a service and provides the service with a channel on which the service can send messages back to the client.

The docs make no mention that the “event-like behavior” of duplex services only works within a session. And I’m not the only one who mistakenly believed that duplex services could be used for long running services (here’s an article in DDJ that makes the same mistake). Shy used the term “episodic” to describe services that span session boundaries. I’d like to see the docs updated to include that concept.

Taking the TCP/UDP analogy even further, I think it demonstrates how pointless the REST vs. SOAP debate is. As UDP is a thin layer on top of IP, REST is a thin layer on top of HTTP. But nobody argues much about UDP vs. TCP these days. I was in grade school when UDP and TCP were standardized, so maybe there were big TCP vs UDP flame wars at the time. But twenty five years later, it’s pretty clear that TCP vs UDP is not an either-or proposition. Some protocols are better built on UDP while others are better built on TCP. I’m guessing we’ll see a similar evolution with SOAP and REST.

Personally, I would expect that message exchanges between services will become more complex over time. Complex message exchanges would seem to favor stateful SOAP over stateless REST for the same reason complex network protocols favor connection-oriented TCP over connectionless UDP. But SOAP could never displace REST any more than TCP could ever displace UDP. Furthermore, as Larry O’Brien recently wrote “the onus is on the WS-* advocates to prove the need”. TCP standardization only lagged a year behind UDP standardization where WS-* has lagged at least six years behind REST. I wonder if UDP would be more prevalent today if it had gotten a six year head start on TCP.

Finally, this “SOAP as Sockets” flash of understanding has also helped me understand how SSB / WCF can evolve together in the future. Some folks have suggested an SSB transport for WCF and I’ve personally looked into such an approach. But given since SSB is at a higher level of abstraction than WCF, it makes much more sense to layer SSB on top of WCF instead of the other way around. Today, SSB uses two protocol layers: the top level Dialog Protocol, which is built on top of the lower-level Adjacent Broker Protocol (ABP), which in turn is built on TCP. I’d like to see a version of ABP that was built on top of WCF instead of directly on top of TCP. SSB’s Dialog Protocol would tie together the WCF duplex sessions into a long-running conversation the same way that it ties together TCP sessions today.

Eventually, I would love to see something that has the programming semantics of SSB and the interoperability of WCF. That would be like the the Reese’s Peanut Butter Cup of service messaging.

Is WCF “Straightforward” for Long Running Tasks?

My father sent me a link to this article on SOA scalability. He thought it was pretty good until he got to this paragraph:

Long-running tasks become more complex. You cannot assume that your client can maintain a consistent connection to your web service throughout the life of a task that takes 15 minutes, much less one hour or two days. In this case, you need to implement a solution that follows a full-duplex pattern (where your client is also a service and gets notified when the task is completed) or a polling scheme (where your client checks back later to get the results). Both of these solutions require stateful services. This full-duplex pattern becomes straightforward to implement using the Windows Communications Foundation (Indigo) included with .NET 3.0.

When I first saw duplex channels in WCF, I figured you can use them for long running tasks also. Turns out that of the nine standard WCF bindings, only four support duplex contracts. Of those four, one is designed for peer-to-peer scenarios and one uses named pipes so it doesn’t work across the network, so they’re obviously not usable in the article’s scenario. NetTcp can only provide duplex contracts within the scope of a consistent connection, which the author has already ruled out as a solution. That leaves wsDualHttp, which is implemented much as the author describes, where both client and the service are listening on the network for messages. There’s even a standard binding element – Composite Duplex – which ties two one-way messaging channels into a duplex channel.

Alas, the wsDualHttp solution has a few flaws that render it – in my opinion at least – unusable for exactly these sorts of long running scenarios. On the client side, while you can specify the ClientBaseAddress, you can’t specify the entire ListenUri. Instead, wsDualHttp generates a random guid and tacks it on the end of your ClientBaseAddress, effectively creating a random url every time you run the client app. So if you shut down and restart your client app, you’re now listening on a different url than the one the service is going to send messages to and the connection is broken. Oops.

The issues don’t end there. On the service side of a duplex contract, you get an object you can use to call back to the client via OperationContext.Current.GetCallbackChannel. This works fine, as long as you don’t have to shut down your service. There’s no way to persist the callback channel information to disk and later recreate it. So if you shut down and restart your service, there’s no way to reconnect with the client, even if they haven’t changed the url they’re listening on. Oops.

So in other words, WCF can do long running services using the wsDualHttp binding, as long as you don’t restart the client or service during the conversation. Because that would never ever happen, right?

This is part of the reason why I’m sold on Service Broker. From where I sit, it looks like WCF can’t handle long running operations at all – at least, not with any of the built in transports and bindings. You may be able to build something custom that would work for long running services, I’m not a deep enough expert on WCF to know. From reading what Nicholas Allen has to say about CompositeDuplex, I’m fairly sure you could work around the client url issue if you built a custom binding element to set the ListenUriBaseAddress. But I have no idea how to deal with the service callback channel issue. It doesn’t appear that the* *necessary plumbing is there at all to persist and rehydrate the callback channel. If you can’t do that, I don’t see how you can reliably support long running services.

Felipe Cabrera on Amazon’s Mechanical Turk

Felipe’s a good guy (I knew him when he was at MSFT) but this session wasn’t anything exciting because it’s all old news. There are some things humans are better at than computers, typically things involving judgment such as “which is the best picture of this store?” Yes, I saw that when Amazon first released Mechanical Turk.

They did have a partner on stage, a company called Casting Words that offers podcast transcription services for 42 cents a minute. But how is that a business? I’m not sure what kind of percentage Casting Words is making out of that 42 cents a minute, but couldn’t I go directly to Mechanical Turk and ask for transcription services myself? There are no Casting Words tasks currently on the site as I type this, but I imagine if I watch a while I’ll see a Casting Words task. Then I could simply use a site like HIT Builder to farm out my own transcription tasks. What’s my incentive to use Casting Words at all?

Furthermore, there’s not really a business model behind Mechanical Turk itself. If Microsoft launched its own version, there would be plenty of takers for that work as well – the workers will gravitate to where the best paying and most interesting work they can do is. There’s no incentive to provide your artificial artificial intelligence services exclusively to one company. So Mechanical Turk wouldn’t work as a stand alone business. But as a feature of Amazon it works great. In fact, when the service first launched the only tasks came from A9. I’m guessing it would be worth it to Amazon to run the service even if they were the only ones using it.