Morning Coffee 134

  • Bill de Hora responds to a few of my Durable and RESTful ideas. He points out that relying on a client-generated ID can be troublesome, and recommends using multiple identifiers – one created by the sender, one by the receiver and one representing the message exchange itself. However, the sender ID is vulnerable to client bugs & tampering as Bill points out, and neither the receiver ID nor the exchange ID can be used to determine if a given message is a duplicate. If you don’t trust the sender, is it even possible to determine if a given message is a duplicate?
  • Pablo Castro confirms that there are “practical limits” to what ADO.NET Data Services can do with respect to idempotence. Nothing in his post was surprising, though I hope it will be more explicitly called out in the final docs. Developers used to the comforting protection of a transaction may be in for a rude awakening.
  • Dare Obasanjo has a great post comparing the new features in C# 3.0 to dynamic languages like IronPython. I believe many of the productivity aspects of dynamic languages have little to do with being dynamic.
  • Pat Helland noodles on durability and messaging, two topics near and dear to my heart (probably from working with him for a couple of years). I’m not sure where he’s going with this – his conclusion that “Basically, big, complex, and distributed system are big, complex, and distributed” isn’t exactly ground-breaking. But his point that “durable” isn’t a binary concept is worth more consideration. Also, his description of IMS only looking at the effects of a committed transaction is very similar to how web sites work, though obviously HTTP isn’t durable so you can’t make event horizon optimizations like IMS did.
  • Tangentially related, Werner Vogels discusses the idea of eventually consistent distributed databases. Today, that’s a problem mostly only Internet-scale sites like Amazon deal with. In the near future of continued data explosion + manycore, we’ll all have to deal with it.
  • Nick Malik argues that categorizing enterprise applications by lifecycle is much less useful than categorization based on organizational impact. He might also need a new chair.
  • Jesus Rodriguez digs into one of SSB’s new features in SQL 2008: conversation priorities.
  • Arnon Rotem-Gal-Oz and Sam Gentile are mixing it up over the definition of SOA. Sam thinks SOA has to include business drivers and Arnon doesn’t. I’m with Sam on this, defining “SOA” independently from “Applying SOA” seems pointless. Then again, rigorously defining SOA – much less arguing about said definition – seems like a waste of time in the first place IMHO.
  • Wow, this guy Zed is mad at the Ruby community.
  • Andrew Baron has 8 Reasons Why The TV Studios Will Die. Personally, I think reason #2 – Expendable Middle-Person – is the most important. If content producers can reach consumers directly, what value-add will the networks provide? (via United Hollywood)

ADO.NET Data Services and Idempotence

I was reading thru the ADO.NET Data Services Quickstart, because I wanted to understand how it support data updates. The quickstart uses the Customers table from the Northwind sample database, which unlike most of the other tables uses an nchar(5) value as the primary key. Categories, Employees, Orders, Products, Shippers and Suppliers all use an auto-increment integer field for their primary keys.

I only point this out because inserting into Customers is idempotent, while inserting into those other listed tables is not. Is it a coincidence that the ADO.NET data services team chose the Customers table for their quickstart? Maybe, but I doubt it. Regardless, making a non-idempotent insert call from the browser is a bad idea, if you care about Exactly Once.

The Importance of Idempotence

Every organization has some operations or processes that have to happen Exactly Once. Your employer needs to make sure they issue your paycheck exactly once. Your bank needs to make sure that paycheck is deposited in your account exactly once. Exactly Once isn’t something that just “traditional” enterprises like banks care about. Google needs to make sure your AdSense check is issued exactly once. Amazon needs to make sure your credit card is charged exactly once. Especially when there’s money involved, the company wants to make sure it gets handled correctly – Exactly Once.

In application (aka siloed) development, transactions are often used to ensure stuff happens Exactly Once, to good effect. But how do we guarantee Exactly Once now that we’re connecting systems together? Given how well transactions work inside applications, it’s not surprising that early attempts to guarantee Exactly Once between systems relied on distributed transactions, this time to not-so-good effect. Pat Helland summarized the problems with distributed transactions this way:

“The two-phase commit protocol will ensure perfect consistency given infinite time.  I say that because it will wait and wait and wait until the transaction is resolved and then provide perfect consistency.   Of course, while partitioned and waiting, arbitrary swaths of the application’s database may be locked up rendering the application unusable.  For this reason, I’ve frequently referred to the two phase commit protocol as the “Anti-Availability Protocol”. “
Pat Helland, SOA and Newton’s Universe

So now we’re faced with a dilemma. Transactions are, for all practical purposes, unusable to ensure Exactly Once processing between connected systems. And yet, the business requirement to ensure Exactly Once hasn’t gone away. We need another way.

The first fallacy of distributed computing is that the network is reliable. It’s usually works, but usually isn’t a guarantee. If I send a message to a remote system but don’t get an acknowledgement, which got lost: the original message or the ack? There’s no way to know, so I have to send the message again. But if I send it again and it’s the ack that got lost, then the target system will receive the message multiple times.

Since the network is not reliable, there’s no way to guarantee that a message will be delivered exactly once. The best we can go is ensure a message will be delivered at least once. However, that implies the target system will receive some messages multiple times. If we need to ensure Exactly Once, we need to make sure the target system won’t duplicate the work if it receives duplicate messages. In other words, we need the target system to be idempotent.

“In computer science, the term idempotent is used to describe method or subroutine calls which can safely be called multiple times, as invoking the procedure a single time or multiple times results in the system maintaining the same state i.e. after the method call all variables have the same value as they did before.

Example: Looking up some customer’s name and address are typically idempotent, since the system will not change state based on this. However, placing an order for a car for the customer is not, since running the method/call several times will lead to several orders being placed, and therefore the state of the system being changed to reflect this.”
Wikipedia, Idempotence (Computer Science)

Or more succinctly:

“Idempotent Means It’s OK to Arrive Multiple Times”
Pat Helland (again)

I can’t overstate the importance of designing your cross-system communication to be idempotent. If you care about ensuring Exactly Once, each step of your process has to be either transactional or idempotent, or you’ll be screwed. It’s interesting to note that you have to be transactional *OR* idempotent, but not both. You can chain together multiple steps in long business process, across multiple disparate systems, but as long as each step is either transactional or idempotent, you can guarantee Exactly Once across the entire process. In other words:

Transactional/Exactly Once == Idempotent/At Least Once

This implies that you can substitute an idempotent operation for a transactional operation, and still ensure Exactly Once.

Let’s look at an example. Typically you ensure Exactly Once processing with MSMQ by receiving messages within the scope of a transaction along with whatever other work you’re doing. But what if you can’t use a transactional receive, say because it’s a remote queue? What would an idempotent equivalent for transactional receive look like?

How about:

  1. Peek a message from the remote queue
  2. Insert the message into the target system database, using the unique MSMQ Message ID as the primary key
  3. Remove the message from the queue by ID

Each of those steps is idempotent. Peek is a read, which is naturally idempotent. Inserting the message into the database is idempotent, since we use the message ID as the primary key. As long as that ID is unique, we can never insert it into the database more than once. Finally, removing a message based on it’s unique ID is also naturally idempotent. Once the message is in the target system database, we can use traditional transactions to ensure it gets processed Exactly Once.

So we took a single transactional operation and turned it into a series of idempotent steps. Both ensure each message is processed Exactly Once. Given the choice, I’d rather write the transactional operation – it’s much less code since we’re we can use existing infrastructure – aka the distributed transaction coordinator. But if the transactional infrastructure isn’t available, I’d rather write multiple idempotent steps and ensure Exactly Once rather than risk losing or duplicating messages.

I’ve got more on this topic, but in the meantime think about this: How do you think durable messaging infrastructure like MSMQ ensures exactly once delivery? You can use that pattern, even if you’re not using durable messaging infrastructure.