Feasible Service Reuse

Yesterday, I posted about services and reuse. More to the point, I posted why I don’t believe that business services will be reusable, any more than business objects were reusable. However, “can’t reuse business services” isn’t the whole story, because I believe in different kinds of reuse.

The kind of reuse I was writing about yesterday is typically referred to as “black box reuse”. The idea being that I can reuse the item (object or service) with little or no understanding of how it works. Thomas Beck wrote about colored boxes on his blog yesterday. Context impacts reuse – the environments in which you plan to reuse an item must be compatible with what the item expects. However, those contextual requirements aren’t written down anywhere – at least, they’re not encoded anywhere in the item’s interface. Those contextual requirements are buried in the code – the code you’re not supposed to look at because we’re striving for black box reuse. Opaque Requirements == No Possibility of Reuse.

As I wrote yesterday, David Chappell tears this type of reuse apart fairly adeptly. Money quote: “Creating services that can be reused requires predicting the future”. But black box reuse this isn’t the only kind of reuse. It’s attractive, since it’s simple. At least it would be, if it actually worked. So what kind of reuse doesn’t require predicting the future?

Refactoring.

I realize most people probably don’t consider refactoring to be reuse. But let’s take a look at the official definition from refactoring.com:

Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations. Each transformation (called a ‘refactoring’) does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it’s less likely to go wrong. The system is also kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring

Two things about this definition imply reuse. First, refactoring is “restructuring an existing body of code”. It’s not rewriting that existing body of code. You may be making changes to the code – this certainly isn’t black box reuse – but you’re not scrapping the code completely and starting over. Second, refactoring is making changes to the code “without changing its external behavior”. You care about the code’s external behavior because somewhere, some other code is calling the code you’re refactoring. Some other existing piece of code that you don’t want to change – i.e. that you want to reuse.

When you refactor, you still reuse a significant amount of the code, but you’re not having to predict the future to do it. Refactoring**is the kind of reuse I believe in.

In his article, David talks about types of reuse such as business agility, adaptability and easily changeable orchestration. These look a lot more like refactoring than black box reuse to me. Unfortunately, David waves these away, saying  “Still, isn’t this just another form of reuse?”. Reconfiguration hardly qualifies as “predict the future” style reuse that he spends the rest of the article arguing against. It’s just one paragraph in an otherwise splendid article, so I’ll give him a pass this time. (I’m sure he’s relieved.)

A Question of Context

A couple of weeks ago, David Chappell posted a great article on SOA and the Reality of Reuse. When someone mentions the idea of using SOA for reuse, I cringe. David does a great job blowing the “SOA for Reuse” argument out of the water. In the future, I will just send that link rather than spending the time arguing out the point.

But something nagged at the back of my brain about that post. David starts by talking about object reuse before making the parallel to services. The problem with that comparison is that object reuse hasn’t been a failure. When was the last time you wrote a String class? A Linked List? A Button? There’s been support for Buttons in Windows since at least the 3.x days (probably longer, but that’s before my time). Whatever your OO language of choice, there’s a big collection of reusable objects to go with it.

Given his position as “famous technology author”, I’m assuming David is well aware of successes of object reuse. Furthermore, I doubt it was an accident that in his article he writes that “reuse of business objects failed” (emphasis added). While there has been success around object reuse, essentially none of those successes have been in a business scenario. In fact, there have been some high profile projects such as Microsoft’s Business Framework and IBM’s San Francisco Project that have crashed and burned been significantly less than successful.

So here’s the question: given that general object reuse has seen some success, what’s so different about business objects that causes reuse to fail utterly? Since we’re really interested in service reuse, knowing why some object reuse succeeds and other reuse fails will help us understand which services are likely to be reusable and which wont. I would say that success of object reuse hinges on context.

Wikipedia gives this definition of context: “The context of an event, word, paradigm, change or other reality includes the circumstances and conditions which surround it.” (emphasis in original) For example, the word “order” is ambiguous. If you’re using a procurement system for the military, you could conceivably be given an order to place an order. (OK, that’s silly. But you get the idea.) The word “order” has two different meanings. However, the words that surround the ambiguous term make the meaning clear. An order that you place is different that an order that you give. That’s context.

A string or a linked list or a button has very little in the way of contextual needs. That is to say you can use it the exact same way in a wide variety of environments. A business object on the other hand has significant contextual requirements, which makes reuse difficult or impossible. For example, a Purchase Order object from the above-mentioned military procurement system sounds like it might be reusable. At least until you take into account the differences between branches of the military, between ordering tanks and ordering uniforms, between active units and reserve units, etc. Once the generic Purchase Order has been specialized to the point of practical usability for a given scenario, it’s no longer reusable in the general case.

Taking this back to the service realm, likewise I figure the reusable services will be the ones with little or no contextual needs. A good example is the identity and directory services such as Active Directory and its competitors. Sure, you use LDAP not SOAP to access it, but AD is certainly both reusable and a service plus it’s in wide usage. Other candidates for reusable services my team is looking at are service directory, management and operations, business activity monitoring and provisioning.

I actually think there will be less reuse in services than there was with objects. The value of reuse of services has to exceed not only the contextual issues but also the overhead of distributed access. Calling across the network is an expensive operation – whatever’s on the other side better be worth the drive. I’m guessing for services, more often than not, reuse won’t be worth the trip (if it’s possible at all).

Update: David pointed out to me that the last paragraph of his article begins:

Object reuse has been successful in some areas. The large class libraries in J2EE and the .NET Framework are perhaps the most important examples of this.

Doh! I guess my “assumption” that David is aware of successful object reuse was correct.

SOA Sample Scenario

So now that I’m back, we’re beginning work in earnest on my project. For those not following along at home, my project is to deliver shared service oriented infrastructure for Microsoft’s internal IT department. We’ve spent the time since I moved over working on our business justification, and now we’re moving into specifications and prototyping. That is, I get to starting getting my hands dirty.

As part of the prototyping efforts, we’re looking to build some sample business services on top of our prototype infrastructure. The idea being to both illustrate what we’re building as well as have a playground where we can experiment with new ideas. During the prototyping, I’ll be pretty involved with the development. But once we start writing production code, the dev team will take that over but I will continue to own the service playground. I’ve been kicking a playground idea like this around for several years, so I’m pretty excited about it.

The question is, what kind of business scenario should we build? I want the sample business services to be something interesting and real-world-esque. But of course, it can’t be too complex since I wasn’t hired to build a playground as my primary job function.

So far, we have two primary ideas:

  • Enterprise Management System: AdventureWorks is the primary sample database that ships with SQL Server. They have business scenarios around Sales & Marketing, Product Management, Purchasing and Manufacturing. This sounds suspiciously like an ERP/CRM/SFA type enterprise system. On the plus side, MS is an enterprise so things like ERP/CRM/SFA are the types of solutions we need/use/buy/build internally. On the negative side, it’s complex to do real-world and teams that actually do ERP/CRM/SFA inside Microsoft might dismiss the infrastructure if the playground isn’t real-world enough.
  • Prediction Market: If you’ve ever seen Hollywood Stock Exchange (HSX) or Tradesports, those are prediction markets. The basic idea is that you trade on predictions, rather than companies like you would in a stock market. Using HSX as an example, you get 2 million “Hollywood Dollars” (i.e. play money) to invest in upcoming movies and / or movie stars. Those movies and stars pay out based on the money they make at the box office. They also have derivatives for opening weekend, blockbusters and the Oscars. HSX even sells forecasting and prediction services based on the HSX market. Of course, I can’t build something quite so extensive, but we could get pretty far with this idea. The upside is that it’s relatively simple (compared to enterprise management systems) and there’s little conflict with existing systems inside Microsoft. The downside also is that it’s relatively simple and not like anything we’re building inside Microsoft.

I’m leaning towards the prediction market, as it sounds more fun to build and experiment with. What do you think?

Cut Out The Middle Man

Nick Malik is an architect in MSIT’s Enterprise Architecture group. He’s been blogging a while, though I only discovered his blog a couple of weeks ago. Yesterday he posted about OMG’s SOA SIG’s Draft RFI on EDA and it’s relationship to SOA and BPM. That’s a veritable alphabet soup of acronyms! To translate, the Object Management Group’sSpecial Interest Group on Service Oriented Architecture has posted a draft Request for Information on Event Driven Architecture and it’s relationship to Service Oriented Architecture and Business Process Management. Here’s the summary from the actual document:

The EDA Sub-group of the OMG SOA SIG seeks information from members of the EDA, BPM and SOA community as well as anyone interested in promoting standards in this area. Requested information will be evaluated by the EDA Sub-group, resulting in the development of Requests for Proposal(s) (RFP) for standardization of Event definition, relationship between EDA, BPM and SOA that will ultimately allow development of standards for Complete Life Cycle of Events -Ontology of Events, Sense and Respond Services, Events Metrics and processing of complex events. Please note that it is our intent to develop modeling standards for the EDA/SOA and EDA-Business Process interaction and provide standards for the implementation of that interaction as well.

First off, I’m a bit wary about this part: “it is our intent to develop modeling standards”. Of course, OMG is responsible for UML and long time readers should be well aware of my opinion of UML. I don’t want to set the bozo bit on an entire organization, but I am skeptical that any new modeling “standards” from OMG will be any more effective than the Unwanted Modeling Language.

Secondly, EDA seems to be vaguely defined, if at all. Wikipedia has this to say about EDA:

An event-driven architecture (EDA) defines a methodology for designing and implementing applications and systems in which events transmit between loosely coupled software components and services. An event-driven system is typically comprised of event consumers and event producers. Event consumers subscribe to an intermediary event manager, and event producers publish to this manager. When the event manager receives an event from a producer, the manager forwards the event to the consumer. If the consumer is unavailable, the manager can store the event and try to forward it later. This method of event transmission is referred to in message-based systems as store and forward.
[emphasis in original]

Assuming events are encoded as messages, then you can rewrite “event consumers / event producers” as “message receivers / message senders” and you really blur the line with SOA?

The big difference in EDA seems to be the use of an “intermediary event manager”. The problem I have with this approach is that the “intermediary event manager” works fine if you have a small number of endpoints, but how will it scale to handle hundreds of systems? Thousands? Tens of thousands? I don’t see how the centralized event manager approach can possibly scale to handle tens of millions of events delivered between tens of thousands of systems. The management of such a system would be a nightmare? If a business process went south, you would obviously look in the central event manager as the source of the problem, but I would think that would be like finding a needle in a haystack. You could federate the event managers, instead of attempting to scale out the center. But a federated event manager approach would seem to defeat much of the purpose of an EDA in the first place. If you’re going to federate your event managers, why not cut out the middle man and make each event producer it’s own event manager as well? What is the benefit of separating these capabilities?

I guess fleshing out EDA isn’t a bad idea, but it seems more like a solution looking for a problem to me.

Business Processes Are Services Too

I’ve been having a conversation with Piyush Pant over on his blog that started as a comment he left on my Services Aren’t Stateless post. He thinks that I’m “missing the crucial point here by implicitly conflating business process and service state”. While Piyush hasn’t really defined what he means by these terms, I think I understand what he’s getting at. Yes, process and service state are different in many ways, but they are also similar in that they are both service private data.

Pat Helland (side note – I wish Pat would start blogging again) wrote an article some time ago titled Data on the Outside vs. Data on the Inside where he talked about the differences between service private data and data in the space between the services. For example, data on the outside is immutable, requires an open schema for interop, doesn’t need encapsulation and is representable in XML. Service private data is not immutable, doesn’t need an open schema for interop, requires encapsulation and is typically stored in a SQL RDBMS. So on this front, process and service state are both service private data so conflating them makes some sense.

However, what’s not in the article is the idea of Resource and Activity data. Not sure why Pat didn’t include this in the article, but he was talking about it as far back as PDC 2003. Stu Charlton described the difference between resource and activity data in his Autonomous Services article:

Activity Data – This is “work in progress” data for any long-running business operation, and is usually encapsulated by business logic. A classic example is a shopping cart in any e-commerce system. This data is mutable, but typically has low concurrency conflicts, as it is not widely shared. Typically activity data retires after a long running operation completes, and may be archived in a decision support system for later analysis.

Resource Data – This is “state of the business” data, which represents the resources of an organization, and is usually encapsulated by business logic. Examples are: room availability in a hotel, inventory levels in a warehouse, account statuses, employee and customer information. Some resources have a small life span, others may last a very long time (years). Resource data is usually volatile with potential for high concurrency conflicts.

So I’m fairly sure that when Piyush says “process state” I should hear “activity data”. Similarly “service state” is “resource data”. The differences between activity and resource data lead to some interesting implementation artifacts, which I assume he getting at when he says I’m conflating the two. For example, since activity data like shopping cart has low or no concurrency issues, using an optimistic concurrency scheme is entirely appropriate, which you would never use for highly volatile resource data like warehouse inventory levels. In fact, since activity data doesn’t have concurrency issues, you could even store it inside an instance of workflow or orchestration, which gets serialized to a persistent store when it’s in an idle state.

However, the fact that activity and resource data is handled differently doesn’t mean that most services won’t have activity data. When Thomas Erl says that that stateless services is a “common principle of service orientation”, essentially what I think he’s saying that services should only have resource data. And as I said before, this seems wrong to me. Sure, some services will be stateless. But all services? Services implement business capabilities. Most business capabilities are long running processes. Doesn’t that imply that most services in the enterprise will need to be long running workflows or orchestrations?

So for the most part, Piyush and I just seem to have different names for the same concepts. The one issue I have with Piyush’s descricription of process and service state is that he seems to implicitly assume that processes aren’t services. Why not? Again, not all services will be processes, but if you’re not exposing processes as services, how exactly are you exposing them?