SOA Sample Scenario

So now that I’m back, we’re beginning work in earnest on my project. For those not following along at home, my project is to deliver shared service oriented infrastructure for Microsoft’s internal IT department. We’ve spent the time since I moved over working on our business justification, and now we’re moving into specifications and prototyping. That is, I get to starting getting my hands dirty.

As part of the prototyping efforts, we’re looking to build some sample business services on top of our prototype infrastructure. The idea being to both illustrate what we’re building as well as have a playground where we can experiment with new ideas. During the prototyping, I’ll be pretty involved with the development. But once we start writing production code, the dev team will take that over but I will continue to own the service playground. I’ve been kicking a playground idea like this around for several years, so I’m pretty excited about it.

The question is, what kind of business scenario should we build? I want the sample business services to be something interesting and real-world-esque. But of course, it can’t be too complex since I wasn’t hired to build a playground as my primary job function.

So far, we have two primary ideas:

  • Enterprise Management System: AdventureWorks is the primary sample database that ships with SQL Server. They have business scenarios around Sales & Marketing, Product Management, Purchasing and Manufacturing. This sounds suspiciously like an ERP/CRM/SFA type enterprise system. On the plus side, MS is an enterprise so things like ERP/CRM/SFA are the types of solutions we need/use/buy/build internally. On the negative side, it’s complex to do real-world and teams that actually do ERP/CRM/SFA inside Microsoft might dismiss the infrastructure if the playground isn’t real-world enough.
  • Prediction Market: If you’ve ever seen Hollywood Stock Exchange (HSX) or Tradesports, those are prediction markets. The basic idea is that you trade on predictions, rather than companies like you would in a stock market. Using HSX as an example, you get 2 million “Hollywood Dollars” (i.e. play money) to invest in upcoming movies and / or movie stars. Those movies and stars pay out based on the money they make at the box office. They also have derivatives for opening weekend, blockbusters and the Oscars. HSX even sells forecasting and prediction services based on the HSX market. Of course, I can’t build something quite so extensive, but we could get pretty far with this idea. The upside is that it’s relatively simple (compared to enterprise management systems) and there’s little conflict with existing systems inside Microsoft. The downside also is that it’s relatively simple and not like anything we’re building inside Microsoft.

I’m leaning towards the prediction market, as it sounds more fun to build and experiment with. What do you think?

Cut Out The Middle Man

Nick Malik is an architect in MSIT’s Enterprise Architecture group. He’s been blogging a while, though I only discovered his blog a couple of weeks ago. Yesterday he posted about OMG’s SOA SIG’s Draft RFI on EDA and it’s relationship to SOA and BPM. That’s a veritable alphabet soup of acronyms! To translate, the Object Management Group’sSpecial Interest Group on Service Oriented Architecture has posted a draft Request for Information on Event Driven Architecture and it’s relationship to Service Oriented Architecture and Business Process Management. Here’s the summary from the actual document:

The EDA Sub-group of the OMG SOA SIG seeks information from members of the EDA, BPM and SOA community as well as anyone interested in promoting standards in this area. Requested information will be evaluated by the EDA Sub-group, resulting in the development of Requests for Proposal(s) (RFP) for standardization of Event definition, relationship between EDA, BPM and SOA that will ultimately allow development of standards for Complete Life Cycle of Events -Ontology of Events, Sense and Respond Services, Events Metrics and processing of complex events. Please note that it is our intent to develop modeling standards for the EDA/SOA and EDA-Business Process interaction and provide standards for the implementation of that interaction as well.

First off, I’m a bit wary about this part: “it is our intent to develop modeling standards”. Of course, OMG is responsible for UML and long time readers should be well aware of my opinion of UML. I don’t want to set the bozo bit on an entire organization, but I am skeptical that any new modeling “standards” from OMG will be any more effective than the Unwanted Modeling Language.

Secondly, EDA seems to be vaguely defined, if at all. Wikipedia has this to say about EDA:

An event-driven architecture (EDA) defines a methodology for designing and implementing applications and systems in which events transmit between loosely coupled software components and services. An event-driven system is typically comprised of event consumers and event producers. Event consumers subscribe to an intermediary event manager, and event producers publish to this manager. When the event manager receives an event from a producer, the manager forwards the event to the consumer. If the consumer is unavailable, the manager can store the event and try to forward it later. This method of event transmission is referred to in message-based systems as store and forward.
[emphasis in original]

Assuming events are encoded as messages, then you can rewrite “event consumers / event producers” as “message receivers / message senders” and you really blur the line with SOA?

The big difference in EDA seems to be the use of an “intermediary event manager”. The problem I have with this approach is that the “intermediary event manager” works fine if you have a small number of endpoints, but how will it scale to handle hundreds of systems? Thousands? Tens of thousands? I don’t see how the centralized event manager approach can possibly scale to handle tens of millions of events delivered between tens of thousands of systems. The management of such a system would be a nightmare? If a business process went south, you would obviously look in the central event manager as the source of the problem, but I would think that would be like finding a needle in a haystack. You could federate the event managers, instead of attempting to scale out the center. But a federated event manager approach would seem to defeat much of the purpose of an EDA in the first place. If you’re going to federate your event managers, why not cut out the middle man and make each event producer it’s own event manager as well? What is the benefit of separating these capabilities?

I guess fleshing out EDA isn’t a bad idea, but it seems more like a solution looking for a problem to me.

Business Processes Are Services Too

I’ve been having a conversation with Piyush Pant over on his blog that started as a comment he left on my Services Aren’t Stateless post. He thinks that I’m “missing the crucial point here by implicitly conflating business process and service state”. While Piyush hasn’t really defined what he means by these terms, I think I understand what he’s getting at. Yes, process and service state are different in many ways, but they are also similar in that they are both service private data.

Pat Helland (side note – I wish Pat would start blogging again) wrote an article some time ago titled Data on the Outside vs. Data on the Inside where he talked about the differences between service private data and data in the space between the services. For example, data on the outside is immutable, requires an open schema for interop, doesn’t need encapsulation and is representable in XML. Service private data is not immutable, doesn’t need an open schema for interop, requires encapsulation and is typically stored in a SQL RDBMS. So on this front, process and service state are both service private data so conflating them makes some sense.

However, what’s not in the article is the idea of Resource and Activity data. Not sure why Pat didn’t include this in the article, but he was talking about it as far back as PDC 2003. Stu Charlton described the difference between resource and activity data in his Autonomous Services article:

Activity Data – This is “work in progress” data for any long-running business operation, and is usually encapsulated by business logic. A classic example is a shopping cart in any e-commerce system. This data is mutable, but typically has low concurrency conflicts, as it is not widely shared. Typically activity data retires after a long running operation completes, and may be archived in a decision support system for later analysis.

Resource Data – This is “state of the business” data, which represents the resources of an organization, and is usually encapsulated by business logic. Examples are: room availability in a hotel, inventory levels in a warehouse, account statuses, employee and customer information. Some resources have a small life span, others may last a very long time (years). Resource data is usually volatile with potential for high concurrency conflicts.

So I’m fairly sure that when Piyush says “process state” I should hear “activity data”. Similarly “service state” is “resource data”. The differences between activity and resource data lead to some interesting implementation artifacts, which I assume he getting at when he says I’m conflating the two. For example, since activity data like shopping cart has low or no concurrency issues, using an optimistic concurrency scheme is entirely appropriate, which you would never use for highly volatile resource data like warehouse inventory levels. In fact, since activity data doesn’t have concurrency issues, you could even store it inside an instance of workflow or orchestration, which gets serialized to a persistent store when it’s in an idle state.

However, the fact that activity and resource data is handled differently doesn’t mean that most services won’t have activity data. When Thomas Erl says that that stateless services is a “common principle of service orientation”, essentially what I think he’s saying that services should only have resource data. And as I said before, this seems wrong to me. Sure, some services will be stateless. But all services? Services implement business capabilities. Most business capabilities are long running processes. Doesn’t that imply that most services in the enterprise will need to be long running workflows or orchestrations?

So for the most part, Piyush and I just seem to have different names for the same concepts. The one issue I have with Piyush’s descricription of process and service state is that he seems to implicitly assume that processes aren’t services. Why not? Again, not all services will be processes, but if you’re not exposing processes as services, how exactly are you exposing them?

deVadoss Down on SOA

My old boss’s boss seems like he was in a downer mood yesterday. First, he blogged about the “Myth of Reuse in SOA“, then the “Achilles Heel of SOA“. Actually, truth be told, I agree with him on both counts.

I slam the door on the reuse argument every time it comes up in my new job. Actually, I slam the door on what I call “Naive Reuse”. When John talks factoring for agility, he’s talking about a form of reuse – similar to how use “reuse” code when you refactor. What does it mean to refactor service? How about refactoring your enterprise?

As for the Achilles Heel “data problem”, I think that’s an artifact of the prevailing stateless request/response mindset most people have about services that I touched on yesterday. I think Pat Helland described a very good approach for dealing with data in an SOA, but I haven’t seen it implemented broadly. Rest assured, many of the concepts Pat described are at the forefront of my thinking as my new project takes shape.

WCF Karma

Last fall, I was presenting to a group of architects about SOA. The previous speaker – Rich Turner – was running way late. As I walked in, he was doing a WCF demo and wanted to show how easy it was to change transport by changing the config file. He wanted to change it to run over named pipes, but he couldn’t remember the name of the binding. He asked me, and I confessed that I didn’t know either. So he gave up on demoing named pipes, finished his presentation and went on his way.

After he left, I confessed to the assembled architects that I knew *nothing* about WCF beyond the high-level concepts. I hadn’t spent any time working with it at all. In fact, the only reason I had it installed was because it got installed automatically when you installed WPF which I was working with at the time. My reasoning, as I explained to them, was that WCF is a low-level abstraction. That is to say, WCF is nearer the bottom of the .NET Abstraction Pile than the top. I figured I’d let the people building the next generation of service-oriented infrastructure to worry about WCF.

Fast forward eight months, and my new job is about building service-oriented infrastructure. You know, the type that builds on WCF. Maybe it’s karma, but I’m having to learn a lot about WCF right quick.

So as I get back into the blogging saddle, expect to see a bunch of stuff about WCF.

BTW, Major thanks to Sam Gentile, who’s taken the time on email and the phone (on his vacation no less) to help talk some things thru. He suggested the WCF Hands On book, which is pretty good.