Software Factories at OOPSLA

Keith blogged about his hectic days since he returned from vacation. Getting a book published, preparing for a BillG review, you know all the usual stuff. 😄

I keep waiting for my hardcopy of Software Factories, and Keith wrote that it should hit the shelves on Sept. 15th. This means that it will be available in time for OOPSLA 04. I’ve blogged this before, but it’s worth repeating that there is going to be quite a MS presence at OOPSLA this year:

You know I’m not going to miss all that. We’ll have coverage on Architecture Center during the event and I will endeavor to get as much of this content Architecture Center after the event. I think the tutorials in particular will be very interesting to the audience at large.

Anyone else going to OOPSLA?

Update: Added the three panel discussions. I’m guessing the J2EE/.NET shootout will be standing room only.

Presenting Software Factories

I presented Software Factories for the first time today and I think I did a pretty good job. We had some architects were in town from a new managed SI partner and they wanted to discuss modeling. They are (were?) an IBM partner, so they’re a big WebSphere shop. They’re also XDE users, so I laid out the Software Factories concept as well as the modeling tools that are coming in VS2005. They seemed pretty impressed. Of course, they’re having what I expect is a typical experience with UML tools – they use XDE for documentation and communication only (i.e. UmlAsSketch). They don’t even try to generate code from the models anymore.

To help explain the Factories concept, I used Steve Maine’s Efficiency/Precision/Generality modeling approach (and gave him credit for it, of course. Steve, I will find a way to properly thank you for that brain.save) as well as my own ideas about VB as a Software Factory. Both worked out very well to help communicate the goals of the Factories approach, though I could refine the delivery quite a bit. I also talked about the Evolving Frameworks Pattern Language, which Jack outlines in the JOURNAL factories article this way:

  • After developing a number of systems in a given problem domain, we identify a set of reusable abstractions for that domain, and then we document a set of patterns for using those abstractions.
  • We then develop a runtime, such as a framework or server, to codify the abstractions and patterns. This lets us build systems in the domain by instantiating, adapting, configuring, and assembling components defined by the runtime.
  • We then define a language and build tools that support the language, such as editors, compilers, and debuggers, to automate the assembly process. This helps us respond faster to changing requirements, since part of the implementation is generated, and can be easily changed.

The problem is that each of these steps is much much harder than the preceding ones. Identifying problem domain abstractions & patterns is something that most organizations are already doing, even if they don’t do it explicitly today. Codifying those abstractions in a reusable framework is much harder, primarily because it’s hard to think through all the usage variations a framework may experience. Still, many companies have the skills to develop reusable frameworks. However few companies have the language and tool development experience to make investing in custom tools cost effective. For example, even though Gregor and Bobby have defined a language for integration patterns, the only tooling available is a Visio stencil.

For Software Factories to work, we need to make it much easier to build domain specific languages and modeling environments. This is one of the big hurdles to adopting the factories approach today.

Zooming In From High Levels of Abstraction

Denny Figuerres left the following comment on my last post:

[O]ne of my “wish” items would be a kind of editor that would be a merge of flow-chart and text editor so that I could view my function as text or as a kind of zooming diagram with more details as I drilldown. like when you use a map program, first you see an area and major routes, rivers, lakes etc… and as you zoom-in you see a smaller area with more detail. at some point you are down to a single line of code that is an expression of some kind.

I think this is a great idea, and is completely in line with Software Factories. Denny, what you’re talking about is working at higher levels of abstractions. If you read the Software Factories article on TheServerSide.NET, Jack writes:

How…do we work at higher levels of abstraction? We use more abstract models, and move the platform closer to the models with either frameworks or transformations, as illustrated in Figure 4.

  • We can use a framework to implement higher level abstractions that appear in the models, and use the models to generate snippets of code at framework extension points. Conversely, the models help users complete the framework by visualizing framework concepts, and exposing its extension points in intuitive ways. A pattern language can be used instead of a framework, as described by Hohpe and Woolf. This requires the tool to generate the pattern implementations, in addition to the completion code. This approach is illustrated in Figure 4 (a).
  • Instead of a framework or pattern language, we can generate to a lower level DSL. We can also use more than two DSLs to span a wide gap, leading to progressive transformation, where models written using the highest level DSLs are transformed into executables through a series of refinements, as shown in Figure 4 (b).

I think we’ll see a combination of these two approaches. Obviously, as an industry, we have lots of experience building frameworks and I don’t see those going away anytime soon. The second approach, however, is much more fascinating as you get that “zooming” effect that Denny describes.

One thing to note about this zooming approach of using higher level abstractions – different systems use different abstractions. The abstractions you use to build an ERP system are not the same as the ones you would use to build a telephone billing system. So the view of the top-level abstractions will be very different, even if they both end up implemented on the same platform using the same programming language.

The Most Popular Modeling Environment Ever (So Far)

Steve’s post on “the modeling problem” hits the nail on the head. We’re all familiar with the concept of “fast, good, cheap – pick two”. Steve breaks down modeling into “general, precise, efficient – pick two (and favor one)”. Furthermore, you can’t have a language that is both general and precise. UML takes what Steve calls the “Favor efficiency, accept generality and compromise precision” approach:

The UML metamodel is flexible enough to allow it to describe virtually any system out there. However, from a formal semantic perspective, the resultant model is gooey and formless which makes it very difficult to compile into anything useful. At best, we can get some approximation of the underlying system via codegen, but even the best UML tools only generate a fraction of the code required to fully realize the model. The lack of precision within the model itself requires operating in both the model domain and the system domain, and implies that some facility exist to synchronize the two. Thus, the imprecision of UML forces us to solve the round-tripping/decompilation problem with 100% fidelity, which is generally difficult to do.

Software Factories, on the other hand, takes what he calls the “Favor efficiency, accept precision, and compromise generality” approach:

This, I think, it the sweet spot for Microsoft’s vision of Software Factories. Here’s why: the classic problem faced by modeling languages is Turing equivalency. How do you model a language that is Turing-complete in one that’s not without sacrificing something? The answer is: you don’t. You can either make the modeling language itself Turing-complete (which sacrifices efficiency) or you can limit the scope of the problem by confining yourself to modeling only a specific subset of the things that be expressed in the underlying system domain. Within that subset, it might be possible to model things extremely precisely, but that precision can only be gained by first throwing out the idea that you’re going to be able to efficiently and precisely model everything.

When describing Software Factories, I have two analogies that I use to explain the idea. The first is the “houses in my neighborhood” example I blogged before. That does a good job describing economies of scope, but doesn’t really cover the modeling aspect of software factories. Talking about how you model cars or skyscrapers doesn’t really capture the essence of software modeling – you don’t generate the construction plans from a scale model of a skyscraper. However, it turns out that all developers have at least a passing familiarity with my second analogy: Visual Basic, the most popular DSL and modeling tool of all time (so far).

The original Visual Basic was a rudimentary software factory for building “form-based windows apps”. (Today, VB.net has been generalized to support more problem domains) Like the factory approach that Steve describes, VB was very efficient, sufficiently precise, yet not particularly general (especially in the early years). There were entire domains of problems that you couldn’t build VB apps to solve. Yet, within those targets problem domains, VB was massively productive, because it provided both a domain specific language (DSL) as well as a modeling environment for that domain.

A DSL incorporates higher-order abstractions from a specific problem domain. In the case of VB, abstractions such as Form, Control and Event were incorporated directly into the language. This allowed developer to directly manipulate the relevant abstractions of the problem domain. Abstractions extraneous to the problem domain, such as pointers and objects in this case, got excluded, simplifying the language immensely. Both of these lead directly to productivity improvements while limiting the scope of the DSL to a particular problem domain.

In his post, Steve makes the point that it’s pointless to distinguish between modeling and programming languages. VB certainly blurred that line to the point of indistinguishably. Regardless, graphical languages are typically more compelling and productive than textual ones. It’s hard to argue with the productivity that VB form designer brought to the industry. Dragging and dropping controls to position them, double clicking on them to associate event handlers, changing properties in drop down boxes – these idioms have been widely implemented to the point that essentially all  UI platforms provide a drag-and-drop based modeler. It’s such a great design that 10 years later, UI modelers are essentially unchanged.

Once you realize that VB’s DSL and modeling environment was a rudimentary software factory, you realize that Software Factories methodology is about generalizing what VB accomplished – building tools that achieve large gains in efficiency by limiting generality. Since each of these tools focuses on a limited problem domain, you need different tools for different problem domains. The problem is that while building apps with VB may be easy, but building VB itself was not. Most enterprises have the expertise to develop abstractions in their domain of expertise and to codify those abstractions in frameworks, but very few can develop tools and DSLs for manipulating those frameworks. One of the goals of Software Factories (and VSTS Architect for that matter) is to make it easier to build tools that are really good at building a narrow range of applications.

Note, it’s important to note that the term “narrow range” is relative. Darrell seems to think narrow range only means vertical market applications that don’t “solve new and interesting problems”. It’s true that the narrower the range, the more productive the tool can be. But VB shows us that you can achieve large productivity gains while solving new and interesting problems even in broad scope problem domains.

SDM Whitepaper

Now that I’m back from vacation, I got my VS2005 Beta 1 VPC’s up and running. I have two – one for Express (VC#/VB/VWD) and one for the Enterprise Architect. I installed Express because I wanted to see how realistic they are as day-to-day dev tools. So far, I’m pretty impressed. I’m going to use VC# & VWD to build a distributed genealogy data management system with my cousin Dave and my dad. Genealogy is pretty interesting problem domain that can touch on many SOA data management issues such as ownership, publication of reference data and federated identity so I think it will be pretty cool.

Of course, I also installed the full-blown VS2005, primarily to get access to the new modeling tools (i.e. Whitehorse). BTW, there’s a relatively new whitepaper on the System Definition Model (i.e. the meta-model that the Whitehorse app designer, data center designer and deployment designer modellers are based on) available. It was published in late April when I was to busy w/ TechEd prep to notice. The whitepaper describes the SDM meta-model, the SDM core models and partnership opportunities.