Abstraction Gap Leapfrog

One of the cool things about having a blogging conversation with someone on the other side of the world is that while you sleep they are thinking of a good response to your post. The only downside? Having to deal with rampant misspelling like “artefacts”. 😄

Anyway, Gareth responds to my post:

Until we get models that are perfectly aligned with our business domains, we’ll have people who want to create models but who get them slightly wrong from a precision point of view – usually in the places where the imperfect models interact with other aspects of the system across or down the abstraction stack.

With code, you’d likely not want to have people check in sources that don’t even compile and then hand them off to other folks who do make them compile, but I think that’s exactly the type of process we’ll see emerging in modelling for a while. I feel this way because I don’t foresee us getting modelling languages of pure business intent 100% right for some time yet – we’re simply not close enough to formal enough descriptions of systems as intensely human as a business yet. However, I hope we won’t want to try and keep modelling as locked away with the techies as traditional development has been. (Hope I’m not talking myself out of a job here…)
[Pseudomodels and intent]

I keep saying incomplete and Gareth keeps saying imprecise, but I think we can both agree on the term “imperfect”. There’s a massive difference between having an precise language that is imperfect versus a language that is inherently imprecise like UML.

However, I think the primary disconnect here has to do with Gareth and my views on how higher abstracted languages will evolve. Gareth’s comments about modeling “pure business intent”, having “models that are perfectly aligned with our business domains” and not “keep[ing] modelling as locked away with the techies” imply to me that Gareth wants to work down from the high level business abstractions into implementable technical abstractions. Frankly, I don’t think that’s very likely. Leapfrogging a few levels of abstraction hasn’t worked well in the past (CASE and UML/MDA) and I don’t think it will work well now.

I find it much more likely that we will build higher level abstractions directly on top of existing abstractions. Again, this is similar to the way C++ built on C which in turn built on ASM. Sure, that could keep modeling “locked away with the techies” for a while, but we’re already beginning to see the light at the end of that tunnel. Windows Workflow Foundation is a significant leap in abstraction while also being something than non-techies can use. Reports about about Sharepoint “12″ embedding the WF engine and FrontPage “12″ providing a Workflow Designer for building SharePoint workflows. While I imagine (and I haven’t used any of the new Office “12″ suite so this is pure conjecture) these WF tools are targeting the “power user”, they certainly aren’t only for developers.

Believe me, I would love to be wrong about this. I would much rather work down from or business user intent than up from the technical foundation. I just don’t think it’s feasible. The process Gareth describes breaks the “Model Transformation must be Deterministic” tenet of Code is Model, though the word “must” may be to strong to allow for language evolution.

Imprecise vs. Incomplete

Gareth responds to the first tenet of Code is Model:

[A]lthough as an industry we desperately need to drag models kicking and screaming from the far left of pretty-picturedom a good long way to the right in the direction of precision, I don’t want to throw the baby out with the bathwater.

I’m going to take it as a given that folks believe that precise models are valuable development artefacts. Why do I think imprecise models are also valuable? Here are three things that tools for imprecise models help you to do:

  1. Communicate with people about design
  2. Think out loud in a way that’s more shareable that your whiteboard
  3. Start with an imprecise model and progress gradually toward precise models

Hopefully the first of these is obvious – there is value in model as communication device – it’s just not enormous.  I’ve talked before about the value of the second – I draw pictures on my whiteboard and when I’m on a conference call to Redmond, they’re effectively useless.

The third is something I’ve only recently become a convert to.  I’m happy to have models which are not precise so long as I can still reason about them programmatically.  This allows me to have development processes that are about a quantitive process of iterative refinement.

Here’s an example – in some infrastructure modelling tool, I have a node type which specifies a logical machine group.  One of its properties is the number of actual machines required to suit the proposed scale of the application to be deployed.  I’d like to be able to put “Depends on outcome of Fred’s scalability investigation” into that numeric field, or perhaps “4->8″.  I can still generate a pretty good report from this model, but I can’t really provision a set of physical servers from it.

But here’s the kicker – it’s vital that I can write tools that programmatically assess this model and tell me what work needs to be done in order to make it precise.  I want to know exactly what must be done on this model before, for example, it is suitable for feeding into some kind of provisioning tool.  You might say it needs to be precisely imprecise; I prefer to think of it as quantifiable imprecision.

[Imprecise Models and Killing Hippies]

The only thing I disagree with Gareth about is terminology. Like the term architecture, model has become a catch-all for things that aren’t code. Regular readers of this blog know I like to be more precise than that. As such, I think items #1 and #2 from Gareth’s list aren’t actually models at all. I think of them as pseudomodels, similar to the concept of pseudocode. Actually, I like the name pseudomodel – it also applies well to Grady’s scaffolding. Like psuedocode, pseudomodels have tons of value in communication and reasoning about a problem but they can’t be used as development artifacts.

As for the third, I think what Gareth is describing is an incomplete model, rather than an imprecise one. If the model is imprecise, there’s no way to programmatically reason about it. But if we look to code as an example, obviously, there are many cases where code is incomplete. Every compiler error you’ve ever seen is an example of incomplete code. And because the language is precisely specified, the compiler can tell you what needs to be done in order to make it precise, exactly as Gareth requested. I don’t think of writing code as “progressing gradually towards precision” and I doubt anyone else does either. And while I do see development as an “iterative process”, I don’t think of it as “iterative refinement”. Modeling shouldn’t be any different.

One area where I do see refinement being critical is in the development of the modeling language itself. Traditionally, the language stays stable while the program written with it changes. But with the introduction of DSLs, it becomes possible for both to vary independently. I would assume that a DSL would evolve over time to have better “coverage” of a given domain. For example, if I was building a CAB DSL, I would implement support for WorkItem right off the bat, but supporting WorkItemExtension would be much lower on the priority list. This represents language refinement, but I would argue it’s a refinement of coverage not a refinement of precision.

Scaffolding Isn’t a Model

Grady Booch @ the OOPSLA 05 Structured Design and Modern Software Practices Panel at (according to the transcript)

The most important part is the executable code. I typically throw my models away, but I always save my source code. I design because I need abstractions to help me reason out my projects.

Grady Booch ranting on his blog:

It’s sad how one can be misquoted and then for that misquote to be picked up by someone else with both then making a spin of the events to support their position. How silly is that.

Juha-Pekka Tovanen quoted me from my OOPSLA panel as saying that “when the project gets closer to the delivery you normally throw away UML models.”

<snip>

Let me be excruciatingly clear: Over the years I have been consistent in saying that in a) the most important artifact of any software development organization is executable code and yet b) modeling is essential in constructing such executables. This is because c) models help us reason about, specify, construct, and document software-intensive systems at levels of abstraction that transcend source code (and the UML is the accepted open standard for doing so). That being said, it is a pragmatic reality that d) some models are essential (and should be retained) while others are simply scaffolding (and should be discarded). I have never said and do not say now that one should throw away all models, as Juha-Pekka then Harry then Steve imply.

First off, Grady claiming to be misquoted his pretty disingenuous. Juha-Pekka’s account of what Grady said on the panel is pretty spot on. Furthermore, Grady claming that I implied he said “throw away all models” is also disingenuous. I specifically wrote that I thought Grady was being taken out of context:

I’ve gotta believe that this comment was somehow taken out of context and that the Grand Poobah of the Common Semantic Model doesn’t actually believe that tossing the model at the end of the project is a good thing.

But he-said/she-said nitpicking aside, where’s the guidance on which models are essential and which are “simply scaffolding”. Last time I checked the UML spec, none of the models are labeled as disposable. How about Rational Rose? Any dialog boxes that pop up reading: “Don’t worry keeping this model up to date, it’s just scaffolding”? I don’t think so.

Obviously, working at a higher level of abstraction helps you reason about a project. But reasoning at high levels of abstraction doesn’t mean you’re modeling. When a building architect sits down with a prospective customer and a sketchpad, they may be working out great ideas but no one is going to call the result a blueprint. Grady’s scaffolding “models” break nearly every tenant of Code is Model. Scaffolding isn’t precise or deterministic. And if it ends up in the recycle bin, I guess it’s not intrinsic to the development process.

Actually, it’s good that scaffolding isn’t a model. It means Grady is specifically not suggesting to throw away models. He just needs to get his terminology right.

As for Grady’s request for “just one DSM out there in production”, Don lists a few: Workflow languages (XLANG and WF), Business Rules Engine Languages and Build lanugages (MSBuild, Ant and NAnt). Juha-Pekka pointed to this list of DSM case studies. I’d also add UI Designers such as the Windows Forms and ASP.NET designers in Visual Studio. In the Window Forms case, the code is stored in a seperate file (yay partial classes) and are specifically marked: “do not modify the contents of this method with the code editor.” In the ASP.NET case, the code for an ASPX file isn’t even generated until runtime. And how about HTML itself? I’m thinking HTML qualifies as “in production”?

JavaScript Utilities

Can you tell it’s a slow day in the office? 😄

Speaking of raising the level of abstraction as well as browser based applications, check out Nikhil’s JavaScript Utilities project:

The project introduces the notion of .jsx (extended JavaScript) and .jsa (JavaScript assembly) files. JSX files provide the ability to include conditional code via familiar preprocessor directives such as #if, #else, #endif and so on…The tool processes these directives in order to produce a standard .js file. JSA files build on top of .jsx files. While they can include normal JavaScript and preprocessor directives, they are primarily there for including individual .jsx and .js files via #include directives. This allows you to split your script into more manageable individual chunks.

Now, that’s not raising the level of abstraction much, but here’s another example of working in a higher abstracted environment (jsx and jsa) and then compiling down to something the underlying platform can execute (js). Nikhil provides three ways of doing this compliation:

  1. A set of standalone tools that output standard .js files that you can then deploy to your web site. Command line parameters are used to control the behavior of the tools.
  2. A couple of IHttpHandler implementations that allow you to dynamically convert your code into standard .js files. This is the mode where you can benefit from implementing per-browser code in a conditional manner. AppSetting values in configuration are used to control the conversion behavior.
  3. As a script project in VS using an msbuild project along with an msbuild task that comes along with the project. MSBuild project properties are used to control the conversion behavior.

If you’re going to raise the level of abstraction to do implement a preprocessor, you could also go all the way and implement an entirely new language that gets compiled down to JavaScript for execution in the browser. For example, I’m not as familiar or comfortable with JavaScript’s prototype based approach. But I could imagine a more class based language that compiles to JavaScript. That’s the same way early C++ compilers worked – they were a preprocessor pass that converted the C++ into C, which could then be compiled with traditional C compilers.

I wonder what JavaScript++ would look like

As Simple as Possible, But No Simpler

Chris Bilson left the following comment to my Thoughts on CAB post

I hesitate to agree that raising the abstraction level of tools is a good idea. That’s just hiding the complexity that’s already there inside of more complexity. If you ever need to look under the hood, it’s even harder to grok. I think it would be better to go the other way. Try removing stuff to combat complexity.

Given that the software industry has been raising the level of abstraction of tools since the start, I found this comment surprising. Assuming Chris doesn’t write in assembly code, he’s leveraging something at a higher level of abstraction that’s “just hiding the complexity that’s already there”. As I wrote in Code is Model:

The only code that the CPU can understand is machine code. But nobody wants to write and debug all their code using 0’s and 1’s. So we move up a level of abstraction and use a language that humans can read more easily and that can be automatically translated (i.e. compiled) into machine code the CPU can execute. The simplest step above machine code is assembly language. But ASM isn’t particularly productive so work with, so the industry has continuously raised the level of abstraction in the languages they use. C is a higher level of abstraction than ASM, adding concepts like types and functions. C++ is a higher level of abstraction than C, adding concepts like classes and inheritance. Each of these levels of abstraction presents a new model of the execution environment with new features that make programming more productive (and sometimes more portable).
[emphasis added]

I feel like Chris has mischaracterized what I wrote. Here it is again:

If you can’t lower the complexity of your framework, it’s time to raise the abstraction of your tools.

Note I’m not advocating raising the level of abstraction for abstractions sake. Believe me, I’m hate overly complex code. The project I’ve been on (and can blog about soon I hope) had a tone of overly complex code that wasn’t particularly germane to solving the problem. I yanked all that code and have reduced the framework to a quarter it’s original size while adding functionality. But the reality is, simplification can’t always be achieved by removing code. To paraphrase Albert Einstein, solutions to problems should be as simple as possible, but no simpler.

CAB is the simplest solution to the problem it addresses, and no simpler. Since we can’t make CAB simpler and still solve the problem, the only alternative we have is to have the tools hide that complexity. Given how well this has worked in the past, I see no reason why it can’t work again in the future.