Evolving Language Fidelity

I haven’t seen much in the way of response to my Hi-Fi Models post, but I did come across this great article by Ted Neward on the history of the tumultuous marriage of objects and relational databases, primarily in the context of LINQ. In the context of Code is Model, the following passage from the summary was the most interesting:

While Project LINQ doesn’t purport to be the “final answer” to all of the world’s object-relational mismatch problems, it does represent a significant shift in direction to solving the problem; instead of taking an approach that centers around code generation, or automated mapping based around metadata and type inference, both of which are exercises in slaving the relational model to an object-oriented one, Project LINQ instead chooses to elevate relations and queries as a first-class concept within language semantics and library-based extensions.
[Comparing LINQ and Its ContemporariesTed Neward]

When Ted says relations and queries are elevated to “first class concepts” within the language, it makes me think of Stuart’scomment about language fidelity. I’m not sure I would say C# 3.0 is at a higher level of abstraction than 2.0, but I would say that the inclusion of these new abstractions does improve the language’s fidelity. This fidelity improvement does come at the cost of complexity (TANSTAAFL) but compared to the current alternatives, I’m willing to pay that price.

The problem with increasing the language fidelity like this is dealing with the outdated code it leaves behind. You see this today with the addition of generics in the 2.0 CLR. How many hand-coded or generated strongly typed collections are floating around out there from the 1.x days? Lots. (As if 1.x was so long ago!) How much database access code is floating around out there today? An astronomical amount. Every app that touches a database or processes XML will be outdated with the arrival of C# 3.0 and VB 9.0. But the price of converting this outdated code to use the new abstractions probably won’t be worth the time or risk. That means you’re left with maintaining the outdated code while also writing any new functionality with the new language features.

I wonder how DSLs will be impacted by this evolving language fidelity issue? On the one hand, the nature of DSLs is that they have much narrower usage (i.e. one domain) than something like generics or LINQ. On the other hand, I expect DSLs to evolve faster than general mainstream languages like C# can. So I’m thinking the impact will be about the same.

Hi-Fi Models

I’m slowly but surely working through my holiday backlog of email and weblogs. Slowly being the operative word here.

Anyway, Stuart has a great post on the process by which we build models. And he’s not talking theoretically here, he’s working on a model for the designer definition file in the DSL toolkit. (Which is good news in and of itself as hand-writing the XML dsldd file is a pain in the butt. Though until then there’s the great Dm2Dd tool from Modelisoft). The iterative process he describes certainly looks a lot like development, in the same way that C# development looks like C development. Similar steps taken on different concepts. Additionally, he’s working bottom up – the output of the model will eventually be a working program (a designer in this case) which is the point I made in Abstraction Gap Leapfrog. There are existing abstractions that work now (i.e. the code generated from the existing dsldd file) and he’s trying to building something one level up from there.

I also like Stuart’s use of “fidelity” instead of my use of “complete”. Stuart uses it as an indication of how correct a given model is. That’s what I was implying when I said “complete” but “fidelity” captures the idea much better. I could imagine both lo-fidelity and hi-fidelity models for a given domain, though I would imagine you would always want to use the highest fidelity model available. The difference would be the amount of custom code you have to write – the higher the model fidelity, the less code you write by hand. And I would imagine the model’s fidelity would evolve over time, which introduces interesting questions regarding language evolution as well as the evolution of projects built with those languages.

I hope Stuart keeps blogging about this project.

Abstraction Gap Leapfrog

One of the cool things about having a blogging conversation with someone on the other side of the world is that while you sleep they are thinking of a good response to your post. The only downside? Having to deal with rampant misspelling like “artefacts”. 😄

Anyway, Gareth responds to my post:

Until we get models that are perfectly aligned with our business domains, we’ll have people who want to create models but who get them slightly wrong from a precision point of view – usually in the places where the imperfect models interact with other aspects of the system across or down the abstraction stack.

With code, you’d likely not want to have people check in sources that don’t even compile and then hand them off to other folks who do make them compile, but I think that’s exactly the type of process we’ll see emerging in modelling for a while. I feel this way because I don’t foresee us getting modelling languages of pure business intent 100% right for some time yet – we’re simply not close enough to formal enough descriptions of systems as intensely human as a business yet. However, I hope we won’t want to try and keep modelling as locked away with the techies as traditional development has been. (Hope I’m not talking myself out of a job here…)
[Pseudomodels and intent]

I keep saying incomplete and Gareth keeps saying imprecise, but I think we can both agree on the term “imperfect”. There’s a massive difference between having an precise language that is imperfect versus a language that is inherently imprecise like UML.

However, I think the primary disconnect here has to do with Gareth and my views on how higher abstracted languages will evolve. Gareth’s comments about modeling “pure business intent”, having “models that are perfectly aligned with our business domains” and not “keep[ing] modelling as locked away with the techies” imply to me that Gareth wants to work down from the high level business abstractions into implementable technical abstractions. Frankly, I don’t think that’s very likely. Leapfrogging a few levels of abstraction hasn’t worked well in the past (CASE and UML/MDA) and I don’t think it will work well now.

I find it much more likely that we will build higher level abstractions directly on top of existing abstractions. Again, this is similar to the way C++ built on C which in turn built on ASM. Sure, that could keep modeling “locked away with the techies” for a while, but we’re already beginning to see the light at the end of that tunnel. Windows Workflow Foundation is a significant leap in abstraction while also being something than non-techies can use. Reports about about Sharepoint “12″ embedding the WF engine and FrontPage “12″ providing a Workflow Designer for building SharePoint workflows. While I imagine (and I haven’t used any of the new Office “12″ suite so this is pure conjecture) these WF tools are targeting the “power user”, they certainly aren’t only for developers.

Believe me, I would love to be wrong about this. I would much rather work down from or business user intent than up from the technical foundation. I just don’t think it’s feasible. The process Gareth describes breaks the “Model Transformation must be Deterministic” tenet of Code is Model, though the word “must” may be to strong to allow for language evolution.

Imprecise vs. Incomplete

Gareth responds to the first tenet of Code is Model:

[A]lthough as an industry we desperately need to drag models kicking and screaming from the far left of pretty-picturedom a good long way to the right in the direction of precision, I don’t want to throw the baby out with the bathwater.

I’m going to take it as a given that folks believe that precise models are valuable development artefacts. Why do I think imprecise models are also valuable? Here are three things that tools for imprecise models help you to do:

  1. Communicate with people about design
  2. Think out loud in a way that’s more shareable that your whiteboard
  3. Start with an imprecise model and progress gradually toward precise models

Hopefully the first of these is obvious – there is value in model as communication device – it’s just not enormous.  I’ve talked before about the value of the second – I draw pictures on my whiteboard and when I’m on a conference call to Redmond, they’re effectively useless.

The third is something I’ve only recently become a convert to.  I’m happy to have models which are not precise so long as I can still reason about them programmatically.  This allows me to have development processes that are about a quantitive process of iterative refinement.

Here’s an example – in some infrastructure modelling tool, I have a node type which specifies a logical machine group.  One of its properties is the number of actual machines required to suit the proposed scale of the application to be deployed.  I’d like to be able to put “Depends on outcome of Fred’s scalability investigation” into that numeric field, or perhaps “4->8″.  I can still generate a pretty good report from this model, but I can’t really provision a set of physical servers from it.

But here’s the kicker – it’s vital that I can write tools that programmatically assess this model and tell me what work needs to be done in order to make it precise.  I want to know exactly what must be done on this model before, for example, it is suitable for feeding into some kind of provisioning tool.  You might say it needs to be precisely imprecise; I prefer to think of it as quantifiable imprecision.

[Imprecise Models and Killing Hippies]

The only thing I disagree with Gareth about is terminology. Like the term architecture, model has become a catch-all for things that aren’t code. Regular readers of this blog know I like to be more precise than that. As such, I think items #1 and #2 from Gareth’s list aren’t actually models at all. I think of them as pseudomodels, similar to the concept of pseudocode. Actually, I like the name pseudomodel – it also applies well to Grady’s scaffolding. Like psuedocode, pseudomodels have tons of value in communication and reasoning about a problem but they can’t be used as development artifacts.

As for the third, I think what Gareth is describing is an incomplete model, rather than an imprecise one. If the model is imprecise, there’s no way to programmatically reason about it. But if we look to code as an example, obviously, there are many cases where code is incomplete. Every compiler error you’ve ever seen is an example of incomplete code. And because the language is precisely specified, the compiler can tell you what needs to be done in order to make it precise, exactly as Gareth requested. I don’t think of writing code as “progressing gradually towards precision” and I doubt anyone else does either. And while I do see development as an “iterative process”, I don’t think of it as “iterative refinement”. Modeling shouldn’t be any different.

One area where I do see refinement being critical is in the development of the modeling language itself. Traditionally, the language stays stable while the program written with it changes. But with the introduction of DSLs, it becomes possible for both to vary independently. I would assume that a DSL would evolve over time to have better “coverage” of a given domain. For example, if I was building a CAB DSL, I would implement support for WorkItem right off the bat, but supporting WorkItemExtension would be much lower on the priority list. This represents language refinement, but I would argue it’s a refinement of coverage not a refinement of precision.

Scaffolding Isn’t a Model

Grady Booch @ the OOPSLA 05 Structured Design and Modern Software Practices Panel at (according to the transcript)

The most important part is the executable code. I typically throw my models away, but I always save my source code. I design because I need abstractions to help me reason out my projects.

Grady Booch ranting on his blog:

It’s sad how one can be misquoted and then for that misquote to be picked up by someone else with both then making a spin of the events to support their position. How silly is that.

Juha-Pekka Tovanen quoted me from my OOPSLA panel as saying that “when the project gets closer to the delivery you normally throw away UML models.”

<snip>

Let me be excruciatingly clear: Over the years I have been consistent in saying that in a) the most important artifact of any software development organization is executable code and yet b) modeling is essential in constructing such executables. This is because c) models help us reason about, specify, construct, and document software-intensive systems at levels of abstraction that transcend source code (and the UML is the accepted open standard for doing so). That being said, it is a pragmatic reality that d) some models are essential (and should be retained) while others are simply scaffolding (and should be discarded). I have never said and do not say now that one should throw away all models, as Juha-Pekka then Harry then Steve imply.

First off, Grady claiming to be misquoted his pretty disingenuous. Juha-Pekka’s account of what Grady said on the panel is pretty spot on. Furthermore, Grady claming that I implied he said “throw away all models” is also disingenuous. I specifically wrote that I thought Grady was being taken out of context:

I’ve gotta believe that this comment was somehow taken out of context and that the Grand Poobah of the Common Semantic Model doesn’t actually believe that tossing the model at the end of the project is a good thing.

But he-said/she-said nitpicking aside, where’s the guidance on which models are essential and which are “simply scaffolding”. Last time I checked the UML spec, none of the models are labeled as disposable. How about Rational Rose? Any dialog boxes that pop up reading: “Don’t worry keeping this model up to date, it’s just scaffolding”? I don’t think so.

Obviously, working at a higher level of abstraction helps you reason about a project. But reasoning at high levels of abstraction doesn’t mean you’re modeling. When a building architect sits down with a prospective customer and a sketchpad, they may be working out great ideas but no one is going to call the result a blueprint. Grady’s scaffolding “models” break nearly every tenant of Code is Model. Scaffolding isn’t precise or deterministic. And if it ends up in the recycle bin, I guess it’s not intrinsic to the development process.

Actually, it’s good that scaffolding isn’t a model. It means Grady is specifically not suggesting to throw away models. He just needs to get his terminology right.

As for Grady’s request for “just one DSM out there in production”, Don lists a few: Workflow languages (XLANG and WF), Business Rules Engine Languages and Build lanugages (MSBuild, Ant and NAnt). Juha-Pekka pointed to this list of DSM case studies. I’d also add UI Designers such as the Windows Forms and ASP.NET designers in Visual Studio. In the Window Forms case, the code is stored in a seperate file (yay partial classes) and are specifically marked: “do not modify the contents of this method with the code editor.” In the ASP.NET case, the code for an ASPX file isn’t even generated until runtime. And how about HTML itself? I’m thinking HTML qualifies as “in production”?