Probably Wrong Info Is Worse Than No Info At All

Like many geeks, I love Dilbert. However, I rarely identify with it as well as I did Sunday.

I kid you not, I’ve had almost exactly this conversation back when I worked in MS IT. They have this big repository of information about deployed applications. Technically, you’re not supposed to deploy an application without listing it in the application repository. Like Dilbert, I never really understood what people were going to do with this information, but the projects I was on dutifully collected the relevant information and put it into the repository.

And never thought of it again. Ever.

And therein lies the problem. Populating the application repository was an artificial step on the critical path of the deployment process. Writing the software, acquiring the physical hardware to run it on, stuff like that really is on the critical path. Populating the application repository was extra busy work legislated by someone (I forget if it was the central architecture team or management) that didn’t benefit the project in the slightest. As such, it was given the minimal about of attention and effort, meaning there was little quality or consistency in the data. Worse yet, when the application changed or was decommissioned , updating the application repository just didn’t happen. I mean, it was supposed to, but rarely did.

So you ended up with a repository of information that was worse than useless. I had a colleague who insisted that the repository had some value because “not all of the data was wrong”. Of course, he couldn’t tell me with any consistency which data was accurate and therefore valuable and which was not. Hence, my argument that it was “worse than useless”.

The only way an application repository is going to be of any value at all is if you can collect the data automatically. My old teammate Buzz coined a phrase we used often: “The Truth Is On The Edge”. You should always regard any central repository of information with a very critical eye since it’s rarely going to be the truth.

(Ed. Note – Man, it’s been a long time since I’ve written about Architecture. My last Architecture post was almost a year ago. I don’t miss the job but I do miss my old teammates – in particular Buzz, Rick, Dale and of course Nick Malik.)


Has your view on architecture as a discipline separate from coding changed since working with dynamic languages?
The Truth Is On The Edge, so true. On the other hand, automatically collected central repository of information in the large scale can be very valuable, e.g. Ohloh.
Hi Harry, Great Dilbert. It is funny partly because it is painfully true. We do have that repository, and it does contain "partly valid" information, and it is expensive and painful to update. Traditionally, in organizations that want to perform an unpopular activity, they will put that activity into a small team of people who are responsible for doing that activity over and over, in different places. This is a "business support function" and occurs variously in everything from financial audits to investigations for sexual harassment. Unfortunately, in IT, we didn't develop a single team that had to maintain the repository... we distributed the function, figuring it would be so much more efficient. As a result, the pain is distributed. That means that it never reaches the level of annoyance that someone will invest to fix the pain. It is also distributed to people who pay the costs of doing the work but, as you noted, reap few of the benefits of the data. As a result, data quality suffers. There are uses for data such as this, if the data is accurate. As Dilbert points out, if the person who has to collect the data is NOT the person who benefits from its collection, then you will get mistakes, quality issues, and delays in delivery. This is not a data issue. It is an issue of IT not running like a business. We do a good job of security in IT, because we have people directly accountable for security who own the teams responsible for delivering security. Automated data collection is a potential solution to a poorly described problem. Unfortunately, without the team responsibility in place to make the problem visible, and to demonstrate the value of fixing it, no one will invest in that automated data collection. I cannot even tell you if it is GOOD solution to the problem, because there is no person who can provide a consistent view on what the actual problem is (yet... this is changing). I believe that MS IT is typical in many of these problems. We are investing to replace that particularly onerous solution with something a bit better, with better support for the actual value-add activites that occur downstream, and with organizational alignment to insure that someone can answer the question "what are you going to do with the data." That said, your observation is salient. Manual data collection is rarely a good long term solution. Good to chat with you again. --- Nick