News Aggregation

I’ve been using FeedReader for aggregating news feeds, but I just downloaded and configured Syndirella. I think I’m going to switch permanently since Syndirella handles some of the 2.0 RSS feeds that FeedReader can’t. (I want to try News Gator, but I’m not running a compatible version of Outlook). Additionally, Syndirella is a .NET app, which I like even if it has little bearing on the apps’ functionallity.

Tracking Referrers

So now I’ve started tracking referrers. I’m using the referrer server variable, and running into canonicalization issues. For example, as I write this entry, there are 8 referrals from ASP.NET Weblog. But it shows up on my list as 3 separate referrers since I’m not sure how to canonicalize the URL. For example, Radio users mostly come from the Radio.Weblogs.com server, so for those users, the specific virtual folder is important. But for other sites, like aspnetweblog.com, it’s not. Anyone have a good suggestion?

WSDL soapAction Issues

Why does WSDL expressly forbid the use of the soapAction attribute with protocols other than HTTP? How else are you supposed to map incoming messages across non-HTTP protocols to SOAP endpoints? WS-Routing provides a simple mechanism for including action information inside any SOAP message – in the message rather than some out-of-band mechanism like HTTP headers.

The “action” element is used to indicate the intent of the WS-Routing message in a manner similar to the SOAPAction HTTP header field defined for SOAP (see [15], section 6.1.1). The value is a URI identifying the intent. Similar to the SOAPAction header field, WS-Routing places no restrictions on the format or specificity of the URI or requires that it can be dereferenced. There is no mechanism for computing the value based on the message and there is no default value. [WS-Routing spec, section 5.1.1]

WSDL provides a soap binding operation element that maps the HTTP soapAction header to a specific operation, providing a simple way to route an incoming method to a SOAP endpoint. However, WSDL expressly forbids specifying the soapAction attribute when you’re not using HTTP!

The soapAction attribute specifies the value of the SOAPAction header for this operation. This URI value should be used directly as the value for the SOAPAction header; no attempt should be made to make a relative URI value absolute when making the request. For the HTTP protocol binding of SOAP, this is value required (it has no default value). For other SOAP protocol bindings, it MUST NOT be specified, and the soap:operation element MAY be omitted. [WSDL 1.1 spec, section 3.4]

Argh! I see three possible ways of resolving this:

  • Ignore WSDL spec and use soapAction anyway.
  • Use a body root element QName / WS-Routing action naming scheme to map message to SOAP endpoint based on message and operation names.
  • Include the equivalent of soapAction in a routing oriented WSDL extension as a child of wsdl:operation (i.e. duplicate soap:operation@soapAction w/o the only HTTP limitation)

Additionally, you might be able to use WS-Policy to specify this. The question is: Which is the best? Go against the spec, use an opaque name mapping scheme, or extend the spec? And if you’re going to extend the spec, which is better to extend? WSDL or WS-Policy?

Disruptive Programming Language Technologies

At one of my first programming jobs out of college, I was working at company that shall remain nameless. I was part of a small 4 person dev staff building a client/server app that ran with a variety of back end databases – abstraction of said databases provided primarily by ODBC. However, ODBC was a leaky abstraction, and different DB vendors had varying levels of support. In particular, at the time Informix supported multiple left outer joins in one query and Oracle did not. I know this because one of my coworkers (who was later fired for drug use and attempted workers comp fraud) wrote some of the worst code I had ever seen. He duplicated large portions of the code – once using multiple left outer joins and once using multiple queries . If the multiple join query failed, he used the multiple query version. Besides the obvious questions of code modularity, readability, reusability, etc. one massive question stuck out to me like a sore thumb: “Since it always works, why not just use the multi-query version all the time?”. The answer: “Performance”. He wanted to squeeze every ounce of performance out of the app, so he wanted to avoid multiple database roundtrips. The fact that he made the code essentially unworkable in the process was of little concern. This was the biggest example of the “only performance matters” mentality, but his code was littered with such “optimizations”. I inherited the code when he got the boot, and we eventually scrapped the app completely because it was so bad.

At the time, I thought that optimizing for performance above all else was a bad policy. Things like development time and cost count too. (Now, I assert they count more.) But I never really had a name for it until someone pointed out “Proebstring’s Law” to me:

“Moore’s Law” asserts that advances in hardware double computing power every 18 months. I (immodestly) assert “Proebsting’s Law”, which says that advances in compiler optimizations double computing power every 18 years. [Todd Proebsting’s Home Page]

Todd is a Senior Researcher at Microsoft Research. His law basically means that in the face of Moore’s Law, optimizing compilers (and by extension, applications) is mostly irrelevant and that compiler (and application) developer’s time would be better spent focused on other types of optimizations – primarily developer productivity. In a talk that he calls Disruptive Programming Language Technologies, (ppt and video) he points out that recently adopted languages – such as Visual Basic, Java and Perl – were all very slow compared to the dominant language at the time: C/C++. They were also devoid of “academic” innovation. Yet each of these languages provided solutions to real world problems that C/C++ couldn’t match. And today, they are in wide adoption with VB easily outpacing C++ in terms of number of developers. Todd goes on to list a series of disruptive technologies that he predicts will be incorporated into future languages. These include: Application Crash Analysis, Checkpoints/Undo, Database Access, Parsing, XML Manipulation, Constraint Solving and Distributed Programming.

There are a few conclusions I draw from this:

  • If there is going to be future language innovation that will make my life easier as a develop, I’m going to want to use a platform that is designed to support multiple languages. Obviously, I’m thinking CLR here – JVM’s “me too” approach to multiple language support just doesn’t cut it.

  • These language innovations will probably not include “typical” programming language elements such as if/then and for/next loops. We have great languages such as C#, Visual Basic and Java that already include all that stuff. With CLR’s true language interoperability, there’s no point in duplicating those elements in each new language. I can build a disruptive technology into a language that exposes classes to the CLR. Then I can use C#, VB or even J# to provide the glue logic. This makes the language design and compiler building much easier, meaning the “barrier to entry” for innovative language design has dropped significantly.

  • Code generation will be replaced with disruptive programming languages. There used to be code generation wizards in VC++ for building event handlers. In VB, you didn’t need them. Technologies where code generation is used extensively (such as database access) are ripe for a disruptive programming language.

  • I want to learn more about language design and compiler development. Thesebooks are a good start, plus there’s the Coco/R toolkit for building parsers in C#.

  • Performance is almost irrelevant. I mean, you can’t ignore it completely. However, in the face of other factors – such as time and money – performance is low on the priority list. I’d rather optimize for developer productivity than performance. After all, I can get more hardware cheaper and easier than I can get more developers.

Using ASHX Files

At the Phoenix MSDN User Group Tuesday night, I mentioned ASHX files in passing, but most of the audience hadn’t encountered them. Based on the emails I got after the event, I decided to post an article about using ASHX files to retrieve images stored in a database to my site. I also decided to add an “Articles” section to put this and other articles like it in (I moved my Add #Region macro page into the articles section so the ASHX article wouldn’t be alone).