Apparently, Microsoft Delivered on Enterprise 2.0 Three Years Ago

In the past few weeks, there’s been a major uptick in discussion about Web 2.0 / Enterprise convergence. Andrew McAfee has a new article on what he calls Enterprise 2.0. Dion’s got an entire blog on the subject, though he thinks it should be called Enterprise Web 2.0. Nicholas Carr is skeptical. Seems to me all this discussion about what might happen in this space is pretty silly since it’s happened already.

Unfortunately, Andrew’s Enterprise 2.0 isn’t freely available (you can buy a copy of the PDF for $6.50), but it primarily focuses on the growing frustration with email and the rise of collaborative Web 2.0 technologies such as blogs and wikis inside the enterprise. No big shock here – for collaboration, blogs and wikis are to email what word processors are to typewriters. Andrew also introduces a model he calls SLATES for describing the aspects of these technologies: Search, Links, Authorship, Tags, Extensions and Signals. So far, all good stuff.

The problem with the article is that he talks about these technologies in the future tense. For example, he writes: “As technologists build Enterprise 2.0 technologies that incorporate the SLATES components” which implies that these are coming down the pipe rather than here right now. Not only here right now, but available for going on three years. I’m talking about SharePoint 2003. 2003 as in “a year before Tim O’ Reilly coined the term Web 2.0“.

SharePoint (I’m talking primarily about the free feature pack for Windows Server 2003 though about the portal server as well) supports Search, Links, Authorship and Signals – four of the six components of Andrew’s Enterprise 2.0 stack. (And frankly, I’m not sure where Andrew is going w/ his Extensions aspect so four out of five is probably more accurate.) More importantly, it’s specifically designed to support what Dion called the Democratization of Content. As of December 2004, Microsoft’s internal IT department was supporting “more than 60,000 users, 250 group and division portals, 50,000 team sites, and manages more than 3 terabytes of information.” Personally, I use the coportate enterprise intranet portal, my division portal, a handful of team sites and my personal site on a pretty much daily basis. Only the enterprise and division portal are centrally managed. Given the explosion of SharePoint sites inside Microsoft, I’m obviously not alone.

Creating a new SharePoint team site inside Microsoft is totally self service and takes literally a few seconds. Once you have a site, you can configure it as you like, creating lists and setting permissions as you see fit. Again, it’s totally self service. Plus, it’s totally public unless you specifically lock it down (well, public inside the firewall at any rate). Of course, it could be easier and better, and that’s what next versions are for. SharePoint 2007 will have direct support for blogs, wikis and RSS. Check out the C9 video for more info.

Given the market momentum to date and the impending release of a new version, I find it very surprising to find Dion, Andrew and Nicholas discussing the potential ramifications of these technologies without even mentioning SharePoint. If these guys want to see the Enterprise 2.0 technology in action, all they need to do install SharePoint.

RubyNet Project

Having written about Ruby in the scope of the Compiler Dev Lab and the Dual Schema Problem, I was interested to come across the Ruby.NET project from Queensland University of Technology. From the Ruby.NET home page:

Our goal is to create a compiler for the Ruby language that targets the .NET CLR. We aim to support 100% of Ruby language semantics, including all dynamic constructs such as closures and continuations. We plan to generate 100% managed and verifiable CIL code.

Sweet!

The Dual Schema Problem

A few months ago, Ted Neward wrote a great article about the history of the Object Relational Impedance Mismatch problem and how LINQ is addressing it in a new way. Basically, LINQ is introducing new language abstractions and complementary libraries to enable queries as a first class concept within the language. However, I don’t believe that O/R Impedance Mismatch is the whole problem. More specifically, it’s a follow-on problem to what I would call the Dual Schema problem.

In a nutshell, the Dual Schema problem is that you have to design and implement two separate versions of your persistent entities. There’s the in memory version, typically written in an OO language like C# or Java. Then there’s the on disk version, typically written in SQL. Regardless of the difficulties translating between the two versions (i.e. the aforementioned impedance mismatch), you have to first deal with the complexity of keeping the two versions in sync. While LINQ does a great job eliminating much of the friction translating between on disk and in memory formats, it could go much farther by eliminating the need for translation in the first place.

A variety of solutions to the Dual Schema problem have evolved, primarily outside the hallowed halls of enterprise vendors (i.e. MS and others like us). One such solution is Ruby on Rails. In a Rails environment, I simply declare the existence of a given persistent entity:

class Person < ActiveRecord::Base
end

The ActiveRecord base class (a standard part of Rails) will dynamically create methods and attributes on the Person object at runtime, based on the schema of the People table in the database. (Rails is smart enough to understand English plurals, hence the automatic connection of Person and People.) So technically there are still two schemas, but the in-memory version is automatically derived of the on-disk version.

(Note, DLinq provides a conceptually similar tool – SqlMetal – that can generate the static types from a given database schema. However, as static types they have to be defined at compile time. So while SqlMetal reduces the effort to keep schemas in sync, it doesn’t eliminate it the way Rails does.)

By slaving the object schema to the database schema, Rails essentially solves the Dual Schema problem. The problem with the Rails approach is that defining a database schema requires a significant amount of skill and effort. Defining classes is typically trivial in comparison.The fact Rails allows you to implement a persistent entity with almost no code doesn’t help you much if you have to write and maintain a ton of SQL code to define your database schema.

I believe the Rails model is actually backwards. It would be much better for the developer if they could define their persistent entity in code and slave the database schema to the object model instead of the other way around.

Of course, this approach isn’t exactly news. In his article, Ted writes of the rise and fall of OO database management systems, which were supposed to solve the Dual Schema and Impedance Mismatch problems. I’m certainly not suggesting a return to the heyday of OODBMS. However, one of the reasons Ted points out OODBMS failed was because big companies were already wedded to RDBMS. But those big companies are the short head. As you move down the long tail of software, relational database as the primary storage paradigm makes less and less sense. For the vast majority of applications, relational databases are overkill.

Ted’s other point about OODBMS is that loose coupling between the data store and the in memory representation is a feature, not a flaw. He’s totally right. But can’t we advance the state of the art in database typing to the level of modern day OO languages? How about eliminating anachronisms like fixed length strings? What if we derive the database schema from the object model – Rails in reverse if you will – but is still loosely coupled enough to allow for schema evolution?

An example of this code-centric model for data storage is Consus. It’s written by Konstantin Knizhnik, who has written a bunch of open source, object-oriented and object-relational databases across a wide variety of languages and execution environments, including CLR. Consus is actually written in Java, but he provides version compiled for .NET using Visual J#. Consus lets you to define your data either as tables or objects. So you can do this:

Statement st = db.createStatement();
st.executeUpdate("create table Person (name string, address string, salary bigint)");
st.executeUpdate("insert into Person values ('John Smith', '1 Guildhall St.', 75000)");
ResultSet rs = st.executeQuery(
    "select name, address, salary from Person where salary > 100000");

Or you can do this:

class Person {
    String name;
    String address;
    long salary;
    Person(String aName, long aSalary, String aAddress) {
        name = aName;
        salary = aSalary;
        address = aAddress;
    }
};

Person p = new Person("John Smith", 75000, "1 Guildhall St.");
ConsusStatement st = db.createStatement();
stmt.insert(p);
ConsusResultSet cursor = (ConsusResultSet)st.executeQuery(
    "select from Person where salary > 100000");

Consus also handles OO concepts like derivation and containment. Of course, the embedded queries are ugly, but you could imagine DLinq style support for Consus. In fact, one of the primary issues with Consus is that it supports both object and tuple style queries. When you explicitly request tables (i.e. “select name, address salary from Person”), you’ve got a tuple style query. When you don’t (i.e. “select from Person”) you’ve got an object style query. Of course, the issues with tuple style queries are well documented in Ted’s article and is exactly the problem that LINQ is designed to solve.

(Konstantin, if you’re reading this, drop me a line and I’ll look into getting you hooked up with the LINQ folks if you’re interested in adding LINQ support to Consus.NET.)

The tradeoff between the Rails approach and the Consus approach is one of performance. I have a ton of respect for Konstantin and the work he’s done on Consus and other OO and OR databases available from his site. However, I sure the combined developer forces at major database vendors like Microsoft (and other DB companies) means SQL Server (and the like) will out perform Consus by a significant margin, especially on large scale databases. So if execution performance is your primary criteria, the Ruby on Rails approach is better (leaving aside discussion of the Ruby runtime itself). However, in the long run execution performance is much less important than developer productivity. So I believe that  for all the current interest in Rails, I think a Consus-style model will become dominant.

ETech Day Three Quick Thoughts

After my marathon blogging session last night and taking notes all day, I’m a bit burnt out on writing. But here are a few quick thoughts. More details to follow.

  • I’m digging the Live.com home page and the integrated Live Search. Since I’m on a rented laptop, Live Toolbar will have to wait. Coolest new feature IMO is the Search Macros, though it’s a tight race with the new image search interface.
  • Jon Udell and Michael Goldhaber spoke about attention economy today. I still don’t get it, though Jon had some interesting ideas about metadata. I’ll believe that attention is a currency when I can buy a car with it.
  • I liked the session on the Yahoo! Design Patterns, though the title and abstract of the session were awful. The title was “The Language of Attention: A Pattern Approach“. The inclusion of attention just confused the issue. Why couldn’t they just call it “A Pattern Language for User Experience”? Because it doesn’t have the concept of attention shoehorned into it.
  • I really like Eventful, even though I’m on record as thinking their business model doesn’t work. Their new demand feature is pretty cool, though it doesn’t really help their business model any.
  • George Dyson’s session on “Turing’s Cathedral” was fascinating, though he tried to cover too much ground in the time alloted.
  • I’m not sure what the point of Joel Spolsky’s Blue Chip Report Card was. Apparently the alien from Reddit is cute and Motorola newer cell phones (RAZR and PEBL) are taking Joel’s advice on becoming “blue chip”. This is somewhat related to points the folks from Adobe (previously Macromedia) made, except much more obtuse.
  • I have no idea what the point or business model of Plum is, even though it was featured as a keynote (a last minute promotion it appears from the conference guide). Seems too complex and centralized to actually work.
  • I wrote last night that Casting Words isn’t really a business because nothing stops me from going directly to Mechanical Turk and getting the transcription services myself. Today, I found a Casting Words task on Mechanical Turk so I decided to figure out how much they’re making. The task I found was to transcribe about 28 minute podcast and they were offering $5.41 for anyone willing to do it. That’s about 19.5 cents per minute. Tack on Amazon’s 10% charge brings the total to around 21.5 cents a minute that Casting Words is paying for transcription services. Given that they’re charging 42 cents a minute, that’s just under a 49% profit margin. Exactly what are they doing to earn that profit? What’s their value add and is really worth a 100% markup?
  • Anyone want to start “Cheap Casting Words” with me? We’ll pay 22 cents a minute (11% more than Casting Words) and charge 36 cents a minute (14% less than Casting Words) and keep the 12 cents a minute markup (a 33% markup). 😄

Update: Added Quick Thoughts on Yahoo!, Eventful, George Dyson, Joel Spolsky and Plum. Added more detail about the attention economy sessions from today.

Microformats Panel

I still haven’t seen a good general session on microformats. I’m thinking it’s because any one given microformat is so simple that you can’t really fill more than about ten minutes talking about it. So this panel was about six or seven different microformats. The format of the panel stunk – I lost track of what was being discussed pretty quickly so I spent the time surfing the microformats website.

The idea of microformats is to adorn visual markup (i.e. xhtml) with semantic information about the data underneath. Probably the best example of this is hCard, the microformat version of vCard. Here’s the markup for my hCard (as produced by the hCard Creator)

<div class="vcard">
    <a class="url fn" href="http://devhawk.net">Harry Pierson</a>
    <div class="org">Microsoft</div>
    <a class="email" href="mailto:hpierson@microsoft.com">hpierson@microsoft.com</a>
    <div class="adr">
        <div class="street-address">One Microsoft Way, 18/2194</div>
        <span class="locality">Redmond</span>,
        <span class="region">WA</span>
        <span class="postal-code">98052</span>
    </div>
    <div class="tel">425/705-6045</div>
</div>

See how the class attributes provide the semantics for the underlying text? Cool.

I’m beginning to get microformats. At first, I was bothered because I thought they were hijacking the semantics of the class attribute. But I didn’t realize the class attribute could be used for “general purpose processing by user agents”. And the link microformats like XFN and rel-tag are even simpler than hCard.

So again, bad session but cool concept. I really see potential for mashing up Ray Ozzie’s Live Clipboard with microformats.