Friday, March 13, 2009

Why ORM? Why Cloud Computing?

I think it was Fred Brooks that once wrote that someone will inevitably look at any software system and ask, "how did we get here?" And, the answer is invariably, "one logical choice at a time."  Rather than answer the questions, "Why ORM?" and "Why Cloud Computing?", I will endeavor to elucidate those logical choices.

The monolithic database model (MDM) has served the IT sector very well for about fifty years.  I will explicitly define it as a model of software development where myriad applications are integrated through a shared database, or--more accurately--more than one user interface is provided to a database.  In this model the database itself is where business processes and domain modeling exist.  Enterprises often expanded these databases to include all aspects of the business, or perhaps a few databases integrated via ETL, until the number of objects exceeded any one person's capacity to understand the system.  EDI, data warehousing, and data marts are all extensions of the (MDM) for dealing with M&A, e-biz partners, etc.  Given a large number of well-trained database professionals with rigorous communication and documentation requirements, the monolithic database model is efficacious and expedient.

As object-oriented programming techniques became more well-known, developers began modeling the domain and business process using advanced general purpose programming languages like Lisp, Smalltalk, C++, etc.  They used object databases to persist these models and all was good and productive.  Unfortunately, few developers grok those technologies, and even fewer have a mastery of the techniques needed to use them well, so it didn't catch on.  Perhaps the biggest impediment to adoption was the success of the MDM. The MDM was working and understood.  The deficiencies of the model were insufficient to move IT development en masse to the object crowd.

Still, a lot of software was being written outside of IT.  Embedded developers loved object programming and had no need to store loads of data.  So, the hardcore programmers--those that relish writing life support systems, flight control systems, nuclear power plant control systems, etc.--were exclusively "object guys".  Meanwhile, Sun Microsystems was building a runtime for embedded systems to simplify programming across hardware--Java.  A 4GL, Java had all the object-oriented bells and whistles, and Sun had written runtimes for Unix, Solaris, and even Windows!  These runtimes came bundled with AWT, allowing developers to nominally "write once, run anywhere."

Suddenly there was the WWW.  I say suddenly because even Bill Gates once purportedly remarked that it was irrelevant.  As IT understood the implications of Metcalfe's Law for the Internet, it became the elephant in every boardroom.  No single company in the world wanted to be a loser in the new, connected economy.  All IT decisions were tempered by the Internet revolution.

Now the cohesive, insular world of the MDM was really looking problematic.  Still, that's where the data was, and ultimately the Internet was about access to information.  So, every IT vendor that wanted to compete was pitching platforms that claimed to do one central task better than anyone else: turn relational data into web pages.  These platforms were either backed by scripting or 4GL, and DBI, JDBC, ODBC, ADO, etc., in all their versions and iterations were simply libraries used by the object guys to get at the data.

Concurrently, universities had been turning out software developers who started writing C++ or Java in their freshman year, graduating to "real" object languages in subsequent years.  The worldwide demand for IT developers skyrocketed, and technical colleges met the challenge with programs designed to get people into the market with the minimal set of skills to create web applications.  At this point the technologies for building web applications were infantile relative to RDBMS technologies.  The result was a lot of really bad developers developing really bad applications, slowly.

The good developers, the object guys with strong Lisp kung fu, really resented having to interface with databases--to go from their object model to a data model.  IT programmers oft complained of the inordinate amount of time they spent move data in and out of databases.  Architects started designing systems built around messaging again, calling them web services this time, and the canon said that these messages were serialization of object state.  So smart people started writing libraries for interfacing with databases in a tacit fashion, attempting to abstract away the "implementation details" of data storage and retrieval: object-relational mappers (ORM).  The "real" model of business entities and processes, after all, were their objects.

Now, the old IT systems in DB2 and Cobol were "legacy" systems to be tolerated until they could eventually be re-written.  New IT initiatives would set out in their own silo (usually an application server), with their own persistence engines that utilized ORM, and integrate with the legacy systems via message-oriented middleware (MOM): Biztalk, SOA, ESB, etc.  Vendors were happy because MOM was expensive; object-guys were happy because their jobs now matched up with their education; and business owners were hopefully optimistic because mainframe vendors all operate on an annuity model, and the owners' hope was to escape from a world of multi-million dollar support contracts.

Lo, the MDM guys sneered, since the whole enterprise--the business itself--was still built upon their systems, and every attempt to re-write their systems was an incredibly expensive failure.

Business owners were getting pretty angry, too.  All these application servers needed lots of new hardware, and the new servers needed to be partitioned by firewalls, and they needed new database servers too, since RDBMS was still the persistence engine of choice, and of course all of this needed fail-over and clustering.  Meanwhile, what the developers were producing wasn't sufficient to run the entire business on, so they now had to maintain two systems in parallel, putting increased strain on the middleware servers.  Further, the developer's velocity was slow, since everything had to be integrated, and years and years of process refinements implicit in the MDM had to be re-imagined and re-discovered.  This integration stuff was expensive.

Again, though, not all software was being written in IT shops.  There were companies building applications hosted on web servers that were available on-demand, so called Application Service Providers (ASP).  These companies didn't have any legacy systems and drew the smartest object guys.  Many of them were wildly successful, but when the dot-com bubble burst, the idea that a company that you relied upon for mission critical systems could just disappear caused many IT shops to reject the idea out-of-hand.  The ASPs that survived were those companies that offered important but not "mission-critical" (like email, production data management, etc.) services, such as sales force automation or marketing support.  Another class of ASPs that won were those that offered hosted services to SME customers who could not or chose not to afford the sunk costs of providing their own infrastructure.

In the U.S., capital was king; the financial markets were providing great returns and due to foreign investment and interest rate cuts the cost of capital was low.  Businesses were now faced with enormous capital expenditures from their IT operations during a time when capital investments were the most prudent choice, but no one was willing to give up the very real productivity gains that had been made steadily in the last decade.  The solution to the dilemma was to defer expenditures, to lease.

At the same time, the business owner's frustration with the lack of agility of their IT operations peaked.  The costs were high, the pace of development was slow, the requirements gathering and definition process was pain-staking, and the failure rate was abominable.  The maintenance costs alone for all the manifold servers, languages, and platforms was choking the life out of new business opportunities before they could be explored.

So, the market itself demanded cloud-computing.  Rather than buying the infrastructure and paying to provision it, they wanted to "pay as you go."  Elastic demand on IT systems meant provisioning for the worst-possible scenario; cloud-computing meant scaling-up (and down) on demand.  Rather than having to pay for and maintain application servers, database servers, load-balancers, etc., they could just write the app and see if it sticks.

Yea, though I walk through the valley of the shadow of integration, I shall fear no SQL.

The problem of integration remains.  No cloud platform fits any enterprise's IT needs completely, so now the challenge is how to integrate a world where the web application server platform is considered a legacy.  Microsoft's solution is to position all ASP.NET development as "cloud" development, making it easy to keep the application on-premise or move it to the cloud (Azure).  Ironically, the major impediment to this approach is the persistence engine: the database. 

Moving a database that fully exploits the RDBMS platform to the cloud is non-trivial.  In other words, developers that wrote stored procedures, utilized event brokers or extended stored procedures, created ETL jobs, made it unlikely that they can move their applications to Azure.  If they instead had only used pure ORM (LINQ2SQL), this could be done mechanically.

This property of not violating the ORM abstraction, namely the ease with which the backing data store can be mechanically replaced, is the key reason why ORM advocates promulgate the principle that the application should never accept a dependency on the data store.  In other words, no stored procedures allowed.  The Entity Data Model is an attempt to abstract the universality of the data model--arguably the best feature of MDM--away from the persistent storage engine(s), in order to ensure the portability of the data model in the enterprise and beyond, i.e. to the cloud.