Monday, September 6, 2010

Real Software Engineering & The Agile Value Graph

Glen Vanderburg (@glv) recently reprised his talk "Real Software Engineering" as a keynote at Ruby Hoedown 2010. In the words of Jim Weirich "every software developer should hear this"... Motivated to see what all the buzz was about, I watched the recording of the same talk from earlier this year (http://j.mp/9dqw9Q). The thesis is simple--Agile IS the state of the art in Software Engineering, not some sort of anti-engineering, or accidental engineering.

This, of course, is not news to most practioners, but his approach to defending this thesis is pretty novel, I think. He begins by quoting a paper from the proceedings of a NATO computing conference in 1969:
"A software system can best be designed if the testing is interlaced with the designing instead of being used after the design."

Pretty compelling huh?

He then frames the inception and growth of the Waterfall methodology as stemming from two errors: 1) the misinterpretation of a 1970 paper by Dr. Winston Royce by Pointy-Haired Bosses (a la http://www.paulgraham.com/icad.html), and 2) the Barry Boehm "cost of errors" graph had a hidden bias in that only Waterfall projects were measured.

I hope you are interested enough to check out the talk. I couldn't do it justice here. And you will find he draws several other valuable conclusions. Personally, I was compelled to read another of Vanderburg's papers referenced at the end of this talk, "Extreme Programming Annealed" (http://vanderburg.org/Writing/xpannealed.pdf), published in ACM SIGPLAN proceedings 2005.

In this paper he attempts an exposition of the coupling between the 12 (+1) XP practices to understand how to cope with a situation where one of the practices is disallowed. I won't recapitulate that work here, but he did present an interesting arrangement of the practices in relation to time. To me there was an immediate correlation between the cost of errors graph and this arrangement.

As providence would have it, I had recently been thinking about visual depictions of the "value" of Agile/XP. The pointy-haired bosses need something simple(-ish) to understand why these Agile practices are important. Ideally, they'd like to see how they affect the bottom-line; everything in business is a trade-off, after all...

After a few iterations, I came up with something that I hope Edward Tufte wouldn't snarl at and Kent Beck might nod at approvingly. I present you, the "Value of Agile Methods"




Some notes:

  • P(Error) is the probability of introducing an error. I make the facile assumption that there is a linear relationship between the frequency with which an activity is performed and the frequency of errors introduced by that activity.

  • P(ErrorAgile) is the result of flattening of P(Error) by four particular Agile methods (see downward arrows); in Vanderburg's paper these are called noise-filters

  • the "x-axis" is a logarithmic arrangement of time; in the case of the probability of error curves, this is how frequently an activity is performed; in the case of the cost of errors curve, it is the length of an interval between when an error is introduced and when it is discovered

  • the groupings of the time axis represent the "epochs" of different practices: engineering (code, test, vet req's etc.); process (define requirements, team meetings, project management, etc.); and strategy (direction set from executives, market research, etc.). There is obviously some overlap and flexibility in what activities correspond to which "epochs"



Inferences


I hope the graph would compel the following inferences:

  • Agile practices reduce the frequency of errors in engineering activities while reducing the time between when an error is introduced and when it is identified in both engineering and process epochs.

  • The cumulative effect of these practices is to reduce the probability of errors and limit the overall cost of errors.

  • Engineering errors are invariably frequent and cheap to fix when applying Agile practices, in stark contrast to errors in strategic decisions that are both infrequent and very, very expensive.



Call to Action


If you think this graph is interesting, flawed, has potentional, etc. I encourage you to iterate it and share it.
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Saturday, September 4, 2010

Hipster Programming: Coding in F# on Mac OS X

The vitriol towards Microsoft has a long and graceless tradition with a lexicon sporting such inventive neologisms as the venerable "M$FT" and the puerile "Total Failure System" in reference to Team Foundation Server (TFS). Oh, how these countless little slights must wound the souls of those folks toiling away under the yoke of stodgy corporate masters! Their inner light having been dimmed and filtered by a soulless, crushing capitalism so much so that they unconsciously embrace their sameness, adopting a uniform of pique polo shirts and khaki pants.


The iconoclasts, the individualists, the thinkers--these bear the nome d'guerre "Rubyist", "Rails aficionado", "Python hacker", "open source developer". You will know them not by their similarity but their diversity, their vestiture definitively anti-establishment and their mocking of any tool that you don't have to download from github and compile from source.


As in all things, there is a middle road whose pedestrians are accused of compromise with the Devil by one side and "just trying to be different" on the other. Yet in this camp we find many sagacious, experienced craftspeople who have spent time in both camps. With the reader's indulgence, I shall call them hipster programmers.


For what its worth, I consider myself a hipster programmer, as evinced by how I spent my morning. Having been developing in .NET for nearly a decade and of recent years drawn strongly to functional programming, I am of course learning F#. But, I'm also experimenting with Erlang, Ruby, and more recently Clojure. As far as platforms go, you've got a lot of choice these days: Windows, Linux, Mac OS X. With Windows you certainly need Cygwin and a git port. Linux is probably the easiest to get going on but lacks some of the day-to-day niceties (e.g. iTunes). Mac OS X can't be (legally) virtualized, so it makes an obvious choice as the base system.


My work machine is a MacBook Pro running VMWare Fusion. My recent explorations of Clojure have kept my exclusive in the Java tools domain--Maven, Ant, Netbeans, etc., but the Chicago Clojure user group recently invited Dr. David Miller to come talk about clojure-clr, a C# implementation of Clojure for the CLR, very hipster! What could be cooler than that?


Well, porting clojure-clr to F# of course!


Sure, I could fire up Fusion and VS 2010 and start rocking, but I like to begin with the end in mind, namely moving seamlessly between writing code in Clojure targetting the CLR and the JVM. I'd like to use similar tool chains. And, I'd like to present to a group of the cool kids in their native tongue.


The first step is getting an environment setup. Robert Pickering has a great article to get Mono setup on Mac OS X. Don't skip the 'sudo install-mono.sh' step. At the time of this writing their is a strange error in his instructions regarding the mono.snk file. The install-mono.sh file makes this pretty clear, so just ignore that part. Basically, you have to give the F# assemblies a proper strong name to add it to the GAC.


Now, I love vim--it was my first code editor--but the Ruby kids have been using TextMate, so let's use that since someone kindly shared an F# bundle. That bundle uses iTerm to evaluate function in F# interactive, so we'll need that too. Finally, we need to alias "fsi" to run F# interactive using Mono, so the TextMate bundle will run it properly. Edit ~/.profile adding this command.



alias fsi='mono ~/path/to/fsi.exe'

So we've got the beginnings of a build environment going; what's next?


  • Get a branch going on github, referencing the clojure-clr project

  • Get a build system stack setup (Maven? Rake?)

  • Find an unit testing suite for F#/Mono/Mac OS X

  • Start hacking


Monday, August 9, 2010

REST: The Uniform Interface

The central feature that distinguishes the REST architectural style from other network based styles is its emphasis on a uniform interface between components. (Fielding’s Dissertation, Sec. 5.1.5)

Street RESTers will immediately submit that the uniform interface is simply having a URL that reads: “/controller/method/id”.  While this maps nicely to their server-side MVC application, it does not represent the necessary constraints to satisfy the uniform interface of REST.

In the dissertation Fielding defines four interface constraints for REST:

  1. Identification of resources
  2. Manipulation of resources through representations
  3. Self-descriptive messages
  4. Hypermedia as the engine of application state

The benefits realized by the uniform interface are best understood in terms of the effects upon the architecture when applying these constraints.

Identification of Resources

To understand this constraint we must first have a solid definition of a resource.  A resource is some named entity that is provided by our application.  On the web we generally use URLs to name these entities.  The normal notion of a URL is “a link to a web page”.  A URL is a special kind of Uniform Resource Identifier.  REST constrains the URIs we use in a few ways:

  • The semantics of the mapping of a URI to a resource must not change.  So, while the contents of example.com/Top10 can change over time, the thing that it names—e.g. the top 10 examples of the day—cannot.
  • A resource’s identity is independent of its value. So, two resources could point to the same value at some point in time, but they are not the same resource.
  • The provider of a resource is solely responsible for maintaining the semantic validity of the URI.  This just means that we should choose good URLs that are easy to maintain.
  • A URI should not contain any reference to the media type used to represent the resource; example.com/Top10/json is verboten.

This seems like a simple constraint, and it is, but its significant benefits include:

  • There is just one way to get at a particular resource.
  • The value of a resource at a point in time (representation) can be served up in any appropriate media type at the time it is requested, based on the characteristics of the request (Accept header).
  • Since the semantics of resource identifiers are static, and the media type of the representation is determined at the time of the request, clients dependent on a resource do not have to change any identifiers in order for the content type to change.

Manipulation of Resources through Representations

The abstract notion of a resource named by a URI is reified by a representation in a media type selected based upon the nature of the request for that resource.  A representation contains the data and metadata describing the data, such as the media type of the data.  All of the perceived work done by a server in REST architecture is initiated by a client either: a) requesting a resource whereupon the server returns a representation of that resource; b) sending a representation of a resource whereupon the server nominally mutates/creates the resource.

There are additional details to the semantics of representation-based interactions, but the salient point is that URLs are not akin to call sites in a program.  These interactions are more akin to message passing.

An important and often overlooked consequence of this constraint is that there is no distributed consistency, and so the notion of a “transactional REST” is anathema:

REST just says that there is no consistency -- only representations that indicate state at some point in the past and an implicit grant of use for some time into the future. –Fielding on rest-discuss

Besides decoupling the resource from a particular representation, the benefit to this approach are seen in the application of the next two constraints.

Self-Descriptive Messages

All of the details required to route, interpret, and process a message must be in the message itself.  This enables the communication between components to be stateless and allows messages to be cached appropriately.

REST concentrates all of the control state into the representations received in response to interactions. The goal is to improve server scalability by eliminating any need for the server to maintain an awareness of the client state beyond the current request. (dissertation)

Hypermedia as the Engine of Application State

The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations. (dissertation)

This is probably the least applied of the REST constraints, especially with the proliferation of AJAX clients. The core idea is that the representation given to the client will have embedded hyperlinks that completely disambiguate what actions are available to interact with the server.

Dr. Fielding wrote an extended exposition of the hypermedia constraint wherein he makes explicit some rules that apply to a RESTful API.  He also says this,

A truly RESTful API looks like hypertext. Every addressable unit of information carries an address, either explicitly (e.g., link and id attributes) or implicitly (e.g., derived from the media type definition and representation structure). Query results are represented by a list of links with summary information, not by arrays of object representations (query is not a substitute for identification of resources).

Lest you think you can apply REST to your architecture sans this “hypermedia constraint”, Dr. Fielding further clarifies.

ROA is supposed to be a kind of design method for RESTful services, apparently, but most folks who use the term are talking about REST without the hypertext constraint. In other words, not RESTful at all. REST without the hypertext constraint is like pipe-and-filter without the pipes: completely useless because it no longer induces any interesting properties. (blog)

Towards a RESTful Client

Taken together, the constraints that engender the uniform interface of REST-style architecture, in turn force certain characteristics in clients of a RESTful API.  Among these client characteristics are:

  • Ability to process one or more media types available for a resource’s representation.
  • Ability to maintain and manage state by selecting among hyperlinks.
  • Awareness of potential inconsistency of representations.
  • Only enters the interaction from a known entry point.

If we are building RESTful web applications, we have to supply such a client that can run in a browser.  For example, the user agent can navigate to the URL that identifies the client. This client is assembled in the browser via normal mechanisms for displaying a web page.  The various JavaScript libraries loaded then augment the capabilities of the browser to exhibit the characteristics mentioned above while interacting with the RESTful API on the server.

Sunday, July 25, 2010

Dynamic Dispatch (Multimethods) in C#?

I’ve recently become enamored with the multimethods system of Clojure, as well as its approach to polymorphism and “type” hierarchies in general.  Having never heard the term, I consulted Wikipedia about multimethods, hoping to have its origins elucidated, though it appears it is simply a synonym of multiple dispatch.

Polymorphism, “the ability of one type to appear as and be used like another type”, is limited to inheritance with simple overloading semantics (and generics) in C#.  The ability to be “used as” another type is implemented by allowing subclasses to implement any virtual methods defined on their respective superclasses.  Then, at compile time, the method to be called is determined based on the actual types of the objects upon which the function/method is invoked; we call this compile-time binding or early-binding.

Let’s look at it from a more mechanical perspective.  In most OO languages, when we write obj.Foo() we are implicitly writing (Foo obj).  In other words, obj is the first argument in the invocation of Foo; an argument called this in many languages.  (See JavaScript’s call/apply.) So, in our inheritance example, the compiler looks at the actual type of this first argument obj (the target of the invocation) when determining the Foo to call. This is called single dispatch.

What about the other arguments to the function?  Can we vary which function is called based on the other arguments to a method besides the target (i.e. the first implicit argument)?  Well, of course, we can have method overloads that take a different number and/or type of arguments.  However, unlike the first argument, there’s no mechanism in C# to evaluate the the derived types of these other arguments to make a dispatch decision; there is no multiple dispatch in C#.

This blog article, “The Visitor Pattern and Multiple Dispatch”, usefully explains the problem in terms of the Visitor pattern.  As a means of implementing double dispatch the Visitor pattern has some shortcomings.  First, the targets must be aware of and receive the visitor, violating the single responsibility principle.  Second, the visitor itself must evaluate the type of the target and invoke the correct method. Though some suggest using reflection to invoke the correct method, that implementation, reflecting on the runtime type of the object and using dynamic invocation through the type to get to the right method can be expensive. 

We can simplify that approach using the dynamic keyword to effect dynamic dispatch.  Using the dynamic keyword to “box” the arguments to the target method (Foo above), we can let the runtime system do the reflection for us. In the code below, I build on the Visitor example in “The Visitor Pattern and Multiple Dispatch” article, see line 28.

Code Snippet
  1. class Program
  2. {
  3.     abstract class Expression { }
  4.  
  5.     class ConstantExpression : Expression
  6.     {
  7.         public int constant;
  8.     }
  9.  
  10.     class SumExpression : Expression
  11.     {
  12.         public Expression left, right;
  13.     }
  14.  
  15.     class EvaluateVisitor
  16.     {
  17.         public int Visit(Expression e)
  18.         {
  19.             throw new Exception("Unsupported type of expression"); // or whatever
  20.         }
  21.  
  22.         public int Visit(ConstantExpression e)
  23.         {
  24.             return e.constant;
  25.         }
  26.         public int Visit(SumExpression e)
  27.         {
  28.             return Visit(e.left as dynamic) + Visit(e.right as dynamic);
  29.         }
  30.     }
  31.     
  32.     static void Main(string[] args)
  33.     {
  34.         var one = new ConstantExpression { constant = 1 };
  35.         var two = new ConstantExpression { constant = 2 };
  36.         var sum = new SumExpression { left = one, right = two };
  37.         var vistor = new EvaluateVisitor();
  38.         Console.WriteLine("Visit result {0}", vistor.Visit(sum));
  39.         Console.ReadKey();
  40.     }

 

This technique has some utility but should be used wisely.  Obviously there will be some cost for “dynamic” dispatch.  It’s important to note that this isn’t a generalized system for multiple dispatch, just a great spot welding technique to make the Visitor pattern more palatable.

In contrast, Clojures multimethods allow you to define a function on the arguments that is evaluated and cached, while the “overloads” define the results of that evaluation that they correspond to.  In this way dispatch on multimethods in Clojure can consider not only the “types” and values of all the arguments, but really any sort of inspection or evaluation you choose.

Tuesday, July 20, 2010

Using the Web to Create the Web

Wikis do this, as do blogs.

Fast JavaScript in browsers is enabling a new generation of programmers to develop applications completely in their browser.  While the most obvious commercial example is Force.com, there are many other ideas out there.  In no particular order:

The common connection between these frameworks is the notion of bootstrapping the web; that is, using the web to create the web.

If you’ll forgive the inchoate thoughts, let me attempt to connect some mental dots.

Dr. Alan Kay has of late been discussing the SmallTalk architecture of real objects (computers) all the way down and how this might improve the nature of software on the Internet.

In September 2009 in an interview, Dr. Kay said,

The ARPA/PARC research community tried to do as many things ‘no center’ as possible and this included Internet […] and the Smalltalk system which was ‘objects all the way down’ and used no OS at all. This could be done much better these days, but very few people are interested in it (we are). We’ve got some nice things to show not quite half way through our project. Lots more can be said on this subject.

This month in an interview with ComputerWorld Australia, Dr. Kay expounded,

To me, one of the nice things about the semantics of real objects is that they are “real computers all the way down (RCATWD)” – this always retains the full ability to represent anything. The old way quickly gets to two things that aren’t computers – data and procedures – and all of a sudden the ability to defer optimizations and particular decisions in favour of behaviours has been lost.

In other words, always having real objects always retains the ability to simulate anything you want, and to send it around the planet. If you send data 1000 miles you have to send a manual and/or a programmer to make use of it. If you send the needed programs that can deal with the data, then you are sending an object (even if the design is poor).

And RCATWD also provides perfect protection in both directions. We can see this in the hardware model of the Internet (possibly the only real object-oriented system in working order).

You get language extensibility almost for free by simply agreeing on conventions for the message forms.

My thought in the 70s was that the Internet we were all working on alongside personal computing was a really good scalable design, and that we should make a virtual internet of virtual machines that could be cached by the hardware machines. It’s really too bad that this didn’t happen.

Is OOP the wrong path? What is this RCATWD concept really about?  Doesn’t the stateless communication constraint of REST force us to think of web applications in the browser as true peers of server applications?  Should we store our stateful browser-based JavaScript applications in a cloud object-database, in keeping with the Code-On-Demand constraint of REST?  Can we make a them “real objects” per Dr. Kay?  Are RESTful server applications just functional programs?  If so, shouldn’t we be writing them in functional languages?

I definitely believe we can gain many benefits from adopting a more message-passing oriented programming style.  I would go so far as to say that OO classes should only export functions, never methods.  (They can use methods privately of course, to keep things DRY.)

I’ve written extensively in a never published paper about related topics: single-page applications, not writing new applications to build and deliver applications for every web site, intent-driven design, event sourcing, and others.  Hopefully I’ll find the time to return to that effort and incorporate some of this thinking.

RavenDB: In the Code, Part 1—MEF

If you’ve not heard of RavenDB, it’s essentially a .NET-from-the-ground-up document database taking its design cues from CouchDB (and MongoDB to a lesser degree). Rather than go into the details about its design and motivations, I’ll let Ayende speak for himself.

Instead, I would like to document some of the great things I’ve found in the codebase of RavenDB, as I read to be a better developer.  This series of articles discusses RavenDBs use of the following .NET 4 features.

  • Managed Extensibility Framework (MEF)
  • New Concurrency Primitives in .NET 4.0
  • The new dynamic keyword in C# 4

While discussing RavenDB’s use of these features, I hope to provide a gentle introduction to these technologies.  In this, the first post of the series, we discuss MEF.  For a very brief introduction to MEF and its core concepts, see the Overview in the wiki.

Managed Extensibility Framework

MEF was originally in the Patterns & Practices team and has since moved into the BCL as the System.ComponentModel.Composition namespace.  Glenn Block has nominated it as a plug-in framework, an application partitioning framework, and has given many reasons why you may not want to attempt to use it as your inversion-of-control container (especially if you listen to Uncle Bob’s advice). RavenDB uses MEF to handle extensibility for it’s RequestResponder classes.

RavenDB’s communication architecture is essentially an HTTP server that has a number of registered handlers of requests, not unlike the front-controller model of ASP.NET MVC.  Akin to MVC’s Routes, each RequestResponder provides a UrlPattern and SupportedVerbs to identify those requests it will handle. A given RequestResponder will vary it’s work depending on the HTTP verbs, headers, and body of the request.  It is in this sense that RavenDB can be considered RESTful (even if it isn’t, see street REST).

Code Snippet
  1. public class HttpServer : IDisposable
  2.     {
  3.         [ImportMany]
  4.         public IEnumerable<RequestResponder> RequestResponders { get; set; }

This HttpServer class dispatches requests to one of the items in the RequestResponders. This is populated by MEF because of the ImportManyAttribute.    MEF looks in its catalogs and finds the RequestResponder class is exported, as is all of it’s subclasses; see below.

Code Snippet
  1. [InheritedExport]
  2. public abstract class RequestResponder

The InheritedExportAttribute ensures that MEF considers all subclasses of the attributed class are themselves as exports.  So, if your class inherits from RequestResponder and MEF can see your class, it will automatically be considered for each incoming request.

How does MEF “see your class”? Out-of-the-box MEF provides for the definition of what is discoverable in a number of useful ways. RavenDB makes use of these by providing it’s own MEF CompositionContainer.

Code Snippet
  1. public HttpServer(RavenConfiguration configuration, DocumentDatabase database)
  2. {
  3.     Configuration = configuration;
  4.  
  5.     configuration.Container.SatisfyImportsOnce(this);

Above, in the constructor of the HttpServer class, we see the characteristic call to SatisfyImportsOnce on the CompositionContainer. This instructs the container to satisfy all the imports for the HttpServer, namely the RequestResponders.  The configuration.Container property is below:

Code Snippet
  1. public CompositionContainer Container
  2. {
  3.     get { return container ?? (container = new CompositionContainer(Catalog)); }

And the Catalog property is initialized in the configuration class’ constructor like this:

Code Snippet
  1. Catalog = new AggregateCatalog(
  2.     new AssemblyCatalog(typeof (DocumentDatabase).Assembly)
  3.     );

So the container is created with a single AggregateCatalog that can contain multiple catalogs.  That AggregateCatalog is initialized with an AssemblyCatalog which pulls in all the MEF parts (classes with Import and Export attributes) in the assembly containing the DocumentDatabase class (more on that later).

That takes care of the built-in RequestResponders, because those are in the same assembly as the DocumentDatabase class.  If that smells like it violates orthogonality, you are not alone. But, I digress; what about extensibility? How does Raven get MEF to see RequestResponder plugins?

The configuration class also has a PluginsDirectory property; in the setter, is the following code.

Code Snippet
  1. if(Directory.Exists(pluginsDirectory))
  2. {
  3.     Catalog.Catalogs.Add(new DirectoryCatalog(pluginsDirectory));
  4. }

So, in Raven’s configuration you can specify a directory where MEF will look for parts.  That’s the raison d'être of MEF’s DirectoryCatalog, since a plugins folder is such a common deployment/extensibility pattern.  You can learn more about the various MEF catalogs in the CodePlex wiki.

Now, the real extensibility story for RavenDB is its triggers.

RavenDB Triggers

The previously mentioned DocumentDatabase class is responsible for the high-level orchestration of the actual database work.  It maintains four groups of triggers.

Code Snippet
  1. [ImportMany]
  2. public IEnumerable<AbstractPutTrigger> PutTriggers { get; set; }
  3.  
  4. [ImportMany]
  5. public IEnumerable<AbstractDeleteTrigger> DeleteTriggers { get; set; }
  6.  
  7. [ImportMany]
  8. public IEnumerable<AbstractIndexUpdateTrigger> IndexUpdateTriggers { get; set; }
  9.  
  10. [ImportMany]
  11. public IEnumerable<AbstractReadTrigger> ReadTriggers { get; set; }

Following the same pattern as RequestResponders, the DocumentDatabase calls configuration.Container.SatisfyImportsOnce(this). So, the imports are satisfied in the same way, i.e. from DocumentDatabase’s assembly and from a configured plug-ins directory.

In RavenDB triggers are the way to perform some custom action when documents are “put” (i.e. upsert) or read or deleted.  RavenDB triggers also provide a way to block any of these actions from happening.

Raven also allows for custom actions to be performed when the database spins up using the IStartupTask interface.

Startup Tasks

When the DocumentDatabase class is constructed, it executes the following method after initializing itself.

Code Snippet
  1. private void ExecuteStartupTasks()
  2. {
  3.     foreach (var task in Configuration.Container.GetExportedValues<IStartupTask>())
  4.     {
  5.         task.Execute(this);
  6.     }
  7. }

This method highlights the use of the CompositionContainer’s GetExportedValues<T> function, which returns all of the IStartupTasks in the catalogs created in the configuration object.

Conclusion

We’ve seen three important extensibility points in RavenDB supported by MEF: RequestResponders, triggers, and startup tasks.  Next time, we’ll look at two more—view generators and dynamic compilation extensions—while learning more about RavenDB indices.

Monday, June 21, 2010

iPad: The InterPersonal Computer

There’s no shortage of information on the “how” of the iPad.  Apple’s reification of Alan Kay’s Dynabook makes no sacrifices in terms of processing, communications, display—even the audio is surprisingly good.  But what does the A4 system-on-a-chip, IPS display, Wifi/Bluetooth/3G add up to in terms of experience?

Having spent a week with the iPad, I feel compelled to write down my answers to that question. The iPad is nothing short of a joy in my home.  It’s the device we didn’t know we needed: the fourth screen.  It’s the morning paper, the evening magazine, and the after dinner board game.  It’s the vacation photo album, the argument settler, and the cookbook.  The iPad is the first interpersonal computer (iPC); the PC has artfully been disguised as an intelligent, portable screen that facilitates rather than stymies interpersonal interaction.

Why did we need this device? Surely I could use Wikipanion on my phone to settle the debate on the national language of Côte d’Ivoire during the game. But, I couldn’t show you the map of the region from across the room.  I definitely could have turned on my PC and connected my TV via DLNA to show our vacation photos. But I’d rather just hand you the album to scan at your leisure.  We could get all the tiles out, flip them over, mix them up, and arrange them, but board games are much more fun (and more apt to be played) when you don’t have to set them up or put them away. I could have done an internet search for recipes that included lemon balm and printed one out, but it’s nice to just go straight from searching to cooking. If we had kids the raison d'être of this latter-day Dynabook would be handsomely fulfilled by an interactive periodic table, a sketchbook, musical toys, and a huge library of books.  For now we’ll just have to settle for loving these apps as grown-ups.

This is a device for kids of all ages, to be sure.  Each app acts as a mask, transforming the iPad to a device well-suited to the task at hand. Who wouldn’t find something to enjoy?

I’d be remiss if I didn’t offer some lament or prognostication on future enhancements.  This device would be truly magical if I didn’t have to plug it in and synch with iTunes.  If my photos, videos, and music were just wirelessly transported from the cloud on demand and cached locally, I wouldn’t have to wait forever to synch or chew up a bunch of space with things I rarely want. 

I would love it if the apps knew me. Maybe a fingerprint scanner could be added to help applications identify me; when I launch a game or Twitter client, if I swiped my finger it could load my saved game or timeline.  An iPC should be like a family friend, a unique relationship with each of us, but impartial and accessible to all.

AT&T+Apple could score quite the coup if the 3G was free up to a certain level of usage.  How many snowbirds would buy an iPad to stay in touch with the family back home?  How many more  business travelers would be able to keep up with their inbox without having to be nickel-and-dimed all the time?  More importantly, they would have a device that would be complete out-of-the-box; thank you for purchasing this magical screen that is connected to everyone, everywhere, anywhere you are—right now.

Monday, May 17, 2010

The Law of Demeter and Command-Query Separation

My first encounter of the Law of Demeter (LoD) was in the Meilir Page-Jones book, Fundamentals of Object-Oriented Design in UML. It is also referenced in Clean Code by Robert C. Martin.  Basically, the law states that methods on an object should only invoke methods on objects in their immediate context: locally created objects, containing instance members, and arguments to the method itself.  This limits what Page-Jones calls the direct encumbrance of a class;  the total set of types a class directly depends upon to do its work.  Martin points out that if an object effectively encapsulates it’s internal state, we should not be able “navigate through it.”  Not to put too fine of a point on it, but the kind of code we are talking about here is:

Code Snippet
  1. static void Main(string[] args)
  2. {
  3.     var a = new A();
  4.     a.Foo().Bar();
  5. }
  6.  
  7. class A
  8. {
  9.     public B Foo()
  10.     {
  11.         // do some work and yield B
  12.     }
  13. }
  14.  
  15. class B
  16. {
  17.     public void Bar()
  18.     {
  19.  
  20.     }
  21. }

Martin calls line 4 above a “train wreck” due to its resemblance to a series of train cars.  Our Main program has a direct encumbrance of types A & B. We “navigate through” A and invoke a method of B.  Whatever Foo() does, it is not effectively encapsulating it; we cannot change it to an implementation that uses C transparently.

LoD is a heuristic that leverages the encumbrance of a type to determine code quality. We observe that effective encapsulation directly constrains encumbrance, so we can say that the Law of Demeter is a partial corollary to an already well known OOP principle: encapsulation.  Another such principle is Command-Query Separation (CQS) as identified by Bertrand Meijer in his work on the Eiffel programming language.

CQS simply states that methods should either be commands or query; they should either mutate state or return state without side-effects.  Queries must be referentially transparent; that is you can replace all query call sites with the value returned by the query without changing the meaning of the program. Commands must perform some action but not yield a value.  Martin illustrates this principle quite succinctly in Clean Code.

Referring to our snippet above again, we can see that if CQS were to have been observed, Foo() would return void and our main program would not have been returned an instance of B.  CQS thus reinforces LoD, both of which manifest as specializations of OO encapsulation.  Following these principles force us to change the semantics of our interfaces, creating contracts that are much more declarative.

CQS has many implications.  Fowler observes that CQS allows consumers of classes to utilize query methods with a sense of “confidence, introducing them anywhere, changing their order.”  In other words, CQS allows for arbitrary composition of queries; this is a very important concept in functional programming.  Queries in CQS are also necessarily idempotent, by definition; this is extremely important in caching.

In Martin’s discussion of LoD, he notes that if B above were simply a data structure—if all of its operations were Queries—we would not have a violation of LoD, in principle.  This is because LoD is really only concerned with the proper abstraction of Commands. From the C2 wiki,

In the discussion on the LawOfDemeter, MichaelFeathers offers the analogy, "If you want your dog to run, do you talk [to] your dog or to each leg? Further, should you be able to manipulate the dog's leg without it knowing about it? What if your dog wants to move its leg and it doesn't know how you left it? You can really confuse your dog."

To extend the analogy, when you are walking your dog, you don’t command it’s legs, you command the dog.  But, if you want to have a smooth walk, you’ll stop and wait if you one of the dog’s legs is raised. It is in this sense that we can restate the Law of Demeter in terms of Command-Query Separation.

  • Whereas a Query of an object must:
    1. never alter the observable state of the object, i.e. the results of any queries;
    2. and return only objects entirely composed of queries, no commands.
  • A Command of an object must:
    1. constrain its actions to be dependent upon the observable state, i.e. Queries, of only those objects in its immediate context (as defined above);
    2. and hide the internal state changes that result of its actions from observers.

This restating is helpful in that it implies that we refactor to CQS before applying LoD. In the first phase we clearly separate commands from queries. In the second phase we alter the semantics of our commands and queries to comply with LoD.  While the first phase is rather mechanical, it give us a good starting point to reconsider the semantics of our objects as we bring them into compliance with LoD.

Thursday, April 22, 2010

Usability Apothegms

A common saying in computing is that “Security is inversely proportional to usability”… or something like that.  As we critical examine the security of our systems, we realize we need to put measures in place that make the system harder to access and thus harder to use.  A good interaction design can help mitigate the usability issues, but at the end of the day a system that doesn’t require me to memorize a password or login is easier to use than a system that does.

We can say definitively that security and usability exist in tension.

As software architects we seek simplicity in our designs in the name of maintainability, if not intelligibility.  We also seek modularity in the name of reusability. I submit that simplicity and modularity exist in tension.

Accepting a priori that simplicity is the absence of complexity, we can obtain the simplicity of a program by measuring its complexity.  A field of computer science called algorithmic information theory defines the complexity of something to be the length of the simplest program for calculating it.  We might infer from this that a monolithic program (no components, no objects, no abstractions, etc.) is a simpler program than our common object-oriented code.  In general we can say that modularity implies no small increase in the use of abstractions to enable that modularity.

In object-oriented* systems, an increase in modularity results in a proportional increase in complexity.

I limit this to object-oriented systems purposefully.  In my experience functional programming languages modularity is de rigueur.

Monday, April 12, 2010

Converting Oranges to Apple’s: Meta-competition in the Platform Wars

Updates to the developer agreement in the new iPhone SDK restrict developers to using C, C++, or Objective-C to create their native iPhone applications.  That is, they must have been “originally written” in one of those languages.

This move has made its rounds among the pundits and bloggers.  Rather than rehashing any of those points, I’ll give my own opinions on why Apple made this move.

Make no mistake, this is about stewardship of the iPhone experience.  Steve Jobs has responded to the criticism (emphasis my own),

[…] intermediate layers between the platform and the developer ultimately produces sub-standard apps and hinders the progress of the platform.

Before you go say they’re really just trying to kill Flash, let’s address that up front.  Flash does not run in Safari on the iPhone for the same reason that you are prohibited in the developer agreement from hosting a virtual machine in your iPhone application.  As the steward of the platform, Apple has to attempt to ensure that users have consistently good experience.  Trying to squeeze a bloated swf player into a memory and processor constrained device is just not workable currently.  As any iPhone user can attest, websites that are not optimized for mobile browsers are horribly slow to load.  Flash content would just make that worse.  And, guess what, my mom doesn’t know that your website or Flash animation is a pig, she just thinks her phone is slow.

Let’s look at it from another perspective.  The latest incarnation of Microsoft’s flagship development suite, Visual Studio 2010, has been re-written on their WPF platform.  Well, everything except the splash screen.  You see, it takes too long to load WPF and .NET up to affect the desired result of a splash screen—giving you something to look at while the main application loads.  The iPhone developer guide makes it abundantly clear that you should load something as quickly as possible to give the user the sense that your app is responsive.  Try explaining to your mom that, yes, her phone has registered her tap on the application icon, but that it has to spin up a virtual machine or load some large libraries that abstract away Cocoa.

Next, try to explain to her why the extra cycles required by the VM and/or interoperability libraries drain her battery.  She’ll just be annoyed that she’s constantly plugging the darn thing in.

Apologies to all the technical moms out there.

so, Flash [1] is not the target here, at least in the sense that Apple cares if Flash is around or perceives it as a threat.  And, assuredly, we cannot say that stewardship of the performance of the phones is the only reason.  That’s part of Steve’s “sub-standard apps” argument, and it is not entirely convincing given that it becomes an optimization problem, and developers are notoriously good at those when they set their mind to it.  As a steward though, you really cannot afford to wait from them to figure it out.  Nevertheless, I think the original ban on virtual machines and this subsequent tightening to a whitelist of languages comes down to a stewardship of a different kind: keeping out the riff-raff.

Keeping up with AppStore submissions is already pretty hard and very expensive.  Developers are often frustrated by the time it takes to get their updates into the store, especially developers used to patching websites with no downtime. Don’t imagine for a second that your $99 covers the cost of the program.  This is Apple’s investment in the ecosystem.  Imagine the deluge of useless, crappy submissions they will get if every Flash developer and every VB developer can just tick a box to target the iPhone.

But, it’s not just about raising the bar for entry to keep out sub-standard developers and their apps.  They have to look after all those developers who have committed to their platform.  They aren’t going to invest significant resources to enabling C#, VB, Python, Ruby, or Flash developers to compete against their own. 

Apple is not interested in seeing your app run on a Droid, Windows Mobile 7, and iPhone.  They are actively controverting homogenization of the mobile application marketplace.  They have an insurmountable lead[2] in the mobile applications marketplace, and the iPad, another device on the iPhone OS platform, marks the next step in their overall competitive strategy for leveraging that lead.  At the center of all of this is the AppStore, perhaps the single most valuable asset in computing today.  Whereas no one controls the Internet, Apple owns the AppStore outright with all the rights and responsibilities that entails; think of Salesforce.com and their AppExchange.

Microsoft has long understood that developers are what make their ecosystem work.  Steve Ballmer’s famous rallying cry, “Developers! Developers! Developers!”, is poignant illustration of this reality.  Apple understands that to usurp Microsoft’s position in the PC and enterprise markets, to continue their domination of consumer segments, they need a large population of developers that understand their platform.  What they don’t need are more ways for developers to “write once, run everywhere.”  In the platform wars, he who has the developers wins.

 

[1] Anyway, it isn’t just about Flash.  Look no further than Microsoft’s Windows Mobile 7 announcements to see that.  Silverlight, a Flash competitor to say that least, will be the native platform of those devices.  Where Flash is Silverlight will surely follow.

[2] I believe that Google will continue to have limited success as a web-native device.  There is a significant segment of the marketplace that just want a phone, email, web-browsing device.  RIM has an Apple-style loyalty thing going for it.  Microsoft?  Well, they still have such a big footprint; they could fail and still make a huge impact.  I think the move to Silverlight helps shield their effort from their internal Windows-Office power structure enough that it actually has a chance of competing, but they are way too far behind to win.  There’s not a huge population of Silverlight developers in the world, after all.