Monday, May 17, 2010

The Law of Demeter and Command-Query Separation

My first encounter of the Law of Demeter (LoD) was in the Meilir Page-Jones book, Fundamentals of Object-Oriented Design in UML. It is also referenced in Clean Code by Robert C. Martin.  Basically, the law states that methods on an object should only invoke methods on objects in their immediate context: locally created objects, containing instance members, and arguments to the method itself.  This limits what Page-Jones calls the direct encumbrance of a class;  the total set of types a class directly depends upon to do its work.  Martin points out that if an object effectively encapsulates it’s internal state, we should not be able “navigate through it.”  Not to put too fine of a point on it, but the kind of code we are talking about here is:

Code Snippet
  1. static void Main(string[] args)
  2. {
  3.     var a = new A();
  4.     a.Foo().Bar();
  5. }
  6.  
  7. class A
  8. {
  9.     public B Foo()
  10.     {
  11.         // do some work and yield B
  12.     }
  13. }
  14.  
  15. class B
  16. {
  17.     public void Bar()
  18.     {
  19.  
  20.     }
  21. }

Martin calls line 4 above a “train wreck” due to its resemblance to a series of train cars.  Our Main program has a direct encumbrance of types A & B. We “navigate through” A and invoke a method of B.  Whatever Foo() does, it is not effectively encapsulating it; we cannot change it to an implementation that uses C transparently.

LoD is a heuristic that leverages the encumbrance of a type to determine code quality. We observe that effective encapsulation directly constrains encumbrance, so we can say that the Law of Demeter is a partial corollary to an already well known OOP principle: encapsulation.  Another such principle is Command-Query Separation (CQS) as identified by Bertrand Meijer in his work on the Eiffel programming language.

CQS simply states that methods should either be commands or query; they should either mutate state or return state without side-effects.  Queries must be referentially transparent; that is you can replace all query call sites with the value returned by the query without changing the meaning of the program. Commands must perform some action but not yield a value.  Martin illustrates this principle quite succinctly in Clean Code.

Referring to our snippet above again, we can see that if CQS were to have been observed, Foo() would return void and our main program would not have been returned an instance of B.  CQS thus reinforces LoD, both of which manifest as specializations of OO encapsulation.  Following these principles force us to change the semantics of our interfaces, creating contracts that are much more declarative.

CQS has many implications.  Fowler observes that CQS allows consumers of classes to utilize query methods with a sense of “confidence, introducing them anywhere, changing their order.”  In other words, CQS allows for arbitrary composition of queries; this is a very important concept in functional programming.  Queries in CQS are also necessarily idempotent, by definition; this is extremely important in caching.

In Martin’s discussion of LoD, he notes that if B above were simply a data structure—if all of its operations were Queries—we would not have a violation of LoD, in principle.  This is because LoD is really only concerned with the proper abstraction of Commands. From the C2 wiki,

In the discussion on the LawOfDemeter, MichaelFeathers offers the analogy, "If you want your dog to run, do you talk [to] your dog or to each leg? Further, should you be able to manipulate the dog's leg without it knowing about it? What if your dog wants to move its leg and it doesn't know how you left it? You can really confuse your dog."

To extend the analogy, when you are walking your dog, you don’t command it’s legs, you command the dog.  But, if you want to have a smooth walk, you’ll stop and wait if you one of the dog’s legs is raised. It is in this sense that we can restate the Law of Demeter in terms of Command-Query Separation.

  • Whereas a Query of an object must:
    1. never alter the observable state of the object, i.e. the results of any queries;
    2. and return only objects entirely composed of queries, no commands.
  • A Command of an object must:
    1. constrain its actions to be dependent upon the observable state, i.e. Queries, of only those objects in its immediate context (as defined above);
    2. and hide the internal state changes that result of its actions from observers.

This restating is helpful in that it implies that we refactor to CQS before applying LoD. In the first phase we clearly separate commands from queries. In the second phase we alter the semantics of our commands and queries to comply with LoD.  While the first phase is rather mechanical, it give us a good starting point to reconsider the semantics of our objects as we bring them into compliance with LoD.