Monday, July 2, 2007

Replace enum Constructs with Classes

Anyone who knows me, knows I'm perfectly happy writing C# code. But I think its important to keep tabs on what the rest of the industry is doing, especially things that come out of the Java space. Witness the omniscient debugger, Guice, and db4o. Along those lines, I've read a bit from a book entitled Effective Java, written by Joshua Bloch, and I'm happy to report that it has made me a better programmer. My favorite gem, so far, is something that I attempted to do about five years ago--replace enums with classes.



First, some background on my initial motivation. What I kept running into were string values that corresponded to various codes that would be kept in the database. You know what I mean, you've got addresses and address types (e.g. home & work, or mailing & billing, etc.). So, you store a string that indicates what type of address it is and expose that as a property of your address object. The problem is a distinct code smell that something is wrong:



if (addr.AddressType == "M")


Of course, this looks like a magic number, so to speak... and comes with all the attendant difficulties in maintainability of the code: what is "M", what does "M" mean, what happens when we phase out "M"? We could probably make an accurate guess given something as prosaic as addresses, but what about exception policies or authentication modes?



What I wanted at the time was a "typesafe string enum". Put in more concrete terms, I wanted to do the following:


if(addr.AddressType == AddressTypes.Billing)


Now, Java doesn't have built-in enum support, whereas .NET does support named integral enumerations as a first-class type. So, we could implement the above with the following declaration:



public enum AddressTypes
{
Mailing,
Billing,
Residence
}

This would work fine, except that we need some fixed mechanism for identifying address types outside of our code (i.e. in the database or in an XML wire-serialization). So, we simply explicitly identify these constants:





Okay, we're golden! Or, are we? Let's look at this design choice in practice. Consider what a customer address record might look like:



CustomerID AddressType AddressLine1 AddressLine2 ...
243843 2 1032 North St. NULL

Hmmm, that's not so bad. We could use a lookup table and a view to make reporting and querying easier. What about XML serialization of a customer object:




Still, that's not terrible, but it could be better. There's a very good reason why XML is human-readable. It makes consuming data easier for systems outside of the system where the data was originated. That is, when the marketing department hires someone to come in and integrate their CRM system with your customer database, they have to figure out that "2" means "Billing".



There's that magic number again! What we have implicitly done with the enum serialization is leaked an implementation detail, e.g. how we represent Billing addresses in our system. We have violated the encapsulation principle. Cue the warning music (dum-dum DUM)!



It's not just other systems that will have some difficulty with these magic numbers. Consider how you would store a default address type in a configuration file, for example defaultAddressType=2.



There are other difficulties with enums. Probably the most pertinent is related to how enums are implemented. Specifically, an integral value is implicitly convertible to an enum instance. In other words:


if(addr.AddressType == 2) //perfectly valid

Well, that's fine, right? I mean, we're storing 2 in the database as well, so we can't go changing the enum values. The problem is we've effectively lost the raison d'ĂȘtre of our enumeration, i.e. type safety. Imagine data coming in to the system from that marketing consultant. How would you perform a validity check on AddressType?



There are lots of problems with this situation, and they all revolve around how enums are implemented. In this example, there is nothing wrong with casting 6 to an AddressTypes instance... until you try to persist that value to your database and you get a referential integrity violation.


Obviously, we need a better solution. And that is where the concept of a strongly-typed string enum comes into play. Bloch's basic concept of the "typesafe enum pattern" is outline in Item 21 of his aforementioned book. More important there is his discussion about the different ways the pattern can be used, e.g. extensible implementations (inheritance) and comparable implementations (sorting). Most of what he says applies to what we are going to build, as well, despite being focused on Java.



We are going implement what I would call a typical usage of the pattern, based on my exeprience. Along the way we will leverage some of .NET's strengths and find ourselves with a much more expressive and extensible way to represent short sets of values in our systems. For now, I hope I've made the case for why we need the typesafe enum pattern and that you'll join me next time for an exposition of my .NET implementation of the pattern.