Tuesday, January 18, 2011

Chasing abstractions

We Java enterprise developers love abstractions in code. I suspect the typical software geek comes up with new abstractions more often than he changes his underwear. I admit to writing code with premature abstractions, such as using a strategy pattern where I end up never using a different strategy. We code defensively, afraid that lack of flexibility will hinder future changes. But beyond the code that we actually write, Java enterprise developers also love abstractions in third party libraries. We chase abstractions, in search of that fantasy land where if we adopt this or that framework we need not change our code to accommodate the hypothetical need to change the underlying implementation. The result is comic levels of abstractions upon abstractions.



Take logging, for example. A logging framework is basically an abstraction over "System.out.println". The Java world had log4j, and it was pretty good. Then the geniuses at Sun decided to build an inferior, incompatible logging facility into the JDK, and now the Java world decided they had to support both. So we got Apache's commons-logging which abstracts over log4j and JDK logging so you could switch between the two without changing your logging calls. Some people did not like commons-logging and wrote SLF4J, which is yet another API. If you were worried before about having a choice of 2 logging frameworks, you now have a choice of 2 abstractions over 2 logging frameworks.

Fortunately, we are saved from needing another abstraction over these abstractions by the fact that SLF4J has all this plumbing that lets you redirect commons-logging/JDK logging/log4j/SLF4J calls to commons-logging/JDK logging/log4j/SLF4J. In other words, it can be a bridge instead of a facade. On the question of logging APIs, I like SLF4J myself, but I think if longtime log4j users had ignored all those wrappers they would still be doing fine. After all, why would they need to drop log4j?

Messaging is another example. JMS is an API abstraction over asynchronous message queues, and I think it is one of the better things to come out of J2EE. Thanks to JMS, we have a universal API over diverse message queue implementations that have their own native APIs. JMS has worked pretty well in a lot of enterprise apps, so naturally the Java ecosystem must come up with an abstraction over JMS. The abstraction du jour here are the EIP (Enterprise Integration Patterns) products like Apache Camel and Spring Integration, or even an ESB, that abstract over heterogenous endpoints. The key here is that these are integration frameworks designed to connect heterogenous endpoints, yet I have met people who are eager to use these on top of green field projects. True, Camel's DSL is seductively elegant, but on single projects or brand new projects, where you can choose all technologies and control all code, it seems to be a solution in search of a problem.

There are a number of reasons I am uncomfortable with this eagerness to slap layers of libraries on top of perfectly useful libraries:
  • More moving parts: more stuff running means more things to maintain and more things that can go wrong. Once you have a Camel engine running you have another item that needs to scale and cluster. There is a price to pay for each additional piece of technology. You have to: profile, follow the forums, configure, migrate when a new version appears, resolve library conflicts, file bug reports, Google for solutions to odd behaviors etc.
  • Abstractions hide: they can obscure key functionality under the covers. If you used Commons-logging, you would not have been able to use log4j's MDC feature. If you are sending messages with an enterprise integration framework, are you sure that key JMS features like message priority, grouping, selectors and headers/properties are exposed and faithfully translated?
  • You may pick a loser. Choosing an abstraction does not necessarily isolate you from the burden of choice. You merely end up with choices at a different level: Camel vs Spring Integration, or Commons-Logging vs SLF4J.

That painting above is part of Thomas Cole's 4 painting series, "The Voyage of Life". I saw a copy of that painting in a CS professor's office in college. He was a descendant of Thomas Cole, a connection I did not make despite addressing him as "Professor Cole" for months. He picked that painting because it was the most cheerful. That was "Youth". The next painting in the series is "Manhood":


Oops. The boy in "Youth" was so intent on chasing his castle in the sky that he did not notice the rough waters round the bend.

The problem with these abstractions is that they only abstract over solutions to problems we know about. The EIP frameworks actually solve more problems than messaging, but if we merely adopt one to abstract over messaging, they don't necessarily add value. That is, they don't solve additional problems that we have. The result is that we accumulate useless code that weigh us down, like the treasure in the man's boat. When the real, unexpected problems arise, we have to solve those problems with a codebase that is unnecessarily complex and hard to change.

My own inclination is to choose the most appropriate solution for the problem at hand. The solution will be lighter, simpler and easier to get started with. It is possible that I might need these abstractions in the future, but these can be added retroactively. Both SLF4J and enterprise integration frameworks can serve as bridges rather than facades, requiring no change in the original code. Or I might never need them, in which case I would be saved a lot of unnecessary complexity.

No comments:

Post a Comment