Saturday, September 29, 2012

Multithreaded programming is easy!

Multithreaded programming is supposed to be perilous. Even for experienced practitioners who are familiar with the concepts of multithreading, the cognitive load of this work is supposed to be too taxing for the human mind to handle reliably. Yet your typical server-side enterprise Java application is inherently multithreaded. Whether deployed on a full Java EE application server or something lighter like Tomcat, your typical application runs in a shared process space with hundreds or thousands of threads. Somehow, we manage to release and keep in production real working applications. And truth be told, in all my years as a programmer very few bugs I encounter actually has anything to do with multithreading logic.

At this point, some of you might expect me to offer some sort of secret to my "success": functional programming or some new framework or methodology, perhaps, or maybe my own special snake oil? But the more experienced among you are probably yawning, if you even made it this far. The fact is, enterprise Java programming at the application level is normally free of multithreading issues. The reason is a phrase I sometimes make fun of, but is quite pertinent here: "best practices".

The real pain of multithreading comes from:

Shared. Mutable. State.

If you don't have state, there is no pain. If you have state but it's immutable, there is no pain. If you have mutable state but it's not shared, there is no pain. It is only if you have state that is both shared and mutable that threads can step on each other and you end up in a world of hurt. Now, an application that never changes is pretty much useless, so you must have state. And in Java, immutability is pretty much impractical. Even if you make all your object references final, the actual objects they refer to tend to be mutable. But we do have control over the sharing of this state. Here, best practices tend to make the sharing of state Someone Else's Problem.

How does shared state become Someone Else's Problem? Think of a typical application's activity. A request comes in over the web, your servlet reads its inputs, reads stuff off the network or database, futzes with this stuff, and writes the result to the database. Or a message might come in off a message queue, or a scheduled job fires up. Same thing: wake up, read stuff from the network or database, fiddle with that stuff, and write the result out to the database. If you follow best practices such as keeping instance data out of servlets (which are shared by threads), you don't really keep shared application state in any area your code controls. So where's the shared state? It's in places like these:

  • The database
  • The cache (distributed or local)
  • The connection pool(s)
  • The message queues
  • The transaction manager

These all have something in common: they are all usually popular infrastructure code written by third parties. The third party infrastructure code -- such as a message broker -- really do need to deal with multithreaded issues, and they look really nasty. Here be dragons. But being popular third party code, they have been thoroughly tested and proven in production before you even download the jar. Lots of smart people test, maintain and worry about this code so you don't have to. It's Someone Else's Problem.

If you are programming at the application level, you generally treat data and state like a hot potato. You pick it up, hold it as briefly as possible, and pass it on. Shared state is pushed to the periphery of your application, in the land of Someone Else's Problem. Consequently, you can write your code in a manner seemingly oblivious to threading issues. We can build complex, large-scale applications without worrying about hard-to-debug multithreading conflicts. Conversely, I get a sinking feeling in my stomach when I see consciously thread-aware code with lots of multithreading primitives: synchronized this or that, notify/wait etc. Here be dragons, and I'd much rather that these dragons remain in the land of Someone Else's Problem.

You may have heard the story of the man who interviewed drivers by asking the same question: "how close to the edge of a cliff can you drive?" He got a number of answers in inches or feet, but ultimately hired the guy who answered, "I don't know: I try to stay as far away as possible from the edge!" The same holds true for application developers: an application is more robust and easier to debug if developers have a little humility and don't try to be too clever. I'm sure you know all about multithreading and can argue the flaws of double-checked locking and such but please: I'm happy to see all that stuff in a job interview, but not in production code. Even if you are as clever as you think, the next person who has to work with your code may not be, and this stuff is hard to get right.

There are always exceptions, of course. You might be building infrastructure code such as the third party code I referenced earlier. Or the application does not fit the database-backed request/read/write pattern. Or … or … well, you can be the judge. But if you feel like gratuitously rolling your own multithreaded logic, please reconsider.

4 comments:

  1. There are more applications in the world that require multi threading than simple web applications. And even these often need tuning; e.g. by setting the correct isolation level, database locking or optimistic locking.

    Once you need to deal with concurrency in a more complex way than crud, it quickly starts to become complicated.

    ReplyDelete
    Replies
    1. As I said, there are always exceptions. But even complex enterprise applications rarely need to use multithreading primitives. If you think it's more common, maybe you can name examples.

      Delete
  2. Not sure how do you define "Enterprise applications". I work in iBank and all trading and risk management systems are multi-threaded, i.e. the system cannot process just one trade at a time.

    ReplyDelete
    Replies
    1. As I said, "your typical server-side enterprise Java application is inherently multithreaded". My point is that it's possible to safely write multithreaded Java applications without explicit multithreading constructs and the perils those imply.

      Delete