This was emphasized to me a couple of days ago when I attended the Boston session of the FogBugz and Kiln World Tour, presented by Joel Spolsky and friends. Now, Joel is a big booster of DVCS and Kiln is built on top of Mercurial. But during the "DVCS University" presentation, what struck me about Kiln is that it requires a central, authoritative repository. In other words, they took the D out of DVCS. I asked them about the missing D, and the presenter made it clear that the centralized repository is the "right" way to do it. But if you have a centralized repository, you no longer have a distributed VCS. In other words, Fogcreek uses Mercurial as a "better Subversion". There is no D there.
Questioning the D
Like just about all DVCS advocates, Joel's crew emphasized the ease of merging as the main selling point of DVCS over centralized VCS. We know that merging with Subversion stinks, but why is good merging synonymous with using DVCS? Some also emphasize offline commits, but again, why is that essential to DVCS? I suggest that the attractiveness of Git or Mercurial in most circumstances has nothing to do with distributed version control.
Keep in mind, folks, how DVCS became popular. You may have heard of this guy named Linus Torvald. Running the entire Linux kernel development project is a major undertaking, yet for a while Linus would not use a version control system. None suited his needs at the time, until Bitkeeper came along. This DVCS worked well for him, but controversy over the commercial license forced him to give it up. In true Linus fashion, he went ahead and wrote his own, and Git was born. Mercurial too was written separately to replace Bitkeeper. My point is that modern DVCS was born to address the needs of open source Linux development. Linux development is hierarchical: Linus gets changesets from trusted subsystem maintainers who in turn receive code from other maintainers. A hierarchy of varying trust and authority implies a hierarchy of independent repositories. DVCS fit perfectly in this use case.
But distributed, independent source repositories is not how most commercial development works. You want backups and security. You don't want team members to "go dark" with their work for extended periods only to push mega-changes onto your team's code, breaking everything in sight. Early, frequent integration means you want to keep your code unified. So in practice, your source repository will be centralized. When people long for Git or Mercurial, it's because they want better merging on their VCS, and maybe offline commits. They are not clamoring to maintain peer source repositories. There is no D there.
Is merging that bad?
As an aside, I already acknowledged that Subversion stinks at merging. Yet I frequently do merges on Subversion, and almost always painlessly. Subversion merging is not as bad and DVCS merging is not as effortless as the DVCS boosters seem to make it sound. Subversion bashers don't always acknowledge that modern Subversion tracks merged changes (mergeinfo). Also, Git users can turn their zeal into self-fulfilling prophecy by using git-svn to do their SVN merging: this tool doesn't handle mergeinfo, subverting Subversion's merge metadata and ensuring future grief. In my experience, the real pain of merging is essential rather than accidental: there are real code conflicts, and a real human needs to reconcile these conflicts line by line. I have read a lot of hand-waving, but I have never seen a concrete example of how a DVCS better handles essential conflicts in merging.
Back to the future
What if you can get the benefits of better merging and offline commits without distributed versioning? Well, check out Subversion's development roadmap. Future enhancements include:
- Commit shelving: you can "shelve" current work offline and revert your working copy for other work, then "unshelve" that work later to resume your original work.
- Checkpoint: a form of offline commit that lets you "commit" a revertible stack of changes offline.
- Better merging from a revamped WC metadata library
- Rename tracking (Finally! Will help merging)
- Improved tree conflict handling
If a Subversion user can get the primary benefits of DVCS without switching to a DVCS, why switch? If "DVCS" for you means getting better merging and offline commits, and you are already using Subversion, then you do not need to seek DVCS. DVCS will come to you. There are those who will genuinely need the distributed repository nature of a DVCS, but I suspect that number is far smaller than those who merely want better merging.
If "Future Subversion" is all you need, should you still go with a DVCS? DVCS advocates argue that a DVCS offers a superset of traditional VCS. The problem with that argument is that any unnecessary superset adds unnecessary conceptual complexity. If I only need an "svn commit", I should not need to execute both a "hg commit" and a "hg push". Consider: this is what a Subversion revision number looks like:
Of course, "Future Subversion" is not here yet. We'll see if the Subversion project can deliver within their promised timeframes. And I will also say that DVCS offers more practical advantages than just better merging. All I wanted to point out in this post, if you have not already fallen asleep reading so far, is that:
- The primary motivations driving DVCS adoption has little to do with the concept of DVCS
- If Subversion improves to the point of eliminating those advantages, would the idea of DVCS still be as compelling?