Thoughts on Darcs and Merging

One of the harder aspects of version control is dealing with merging issues. Normal development is straightforward — all you’re essentially doing is providing an annotated “undo” feature. darcs manages that, IMO, perfectly. And to be honest, that’s probably 80% of what I want form a version control system. But dealing with merging different lines of development is important too — it’s probably 80% of the remaining 20% :)

darcs doesn’t actually do too badly there — when you’re working on separate parts of the code, darcs will do a merge for you automatically quite happily. Where it falls apart is when the changes affect the same bit of code, and can’t be resolved automatically; I find myself really disliking darcs’ behaviour there, even independent of the performance issues.

It doesn’t help that I don’t remotely understand what darcs thinks “merging” means in the event of conflicts. But at least I’m beginning to understand what I mean by merging, or, rather what I expect a version control system to do for me when I try merging disparate branches. And that’s this:

First, it should produce a tree that includes the changes from both branches — even one that doesn’t compile — so that I don’t have to worry about looking through diffs myself to make sure I’ve incorporated all the appropriate changes. darcs kind-of manages this in some circumstances. Second, it should make it easy for me to merge later changes between both branches — if two branches add two different functions by the same name, and when I merge them, I resolve that conflict by renaming one, I want future merges that only deal with the contents of either function to happen completely automatically. darcs doesn’t manage that at all.

The reason it doesn’t is that it uses an incremental approach to resolving mergers. If you’re merging from a repository whose history is “X, A, B”, into a repository whose history is “X, L, A’, M”, where merging A resulted in a conflict with L that M resolved, then darcs will construct “X, A, L’, M”, then merge “X, A, B” with “X, A, L’ ” to get “X, A, L’, B’ “, and then merge that with “X, A, L’, M” — but if A and L conflicted, and B depends on A, then B and L’ will conflict too, and there’s simply no way for M to resolve that conflict too.

There are (at least) three potential approaches to dealing with this. You can make L’ and B not generate a conflict (by encoding some part of the resolution of the conflict between “L” and B in the differences between L’ and L). Alternatively, you can change the definition of merge from “find the most recent common history, then commute” to also include the possibility of applying the patch directly to the point at which the conflict was resolved — though you need some other way to adjust the patch at that point. Finally, you could find some way to have the resolution patch not only resolve the existing conflict but also resolve future conflicts too.

Probably the middle solution is the only one that wouldn’t require major changes to the way darcs deals with resolving conflicts — at present the application of a conflicting patch undoes both the original change and the conflicting change, and the resolution creates a new change that’s the combination of both. It’s probably not possible to infer enough information into the conflicting patches, or into the resolution patch to resolve future conflicts.

To implement the middle solution, you need to introduce a new “patch” type that says “you can treat me as “R, A” and apply B from “R, A, B”, if you apply the following transformation to B”. You need to find some way of commuting the new patch type around. And you need to find some way of generating the transformation given a repository that’s not necessarily remotely similar to the original “R, A” repository at all, and that may differ from it in ways that can’t be automatically determined (for example, you’re merging one patch that changes two lines that say “foo” to both say “foobar”, and another patch that deletes one of the foos — easy enough, you end up with a single foobar. Then you get a new patch that only changes one “foobar” to “bazquux”. Was it the one you kept, or the one you got rid of?).

Tricky.

Leave a Reply