Bitbucket is going to drop support for the Mercurial version control system, due to most people using Git instead. I use both Mercurial (hg) and Git, and while Git has become the de facto standard today, Mercurial was my first love in the world of version control systems, and I’m sad to see it go.
Of course, Mercurial is not dead, but being removed from Bitbucket is a big deal (at least to me). But this news also gave me an occasion to reconsider Git vs. Mercurial, and writing this post has actually given me a newfound, if not appreciation, then at least understanding of Git.
Init
When I started university in 2006, SVN had recently replaced CVS as the cool thing. So for group work we used SVN and it worked fine. Well, except for all the file conflicts we got. And the awful tree conflicts. Come to think of it, it didn’t work that great. Luckily distributed version control systems had been maturing in the meantime.
In 2009 we started using Mercurial. It was like SVN, except it was actually good for collaboration! Now you could store a safe snapshot of your own work before trying to merge other people’s changes in. Also merging generally worked well. Also it was fast.
All was good, but there was also this Git thing people on the internet were talking about. Coming from SVN, Mercurial made sense and Git didn’t. (Linus Torvalds, being Linus Torvalds, made Git purposely different from the CVS/SVN family.) Git did seem to be more flexible at the expense of being less elegant. As was written back then: Git is MacGyver, Mercurial is James Bond.
My group mates and I kept using Mercurial, but as I started my PhD, collaboration with new people, not least my supervisor René, made me spend more time with Git. It got better when I began to understand that the differences between Mercurial and Git start at fundamental concepts such as branches and merges.
Different Concepts
“Branch” doesn’t have the same interpretation in Mercurial and Git. In Mercurial, a branch is a sequence of commits all sharing a (permanent) name. In Git, a branch is temporary and is represented by a reference to the current commit at the tip of the branch. Once gone, you can’t tell which branch(es) a commit used to be part of.
In Mercurial branches are also shared between clones of the same repository.
Everybody works on the same branch, and you can hg pull
to get the
newest version of the branch, and then hg update
your working
copy. In Git nobody works on the same branch — there’s just people working on
different branches that might or might not have the same name. You can’t
“update” to the newest version because you already have the newest version of
your own branch by definition. But you can merge another branch, possibly
with the same name, into your own.
In Mercurial, merges are used for merging divergent branches or heads. In Git, everything is a merge, even if only one branch has changed. But at the same time, a Git merge might or might not create a merge commit. By default, a new commit will only be created if the two branches had diverged, i.e., work had been done on both. Otherwise, the current branch will just be updated to point to the same commit as the branch being merged in. It makes a lot of sense once you get used to Git, but before that it’s just confusing.
Different Commands
Learning Git from an SVN/Mercurial background meant relearning a few words. “Revert” means something new in Git and “shelve” is called “stash”, but that’s not bad. Instead, what is annoying is that many simple Mercurial commands don’t have simple, memorable equivalents in Git:
Show the hash of the current commit:
1 2 |
|
Show which commits would be pulled or pushed:
1 2 3 4 5 |
|
Mark a file to stop tracking, but don’t delete the file. This one is especially weird in Git because you don’t remove something --cached, you cache (stage) the removal of something:
1 2 |
|
Print the root directory of the repository and the URL of the remote repository:
1 2 3 4 5 |
|
List the version controlled files:
1 2 |
|
Git is Git
I have been using the above commands as aliases in Git, but I have wondered if doing that somehow prevented me from learning the “true spirit” of Git. I have now come to the conclusion that the essence of Git is its architecture and data model, and not its incoherent command line interface. Learning the user interface is just a chore, and using aliases to smooth it out is fine.
I’m not bashing Git because it’s different from Mercurial, but because it is internally inconsistent and, as another blogger puts it: Git doesn’t so much have a leaky abstraction as no abstraction. Here are some of the inconsistencies and lacking abstractions:
- Concepts can have multiple names, like the index a.k.a. the cache a.k.a. the staging area (which would be okay if Git didn’t already have so many concepts)
- The branch
origin/master
is your (local) remote tracking branch, butmaster
alone can refer to your local master branch or to the remote’s master branch depending on context - The
git checkout
command famously does too much: It can check out a branch (merging in uncommitted changes in the process if you want, or creating a new branch manually or automatically if you want), and it can restore or overwrites individual files in the working copy- However, the new Git 2.23 released this month has started experimenting
with the new, more specific, and subject to change
git switch
andgit restore
commands (similar tohg update
andhg revert
!)
- However, the new Git 2.23 released this month has started experimenting
with the new, more specific, and subject to change
- The notation
A:B
sometimes means<src>:<dst>
(a refspec) and sometimes<rev>:<path>
(path in a revision — to be distinguished from a pathspec which is a third concept that can use the colon in a third way) - The notations
A B
andA..B
mean the same thing togit diff
but not togit log
- The notations
A..B
andA...B
mean somewhat opposite things togit diff
andgit log
- To specify branches A and B symmetrically you need
git diff A..B
butgit log A...B
git diff A...B
andgit log A..B
mean “from the common ancestor of A and B, to B”
- To specify branches A and B symmetrically you need
In the end, I think Git is a good tool despite its command line interface. But
if you’re the type of person who needs to understand something in depth to feel
comfortable with it, you have a lot of reading to do. If you already know the
basics, git help gitglossary
will get you started on 80+
underlying concepts that will help you on your way to master the world’s
favored VCS!