I’ve been working on a small change to the open source project I mentioned, Zulip. The code required was trivial, but I spent most of my time figuring out a rational order of operations and being confused about how to accomplish them with the version control system.

I’m certainly no stranger to source code version control, but mainly with commercial systems. Git, used by many open source projects, isn’t much like the others I have experience with. The Zulip project has documentation about how they use git and github, the website that provides an online version of that system. But, of course, not everything can be anticipated in advance (and documentation can have bugs too.)

The funny thing is, I’ve used github some before. But only in a very narrow way. If you have a private repository where you directly submit changes to a main branch, there’s not a lot to it. Combining together different changes can be a nuisance, but that’s going to happen in any system like this. And if there’s only one or a few people making changes, the need for it can be nearly avoided with a tiny amount of discipline.

Where things get complicated is when you have a bunch of people working all at once. Even more when they are only loosely coordinated. Git assumes developers will work on what they want, and a handful of administrators will direct traffic with incoming requests to include new code in the main repository.

One thing that I was misunderstanding is how (little) git thinks about branches. Branches are normal things in version control, you make one to take a copy of existing code so you can safely modify it away from the central, primary, version.

In some systems, this is a resource-intensive operation where each branch is literally a copy of everything. Git doesn’t work that way. Since it functionally costs little to make a branch, branching is encouraged. You have your own copy of the code at a particular point in time. Both you and other people can make changes independently on different branches. You make some more branches. In the git universe, that’s no big deal. Time marches forward.

After you do your thing with your branch, you probably want to somehow get it back into the main repository. I’m most familiar with merging, where the system compares two parallel but not identical sets of source code and figures out if the changes are neatly separated enough for it to safely mash them together for you. Some automagical stuff happens, and the result becomes the latest version. (This latest revision is typically called “HEAD”.)

If not, you get to do it by hand. Use a merge-intensive version control system for a while, and you will absolutely find yourself dealing with a horrific mess to unravel. Merging is ugly but, if you are used to it, it’s a known ugly. That’s a certain kind of comfort. You can do that with git if you want. Many people do.

And many people don’t.

One thing about branches: many systems consider HEAD the be-all and end-all picture of reality. You might not be happy with the most recent version of your branch, you could keep a pointer to the revision you’d rather have, but it’s always the most recent version. If you don’t like it, you make a change and now you have a new HEAD. Time always moves forward. Re-writing history, to the extent that it can be done, is only for the most dire of emergencies.

Git has something called “rebase.” You can use it in a couple different ways, but it’s basically the version control equivalent of a Constitutional Convention: everything is on the table. You don’t like the commit message from three changes ago? Rebase. Want to not have those 47 typo-fixing revisions you created? Rebase. It’s also an alternative to merging, where your other branch’s changes are stuck on the end after HEAD and any changes that were made between the time you branched and now are patched in to your code that didn’t get them. (If you want a real explanation, here’s a PDF that helped me understand how rebase works.)

Coming from a merge-land where HEAD is sacred, this terrifies me. You are going into the past and messing with history, and that Just Isn’t Done. Admit that you checked in something with the commit message “shit is broke” and move on.

When branches are expensive and you don’t want to make too many of them, you have to protect the integrity of the ones you have. The idea of something like rebase is dangerous, and with great power comes great responsibility.

When branches are cheap, and you make one because you feel like watching what happens when you delete the database maintenance subsystem? Well, have fun. Clean up after yourself when you are done. It’s not exactly a different universe, but you think about some things in different ways. I’m not entirely there yet, but rewriting history is apparently one of those things.

In making my code change, I ran into a bunch of small things I didn’t understand. I was concerned that I’d do something that would make a mess, and it would be hard to clean up. I didn’t yet know the commands that would have helped. I didn’t understand the multiple purposes of others. I was entirely terrified by the idea of rebase. (I still mostly am, to be honest.)

I made a small mess attempting to merge in an environment that was expecting a rebase. And then halfway in I attempted to cancel but it was applied anyway. There were a few mysteries as things seemed to behave inconsistently. Some of it would have been easier if I had thought to create a branch to try something, against my previous conditioning.

Leave a Reply