`git rat` when thrashing on new ideas

When I’m working out a new piece of code in my brain, I end up using git rebase quite a bit. I often start working through a brute force solution, slowly feeling my way from the starting point, starting and stoping and starting again down different paths.

The path from A to B is never efficient. I make cohesive commits at nonconsecutive points in the commit log. Some commits are too big, and others are too small. Some changes become superfluous due to changes in later commits. Some commits change structure but not behavior, while others do the opposite.

While it looks like I’m trashing around, what I’m really doing is building up intuition about the problem at hand. Each step clarifies my context for what codepaths will need to be changed, what existing code constructs I may need to modify, and how I might do so in a sequential set of atomic sets. I think of it like building up a piece of marble that I’ll slowly chisel into a piece of fine art.

While at first I do this in an unstructured way, there are some rules I like to follow along the way. I’ve subscribed to these rules to make the overall process more consistent and less likely to result in unusable progress; too often I would approach a problem through this “thrashing” method and end up at the end result, but without a logical path of breadcrumbs from the starting point. This meant that I could commit all the changes into one or two giant commits, or that I could try to fabricate the path I took. While the latter method might result in a set of atomic commits that were semantically right, they may not make logical sense to an objective observer; another developer couldn’t come in and reasonably see the interations in thinking for each step.

My main rules are:

  • Each commit needs to pass tests
  • Each commit needs to be small, aiming to be ~25 LoC changed or smaller. The smaller it is, the better
  • Separate class / object creation commits from usage commits, i.e. introduce the class in one commit, and then use it in another commit
  • Each commit message must be instructive and descriptive - usually I take the pattern of ClassName for new classes and ClassName#method_a or ClassName.method_b for new instance and static methods respectively

Subscribing to those rules makes it much easier for me to rearrange, condense, tease apart, or drop various changes once I come up for air after thrasing for a bit. Each 5 or so commits down a promising path see if I can straighten up my last few changes against itself and any earlier commits I’ve made. Once I’ve got my commit log arranged nicely, I’ll then make sure each commit passes tests.

To do so, I use the very handy --exec option with git rebase. This allows a command to be run after each checkout of a commit. Setting --exec to run whatever command used to execute tests will then step through each commit, run tests, and if they fail, pause the git rebase and allow the developer to fix the problem. This makes it obvious if I introduce a piece of code that fails tests in a later step that the issue is not from some botched rebase earlier, and that the failing test was actually introduced in whatever I’m currently working on. It also provides a nice guardrail to make sure that each commit follows its parent logicially, and that a commit doesn’t reference some entity that is created in a later commit. If it does, that’ll become clear when it fails to compile or if tests don’t pass.

So, for an example, at each “check point” during my thrashing period, after cleaning up all of my past commits, I run this test-each-commit command. I added a git alias for it, which I’ve called rat for rebase-and-test.

My setup is the following (assuming ./test runs tests):

[alias]
  rat = rebase -i master --exec './test'

Annoyingly, the interactive screen needs to display; the --exec option can’t be used outside of git rebase in interactive mode. But, maybe there’s some option I can use to automatically confirm the list of rebase steps.