Pre-commit hooks that autogenerate iPython notebook diffs

Nowadays, I use iPython notebooks a lot in my software development nowadays. It’s a nice way to debug things without having to fire up pdb; I’ll often use it when I’m trying to debug and explore a new API.

Unfortunately, notebooks are really hard to diff in Git. I use magit and git diffs pretty extensively when I change code, and I rely heavily them to make sure I haven’t introduced typos or bugs. iPython notebooks are just JSON blobs, though, so git gives me a horrible, incoherent mess. I basically commit them blindly without checking the code at all nowadays, which isn’t ideal.

So to resolve this I generate a readable version of the notebook, and check the diff for that. Specifically, I wrote a script that extracts only the Python code from the iPython notebook (which is essentially a JSON file). Then, whenever I commit a change to the iPython notebook, it:

  1. Automatically generates the Python-only version alongside the original notebook.
  2. Commits both files to the repository.

Here’s what the diff looks like:

To make sure it runs when I need it, I created a git pre-commit hook. Git’s default pre-commit hooks are a little difficult to use, so I built a hook for the pre-commit package. If you want to try it out, you can do so by setting up pre-commit, and then including the following code in your .pre-commit-hooks.yaml:

 - repo: https://github.com/moonglow-ai/pre-commit-hooks
    rev: v0.1.1
    hooks:
      - id: clean-notebook

You can find the code for the hooks here: GitHub - moonglow-ai/pre-commit-hooks: Moonglow pre-commit hooks

and you can read more about it at this blog post here! Diffing iPython notebook code in Git