How to prevent unnecessary Jupyter Notebook changes from showing in Git commits?

Hi everyone,

I’ve run into a bit of an issue:

I’m working with a Jupyter Notebook stored in a Git repository. The problem is that every time I commit to Git, even if I haven’t changed the actual content of the notebook, Git still shows changes. It seems this is due to some metadata in a JSON file being updated whenever I run cells. This makes my commit history look messy and confusing — as if I made actual content changes when I didn’t.

Here is how it looks when i view the changes of my contentwise not changed .ipynb-file

Now my Questions:

  • Where is this JSON-file stored and how can i look at it (it only appears in the git-changes of the .ipynb-file)
  • How do you usually deal with this? Is there a way to prevent these kinds of metadata changes from being tracked, or ignored when committing?

Thanks in advance for your help, and have a great day!

The JSON file is the notebook.

In our group, we have the custom of always doing “clear all output” on a notebook before committing, throughout the development phase. Only once a notebook is “done,” from our point of view, and changes will be infrequent, do we commit with output. Thereafter, we are careful to ignore changes on the file if only playing with it and re-executing cells. (Particularly if it contains figures!)

1 Like

I typically use the package nbstripout and run nbstripout --install on each clone of a repo, so Git diffs and commits do not include notebook output.