Saving State or Sessions in Jupyter Notebooks

Is there a way to save state in Jupyter, similar to:

Saving workspaces in RStudio

It would be really nice to save a snapshot of a notebook session or “workspace”, which includes all variables, which can be loaded back into memory. I tried exploring these options with no luck:

  • dill.dump_session(): This cannot save all variables, and if you have something that cannot be pickled like a Tensorflow object, it will fail.
  • %store magic. This also cannot save all types of variables. And is cumbersome such that you have to call this for each variable.

Does anyone have any solutions to this problem?

cc: @betatim

1 Like

I think this is more a Python problem than a Jupyter issue, so I think you should probably look for solutions at that layer.

2 Likes

To attach (large) data to a notebook take a look at https://nteract-scrapbook.readthedocs.io/en/latest/index.html

Things that can’t be serialised: https://github.com/cloudpipe/cloudpickle If this library can’t do it … you probably need some custom code for that object type.

JupyterLab has workspaces which you could use to store the layout of the UI.

1 Like

Challenges:

  • Keeping track of data across multiple threads.
  • Running multiple notebooks on the same kernel.
  • Un-dillable types: frame, generator, traceback
  • Anything Tensorflow/ Keras.

A general Python solution would have much more complex requirements to suit all needs, whereas Jupyter could implement something with caveats?

@hamel here is how I persist the unpickleable keras models:

#save
h5_buffer = io.BytesIO()
model.save(
    h5_buffer
    , include_optimizer = True
    , save_format = 'h5'
)
model_bytes = h5_buffer.getvalue()

#get
model_bytesio = io.BytesIO(model_bytes)
h5_file = h5py.File(model_bytesio,'r')
model = load_model(h5_file, compile=True)

The model_bytes get fed to a SQLite BlobField so an actual h5 file never touches disk, which is how I would imagine a session/ variable persister would centrally store things.

1 Like