Persistent computation on JH server after client disconnects

berceanu · September 25, 2022, 4:55pm

Hi! I followed the great Z2JH guide and achieved a fully functional bare metal install via microk8s.

My question is, if a client connects to the jupyterhub instance from another machine and has a long-running notebook, can that client close the browser window (or even his machine) and come back to the calculation later in order to find the final results? Or does the computation stops once the browser is closed?

How can one keep the JupyterLab notebook running and come back to it at a later date?

manics · September 25, 2022, 6:43pm

The notebook should keep running in the background, but the notebook output will be lost. There’s an open issue here:

github.com/jupyterlab/jupyterlab

Reconnect to running session: keeping output

opened 07:46AM - 15 Aug 17 UTC

oersted

enhancement pkg:notebook tag:Server Change tag:Real Time Collaboration

There's been a lot of requests in the past to: * Reconnect to a long-running no…tebook and keep receiving new output. * Reconnect to a notebook running on the server side and recover all output when reconnecting. I'd like to resurface this issue. To the best of my knowledge, these features have been planned at least since 2015 (earliest reference I found: [link](https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jupyter/8hyGKoBY6O0/RyEDDyOZAQAJ)). It hasn't been implemented yet because it directly conflicts with the architecture of the stack and would require a big refactor. I don't know if this is easier in the JupyterLab context or not. But now that this project is in very active development, it might be a good opportunity to revisit this and try to get it in now at the foundations. I found to open issues on the topic in the old notebooks repo: * jupyter/notebook#1150 * jupyter/notebook#641

The first part of the work, JupyterLab RTC for collaborative editing, is available but it’s not production ready. See this topic for the current state of things:

Additional work is still needed to allow a client to reconnect and receive the missed outputs.

berceanu · September 25, 2022, 7:43pm

Thank you for the detailed reply.

I guess this means that if the notebook outputs the results of a long computation to file, that can still be recovered later on?

minrk · September 26, 2022, 8:54am

Yes, that’s a common pattern. If you’re only interested in figures, you can add calls to plt.savefig to plot cells to write your figures to disk to look at later.

If you know you’re going to run offline, you can use a tool like papermill, which will run offline and capture output.

I used a caching pattern in my thesis in 2012 to run overnight simulations and check on them in the morning. My pattern was to write expensive cells that looked like:

# remove or rename this file to force recompute
cache_file = "..."
if os.path.exists(cache_file):
    # this branch taken after the first successful run
    load_cache_file(cache_file)
else:
    # only run once
    compute()
    save_cache_file(cache_file)
display_something()

As a result, ‘restart and run all’ would run quickly by loading all the results and displaying them, without recomputing anything expensive. This assumes that you can actually serialize your results to files, though.

You could do this more conveniently by writing a %%cache cell magic, or if you’re lucky and most of your expensive computations are pure functions of hashable inputs, you can use a modified functools.cache that caches to disk instead of memory.

The challenge for %%cache is that it is hard to compute the cache key in general, and to figure out what should be recomputed and what indeed can be serialized and re-loaded from disk, so explicit manual caching always worked best for me.

Topic		Replies	Views
SAML SSO killing running notebooks JupyterHub how-to , help-wanted	1	353	September 14, 2023
How to keep the session alive? JupyterLab how-to	2	3382	June 13, 2025
Jupyter notebook General community , jupyterhub	12	3672	June 8, 2020
Keep Jupyter Notebook (dockerHDDM) running without browser Notebook help-wanted	1	101	March 12, 2025
What is the behavior of python execution when browser is closed/non-responsive? JupyterHub	1	678	August 13, 2019

Persistent computation on JH server after client disconnects

Related topics