Allow users to run a notebook for 12+ hours with no browser activity without being culled?

Hi all,

I am trying to configure Jupyter to free up resources so users don’t leave expensive machines running idle but we’re having an issue where notebooks are being culled when they have a very long running cell.

I’ve been trying to use the jupyter-idle-culler but it was still killing kernels which have long running cells.

After some digging it seems that the culler uses the “idle” status along with the “last_activity” field of the kernel, but when a single cell is executing for a long amount of time the kernel thinks it is idle and the last_activity field doesn’t get updated until it gets some output again.

I wrote a small script that would detect the long running notebook and try to hit the rest API to update the last_activity field but it seems /interrupt is the only one that would update that field, which I obviously can’t use. I also tried using the web socket api to execute pass hoping that would update the last_activity field but that call gets queued to run after the current cell finishes.

Our current workaround was just to disable the culler but that’s not viable long term.

Would greatly appreciate any input or similar situations, thanks

1 Like

Well, Jupyter notebooks are meant for interactive computing rather than for long running processes. In that case you might need a workload scheduler like SLURM to submit those long running cells as batch jobs.

If that is not possible for you and if you are comfortable with Python and JupterHub API, I suggest you to write your own culler. What you can do is keep a map of user servers and their PIDs (parent and children). By monitoring the CPU time of all the processes you can decide if single user server is idle or actively doing computations. Based on your desired idle time, if cumulative CPU time of all processes did not exceed a threshold time, you can kill them using JupyterHub API.

3 Likes

Could you simply increase the timeout?

c.JupyterHub.services = [
    {
        "name": "jupyterhub-idle-culler-service",
        "command": [
            sys.executable,
            "-m", "jupyterhub_idle_culler",
            "--timeout=43200", # 12 hours
        ],
        # "admin": True,
    }
]
1 Like