JupyterHub doesn't kill processes and threads when notebooks are closed or user log out

Hello,

on our new JupyterHub instance we noticed, that user processes and threads (sessions) don’t stop when notebooks are closed or when the user logs out. On the opening of each - e.g. - R notebook, a new process and several threads are spawned. Is that the intended behaviour that JupyterHub doesn’t clean up user processes?

We expect many active users on our system and would like to avoid that the process space fills up with thousands of processes and threads. No matter wether they consume resources or not.

Is there a safe way to clean up these unused processes?

Cheers
frank

Hi Frank - welcome to the Jupyter community discourse forum!

Because computations can take a long time (hours, sometimes days) the underlying kernel (and its associated ports) remain allocated even when the notebook is closed - this is by design. However, because of that requirement, distinguishing between a user intentionally leaving their kernel process running versus those forgetting to shutdown the kernel is difficult. To address these circumstances, culling can be configured based on inactivity (or idleness). In JupyterHub configurations, culling can be configured at two levels.

Level 1. Notebook server provides the ability to cull Notebook kernels after some period of inactivity. There are also options for whether the kernel should be culled if currently connected or even if busy (i.e., a cell is executing). The culling polling period is also configurable. See the configuration options relative to the MappingKernelManager class.

Level 2. JupyterHub provides the ability to cull Notebook servers. This is accomplished via an external cull_idle_service. I don’t think this service takes into account whether a given kernel of the Notebook server being checked is busy or not, so a cell executing for 10 hours may still appear as a Notebook server that’s been idle for 10 hours.

In either case, you should consider what might be the longest period you as an administrator would like a given cell to complete its execution (plus some period for possible analysis) and be sure to set the inactivity setting(s) greater than that calculation.

If you want to unconditionally shutdown inactive servers (regardless of busy state or not) after some period, then you’d only need to configure the cull_idle_service since stopping a Notebook server will also shutdown any active kernels.

Hello Kevin,

thank you for the welcoming and thank you for the detailed, informative and most helpful answer. We will definitively go towards option one and are still discussing, if option two could be an option, too. (pun not intended…:slight_smile: ).

Cheers, frank

1 Like