Best practices for jupyterhub

We are deploying JupyterHub on a virtual machine for 50 users. The main problem is users not closing their sessions and each user having multiple open notebooks consuming all the available memory. Frequent node reboots is frustrating for both admin team and end users.

  1. Is there a page outlining the best practices for users/ a user education page?
  2. Tried idle culler with python-3.7, it doesn’t start throwing weird errors. Any alternate solution to kill/terminate idle sessions after 8 hours?

Any help is appreciated!

There are several options for restricting users, for example by imposing CPU and memory resource limits per user. These depend on your chosen spawner, e.g. all container based spawners should support limits, others may not.

The idle culler should work, however you’re using a version of Python that became end-of-life over a year ago.

Can you try upgrading to a supported version of Python, and to the latest JupyterHub components? If you still have problems with the idle culler please can you tell us how you’ve deployed JupyterHub, show us your configuration, and your logs with debugging turned on?

3 Likes

As @manics mentioned, try upgrading python. Idle culler works fine for us.

Another measure we use with JupyterLab in JupyterHub is force users to spawn to a fresh session, not opening any of the notebooks they had open in the previous session. This can help with reducing the load a bit.

c.Spawner.default_url = '/lab?reset'
3 Likes

Python is updated to 3.10
jupyterhub-idle-culler-1.4.0

~]# jupyterhub &
~]# [I 2024-11-26 10:52:51.575 JupyterHub app:2859] Running JupyterHub version 4.0.2
[I 2024-11-26 10:52:51.575 JupyterHub app:2889] Using Authenticator: jupyterhub.auth.PAMAuthenticator-4.0.2
[I 2024-11-26 10:52:51.575 JupyterHub app:2889] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-4.0.2
[I 2024-11-26 10:52:51.575 JupyterHub app:2889] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-4.0.2

[I 2024-11-26 10:52:51.650 JupyterHub proxy:556] Generating new CONFIGPROXY_AUTH_TOKEN
[I 2024-11-26 10:52:51.659 JupyterHub app:1984] Not using allowed_users. Any authenticated user will be allowed.
[I 2024-11-26 10:52:51.780 JupyterHub app:2928] Initialized 0 spawners in 0.001 seconds
[I 2024-11-26 10:52:51.784 JupyterHub metrics:278] Found 0 active users in the last ActiveUserPeriods.twenty_four_hours
[I 2024-11-26 10:52:51.785 JupyterHub metrics:278] Found 2 active users in the last ActiveUserPeriods.seven_days
[I 2024-11-26 10:52:51.785 JupyterHub metrics:278] Found 2 active users in the last ActiveUserPeriods.thirty_days
[W 2024-11-26 10:52:51.785 JupyterHub proxy:625] Found proxy pid file: /root/jupyterhub-proxy.pid
[W 2024-11-26 10:52:51.785 JupyterHub proxy:642] Proxy still running at pid=457731
[W 2024-11-26 10:52:52.787 JupyterHub proxy:662] Stopped proxy at pid=457731
[W 2024-11-26 10:52:52.787 JupyterHub proxy:746] Running JupyterHub without SSL. I hope there is SSL termination happening somewhere else…
[I 2024-11-26 10:52:52.787 JupyterHub proxy:750] Starting proxy @ http://:9443/
10:52:52.952 [ConfigProxy] info: Proxying http://*:9443 to (no default)
10:52:52.953 [ConfigProxy] info: Proxy API at http://127.0.0.1:8001/api/routes
[I 2024-11-26 10:52:52.979 JupyterHub app:3178] Hub API listening on http://127.0.0.1:8081/hub/
[I 2024-11-26 10:52:52.979 JupyterHub app:3189] Starting managed service jupyterhub-idle-culler-service
[I 2024-11-26 10:52:52.979 JupyterHub service:385] Starting service ‘jupyterhub-idle-culler-service’: [‘-m’, ‘jupyterhub_idle_culler’, ‘–timeout=3600’]
10:52:52.979 [ConfigProxy] info: 200 GET /api/routes
[I 2024-11-26 10:52:52.980 JupyterHub service:133] Spawning -m jupyterhub_idle_culler --timeout=3600
[C 2024-11-26 10:52:52.981 JupyterHub app:3193] Failed to start service jupyterhub-idle-culler-service
Traceback (most recent call last):
File “/opt/anaconda310/lib/python3.10/site-packages/jupyterhub/app.py”, line 3191, in start
await service.start()
File “/opt/anaconda310/lib/python3.10/site-packages/jupyterhub/services/service.py”, line 421, in start
self.spawner.start()
File “/opt/anaconda310/lib/python3.10/site-packages/jupyterhub/services/service.py”, line 135, in start
self.proc = Popen(
File “/opt/anaconda310/lib/python3.10/subprocess.py”, line 971, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File “/opt/anaconda310/lib/python3.10/subprocess.py”, line 1847, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: ‘-m’

jupyterhub_config.py

import sys

c.JupyterHub.load_roles = [
{
“name”: “jupyterhub-idle-culler-role”,
“scopes”: [
“list:users”,
“read:users:activity”,
“read:servers”,
“delete:servers”,
],
“services”: [“jupyterhub-idle-culler-service”],
}
]

c.JupyterHub.admin_access = True
c.JupyterHub.cookie_max_age_days = 1
c.JupyterHub.port = 9443
c.JupyterHub.services = [
{
“name”: “jupyterhub-idle-culler-service”,
“command”: [
“-m”, “jupyterhub_idle_culler”,
“–timeout=3600”,
],
}
]
c.Spawner.default_url = ‘/lab’

Let me know if you need additional details.

Thank you, will update the config. As of now both jupyterhub and idle culler are not running (if culler config is included in jupyterhub_config).

Maybe missing the parameter: sys.executable

        "command": [
            sys.executable,
            "-m", "jupyterhub_idle_culler",
            "--timeout=3600",
        ],
1 Like

It worked after the sys.executable was commented out. Now, both jupyterhub and idle-culler are running. How do I check if it is culling the idle sessions? I started the service around 2 PM, logged into the jupyterhub browser at 6 PM, and opened 4 notebooks. The timeout is 3600 (60 minutes). After 12 hours I still see one jupyterhub session + 4 notebook processes from linux command line. Am I missing something? Do I need to set any additional properties?

tharun 466050 464603 0 18:31 ? /opt/anaconda310/bin/python /opt/anaconda310/bin/jupyterhub-singleuser (jupyterhub session)

tharun 466143 466050 0 18:32 ? /opt/anaconda310/bin/python -m ipykernel_launcher -f /home/tharun/.local/share/jupyter/runtime/kernel-xxx.json (notebook 1)

tharun 466146 466050 0 18:32 ? /opt/anaconda310/bin/python -m ipykernel_launcher -f /home/tharun/.local/share/jupyter/runtime/kernel-xxx.json (notebook 2)

tharun 466147 466050 0 18:32 ? /opt/anaconda310/bin/python -m ipykernel_launcher -f /home/tharun/.local/share/jupyter/runtime/kernel-xxx.json (notebook 3)

tharun 466148 466050 0 18:32 ? /opt/anaconda310/bin/python -m ipykernel_launcher -f /home/tharun/.local/share/jupyter/runtime/kernel-xxx.json (notebook 4)

@manics @deaftone, can you please provide any input? idle-culler is running but not terminating sessions after the timeout i.e., 3600 seconds/ 1 hour.

This probably means your singleuser server is sending activity notifications to JupyterHub. Is JupyterLab open in a browser somewhere? Do you have any extensions that might be running in the background and reporting the server as active?

Can you turn on debug logging, and show us your logs for JupyterHub and any running singleuser servers?

Users don’t logout of the browser and leave the notebooks open. Is that not considered as idle session?

What is the considered as an idle session if users don’t logout of the browser?

I’ll enable debug logging and share the logs soon…