Problem with batchspawner (Slurm) and systemd service for Jupyterhub

Hi all,
I am using Jupyterhub + Jupyterlab with a batchspawner on our local small HPC cluster. Everything seemed to be running normally during testing, but when I set it up as a systemd service to run in the background, I notice some strange error. I can log in and start a single-user server on the HPC cluster, which all works normally, but when I log out and log back in, the Jupyterhub systemd service deactivates on its own. I’m trying to figure out why this is happening from systemd, when it does not appear to happen from a manually run shell.

Keep in mind, all of the following configs work perfectly when I start the server from an interactive session, or even use it in Screen or nohup, the problem is only with running it from a systemd service.

My Jupyterhub config:

# Admin Access
c.JupyterHub.admin_access = True
c.JupyterHub.admin_users = set("root")
c.Authenticator.admin_users = set(['root'])

# Path Configuration
c.JupyterHub.config_file = '/mounts/faraday/jupyterhub/jupyterhub_config.py'
c.JupyterHub.cookie_secret_file = '/mounts/faraday/jupyterhub/jupyterhub_cookie_secret'
c.JupyterHub.db_url = 'sqlite:////mounts/faraday/jupyterhub/jupyterhub.sqlite'
c.JupyterHub.ConfigurableHTTPProxy.auth_token = '036536a1e95a4d4d83907648238eaa8e'
c.Application.log_file = '/var/log/jupyterhub.log'

# Proxy Configuration
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/jupyter.cluster.earlham.edu/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/jupyter.cluster.earlham.edu/fullchain.pem'
c.ConfigurableHTTPProxy.command = ['configurable-http-proxy', '--redirect-port', '80']
c.JupyterHub.port = 443
c.PAMAuthenticator.open_sessions = False

# Batch Spawner
import batchspawner
c.SlurmSpawner.hub_connect_url = 'https://jupyter.cluster.earlham.edu'
c.SlurmSpawner.req_prologue='export PATH=$PATH:/mounts/faraday/software/anaconda/envs/jupyter-test/bin/'

# Idle-Culler (Stops idle servers after timeout)
import sys
c.JupyterHub.services = [
     {
            'name': 'idle-culler',
            'admin': True,
            'command': [
                sys.executable,
                '-m', 'jupyterhub_idle_culler',
                '--timeout=1200'
            ],
      }
]

# GENERAL
c.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
c.Spawner.http_timeout = 120
c.SystemdSpawner.default_shell = '/bin/bash'
c.MultiKernelManager.default_kernel_name = 'py312'

# PROFILES CONFIG
c.ProfilesSpawner.profiles = [
	( "Faraday scheduler (8 core / 48GB RAM)",
	'slurmsession1',
	'batchspawner.SlurmSpawner',
	{
		'hub_connect_url': 'https://jupyter.cluster.earlham.edu',
		'req_prologue':'export PATH=$PATH:/mounts/faraday/software/anaconda/envs/jupyter-test/bin/',
		'req_memory':'48GB',
		'req_nprocs': '8',
		'environment':
		{
			'TF_CPP_MIN_LOG_LEVEL':'3'
		}
	}),
	( "Faraday scheduler (1 core / 4GB RAM)",
        'slurmsession2',
        'batchspawner.SlurmSpawner',
        {
                'hub_connect_url': 'https://jupyter.cluster.earlham.edu',
                'req_prologue':'export PATH=$PATH:/mounts/faraday/software/anaconda/envs/jupyter-test/bin/',
                'req_memory':'4GB',
                'req_nprocs': '1',
                'environment':
                {
                        'TF_CPP_MIN_LOG_LEVEL':'3'
                }
        }),
	( "F0 - Local Session (1 core / 4GB RAM)", 'localsession', 'systemdspawner.SystemdSpawner', {'cpu_limit':1.0, 'mem_limit':'4G'} )
]

Systemd service file:

[Unit]
Description=Jupyterhub
After=syslog.target network.target
Type=forking

[Service]
User=root
Environment="PATH=/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/mounts/faraday/software/anaconda/envs/jupyter-test/bin"
ExecStart=/mounts/faraday/software/anaconda/envs/jupyter-test/bin/jupyterhub -f /mounts/faraday/jupyterhub/jupyterhub_config.py
autostart=true
autorestart=true
startretries=1
exitcodes=0,2
stopsignal=TERM
redirect_stderr=true
stdout_logfile=/var/log/jupyterhub.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10
stdout_capture_maxbytes=1MB

[Install]
WantedBy=multi-user.target

Log output (from systemd error):

Sep 19 13:47:18 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:18.803 JupyterHub base:1197] User pelibby16 server took 1.350 seconds to stop
Sep 19 13:47:18 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:18.808 JupyterHub spawner:170]
Sep 19 13:47:18 f0 jupyterhub[2131374]:     The shared database session at Spawner.db is deprecated, and will be removed.
Sep 19 13:47:18 f0 jupyterhub[2131374]:     Please manage your own database and connections.
Sep 19 13:47:18 f0 jupyterhub[2131374]:
Sep 19 13:47:18 f0 jupyterhub[2131374]:     Contact JupyterHub at https://github.com/jupyterhub/jupyterhub/issues/3700
Sep 19 13:47:18 f0 jupyterhub[2131374]:     if you have questions or ideas about direct database needs for your Spawner.
Sep 19 13:47:18 f0 jupyterhub[2131374]:
Sep 19 13:47:18 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:18.809 JupyterHub base:1444] Failing suspected API request to not-running server: /hub/user/pelibby16/api/kernels/46ef765f-00bf-4ed4-9570-4a128e2d7>
Sep 19 13:47:18 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:18.810 JupyterHub log:191] 424 GET /hub/user/pelibby16/api/kernels/46ef765f-00bf-4ed4-9570-4a128e2d7398?1726768038418 (pelibby16@::ffff:10.106.0.25>
Sep 19 13:47:18 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:18.811 JupyterHub log:191] 204 DELETE /hub/api/users/pelibby16/server?_xsrf=[secret] (pelibby16@::ffff:10.106.0.254) 1384.24ms
Sep 19 13:47:18 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:18.959 JupyterHub log:191] 302 GET /user/pelibby16/api/contents?content=1&1726768038925 -> /hub/user/pelibby16/api/contents?content=1&1726768038925>
Sep 19 13:47:18 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:18.974 JupyterHub log:191] 302 GET /user/pelibby16/api/events/subscribe?token=[secret] -> /hub/user/pelibby16/api/events/subscribe?token=[secret] (>
Sep 19 13:47:19 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:19.017 JupyterHub base:1444] Failing suspected API request to not-running server: /hub/user/pelibby16/api/contents
Sep 19 13:47:19 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:19.017 JupyterHub log:191] 424 GET /hub/user/pelibby16/api/contents?content=1&1726768038925 (pelibby16@::ffff:10.106.0.254) 3.24ms
Sep 19 13:47:19 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:19.622 JupyterHub log:191] 302 GET /user/stbrons22/api/kernels/84f0d582-b1e7-4b56-b497-8530532d9656?1726768039058 -> /hub/user/stbrons22/api/kernel>
Sep 19 13:47:19 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:19.646 JupyterHub base:1444] Failing suspected API request to not-running server: /hub/user/stbrons22/api/kernels/84f0d582-b1e7-4b56-b497-8530532d9>
Sep 19 13:47:19 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:19.647 JupyterHub log:191] 424 GET /hub/user/stbrons22/api/kernels/84f0d582-b1e7-4b56-b497-8530532d9656?1726768039058 (stbrons22@::ffff:10.105.3.34>
Sep 19 13:47:22 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:22.540 JupyterHub log:191] 302 GET /user/pelibby16/api/kernels/8276361c-0af8-4984-a2a6-6394acc2ec59?1726768042441 -> /hub/user/pelibby16/api/kernel>
Sep 19 13:47:22 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:22.601 JupyterHub base:1444] Failing suspected API request to not-running server: /hub/user/pelibby16/api/kernels/8276361c-0af8-4984-a2a6-6394acc2e>
Sep 19 13:47:22 f0 jupyterhub[2131374]: [W 2024-09-19 13:47:22.602 JupyterHub log:191] 424 GET /hub/user/pelibby16/api/kernels/8276361c-0af8-4984-a2a6-6394acc2ec59?1726768042441 (pelibby16@::ffff:10.106.0.25>
Sep 19 13:47:25 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:25.177 JupyterHub login:44] User logged out: pelibby16
Sep 19 13:47:25 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:25.180 JupyterHub log:191] 302 GET /hub/logout -> /hub/login (@::ffff:10.106.0.254) 4.14ms
Sep 19 13:47:25 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:25.245 JupyterHub log:191] 200 GET /hub/login (@::ffff:10.106.0.254) 2.59ms
Sep 19 13:47:26 f0 jupyterhub[2131374]: [I 2024-09-19 13:47:26.080 JupyterHub log:191] 200 GET /hub/login (@::ffff:10.106.0.254) 1.97ms
Sep 19 13:47:27 f0 jupyterhub[2131377]: 13:47:27.869 [ConfigProxy] warn: Terminated
Sep 19 13:47:27 f0 systemd[1]: jupyterhub.service: Deactivated successfully.
Sep 19 13:47:27 f0 systemd[1]: jupyterhub.service: Consumed 2.224s CPU time.

I am using the following versions:
systemd 252 (252.22-1~deb12u1)
jupyterhub 4.0.2
Jupyterlab 4.0.11
batchspawner 1.3.0
jupyterhub-systemdspawner 1.0.1
wrapspawner 1.0.1

Any help or suggestions are appreciated.

This is a bit unclear. From what I understand from your config, you have 3 different profiles in wrapSpawner: two of them target SLURM cluster and one is to spawn single user servers locally using SystemdSpawner. So, when you log in and log out the single user server you spawned using SystemdSpawner deactivates on its own?

If you can share your logs right from the beginning it can help understand your problem better. Record the logs from starting of JupyterHub, spawn a single user server, log in and log out that reproduce your problem. Logs in DEBUG mode would be even more helpful.

1 Like

It’s been a while, but I actually solved this problem. The cause, in my case, was having the Jupyter/ipy related packages installed in a Conda env, which wasn’t working well with the systemd service. I was able to create a bash script wrapper to activate the needed conda env, and then start the server from systemd.

1 Like