JupyterHub and Slurm

Hello!
I am trying to set up slurm together with jupyterhub. Here is part of jupyterhub config

from batchspawner import SlurmSpawner
from os import environ
c.JupyterHub.spawner_class = SlurmSpawner
environ['SLURM_CONF'] = '/etc/slurm/slurm.conf'
c.SlurmSpawner.environment = {'SLURM_CONF': '/etc/slurm/slurm.conf'}
c.JupyterHub.hub_ip = 'ip_of_machine_with_hub'
c.JupyterHub.hub_connect_ip = 'how_machine_is_named_in_network'
c.JupyterHub.hub_port = #Port
c.SlurmSpawner.batch_script = """#!/bin/bash
#SBATCH --output=/home/zotov/jupyterhub_slurmspawner_%j.log
#SBATCH --job-name=spawner-jupyterhub
#SBATCH --chdir=/home/zotov
#SBATCH --export={{keepvars}}
#SBATCH --get-user-env=L

#SBATCH --partition=jupyter45
{{prologue}}
#source /home/zotov/venv/bin/activate
source /home/zotov/batchvenv/bin/activate
echo "activate env"
env
echo {{cmd}}
which jupyterhub-singleuser
{% if srun %}{{srun}} {% endif %}{{cmd}}
echo "jupyterhub-singleuser ended gracefully"
{{epilogue}}
"""

After starting jupyterhub systemd unit and trying to connect to the server, error arises

calling journalctl jupyterhub reveals this:

Jun 15 15:44:06 hostname jupyterhub[6553]: 15:44:06.329 [ConfigProxy] info: 200 GET /api/routes
Jun 15 15:44:06 hostname jupyterhub[6553]: 15:44:06.330 [ConfigProxy] info: 200 GET /api/routes
Jun 15 15:44:06 hostname jupyterhub[6550]: [D 2023-06-15 15:44:06.330 JupyterHub proxy:392] Checking routes
Jun 15 15:44:06 hostname jupyterhub[6550]: [I 2023-06-15 15:44:06.330 JupyterHub proxy:477] Adding route for Hub: / => http://hostname.domain:8081
Jun 15 15:44:06 hostname jupyterhub[6550]: [D 2023-06-15 15:44:06.330 JupyterHub proxy:880] Proxy: Fetching POST http://127.0.0.1:8001/api/routes/
Jun 15 15:44:06 hostname jupyterhub[6553]: 15:44:06.331 [ConfigProxy] info: Adding route / -> http://hostname.domain:8081
Jun 15 15:44:06 hostname jupyterhub[6553]: 15:44:06.332 [ConfigProxy] info: Route added / -> http://hostname.domain:8081
Jun 15 15:44:06 hostname jupyterhub[6553]: 15:44:06.332 [ConfigProxy] info: 201 POST /api/routes/
Jun 15 15:44:06 hostname jupyterhub[6550]: [I 2023-06-15 15:44:06.333 JupyterHub app:3245] JupyterHub is now running at http://:8000
Jun 15 15:44:06 hostname jupyterhub[6550]: [D 2023-06-15 15:44:06.333 JupyterHub app:2852] It took 0.508 seconds for the Hub to start
Jun 15 15:44:15 hostname jupyterhub[6550]: [D 2023-06-15 15:44:15.998 JupyterHub base:342] Refreshing auth for zotov
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.000 JupyterHub scopes:863] Checking access via scope servers
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.001 JupyterHub scopes:677] Unrestricted access to /hub/spawn-pending/zotov via servers
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.001 JupyterHub user:431] Creating <class 'batchspawner.batchspawner.SlurmSpawner'> for zotov:
Jun 15 15:44:16 hostname jupyterhub[6550]: [I 2023-06-15 15:44:16.024 JupyterHub log:191] 200 GET /hub/spawn-pending/zotov (zotov@10.36.158.4) 36.30ms
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.077 JupyterHub log:191] 304 GET /hub/logo (@10.36.158.4) 5.83ms
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.159 JupyterHub log:191] 200 GET /hub/static/js/not_running.js?v=20230615154405 (@10.36.158.4) 1.49ms
Jun 15 15:44:16 hostname jupyterhub[6550]: [D 2023-06-15 15:44:16.171 JupyterHub log:191] 200 GET /hub/static/js/utils.js?v=20230615154405 (@10.36.158.4) 1.59ms
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.092 JupyterHub scopes:863] Checking access via scope servers
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.092 JupyterHub scopes:677] Unrestricted access to /hub/spawn/zotov via servers
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.093 JupyterHub pages:217] Triggering spawn with default options for zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.093 JupyterHub base:959] Initiating spawn for zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.093 JupyterHub base:963] 0/100 concurrent spawns
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.093 JupyterHub base:968] 0 active servers
Jun 15 15:44:17 hostname jupyterhub[6550]: [I 2023-06-15 15:44:17.115 JupyterHub provider:659] Creating oauth client jupyterhub-user-zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.130 JupyterHub user:794] Calling Spawner.start for zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [I 2023-06-15 15:44:17.133 JupyterHub batchspawner:291] Spawner submitting job using sudo -E -u zotov sbatch --parsable
Jun 15 15:44:17 hostname jupyterhub[6550]: [I 2023-06-15 15:44:17.133 JupyterHub batchspawner:292] Spawner submitted script:
Jun 15 15:44:17 hostname jupyterhub[6550]:     #!/bin/bash
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --output=/home/zotov/jupyterhub_slurmspawner_%j.log
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --job-name=spawner-jupyterhub
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --chdir=/home/zotov
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --export=PATH,LANG,JUPYTERHUB_API_TOKEN,JPY_API_TOKEN,JUPYTERHUB_CLIENT_ID,JUPYTERHUB_HOST,JUPYTERHUB_OAUTH_CALLBACK_URL,JUPYTERHUB_OAUTH_SCOPES,JUPYTERHUB_OAUTH_ACCESS_SCOPES,JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES,JUPYTERHUB_USER,JUPYTERHUB_SERVER_NAME,JUPYTERHUB_API_URL,JUPYTERHUB_ACTIVITY_URL,JUPYTERHUB_BASE_URL,JUPYTERHUB_SERVICE_PREFIX,JUPYTERHUB_SERVICE_URL,JUPYTERHUB_DEFAULT_URL,JUPYTERHUB_DEBUG,SLURM_CONF,USER,HOME,SHELL
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --get-user-env=L
Jun 15 15:44:17 hostname jupyterhub[6550]:     
Jun 15 15:44:17 hostname jupyterhub[6550]:     #SBATCH --partition=jupyter45
Jun 15 15:44:17 hostname jupyterhub[6550]:     
Jun 15 15:44:17 hostname jupyterhub[6550]:     #source /home/zotov/venv/bin/activate
Jun 15 15:44:17 hostname jupyterhub[6550]:     source /home/zotov/batchvenv/bin/activate
Jun 15 15:44:17 hostname jupyterhub[6550]:     echo "activate env"
Jun 15 15:44:17 hostname jupyterhub[6550]:     env
Jun 15 15:44:17 hostname jupyterhub[6550]:     echo batchspawner-singleuser jupyterhub-singleuser
Jun 15 15:44:17 hostname jupyterhub[6550]:     which jupyterhub-singleuser
Jun 15 15:44:17 hostname jupyterhub[6550]:     srun batchspawner-singleuser jupyterhub-singleuser
Jun 15 15:44:17 hostname jupyterhub[6550]:     echo "jupyterhub-singleuser ended gracefully"
Jun 15 15:44:17 hostname jupyterhub[6550]:     
Jun 15 15:44:17 hostname sudo[6621]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/sbatch --parsable
Jun 15 15:44:17 hostname sudo[6621]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:17 hostname sudo[6621]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [I 2023-06-15 15:44:17.149 JupyterHub batchspawner:295] Job submitted. cmd: sudo -E -u zotov sbatch --parsable output: 1469
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.150 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:17 hostname sudo[6625]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:17 hostname sudo[6625]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:17 hostname sudo[6625]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.161 JupyterHub batchspawner:436] Job 1469 still pending
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.662 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:17 hostname sudo[6631]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:17 hostname sudo[6631]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:17 hostname sudo[6631]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:17 hostname jupyterhub[6550]: [D 2023-06-15 15:44:17.678 JupyterHub batchspawner:436] Job 1469 still pending
Jun 15 15:44:18 hostname jupyterhub[6550]: [W 2023-06-15 15:44:18.096 JupyterHub base:183] Rolling back dirty objects IdentitySet([<Server(:0)>])
Jun 15 15:44:18 hostname jupyterhub[6550]: [I 2023-06-15 15:44:18.101 JupyterHub log:191] 302 GET /hub/spawn/zotov -> /hub/spawn-pending/zotov (zotov@10.36.158.4) 1007.39ms
Jun 15 15:44:18 hostname jupyterhub[6550]: [D 2023-06-15 15:44:18.113 JupyterHub scopes:863] Checking access via scope servers
Jun 15 15:44:18 hostname jupyterhub[6550]: [D 2023-06-15 15:44:18.113 JupyterHub scopes:677] Unrestricted access to /hub/spawn-pending/zotov via servers
Jun 15 15:44:18 hostname jupyterhub[6550]: [I 2023-06-15 15:44:18.115 JupyterHub pages:398] zotov is pending spawn
Jun 15 15:44:18 hostname jupyterhub[6550]: [I 2023-06-15 15:44:18.120 JupyterHub log:191] 200 GET /hub/spawn-pending/zotov (zotov@10.36.158.4) 8.80ms
Jun 15 15:44:18 hostname jupyterhub[6550]: [D 2023-06-15 15:44:18.178 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:18 hostname sudo[6658]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:18 hostname sudo[6658]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:18 hostname sudo[6658]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:18 hostname jupyterhub[6550]: [D 2023-06-15 15:44:18.255 JupyterHub scopes:863] Checking access via scope read:servers
Jun 15 15:44:18 hostname jupyterhub[6550]: [D 2023-06-15 15:44:18.256 JupyterHub scopes:677] Unrestricted access to /hub/api/users/zotov/server/progress via read:servers
Jun 15 15:44:19 hostname jupyterhub[6550]: [I 2023-06-15 15:44:19.924 JupyterHub log:191] 200 GET /hub/api (@10.34.41.32) 0.96ms
Jun 15 15:44:19 hostname jupyterhub[6550]: [D 2023-06-15 15:44:19.941 JupyterHub base:297] Recording first activity for <APIToken('d514...', user='zotov', client_id='jupyterhub')>
Jun 15 15:44:19 hostname jupyterhub[6550]: [D 2023-06-15 15:44:19.954 JupyterHub scopes:863] Checking access via scope users:activity
Jun 15 15:44:19 hostname jupyterhub[6550]: [D 2023-06-15 15:44:19.954 JupyterHub scopes:690] Argument-based access to /hub/api/users/zotov/activity via users:activity
Jun 15 15:44:19 hostname jupyterhub[6550]: [D 2023-06-15 15:44:19.957 JupyterHub users:874] Activity for user zotov: 2023-06-15T13:44:19.881493Z
Jun 15 15:44:19 hostname jupyterhub[6550]: [D 2023-06-15 15:44:19.957 JupyterHub users:892] Activity on server zotov/: 2023-06-15T13:44:19.881493Z
Jun 15 15:44:19 hostname jupyterhub[6550]: [I 2023-06-15 15:44:19.966 JupyterHub log:191] 200 POST /hub/api/users/zotov/activity (zotov@10.34.41.32) 28.90ms
Jun 15 15:44:27 hostname jupyterhub[6550]: [W 2023-06-15 15:44:27.095 JupyterHub base:1088] User zotov is slow to start (timeout=10)
Jun 15 15:44:37 hostname jupyterhub[6550]: [W 2023-06-15 15:44:37.133 JupyterHub user:876] zotov's server failed to start in 20 seconds, giving up.
Jun 15 15:44:37 hostname jupyterhub[6550]:     
Jun 15 15:44:37 hostname jupyterhub[6550]:     Common causes of this timeout, and debugging tips:
Jun 15 15:44:37 hostname jupyterhub[6550]:     
Jun 15 15:44:37 hostname jupyterhub[6550]:     1. Everything is working, but it took too long.
Jun 15 15:44:37 hostname jupyterhub[6550]:        To fix: increase `Spawner.start_timeout` configuration
Jun 15 15:44:37 hostname jupyterhub[6550]:        to a number of seconds that is enough for spawners to finish starting.
Jun 15 15:44:37 hostname jupyterhub[6550]:     2. The server didn't finish starting,
Jun 15 15:44:37 hostname jupyterhub[6550]:        or it crashed due to a configuration issue.
Jun 15 15:44:37 hostname jupyterhub[6550]:        Check the single-user server's logs for hints at what needs fixing.
Jun 15 15:44:37 hostname jupyterhub[6550]:     
Jun 15 15:44:37 hostname jupyterhub[6550]: [D 2023-06-15 15:44:37.135 JupyterHub user:982] Stopping zotov
Jun 15 15:44:37 hostname jupyterhub[6550]: [D 2023-06-15 15:44:37.135 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:37 hostname sudo[6663]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:37 hostname sudo[6663]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:37 hostname sudo[6663]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:37 hostname jupyterhub[6550]: [I 2023-06-15 15:44:37.150 JupyterHub batchspawner:477] Stopping server job 1469
Jun 15 15:44:37 hostname jupyterhub[6550]: [I 2023-06-15 15:44:37.151 JupyterHub batchspawner:355] Cancelling job 1469: sudo -E -u zotov scancel 1469
Jun 15 15:44:37 hostname sudo[6666]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/scancel 1469
Jun 15 15:44:37 hostname sudo[6666]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:37 hostname sudo[6666]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:37 hostname jupyterhub[6550]: [D 2023-06-15 15:44:37.161 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:37 hostname sudo[6672]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:37 hostname sudo[6672]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:37 hostname sudo[6672]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:38 hostname jupyterhub[6550]: [D 2023-06-15 15:44:38.174 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:38 hostname sudo[6676]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:38 hostname sudo[6676]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:38 hostname sudo[6676]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:39 hostname jupyterhub[6550]: [D 2023-06-15 15:44:39.192 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:39 hostname sudo[6680]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:39 hostname sudo[6680]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:39 hostname sudo[6680]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:40 hostname jupyterhub[6550]: [D 2023-06-15 15:44:40.207 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u zotov squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:40 hostname sudo[6684]:     root : PWD=/ ; USER=zotov ; COMMAND=/usr/local/bin/squeue -h -j 1469 -o '%T %B'
Jun 15 15:44:40 hostname sudo[6684]: pam_unix(sudo:session): session opened for user zotov(uid=10074) by (uid=0)
Jun 15 15:44:40 hostname sudo[6684]: pam_unix(sudo:session): session closed for user zotov
Jun 15 15:44:40 hostname jupyterhub[6550]: [D 2023-06-15 15:44:40.229 JupyterHub user:1002] Deleting oauth client jupyterhub-user-zotov
Jun 15 15:44:40 hostname jupyterhub[6550]: [D 2023-06-15 15:44:40.237 JupyterHub user:1005] Finished stopping zotov
Jun 15 15:44:40 hostname jupyterhub[6550]: [E 2023-06-15 15:44:40.243 JupyterHub gen:630] Exception in Future <Task finished name='Task-23' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /opt/jupyterhub/lib/python3.10/site-packages/jupyterhub/handlers/base.py:979> exception=TimeoutError('Timeout')> after timeout
Jun 15 15:44:40 hostname jupyterhub[6550]:     Traceback (most recent call last):
Jun 15 15:44:40 hostname jupyterhub[6550]:       File "/opt/jupyterhub/lib/python3.10/site-packages/tornado/gen.py", line 625, in error_callback
Jun 15 15:44:40 hostname jupyterhub[6550]:         future.result()
Jun 15 15:44:40 hostname jupyterhub[6550]:       File "/opt/jupyterhub/lib/python3.10/site-packages/jupyterhub/handlers/base.py", line 986, in finish_user_spawn
Jun 15 15:44:40 hostname jupyterhub[6550]:         await spawn_future
Jun 15 15:44:40 hostname jupyterhub[6550]:       File "/opt/jupyterhub/lib/python3.10/site-packages/jupyterhub/user.py", line 902, in spawn
Jun 15 15:44:40 hostname jupyterhub[6550]:         raise e
Jun 15 15:44:40 hostname jupyterhub[6550]:       File "/opt/jupyterhub/lib/python3.10/site-packages/jupyterhub/user.py", line 798, in spawn
Jun 15 15:44:40 hostname jupyterhub[6550]:         url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
Jun 15 15:44:40 hostname jupyterhub[6550]:     asyncio.exceptions.TimeoutError: Timeout
Jun 15 15:44:40 hostname jupyterhub[6550]:     
Jun 15 15:44:40 hostname jupyterhub[6550]: [I 2023-06-15 15:44:40.244 JupyterHub log:191] 200 GET /hub/api/users/zotov/server/progress?_xsrf=[secret] (zotov@10.36.158.4) 21991.68ms

Additionallly a log file for slurm job is generated

/home/zotov/batchvenv/bin/jupyterhub-singleuser
/home/zotov/batchvenv/lib/python3.9/site-packages/batchspawner/singleuser.py:17: RuntimeWarning: coroutine 'HubAuth._api_request' was never awaited
  hub_auth._api_request(
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
[I 2023-06-15 15:44:19.835 ServerApp] Package jupyterhub took 0.0000s to import
[I 2023-06-15 15:44:19.856 ServerApp] Package jupyter_server_terminals took 0.0209s to import
[I 2023-06-15 15:44:19.862 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2023-06-15 15:44:19.863 JupyterHubSingleUser] Starting jupyterhub single-user server extension version 4.0.1
[I 2023-06-15 15:44:19.863 JupyterHubSingleUser] Using default url from environment $JUPYTERHUB_DEFAULT_URL: /lab
[I 2023-06-15 15:44:19.865 ServerApp] jupyterhub | extension was successfully linked.
[I 2023-06-15 15:44:19.924 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2023-06-15 15:44:19.928 JupyterHubSingleUser] Updating Hub with activity every 300 seconds
[I 2023-06-15 15:44:19.928 ServerApp] jupyterhub | extension was successfully loaded.
[I 2023-06-15 15:44:19.930 ServerApp] Serving notebooks from local directory: /home/zotov
[I 2023-06-15 15:44:19.930 ServerApp] Jupyter Server 2.6.0 is running at:
[I 2023-06-15 15:44:19.930 ServerApp] http://cluster:51085/user/zotov/lab?token=...
[I 2023-06-15 15:44:19.930 ServerApp]     http://127.0.0.1:51085/user/zotov/lab?token=...
[I 2023-06-15 15:44:19.930 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** JOB 1469 ON cluster CANCELLED AT 2023-06-15T15:44:37 ***
slurmstepd: error: *** STEP 1469.0 ON cluster CANCELLED AT 2023-06-15T15:44:37 ***
[C 2023-06-15 15:44:37.174 ServerApp] received signal 15, stopping

Moreover, tcpdump shows that connection is esablished, but then it suddenly drops. What could be the problem?

Edit:
there are 22 packages sent between jupyterhub and slurm cluster (now only one machine for testing)

First 10 packages have such TCP stream

GET /hub/api HTTP/1.1
Connection: close
Host: host_with_jupyter:8081
User-Agent: Tornado/6.3.2
Accept-Encoding: gzip

HTTP/1.1 200 OK
Server: TornadoServer/6.2
Content-Type: application/json
Date: Thu, 15 Jun 2023 13:00:13 GMT
X-Jupyterhub-Version: 4.0.1
Access-Control-Allow-Headers: accept, content-type, authorization
Content-Security-Policy: frame-ancestors 'self'; report-uri /hub/security/csp-report; default-src 'none'
Etag: "4da5b51af2b10a753250f86a09da956e3fc38cde"
Content-Length: 20
Connection: close

{"version": "4.0.1"}

from 11 to 22

POST /hub/api/users/zotov/activity HTTP/1.1
Authorization: token ***
Content-Type: application/json
Connection: close
Host: host_with_jupyter:8081
User-Agent: Tornado/6.3.2
Content-Length: 115
Accept-Encoding: gzip

{"servers": {"": {"last_activity": "2023-06-15T13:00:13.318380Z"}}, "last_activity": "2023-06-15T13:00:13.318380Z"}HTTP/1.1 200 OK
Server: TornadoServer/6.2
Content-Type: application/json
Date: Thu, 15 Jun 2023 13:00:13 GMT
X-Jupyterhub-Version: 4.0.1
Access-Control-Allow-Headers: accept, content-type, authorization
Content-Security-Policy: frame-ancestors 'self'; report-uri /hub/security/csp-report; default-src 'none'
Content-Length: 0
Connection: close

For some reason no servers are specified in second sequence

Well, that was the issue. I am quite sure that you installed batchspawner from PyPI. This PR fixed that problem but if I am right, there was no new release with this patch. So, batchspawner installed from PyPI would not work with JupyterHub 3.

Try installing it from the repo’s main branch and it should normally work. If you are using JupyterHub 4, you will need this patch as well.

1 Like

mahendrapaipuri Thank you for your response I have the same issue and your solution not work for me, still investigating

Could you share your JupyterHub and single user logs logs?

Hi mahendrapaipuri, thank you for the attention,

ubuntu@dev-dy-t3amedium-1:/var/log$ sudo cat slurmd.log
[2023-08-24T17:43:43.454] error: Node configuration differs from hardware: CPUs=2:2(hw) Boards=1:1(hw) SocketsPerBoard=2:1(hw) CoresPerSocket=1:1(hw) ThreadsPerCore=1:2(hw)
[2023-08-24T17:43:43.500] CPU frequency setting not configured for this node
[2023-08-24T17:43:43.510] slurmd version 22.05.8 started
[2023-08-24T17:43:43.899] slurmd started on Thu, 24 Aug 2023 17:43:43 +0000
[2023-08-24T17:43:43.900] CPUs=2 Boards=1 Sockets=2 Cores=1 Threads=1 Memory=3875 TmpDisk=992603 Uptime=248 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2023-08-24T17:44:00.220] task/affinity: task_p_slurmd_batch_request: task_p_slurmd_batch_request: 26
[2023-08-24T17:44:00.220] task/affinity: batch_bind: job 26 CPU input mask for node: 0x3
[2023-08-24T17:44:00.220] task/affinity: batch_bind: job 26 CPU final HW mask for node: 0x3
[2023-08-24T17:44:02.862] error: Failed to get current user environment variables
[2023-08-24T17:44:02.862] error: _get_user_env: Unable to get user's local environment, running only with passed environment
[2023-08-24T17:44:02.862] Launching batch job 26 for UID 2001
[2023-08-24T17:44:02.908] [26.batch] task/cgroup: _memcg_initialize: job: alloc=3891MB mem.limit=3875MB memsw.limit=unlimited
[2023-08-24T17:44:02.908] [26.batch] task/cgroup: _memcg_initialize: step: alloc=3891MB mem.limit=3875MB memsw.limit=unlimited
[2023-08-24T17:44:02.935] [26.batch] done with job

sudo journalctl -u jupyterhub

sudo[770911]:     root : TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ; COMMAND=/opt/slurm/bin/squeue -h -j 32 -o %T %B
sudo[770911]: pam_unix(sudo:session): session opened for user mok by (uid=0)
sudo[770911]: pam_unix(sudo:session): session closed for user mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:29:58.367 JupyterHub batchspawner:436] Job 32 still pending
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.726 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u dmitrii squeue -h -j 32 -o '%T %B'
sudo[770988]:     root : problem with defaults entries ; TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ;
sudo[770988]:     root : TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ; COMMAND=/opt/slurm/bin/squeue -h -j 32 -o %T %B
sudo[770988]: pam_unix(sudo:session): session opened for user dmitrii by (uid=0)
sudo[770988]: pam_unix(sudo:session): session closed for user dmitrii
run-jupyterhub-ip-10-22-33-44.sh[583658]: [W 2023-08-30 16:30:08.778 JupyterHub batchspawner:440] Job 32 neither pending nor running.
run-jupyterhub-ip-10-22-33-44.sh[583658]:
run-jupyterhub-ip-10-22-33-44.sh[583658]: [E 2023-08-30 16:30:08.778 JupyterHub user:815] Unhandled error starting mok's server: The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.
run-jupyterhub-ip-10-22-33-44.sh[583658]:     Traceback (most recent call last):
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/user.py", line 732, in spawn
run-jupyterhub-ip-10-22-33-44.sh[583658]:         url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/cdsdashboards/hubextension/spawners/variablemixin.py", line 198, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/aws_pcluster_slurm_spawner/aws_pcluster_slurm_spawner.py", line 152, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/batchspawner/batchspawner.py", line 447, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         raise RuntimeError(
run-jupyterhub-ip-10-22-33-44.sh[583658]:     RuntimeError: The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.
run-jupyterhub-ip-10-22-33-44.sh[583658]:
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.799 JupyterHub user:913] Stopping mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.914 JupyterHub user:933] Deleting oauth client jupyterhub-user-mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.986 JupyterHub user:936] Finished stopping mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [E 2023-08-30 16:30:09.137 JupyterHub gen:630] Exception in Future <Task finished name='Task-231240' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/handlers/base.py:954> exception=RuntimeError('The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.')> after timeout
run-jupyterhub-ip-10-22-33-44.sh[583658]:     Traceback (most recent call last):
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/tornado/gen.py", line 625, in error_callback
run-jupyterhub-ip-10-22-33-44.sh[583658]:         future.result()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/handlers/base.py", line 961, in finish_user_spawn
run-jupyterhub-ip-10-22-33-44.sh[583658]:         await spawn_future
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/user.py", line 833, in spawn
run-jupyterhub-ip-10-22-33-44.sh[583658]:         raise e
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/user.py", line 732, in spawn
run-jupyterhub-ip-10-22-33-44.sh[583658]:         url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/cdsdashboards/hubextension/spawners/variablemixin.py", line 198, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/aws_pcluster_slurm_spawner/aws_pcluster_slurm_spawner.py", line 152, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/batchspawner/batchspawner.py", line 447, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         raise RuntimeError(
run-jupyterhub-ip-10-22-33-44.sh[583658]:     RuntimeError: The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.
run-jupyterhub-ip-10-22-33-44.sh[583658]:
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:30:09.144 JupyterHub log:189] 200 GET /hub/api/users/dmitrii/server/progress (dmitrii@10.24.63.200) 355284.63ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:30:24.605 JupyterHub log:189] 200 GET /hub/api (@10.24.33.200) 1.72ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:30:26.526 JupyterHub log:189] 200 GET /hub/api (@10.24.44.44) 1.07ms
sudo[770911]:     root : TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ; COMMAND=/opt/slurm/bin/squeue -h -j 32 -o %T %B
sudo[770911]: pam_unix(sudo:session): session opened for user mok by (uid=0)
sudo[770911]: pam_unix(sudo:session): session closed for user mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:29:58.367 JupyterHub batchspawner:436] Job 32 still pending
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.726 JupyterHub batchspawner:322] Spawner querying job: sudo -E -u dmitrii squeue -h -j 32 -o '%T %B'
sudo[770988]:     root : problem with defaults entries ; TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ;
sudo[770988]:     root : TTY=unknown ; PWD=/apps/software-configs/jupyterhub ; USER=mok ; COMMAND=/opt/slurm/bin/squeue -h -j 32 -o %T %B
sudo[770988]: pam_unix(sudo:session): session opened for user mok by (uid=0)
sudo[770988]: pam_unix(sudo:session): session closed for user mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [W 2023-08-30 16:30:08.778 JupyterHub batchspawner:440] Job 32 neither pending nor running.
run-jupyterhub-ip-10-22-33-44.sh[583658]:
run-jupyterhub-ip-10-22-33-44.sh[583658]: [E 2023-08-30 16:30:08.778 JupyterHub user:815] Unhandled error starting mok's server: The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.
run-jupyterhub-ip-10-22-33-44.sh[583658]:     Traceback (most recent call last):
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/jupyterhub/user.py", line 732, in spawn
run-jupyterhub-ip-10-22-33-44.sh[583658]:         url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/cdsdashboards/hubextension/spawners/variablemixin.py", line 198, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/aws_pcluster_slurm_spawner/aws_pcluster_slurm_spawner.py", line 152, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         return await super().start()
run-jupyterhub-ip-10-22-33-44.sh[583658]:       File "/apps/easybuild/4.5.4/software/jupyterlab/2022.05.10/lib/python3.9/site-packages/batchspawner/batchspawner.py", line 447, in start
run-jupyterhub-ip-10-22-33-44.sh[583658]:         raise RuntimeError(
run-jupyterhub-ip-10-22-33-44.sh[583658]:     RuntimeError: The Jupyter batch job has disappeared while pending in the queue or died immediately after starting.
run-jupyterhub-ip-10-22-33-44.sh[583658]:
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.799 JupyterHub user:913] Stopping mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.914 JupyterHub user:933] Deleting oauth client jupyterhub-user-mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [D 2023-08-30 16:30:08.986 JupyterHub user:936] Finished stopping mok
run-jupyterhub-ip-10-22-33-44.sh[583658]: [E 2023-08-30 16:30:09.137 JupytAug 30 16:30:54 ip-10-22-33-44 run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:30:54.633 JupyterHub log:189] 200 GET /hub/api (@10.24.33.200) 0.88ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:30:56.556 JupyterHub log:189] 200 GET /hub/api (@10.24.44.44) 1.32ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:31:24.649 JupyterHub log:189] 200 GET /hub/api (@10.24.33.200) 1.38ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:31:26.586 JupyterHub log:189] 200 GET /hub/api (@10.24.41.46) 0.85ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:31:54.651 JupyterHub log:189] 200 GET /hub/api (@10.24.33.200) 0.83ms
run-jupyterhub-ip-10-22-33-44.sh[583658]: [I 2023-08-30 16:31:56.607 JupyterHub log:189] 200 GET /hub/api (@10.24.44.44) 1.31ms

We will need the SLURM output and error logs of the job 32 which failed to start single user server. I think we might find the root cause of the problem in those logs.

slurmctld.log

[2023-08-30T16:24:13.116] _slurm_rpc_submit_batch_job: JobId=32 InitPrio=4294901758 usec=11008
[2023-08-30T16:24:13.867] sched/backfill: _start_job: Started JobId=32 in dev on dev-dy-t3amedium-1
[2023-08-30T16:24:36.608] POWER: power_save: pid 766676 waking nodes dev-dy-t3amedium-1
[2023-08-30T16:25:06.612] POWER: JobId=32 needed resuming but nodes aren't power_save anymore
[2023-08-30T16:29:43.872] Node dev-dy-t3amedium-1 now responding
[2023-08-30T16:29:58.714] job_time_limit: Configuration for JobId=32 complete
[2023-08-30T16:29:58.714] Resetting JobId=32 start time for node power up
[2023-08-30T16:30:01.466] _job_complete: JobId=32 WEXITSTATUS 127
[2023-08-30T16:30:01.469] _job_complete: JobId=32 done
[2023-08-30T16:32:06.807] POWER: Power save mode: 43999 nodes
[2023-08-30T16:40:06.059] POWER: power_save: pid 771849 suspending nodes dev-dy-t3amedium-1

here my script to start job

"""#!/bin/bash
#SBATCH --output={{homedir}}/logs/jupyterhub_%j.log
#SBATCH --job-name=jupyterhub
#SBATCH --chdir={{homedir}}
#SBATCH --export={{keepvars}}
#SBATCH --get-user-env=L
{% if req_partition  %}#SBATCH --partition={{req_partition}}{% endif %}
{% if req_constraint %}#SBATCH --constraint={{req_constraint}}{% endif %}
{% if req_nprocs %}#SBATCH --cpus-per-task={{req_nprocs}}{% endif %}
{% if exclusive %}#SBATCH --exclusive{% endif %}
{% if runtime    %}#SBATCH --time={{runtime}}{% endif %}
{% if options    %}#SBATCH {{options}}{% endif %}
set -euo pipefail
trap 'echo SIGTERM received' TERM

{{prologue}}

# Custom R
{% if req_custom_r    %}export PATH="{{req_custom_r}}:$PATH"{% endif %}

# Custom Env
{% if req_custom_env    %}{{req_custom_env}}{% endif %}

which jupyterhub-singleuser
{% if srun %}{{srun}} {% endif %}{{cmd}}
echo "jupyterhub-singleuser ended gracefully"
{{epilogue}}
    """

cat jupyterhub_33.log

/var/spool/slurmd/job00033/slurm_script: line 18: module: command not found

Seems like your environment is not properly configured. I assume you have module load ... in your prologue but environment modules is not properly initialised. You must make sure module command works on your single user environment