I am running a slurm cluster and running batchspawner, but the spawned server does not communicate with the frontend server and keeps getting killed off.
c = get_config()
import batchspawner
import wrapspawner
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.hub_connect_ip = 'server_ip'
c.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
c.Spawner.http_timeout = 60
c.BatchSpawnerBase.req_nprocs = '1'
c.BatchSpawnerBase.ip = 'server_ip'
c.BatchSpawnerBase.req_runtime = '12:00:00'
c.BatchSpawnerBase.start_timeout = 240
c.BatchSpawnerBase.req_host = 'server_ip'
c.BatchSpawnerBase.exec_prefix = ''
c.SlurmSpawner.batch_script = '''#!/bin/bash
#
#SBATCH --output=/nfs/cluster/jupyterhub/jupyterhub_slurmspawner_%j.log
#SBATCH --job-name=jupyterhub-spawner
{% if partition %}#SBATCH --partition={{partition}}
{% endif %}{% if runtime %}#SBATCH --time={{runtime}}
{% endif %}{% if gres %}#SBATCH --gres={{gres}}
{% endif %}{% if nprocs %}#SBATCH --cpus-per-task={{nprocs}}
{% endif %}{% if options %}#SBATCH {{options}}{% endif %}
#!/usr/bin/scl enable devtoolset-8 -- /bin/bash
eval "$(conda shell.bash hook)"
conda activate deep_learning
{{prologue}}
which jupyterhub-singleuser
printenv
{% if srun %}{{srun}} {% endif %}{{cmd}}
echo "jupyterhub-singleuser ended gracefully"
{{epilogue}}
'''
c.ProfilesSpawner.ip = 'server_ip
I am hoping that it is an IP configuration issue. Newbie to jupyterhub configurations, so fairly sure I have messed up somewhere trivial.
[I 2021-11-15 11:56:08.804 SingleUserNotebookApp mixins:576] Starting jupyterhub-singleuser server version 1.5.0
[I 2021-11-15 11:56:08.808 SingleUserNotebookApp notebookapp:2302] Serving notebooks from local directory: /home/abc
[I 2021-11-15 11:56:08.808 SingleUserNotebookApp notebookapp:2302] Jupyter Notebook 6.4.5 is running at:
[I 2021-11-15 11:56:08.808 SingleUserNotebookApp notebookapp:2302] http://xxxx:34577/
[I 2021-11-15 11:56:08.808 SingleUserNotebookApp notebookapp:2303] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2021-11-15 11:56:08.816 SingleUserNotebookApp mixins:557] Updating Hub with activity every 300 seconds
slurmstepd: error: *** STEP 156.0 ON xxxx CANCELLED AT 2021-11-15T12:06:05 ***
[C 2021-11-15 12:06:05.807 SingleUserNotebookApp notebookapp:1972] received signal 15, stopping