Hi everyone,
I’ve gotten batchspawner working nicely with my slurm based cluster without the use of a reverse proxy. I would like to use a reverse proxy in order to host some other information from the head node of this cluster.
I followed the directions in here:
And seem to have found success if not running using the batchspawner. However, because my c.JupyterHub.bind_url is now set to 127.0.0.1:8000/jhub/ (from following the instructions for the reverse proxy), it seems that batchspawner jobs (which end up running on completely different machines) try to connect to the hub at that location. Of course, that fails, because they’re trying to connect to themself!
So, what seems to be the solution to this is to set the batchspawner hub_connect_url. I found the local network IP of my cluster head node, as is accessible from the worker nodes. Then set this in my jupyterhub_config:
c.SlurmSpawner.hub_connect_url = 'http://172.16.33.254:8000/'
However, my job run output fails with this error:
Error connecting to http://172.16.33.254:8000/jhub/hub/api: [Errno 111] Connection refused
Traceback (most recent call last):
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 475, in _api_request
r = await AsyncHTTPClient().fetch(req, raise_error=False)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/bin/batchspawner-singleuser", line 8, in <module>
sys.exit(main())
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/lib/python3.9/site-packages/batchspawner/singleuser.py", line 32, in main
asyncio.run(
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/spack/opt/spack/linux-centos7-sandybridge/gcc-11.2.0/python-3.9.10-2luse2jko74ictdwekggecxci76g3rso/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 488, in _api_request
raise HTTPError(500, msg)
tornado.web.HTTPError: HTTP 500: Internal Server Error (Failed to connect to Hub API at 'http://172.16.33.254:8000/jhub/hub/api'. Is the Hub accessible at this URL (from host: n011.cluster.com)?)
srun: error: n011: task 0: Exited with exit code 1
Why is the connection refused? Do I need to change some firewall settings so that this port can receive things over the network? I’m not sure how to do that, and would like some assistance.