Hi all,
I’m hoping to get some input on a problem that has cropped up recently.
We have Jupyterhub running on Rocky Linux 8 using Batchspawner to run notebooks on a Slurm Cluster (also running Rocky Linux 8). This has been working great for many months.
Now whenever trying to spawn a notebook, the web interface gives:
“Spawn failed: sbatch: error: Batch job submission failed: Socket timed out on send/recv operation”
On the Cluster node where the job was trying to spawn, there are files like /tmp/jupyterhub-31717.error that contain:
I don’t have permission to check authorization with JupyterHub, my auth token may have expired: [403] Forbidden
{“status”: 403, “message”: “Forbidden”}
Traceback (most recent call last):
File “/mnt/local/python3.9/bin/batchspawner-singleuser”, line 8, in
sys.exit(main())
File “/mnt/local/python3.9/lib/python3.9/site-packages/batchspawner/singleuser.py”, line 17, in main
hub_auth._api_request(
File “/mnt/local/python3.9/lib/python3.9/site-packages/jupyterhub/services/auth.py”, line 436, in _api_request
raise HTTPError(
tornado.web.HTTPError: HTTP 500: Internal Server Error (Permission failure checking authorization, I may need a new token)
So far the only similar issue I have found searching the forums is a reference to a netrc or .netrc file in the users’s home directory causing the problem, but these files do not exist in the user dir, /etc, or in any other obvious places.
Anyone have any thoughts to share on what might be causing this, or how to troubleshoot?
Thanks,
-Dj