Websocket connection failure when using `n_jobs` parameter greater than 1 with JupyterHub

Hello, I am experiencing an issue on JupyterHub where the WebSocket connection fails when I restart the kernel after running GridSearchCV of scikit-learn with the n_jobs parameter set to a value greater than 1. I have identified that this parameter is passed to joblib. I have created a minimal code example to reproduce the issue:

import joblib

def func_multi(i):
    return i, i**2, i**3

results = joblib.Parallel(n_jobs=2)(joblib.delayed(func_multi)(i) for i in range(5))

When I run this code and then try to “Restart Kernel and Run All Cells…”, restart does not complete and the cell icon continues to show ‘*’. JavaScript console logs in the developer tools display the following error message:

WebSocket connection to 'wss://[Domain Name of My JupyterHub Server]/user/[User Name]/api/kernels/9fe899c4-9179-40a3-97f8-5a698d60e996/channels?session_id=d30418dd-8174-41d3-9c22-b02275cb795b' failed: 
_createSocket	@	default.js:73

This issue only occurs when n_jobs is set to a value greater than 1. If it is set to 1, the problem disappears. This issue only occurs on JupyterHub and not on JupyterLab on my local PC.

Here are some additional details:

  • The server OS is RHEL8.
  • I am using Google Chrome 114.0.5735.199 as the client browser.
  • I am using Apache/2.4.37 as a reverse proxy server.
  • I am using Python 3.9
  • The versions of related packages are:
    • jupyterhub 3.0.0
    • jupyterhub-systemdspawner 0.17.0
    • jupyterlab 3.4.7
    • joblib 1.3.0
    • scikit-learn 1.3.0

Thank you.

I would like to apologize for posting a resolution so soon after my initial post. The issue has been resolved by updating the jupyter-server version from 1.19.0 to 2.7.0.

Upon investigating the server logs, I noticed the following error:

Uncaught exception GET /user/<User Name>/api/kernels/613dd4c1-189
HTTPServerRequest(protocol='http', host='<Server Domain Name>', method='GET', uri='/user/<User Name>/api/kernels
Traceback (most recent call last):
File "/usr/local/lib64/python3.9/site-packages/tornado/web.py", line 1713, in _execute
result = await result
File "/usr/local/lib/python3.9/site-packages/jupyter_server/services/kernels/handlers.py", line 415, in get
await super().get(kernel_id=kernel_id)
File "/usr/local/lib/python3.9/site-packages/jupyter_server/base/zmqhandlers.py", line 341, in get
await res
File "/usr/local/lib/python3.9/site-packages/jupyter_server/services/kernels/handlers.py", line 384, in pre_get
await self._register_session()
File "/usr/local/lib/python3.9/site-packages/jupyter_server/services/kernels/handlers.py", line 428, in _register_sessi
await stale_handler.close()
TypeError: object Future can't be used in 'await' expression

This prompted me to check the version of my jupyter-server, and it appeared to be out-of-date (1.19.0). After upgrading jupyter-server to 2.7.0, the issue was resolved.

Additionally, in my previous post, I mentioned that this problem did not occur on JupyterLab on my local PC. However, after matching the version of the server when the problem occurred (jupyter-server==1.19.0, jupyterlab==3.4.7) on my local PC, I was able to reproduce the issue. Therefore, it was inappropriate to post this issue under the JupyterHub topic.

I apologize for any confusion caused by my incomplete initial investigation.

1 Like