Threads on Jupyter Notebook

Hi, Do you know if there is a limitation about the process/threads can be run on Jupyter? I have a stress process/thread pool script in order to simulate a N number of processes/threads and when I tried to simulate 1000 threads the performance of Jupyter notebooks is affected, for example on the Jupyter log I am getting these kind of errors

OpenBLAS blas_thread_init: pthread_create failed for thread 1 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max
OpenBLAS blas_thread_init: pthread_create failed for thread 2 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max
OpenBLAS blas_thread_init: pthread_create failed for thread 3 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max

Also, when I tried to see current processes/threads running I am getting this message while running the command ps -AL | wc -l via Jupyter terminal.

bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable

Do you know if there is a limitation for the number of threads/process wich Jupyter Notebook could handled? if so, there is a way to adjust this in order to run more threads?

Thanks for your help on this topic.

The layer in question is likely the IPython Kernel, ipykernel, rather than the Juptyer Server (either notebook or jupyter_server), with the underlying operating system. That RLIMIT_NPROC is a good indicator that it is more on the OS side… have a look for how you might affect that prior to starting your kernel (or more grossly, before starting the server), as it’s unlikely to be something an unprivileged process/user will be able to change. And no, this is not a recommendation to start your kernel/server as root.

Another thing to investigate: running the server and a single kernel will add some overhead, probably on the order of <5 processes and 2-3 times that many threads, but you seem to be dealing a couple orders of magnitude higher than that.

You could try extracting the relevant code outside of an .ipynb (or use importnb) such that you bypass all the jupyter machinery.

2 Likes

Thanks, I reviewed this topic with Cluster Admins and there is pod pid limit on OpenShift that controls the number of processes/threades at pod level, this parameter can be re-configured in order to increase the pids allowed for execution, please refer to Unable to create more than 1024 Threads in the container. - Red Hat Customer Portal for more information. Thanks for your insight!!