Threads on Jupyter Notebook

oharach · November 26, 2021, 8:54pm

Hi, Do you know if there is a limitation about the process/threads can be run on Jupyter? I have a stress process/thread pool script in order to simulate a N number of processes/threads and when I tried to simulate 1000 threads the performance of Jupyter notebooks is affected, for example on the Jupyter log I am getting these kind of errors

OpenBLAS blas_thread_init: pthread_create failed for thread 1 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max
OpenBLAS blas_thread_init: pthread_create failed for thread 2 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max
OpenBLAS blas_thread_init: pthread_create failed for thread 3 of 4: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4194304 current, 4194304 max

Also, when I tried to see current processes/threads running I am getting this message while running the command ps -AL | wc -l via Jupyter terminal.

bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable

Do you know if there is a limitation for the number of threads/process wich Jupyter Notebook could handled? if so, there is a way to adjust this in order to run more threads?

Thanks for your help on this topic.

bollwyvl · November 27, 2021, 3:31pm

The layer in question is likely the IPython Kernel, ipykernel, rather than the Juptyer Server (either notebook or jupyter_server), with the underlying operating system. That RLIMIT_NPROC is a good indicator that it is more on the OS side… have a look for how you might affect that prior to starting your kernel (or more grossly, before starting the server), as it’s unlikely to be something an unprivileged process/user will be able to change. And no, this is not a recommendation to start your kernel/server as root.

Another thing to investigate: running the server and a single kernel will add some overhead, probably on the order of <5 processes and 2-3 times that many threads, but you seem to be dealing a couple orders of magnitude higher than that.

You could try extracting the relevant code outside of an .ipynb (or use importnb) such that you bypass all the jupyter machinery.

oharach · December 9, 2021, 5:13pm

Thanks, I reviewed this topic with Cluster Admins and there is pod pid limit on OpenShift that controls the number of processes/threades at pod level, this parameter can be re-configured in order to increase the pids allowed for execution, please refer to Unable to create more than 1024 Threads in the container. - Red Hat Customer Portal for more information. Thanks for your insight!!

Topic		Replies	Views
ValueError: Current limit exceeds maximum limit Notebook help-wanted	2	2298	December 23, 2022
JupyterHub doesn't kill processes and threads when notebooks are closed or user log out JupyterHub	4	9830	July 22, 2022
Help understanding c.Spawner.cpu_limit JupyterHub cpu	5	1917	February 19, 2022
Kernel for notebook.ipynb appears to have died. It will restart automatically Zero to JupyterHub on Kubernetes jupyterhub , help-wanted	1	258	August 9, 2024
Cpu limits in jupyterhub/k8S Zero to JupyterHub on Kubernetes	6	2447	February 15, 2022

Threads on Jupyter Notebook

Related topics