Weird behavior for cpu usage with multiprocessing in microk8s jupyterhub server

MarineChap · August 26, 2022, 9:33am

Dear,
I am managing a jupyterhub instance on top of microk8s server on a supercomputer.
We have been using multiple versions of jupyterhub or even daskhub and the problem is always present.
Our latest version is 1.2.0 because we are using daskhub-2022.6.0.
We have been noticing for a long time (2y probably) a weird behavior with our cpu usage when doing multiprocessing.
As you can see below in the gif, first time, we run the multiprocessing code all 64 cpu are used. But once the kernel is restarted, it will always run on 1 cpu until we have been a weird and long workflow:

shutdown all kernels,
shutdown server
log out
log in
start server
shutdown all kernel
start kernel

And even like this sometime it works again, sometime not. We have not been able to pinpoint which step is actually important.

cpu_utilization_jupyterhub

What we have already tested :

If we run on the same computer an unique jupyter notebook, there is no problem.
If I manually killed the user pod, the hub pod, the proxy pod, the problem is still here.
there is no cpu affinity set anywhere, i double/triple check at every level of the computer

I am running out of idea here and I was hoping someone already saw this problem.

manics · August 26, 2022, 10:19am

Can you reproduce this problem if you run JupyterLab on k8s on its own, without JupyterHub? If you can that simplifies things, if you can’t then check the configuration of the pod when it’s launched by JupyterHub, and add those options to your manually created JupyterLab pod until you hopefully reproduce the problem.

MarineChap · August 26, 2022, 12:26pm

That’s a good idea. After test, the problem does not appear in the manually created jupyterlab. I will try to add options in the config now.

minrk · August 31, 2022, 1:17pm

If we run on the same computer an unique jupyter notebook, there is no problem.

Is this outside microk8s, or still in a manual microk8s pod? If outside, is it still in a container?

Can you also verify whether the pool processes are shut down after you restart the kernel? I wonder if some leftovers from the pool could be related due to an unclean shutdown of the kernel the first time.

One shot in the dark to try, before starting the pool, set the start method to spawn instead of fork:

multiprocessing.set_start_method('spawn')

there is no cpu affinity set anywhere, i double/triple check at every level of the computer

Were you checking code, or inspecting the processes at runtime? Since this behavior looks so much like cpu affinity pinning (something like the forked subprocess modifying something somewhere that affects the parent when it shouldn’t have), checking at runtime would give me more confidence that it’s truly not involved. You can do this with psutil or taskset: what do you get from taskset --all-tasks -cp 1 and/or the Python code:

import psutil
for p in psutil.process_iter():
    print(p.pid, p.cpu_affinity())

MarineChap · August 31, 2022, 3:58pm

Both inside microk8s (with a manual pod) and outside microk8s, not in a container.

Thank for your advice I will try it.

In the mean time, I gave up on the lead of rebuilding the manual pod with the same config. But another side, I saw that depending the base image of the server it has a more or less high impact.
For example with my private docker image, it is in each restart while with one of the jupyterlab base image, it happens less often (still happen through).
At the end, excluding the cpu 0-3 from the pool did the trick but it is not very practical.
thanks for the input

Topic		Replies	Views
Cpu limits in jupyterhub/k8S Zero to JupyterHub on Kubernetes	6	2469	February 15, 2022
Jupyterhub-singleuser consuming significant and constant CPU JupyterHub jupyterhub	15	1010	October 4, 2023
JupyterHub spawns not actually launching Zero to JupyterHub on Kubernetes help-wanted	15	4507	June 16, 2023
Singleuser pod unable to register to hub on k8s Zero to JupyterHub on Kubernetes help-wanted	10	882	August 25, 2023
Cannot spawn on the master kubernetes node Zero to JupyterHub on Kubernetes jupyterhub	6	1086	July 4, 2023

Weird behavior for cpu usage with multiprocessing in microk8s jupyterhub server

Related topics