Make user pod survive resource limits enforcement?

Is it possible to make the pod survive and only kill the abusive kernel in Kubernetes? If a user hits the memory limit the whole singleuser-server dies and needs to be restarted, it would be preferrable to only kill the abusive kernel/process so the user interface survives. I submitted a patch to systemdspawner to achieve this, but I’m not familiar enough with containers/kubernetes to see an obvious fix. It would be nice to keep this behaviour as we move from a vm-based cluster to k8s.

I’m not aware of anything in Kubernetes that can kill a single process instead of the container.

One option could be to write a Jupyter-server extension, or a custom kernel, that can monitor and kill itself before hitting the resource limits?

can show CPU/memory usage on a per-kernel basis, so this information is available to jupyter-server

Thanks, at least I’m not missing something obvious. Using the mentioned extension would give a user a warning and information on the reason for being kicked out.

If one can use cgroups within a container it should in principle be possible to implement this in the jupyter-server.

1 Like