Jupyterhub Pod Dies on regular basis

grayrigel · June 14, 2023, 1:53pm

Hi everyone,
we have a Jupyterhub running with Kubernetes and the pods seems to die 3-4 dies a day on regular basis. It is not related to no-activity as pod often dies when I am in a middle of something. Please note that I don’t particularly use Jupyterhub directly, I just connect to the Jupyterhub pod using VSCode. One of the reason that I thought could cause this behavior: no activity of Jupyterhub when I connect my pod to directly to VSCode. Therefore, I just ran an infinite loop on the notebook in Jupyterhub. It seems to make things slightly better but doesn’t resolve things fully. Also, I would like to understand if this is an expected behavior.

Any help would be appreciated.

manics · June 14, 2023, 3:35pm

Have you tried enabled debug logging and checking the logs for the hub and singleuser-server in the lead up to the pod terminating? What version of Z2JH are you using, and can you show us your config? Do you have any monitoring of your K8s cluster to show resource usage?

grayrigel · June 14, 2023, 4:32pm

Hi,
Thanks. I haven’t given it a try as I am not the maintainer of the hub but I could suggest this to maintainers. Not sure how to check version of Z2JH but I have jupyterhub==2.3.1. Will get back to you with config after discussing with maintainers. Jupyterhub is deployed on a google-kubernetes-cluster, so we have some monitoring of resource usage.

but on the top of your head, what could lead to such behavior? Is it the excess memory usage. I read about the culling mechanism here (Pod containing spawned server dies regularly · Issue #1430 · jupyterhub/zero-to-jupyterhub-k8s · GitHub). Not sure if that’s the issue, but we have tried increased timeout of 3-4 hours.

manics · June 19, 2023, 11:36am

It could be a lot of things! Do you see the same problem when using JupyterLab/notebook directly in a browser, without VSCode?

grayrigel · July 13, 2023, 11:55am

@manics Sorry for the late reply. Here is the error after seeing pod’s logs

[W 2023-07-13 10:43:58.900 SingleUserNotebookApp zmqhandlers:227] WebSocket ping timeout after 119991 ms.

consideRatio · July 25, 2023, 6:06am

There are many pods, is this issue that the pod named “hub-…” dies or a user pod named “jupyter-…” dies?

If its a user pod, then i suspect the jupyterhub-idle-culler could be involved, you would see that from logs in the “hub-…” named pod.

Including logs from the “hub-…” named pod when the issue has showed up is relevant.

You can disable jupyterhub-idle-culler running in the hub pod by configuring the chart “cull.enabled”, see Configuration Reference — Zero to JupyterHub with Kubernetes documentation

Topic		Replies	Views
Juputerhub on K8s user restart policy JupyterHub announcement , jupyterhub , how-to , help-wanted	1	50	November 6, 2024
Pods stuck in terminating state on node Zero to JupyterHub on Kubernetes	1	1900	December 28, 2019
Z2JH singleuser pods not surviving hub outage? Zero to JupyterHub on Kubernetes	1	26	December 16, 2024
Error notifying Hub of activity Zero to JupyterHub on Kubernetes jupyterhub , how-to , help-wanted	4	32	February 20, 2025
JupyterLab kernel dies randomly JupyterLab community , jupyterlab , jupyterhub , help-wanted , notebook	5	2400	July 9, 2024

Jupyterhub Pod Dies on regular basis

Related topics