Deployment Strategy RollingUpdate to avoid "Service Unavailable"

Paul2708 · March 7, 2025, 1:03pm

Hey there,

We noticed that running a Helm upgrade (which only affects the hub, not the proxy) results in a short downtime due to the deployment strategy Recreate.
During this time period, the hub is not available and the proxy responds with “Service unavailable”.
According to the docs, Recreate is preferred over RollingUpdate because:

JupyterHub does not support running in parallel, due to this we default to using a deployment strategy of Recreate.

Suppose I understand the deployment strategy RollingUpdate correctly. In that case, it keeps the hub pod running (i.e., forwarding all traffic to this pod) until the new pod is created and running (i.e., readiness probe succeeded). Thus, both pods are not running at the same time.
Or may both pods be running in parallel for less than a second?
If that’s not the case, I don’t know why Recreate is preferred over RollingUpdate, which does not result in a downtime.

Thus, can I safely use RollingUpdate?
We didn’t notice any issues yet.

Best regards
Paul

mahendrapaipuri · March 7, 2025, 1:15pm

I am no expert in JupyterHub helm chart but running two hubs at the same time can be problematic due to the DB. Two hubs talking to the same DB can result in DB inconsistencies especially when JupyterHub applies DB migrations. There might be more issues but this is what came to me first!!

manics · March 7, 2025, 4:35pm

Yes, the jupyterhub process assumes it is the only one modifying the database and proxy state. If you have two hubs running the internal state of jupyterhub can get out of sync.

See

Paul2708 · March 17, 2025, 3:33pm

If I’m correct, the message “Service Unavailable” is shown by the CHP because the error target, which by default is the hub itself, is obviously not reachable.

According to the docs, changing the error target is helpful to show more informative error messages.
Besides the hub is not available at all, what are other “error scenarios” that should be considered when providing a custom error message? (For example, is a 404 an “error” handled by the error target?)

Topic		Replies	Views
Resume job with swarmspawner on multi nodes JupyterHub how-to	6	393	December 15, 2022
What happens when the Hub restarts? JupyterHub	7	2790	July 22, 2019
Multiple replica for proxy and hub pod in single deployment kubernetes Zero to JupyterHub on Kubernetes jupyterhub , help-wanted	4	671	January 23, 2024
Questions about running Jupyterhub Z2JK in a multi-zonal GKE cluster Zero to JupyterHub on Kubernetes	2	1009	December 17, 2020
GKE - 503 : Service Unavailable Your server appears to be down. Try restarting it from the hub Zero to JupyterHub on Kubernetes jupyterhub , help-wanted	2	1751	March 22, 2022

Deployment Strategy RollingUpdate to avoid "Service Unavailable"

Related topics