A bug in the z2jh 3.0 (The JupyterHub helm chart) using KubeSpawner 6.0 has resulted in running users being disrupted whenever the
hub pod starts up, for example when upgrading to z2jh 3.0 or re-configuring the chart causing the pod to restart.
This bug was introduced in the 3.0.0 release, and is patched in the 3.1.0 release. The bug was also part of the
3.0.0-alpha.1 pre-release, and development version releases
3.0.0-0.dev.git.6133.hbfc583f8 and later.
This kind of disruption as seen by a JupyterHub user is to get redirected from
/user/<username> where they typically work back to
/hub where they depending on the JupyterHub’s configuration would be prompted to start a server again or end up automatically start a server.
hub pod restarts with a bug affected version, you may see that the
/hub/admin panel reports user servers as stopped even though you see running user server pods in Kubernetes.
When user servers are inactive, they are typically automatically stopped by
jupyterhub-idle-culler that is enabled by default in z2jh (
cull.enabled), but this won’t work in this case as JupyterHub considers the servers stopped already.
If you have been using z2jh 3.0.0-alpha.1 to 3.0.3, you should check for orphaned user server pods that JupyterHub doesn’t consider running.
Using @minrk’s Python script
From a computer with Python and
kubectl with to access the Kubernetes cluster with a JupyterHub installed, do the following:
- Download Min’s script from kube_orphans.py · GitHub
- Visit https://your-jupyterhub.example.org/hub/token and request a token with short lived access duration
- Set the environment variable
JUPYTERHUB_API_TOKENto the token from the previous step. Note that requested permissions needed is to read information about all users, which admin has but not non-admin users.
- Configure and verify access to the Kubernetes cluster where the JupyterHub is running
- Run the script and pass it the url to your hub and the k8s namespace via the
Practically on a mac or linux computer, this can look like this:
# 1. Download script wget https://gist.githubusercontent.com/minrk/e15653520847746e643a6ca5e48d3949/raw/1698d0bd6949b16e0f99c84c305c84aa667c5e7f/kube_orphans.py # 2. Request an API token from /hub/token # 3. Set environment variable for use by script export JUPYTERHUB_API_TOKEN=1234567890abcdef1234567890abcdef # 4. Verify you can work against the k8s cluster and it seems to be the right namespace kubectl get all --namespace <namespace> # 5. Run the script python kube_orphans.py --namespace <namespace> https://your-jupyterhub.example.org
The script should now have printed information and a
kubectl delete pod command you can run listing all orphaned pods. Copy it, add
--namespace <namespace> to it, and then run it to delete all the orphaned pods detected by the script.
I’ve adjusted Min’s script to run inside the
hub pod by using a JupyterHub chart config file, and to not ask to delete the detected orphaned servers on startup. This can be useful if you manage several JupyterHub’s with shared configuration files for example.
# 1. Download JupyterHub chart config addition wget https://gist.githubusercontent.com/consideRatio/7b5b8e65f0e90b3c56b5eff3a4038560/raw/fa9b314d78e85ea335847b3d38d698afa1173366/cleanup-service.values.yaml # 2. Verify that the chart config file is nested correctly, its made to work # assuming the jupyterhub chart isn't a chart dependency. If you have a # helm chart that in turn depends on the jupyterhub chart, you would need # to nest the configuration for example. # 3. Perform an chart upgrade referencing the chart config addition helm upgrade <...> --values cleanup-service.values.yaml # 4. Get the hub pod's logs kubectl logs deploy/hub # 5. Look for log lines like these # INFO:/tmp/cleanup-orphaned-pods.py:Found 1 active user servers according to JupyterHub # INFO:/tmp/cleanup-orphaned-pods.py:Found 1 active user server pods according to Kubernetes # INFO:/tmp/cleanup-orphaned-pods.py:0 user server pods are orphaned # INFO:/tmp/cleanup-orphaned-pods.py:Cleanup of orphaned pods complete. # 6. Perform a chart upgrade without the cleanup service helm upgrade <...>