Readiness probe fails on Docker Kubernetes deploy

I’m setting up a local environment for JupyterHub testing using Kubernetes with Docker. Following the ZTJH instructions to setup the hub, I can’t spawn user pods, as they simply fail to start.

Describing the pods reveals that each one is considered “unhealthy”. The output is attached below.

Name:             continuous-image-puller-4sxdg
Namespace:        ztjh
Priority:         0
Service Account:  default
Node:             docker-desktop/192.168.65.4
Start Time:       Wed, 11 Jan 2023 11:53:39 -0600
Labels:           app=jupyterhub
                  component=continuous-image-puller
                  controller-revision-hash=8678c4b657
                  pod-template-generation=2
                  release=ztjh-release
Annotations:      <none>
Status:           Running
IP:               10.1.0.56
IPs:
  IP:           10.1.0.56
Controlled By:  DaemonSet/continuous-image-puller
Init Containers:
  image-pull-metadata-block:
    Container ID:  docker://379e12ddbee3ea36bb9077d98b1f9ae428fde6be446d3864a50ab1d0fb07d62f
    Image:         jupyterhub/k8s-network-tools:1.2.0
    Image ID:      docker-pullable://jupyterhub/k8s-network-tools@sha256:a6fa68b84748dcf01085016fd2475e84a38d4b5f0940d010c0ae3044e50ee28d
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      echo "Pulling complete"
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 11 Jan 2023 12:12:46 -0600
      Finished:     Wed, 11 Jan 2023 12:12:46 -0600
    Ready:          True
    Restart Count:  1
    Environment:    <none>
    Mounts:         <none>
  image-pull-singleuser:
    Container ID:  docker://72c4ae33f89eab1fbab37f34d13f94ed8ddebaa879ba3b8e186559fd2500b613
    Image:         ideonate/jh-voila-oauth-singleuser:0.6.3
    Image ID:      docker-pullable://ideonate/jh-voila-oauth-singleuser@sha256:7b597b31b7bfee2099aedd45f552cf7bd446f86afadd1d938c0d9a741e182a82
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      echo "Pulling complete"
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 11 Jan 2023 12:12:47 -0600
      Finished:     Wed, 11 Jan 2023 12:12:47 -0600
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Containers:
  pause:
    Container ID:   docker://8bcb56e0d0cea48ffdee1b99dbdfbc57389e3f0de7a50aa1080c43211f8936ad
    Image:          k8s.gcr.io/pause:3.5
    Image ID:       docker-pullable://k8s.gcr.io/pause@sha256:1ff6c18fbef2045af6b9c16bf034cc421a29027b800e4f9b68ae9b1cb3e9ae07
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 11 Jan 2023 12:12:48 -0600
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Wed, 11 Jan 2023 11:53:42 -0600
      Finished:     Wed, 11 Jan 2023 12:12:36 -0600
    Ready:          True
    Restart Count:  1
    Environment:    <none>
    Mounts:         <none>
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:            <none>
QoS Class:          BestEffort
Node-Selectors:     <none>
Tolerations:        hub.jupyter.org/dedicated=user:NoSchedule
                    hub.jupyter.org_dedicated=user:NoSchedule
                    node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                    node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                    node.kubernetes.io/not-ready:NoExecute op=Exists
                    node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                    node.kubernetes.io/unreachable:NoExecute op=Exists
                    node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       48m   default-scheduler  Successfully assigned ztjh/continuous-image-puller-4sxdg to docker-desktop
  Normal  Pulled          48m   kubelet            Container image "jupyterhub/k8s-network-tools:1.2.0" already present on machine
  Normal  Created         48m   kubelet            Created container image-pull-metadata-block
  Normal  Started         48m   kubelet            Started container image-pull-metadata-block
  Normal  Pulled          48m   kubelet            Container image "ideonate/jh-voila-oauth-singleuser:0.6.3" already present on machine
  Normal  Created         48m   kubelet            Created container image-pull-singleuser
  Normal  Started         48m   kubelet            Started container image-pull-singleuser
  Normal  Pulled          48m   kubelet            Container image "k8s.gcr.io/pause:3.5" already present on machine
  Normal  Created         48m   kubelet            Created container pause
  Normal  Started         48m   kubelet            Started container pause
  Normal  SandboxChanged  29m   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          29m   kubelet            Container image "jupyterhub/k8s-network-tools:1.2.0" already present on machine
  Normal  Created         29m   kubelet            Created container image-pull-metadata-block
  Normal  Started         29m   kubelet            Started container image-pull-metadata-block
  Normal  Pulled          29m   kubelet            Container image "ideonate/jh-voila-oauth-singleuser:0.6.3" already present on machine
  Normal  Created         29m   kubelet            Created container image-pull-singleuser
  Normal  Started         29m   kubelet            Started container image-pull-singleuser
  Normal  Pulled          29m   kubelet            Container image "k8s.gcr.io/pause:3.5" already present on machine
  Normal  Created         29m   kubelet            Created container pause
  Normal  Started         29m   kubelet            Started container pause


Name:             hub-77f44fdb46-pq4p6
Namespace:        ztjh
Priority:         0
Service Account:  hub
Node:             docker-desktop/192.168.65.4
Start Time:       Wed, 11 Jan 2023 12:40:53 -0600
Labels:           app=jupyterhub
                  component=hub
                  hub.jupyter.org/network-access-proxy-api=true
                  hub.jupyter.org/network-access-proxy-http=true
                  hub.jupyter.org/network-access-singleuser=true
                  pod-template-hash=77f44fdb46
                  release=ztjh-release
Annotations:      checksum/config-map: 15f5d181f0a18c2112e9ed274e9ec724e5e0d1b235f3867110bd81ec4410e485
                  checksum/secret: ec5664f5abafafcf6d981279ace62a764bd66a758c9ffe71850f6c56abec5c12
Status:           Running
IP:               10.1.0.66
IPs:
  IP:           10.1.0.66
Controlled By:  ReplicaSet/hub-77f44fdb46
Containers:
  hub:
    Container ID:  docker://cb78ca68caec3677dcbaeb63d76762b38dd86b458444987af462d84d511e0ce6
    Image:         ideonate/cdsdashboards-jupyter-k8s-hub:1.2.0-0.6.3
    Image ID:      docker-pullable://ideonate/cdsdashboards-jupyter-k8s-hub@sha256:5180c032d13bf33abc762c807199a9622546396f9dd8b134224e83686efb9d75
    Port:          8081/TCP
    Host Port:     0/TCP
    Args:
      jupyterhub
      --config
      /usr/local/etc/jupyterhub/jupyterhub_config.py
      --upgrade-db
    State:          Running
      Started:      Wed, 11 Jan 2023 12:40:54 -0600
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:http/hub/health delay=300s timeout=3s period=10s #success=1 #failure=30
    Readiness:      http-get http://:http/hub/health delay=0s timeout=1s period=2s #success=1 #failure=1000
    Environment:
      PYTHONUNBUFFERED:        1
      HELM_RELEASE_NAME:       ztjh-release
      POD_NAMESPACE:           ztjh (v1:metadata.namespace)
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'hub.config.ConfigurableHTTPProxy.auth_token' in secret 'hub'>  Optional: false
    Mounts:
      /srv/jupyterhub from pvc (rw)
      /usr/local/etc/jupyterhub/config/ from config (rw)
      /usr/local/etc/jupyterhub/jupyterhub_config.py from config (rw,path="jupyterhub_config.py")
      /usr/local/etc/jupyterhub/secret/ from secret (rw)
      /usr/local/etc/jupyterhub/z2jh.py from config (rw,path="z2jh.py")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jkmtw (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub
    Optional:  false
  secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub
    Optional:    false
  pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hub-db-dir
    ReadOnly:   false
  kube-api-access-jkmtw:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 hub.jupyter.org/dedicated=core:NoSchedule
                             hub.jupyter.org_dedicated=core:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  60s                default-scheduler  Successfully assigned ztjh/hub-77f44fdb46-pq4p6 to docker-desktop
  Normal   Pulled     59s                kubelet            Container image "ideonate/cdsdashboards-jupyter-k8s-hub:1.2.0-0.6.3" already present on machine
  Normal   Created    59s                kubelet            Created container hub
  Normal   Started    59s                kubelet            Started container hub
  Warning  Unhealthy  53s (x6 over 58s)  kubelet            Readiness probe failed: Get "http://10.1.0.66:8081/hub/health": dial tcp 10.1.0.66:8081: connect: connection refused


Name:             proxy-76f45cc855-mjjm9
Namespace:        ztjh
Priority:         0
Service Account:  default
Node:             docker-desktop/192.168.65.4
Start Time:       Wed, 11 Jan 2023 11:45:51 -0600
Labels:           app=jupyterhub
                  component=proxy
                  hub.jupyter.org/network-access-hub=true
                  hub.jupyter.org/network-access-singleuser=true
                  pod-template-hash=76f45cc855
                  release=ztjh-release
Annotations:      checksum/auth-token: 0cf7
                  checksum/proxy-secret: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
Status:           Running
IP:               10.1.0.58
IPs:
  IP:           10.1.0.58
Controlled By:  ReplicaSet/proxy-76f45cc855
Containers:
  chp:
    Container ID:  docker://1ba79bf81875dbdf20c4be21d9b851fd27830f9c96dada96c22e346f467244dc
    Image:         jupyterhub/configurable-http-proxy:4.5.0
    Image ID:      docker-pullable://jupyterhub/configurable-http-proxy@sha256:8ced0a2f8073bd14e9d9609089c8144e95473c0d230a14ef49956500ac8d24ac
    Ports:         8000/TCP, 8001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      configurable-http-proxy
      --ip=
      --api-ip=
      --api-port=8001
      --default-target=http://hub:$(HUB_SERVICE_PORT)
      --error-target=http://hub:$(HUB_SERVICE_PORT)/hub/error
      --port=8000
    State:          Running
      Started:      Wed, 11 Jan 2023 12:12:48 -0600
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Wed, 11 Jan 2023 11:45:53 -0600
      Finished:     Wed, 11 Jan 2023 12:12:36 -0600
    Ready:          True
    Restart Count:  1
    Liveness:       http-get http://:http/_chp_healthz delay=60s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:http/_chp_healthz delay=0s timeout=1s period=2s #success=1 #failure=3
    Environment:
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'hub.config.ConfigurableHTTPProxy.auth_token' in secret 'hub'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-86pzt (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  kube-api-access-86pzt:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 hub.jupyter.org/dedicated=core:NoSchedule
                             hub.jupyter.org_dedicated=core:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason          Age   From               Message
  ----     ------          ----  ----               -------
  Normal   Scheduled       56m   default-scheduler  Successfully assigned ztjh/proxy-76f45cc855-mjjm9 to docker-desktop
  Normal   Pulled          56m   kubelet            Container image "jupyterhub/configurable-http-proxy:4.5.0" already present on machine
  Normal   Created         56m   kubelet            Created container chp
  Normal   Started         56m   kubelet            Started container chp
  Warning  Unhealthy       56m   kubelet            Readiness probe failed: Get "http://10.1.0.50:8000/_chp_healthz": dial tcp 10.1.0.50:8000: connect: connection refused
  Normal   SandboxChanged  29m   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          29m   kubelet            Container image "jupyterhub/configurable-http-proxy:4.5.0" already present on machine
  Normal   Created         29m   kubelet            Created container chp
  Normal   Started         29m   kubelet            Started container chp
  Warning  Unhealthy       29m   kubelet            Readiness probe failed: Get "http://10.1.0.58:8000/_chp_healthz": dial tcp 10.1.0.58:8000: connect: connection refused


Name:             user-scheduler-6cdf89ff97-7pjbn
Namespace:        ztjh
Priority:         0
Service Account:  user-scheduler
Node:             docker-desktop/192.168.65.4
Start Time:       Wed, 11 Jan 2023 11:37:31 -0600
Labels:           app=jupyterhub
                  component=user-scheduler
                  pod-template-hash=6cdf89ff97
                  release=ztjh-release
Annotations:      checksum/config-map: fe036fd82f7529b63f739a2dac48c7dfbd443c8213b332f7a3f31d18f50925f9
Status:           Running
IP:               10.1.0.63
IPs:
  IP:           10.1.0.63
Controlled By:  ReplicaSet/user-scheduler-6cdf89ff97
Containers:
  kube-scheduler:
    Container ID:  docker://4e174c5022b4247661a6976988ab55c3a1f835cf7bcf23206c59ca23f1d561a1
    Image:         k8s.gcr.io/kube-scheduler:v1.19.13
    Image ID:      docker-pullable://k8s.gcr.io/kube-scheduler@sha256:1810844d782c996ca17cd8795e2605aae6c7cbc123f7933fbc273bc6643d12be
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/local/bin/kube-scheduler
      --config=/etc/user-scheduler/config.yaml
      --authentication-skip-lookup=true
      --v=4
    State:          Running
      Started:      Wed, 11 Jan 2023 12:12:49 -0600
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Wed, 11 Jan 2023 11:37:33 -0600
      Finished:     Wed, 11 Jan 2023 12:12:36 -0600
    Ready:          True
    Restart Count:  1
    Liveness:       http-get http://:10251/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10251/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/user-scheduler from config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dcw5n (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      user-scheduler
    Optional:  false
  kube-api-access-dcw5n:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 hub.jupyter.org/dedicated=core:NoSchedule
                             hub.jupyter.org_dedicated=core:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       64m   default-scheduler  Successfully assigned ztjh/user-scheduler-6cdf89ff97-7pjbn to docker-desktop
  Normal  Pulled          64m   kubelet            Container image "k8s.gcr.io/kube-scheduler:v1.19.13" already present on machine
  Normal  Created         64m   kubelet            Created container kube-scheduler
  Normal  Started         64m   kubelet            Started container kube-scheduler
  Normal  SandboxChanged  29m   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          29m   kubelet            Container image "k8s.gcr.io/kube-scheduler:v1.19.13" already present on machine
  Normal  Created         29m   kubelet            Created container kube-scheduler
  Normal  Started         29m   kubelet            Started container kube-scheduler


Name:             user-scheduler-6cdf89ff97-qcf8s
Namespace:        ztjh
Priority:         0
Service Account:  user-scheduler
Node:             docker-desktop/192.168.65.4
Start Time:       Wed, 11 Jan 2023 11:37:31 -0600
Labels:           app=jupyterhub
                  component=user-scheduler
                  pod-template-hash=6cdf89ff97
                  release=ztjh-release
Annotations:      checksum/config-map: fe036fd82f7529b63f739a2dac48c7dfbd443c8213b332f7a3f31d18f50925f9
Status:           Running
IP:               10.1.0.64
IPs:
  IP:           10.1.0.64
Controlled By:  ReplicaSet/user-scheduler-6cdf89ff97
Containers:
  kube-scheduler:
    Container ID:  docker://b99b5ce6f841b5a65160a01b8a8ee594ddc80cbbb9cce5c9d2059cb44b704e85
    Image:         k8s.gcr.io/kube-scheduler:v1.19.13
    Image ID:      docker-pullable://k8s.gcr.io/kube-scheduler@sha256:1810844d782c996ca17cd8795e2605aae6c7cbc123f7933fbc273bc6643d12be
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/local/bin/kube-scheduler
      --config=/etc/user-scheduler/config.yaml
      --authentication-skip-lookup=true
      --v=4
    State:          Running
      Started:      Wed, 11 Jan 2023 12:12:49 -0600
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Wed, 11 Jan 2023 11:37:32 -0600
      Finished:     Wed, 11 Jan 2023 12:12:36 -0600
    Ready:          True
    Restart Count:  1
    Liveness:       http-get http://:10251/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10251/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/user-scheduler from config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xg7xv (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      user-scheduler
    Optional:  false
  kube-api-access-xg7xv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 hub.jupyter.org/dedicated=core:NoSchedule
                             hub.jupyter.org_dedicated=core:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason          Age   From               Message
  ----     ------          ----  ----               -------
  Normal   Scheduled       64m   default-scheduler  Successfully assigned ztjh/user-scheduler-6cdf89ff97-qcf8s to docker-desktop
  Normal   Pulled          64m   kubelet            Container image "k8s.gcr.io/kube-scheduler:v1.19.13" already present on machine
  Normal   Created         64m   kubelet            Created container kube-scheduler
  Normal   Started         64m   kubelet            Started container kube-scheduler
  Warning  Unhealthy       64m   kubelet            Readiness probe failed: Get "http://10.1.0.46:10251/healthz": dial tcp 10.1.0.46:10251: connect: connection refused
  Normal   SandboxChanged  29m   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          29m   kubelet            Container image "k8s.gcr.io/kube-scheduler:v1.19.13" already present on machine
  Normal   Created         29m   kubelet            Created container kube-scheduler
  Normal   Started         29m   kubelet            Started container kube-scheduler

When attempting to spawn a server for a user (admin), the logs read Defaulted container "notebook" out of: notebook, block-cloud-metadata (init).

This is a pretty bare-bones setup with a config.yaml as follows:

hub:
  config:
    Authenticator:
      admin_users:
        - admin
  image:
    name: ideonate/cdsdashboards-jupyter-k8s-hub
    tag: 1.2.0-0.6.3

singleuser:
  startTimeout: 60
  image:
    name: ideonate/jh-voila-oauth-singleuser
    tag: 0.6.3

It seems that the connections between proxy and hub are being refused. Is this an issue with port setup?

I’m actually seeing the same. Disabling the userScheduler got me past, but not sure why

scheduling:
  userScheduler:
    enabled: false

Thanks, @mirestrepo! I can verify that the disabling the user scheduler allows me to spawn servers. I am likewise stumped why that would be that case though…

Looking at the logs of the user-scheduler pod, I’m seeing this error
k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: the server could not find the requested resource

Okay… I’m using helm chart 1.20… not sure if this is fixed in 2.0, but this is likely coming from a newer k8s version and a depreciation of v1beta1 policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget

In case others run into this. I was able to re-enable the user-scheduler by pining my k8s version to 1.24 (1.24.8-gke.2000)

@mirestrepo Are you still using k8s version 1.24 (locally and in your cluster)?

Or have you come up with a different approach to resolve this issue.

I’ve been following the ZTJH docs and trying to set up a test cluster on Digital Ocean, but even the simplest test case still experiences a spawn process that just hangs and never completes.