Installation runs without error messages, but hub pod remains "Pending"

Hi, I am trying to install JupyterHub (latest dev) on a Dell PowerEdge R7525 using Ubuntu 20.4 LTS.
A first attempt with microk8s failed: image pullers were spawned and failed after some minutes.
In a second attempt, I reinstalled the OS, installed Kubernetes using https://www.linuxtechi.com/install-kubernetes-k8s-on-ubuntu-20-04/ and verified the test cases (which worked after untainting the master node). I then followed Installing JupyterHub — Zero to JupyterHub with Kubernetes documentation (the config file just contains comments):

helm upgrade --cleanup-on-fail --install jhub jupyterhub/jupyterhub --namespace jhub --create-namespace --version=0.11.1-n349.he14d686f --values config_JupyterHub.yaml

However, the installation seems stuck after a minute or so (now without any timeout messages)

admin-nb@eo-dell-r7525a:~$ kubectl --namespace=jhub get pod
NAME                              READY   STATUS    RESTARTS   AGE
continuous-image-puller-r9rqm     1/1     Running   0          53m
hub-6d7cb485dc-xsjrc              0/1     Pending   0          53m
proxy-5587d47656-k6t2g            1/1     Running   0          53m
user-scheduler-7f59fc6f47-ndbd4   1/1     Running   0          53m
user-scheduler-7f59fc6f47-rndp9   1/1     Running   0          53m

What should I do?

If you manually installed Kubernetes there’s a good chance you’re missing some standard components. Typical examples include storage controllers or load balancers. If this is the problem there are ways to work around their absence in Z2JH.

Can you check your pods with kubectl describe ... and paste the output, along with your full configuration file (with secrets redacted?)

If this is a single server deployment you might be better off using https://tljh.jupyter.org/ instead

1 Like

Hi manics, thanks for the fast and useful answer! I might switch to tljh, but ultimately we are targeting a Kubernetes cluster of at least 3 machines, so I want to give this another try.

Attached is the output of “describe”, which seems to indicate a problem at least with pvc. I played around a bit with the commands in Configure a Pod to Use a PersistentVolume for Storage | Kubernetes, but couldn´t quickly find a solution.

Which configuration file are you referring to? The yaml-file for Helm consisted only of comments.

By the way, the machine does have a public IP address, but is currently accessible from the internet only via VPN.

admin-nb@eo-dell-r7525a:~$ kubectl --namespace=jhub describe pod
Name:         continuous-image-puller-r9rqm
Namespace:    jhub
Priority:     0
Node:         eo-dell-r7525a/141.78.6.79
Start Time:   Tue, 30 Mar 2021 10:02:02 +0000
Labels:       app=jupyterhub
              component=continuous-image-puller
              controller-revision-hash=5bd85c6d5
              pod-template-generation=1
              release=jhub
Annotations:  cni.projectcalico.org/podIP: 192.168.234.203/32
              cni.projectcalico.org/podIPs: 192.168.234.203/32
Status:       Running
IP:           192.168.234.203
IPs:
  IP:           192.168.234.203
Controlled By:  DaemonSet/continuous-image-puller
Init Containers:
  image-pull-metadata-block:
    Container ID:  docker://4511447d7401ffb34279754460b88cc3f63e382b092f118f60684840828cc011
    Image:         jupyterhub/k8s-network-tools:0.11.1-n346.h03aaf174
    Image ID:      docker-pullable://jupyterhub/k8s-network-tools@sha256:d104871c7f07af90ae3156b01a61f35684549780158a8c0efa59b6f20a39a963
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      echo "Pulling complete"
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 30 Mar 2021 10:02:03 +0000
      Finished:     Tue, 30 Mar 2021 10:02:03 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
  image-pull-singleuser:
    Container ID:  docker://c9c088b0dbcc21a97d82d2e36acd45470a8b575954352c0291c3dc4344c99454
    Image:         jupyterhub/k8s-singleuser-sample:0.11.1-n298.h96f7ad0a
    Image ID:      docker-pullable://jupyterhub/k8s-singleuser-sample@sha256:8c4ed841503b5aeee0f4c63fa524b185d1f904389bd58c6a8fbbb0a39c6fb010
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      echo "Pulling complete"
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 30 Mar 2021 10:02:04 +0000
      Finished:     Tue, 30 Mar 2021 10:02:04 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Containers:
  pause:
    Container ID:   docker://19d802c023e91c198702cc5810a211febbf2e088f7c9411d4f27ee8e8b1a6d35
    Image:          k8s.gcr.io/pause:3.2
    Image ID:       docker-pullable://k8s.gcr.io/pause@sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 30 Mar 2021 10:02:05 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:            <none>
QoS Class:          BestEffort
Node-Selectors:     <none>
Tolerations:        hub.jupyter.org/dedicated=user:NoSchedule
                    hub.jupyter.org_dedicated=user:NoSchedule
                    node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                    node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                    node.kubernetes.io/not-ready:NoExecute op=Exists
                    node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                    node.kubernetes.io/unreachable:NoExecute op=Exists
                    node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:             <none>


Name:           hub-6d7cb485dc-xsjrc
Namespace:      jhub
Priority:       0
Node:           <none>
Labels:         app=jupyterhub
                component=hub
                hub.jupyter.org/network-access-proxy-api=true
                hub.jupyter.org/network-access-proxy-http=true
                hub.jupyter.org/network-access-singleuser=true
                pod-template-hash=6d7cb485dc
                release=jhub
Annotations:    checksum/config-map: 4634b5e7ccf2faf5d52a0e30e96b513ea306b58afdcdf327cc137a359ac60ec6
                checksum/secret: 5bf432ff62d14d37c25b26980b0edb06c0f8f1bff506ab6f05d73de18284c973
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/hub-6d7cb485dc
Containers:
  hub:
    Image:      jupyterhub/k8s-hub:0.11.1-n344.hf8be5fa7
    Port:       8081/TCP
    Host Port:  0/TCP
    Args:
      jupyterhub
      --config
      /usr/local/etc/jupyterhub/jupyterhub_config.py
      --upgrade-db
    Liveness:   http-get http://:http/hub/health delay=300s timeout=3s period=10s #success=1 #failure=30
    Readiness:  http-get http://:http/hub/health delay=0s timeout=1s period=2s #success=1 #failure=1000
    Environment:
      PYTHONUNBUFFERED:        1
      HELM_RELEASE_NAME:       jhub
      POD_NAMESPACE:           jhub (v1:metadata.namespace)
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'hub.config.ConfigurableHTTPProxy.auth_token' in secret 'hub'>  Optional: false
    Mounts:
      /srv/jupyterhub from pvc (rw)
      /usr/local/etc/jupyterhub/config/ from config (rw)
      /usr/local/etc/jupyterhub/jupyterhub_config.py from config (rw,path="jupyterhub_config.py")
      /usr/local/etc/jupyterhub/secret/ from secret (rw)
      /usr/local/etc/jupyterhub/z2jh.py from config (rw,path="z2jh.py")
      /var/run/secrets/kubernetes.io/serviceaccount from hub-token-vf5fw (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub
    Optional:  false
  secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub
    Optional:    false
  pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hub-db-dir
    ReadOnly:   false
  hub-token-vf5fw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-token-vf5fw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     hub.jupyter.org/dedicated=core:NoSchedule
                 hub.jupyter.org_dedicated=core:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  176m  default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.


Name:         proxy-5587d47656-k6t2g
Namespace:    jhub
Priority:     0
Node:         eo-dell-r7525a/141.78.6.79
Start Time:   Tue, 30 Mar 2021 10:02:02 +0000
Labels:       app=jupyterhub
              component=proxy
              hub.jupyter.org/network-access-hub=true
              hub.jupyter.org/network-access-singleuser=true
              pod-template-hash=5587d47656
              release=jhub
Annotations:  checksum/auth-token: 8a07
              checksum/proxy-secret: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
              cni.projectcalico.org/podIP: 192.168.234.206/32
              cni.projectcalico.org/podIPs: 192.168.234.206/32
Status:       Running
IP:           192.168.234.206
IPs:
  IP:           192.168.234.206
Controlled By:  ReplicaSet/proxy-5587d47656
Containers:
  chp:
    Container ID:  docker://5714c44d91c4a83397703e8b9a857406d15779c2ed20ab9753f6a57007a63e87
    Image:         jupyterhub/configurable-http-proxy:4.3.1
    Image ID:      docker-pullable://jupyterhub/configurable-http-proxy@sha256:4c2c995879c398d2bb663604f27f6c033f3aad94cdc2fa92ec7e6c5b09cff8f9
    Ports:         8000/TCP, 8001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      configurable-http-proxy
      --ip=::
      --api-ip=::
      --api-port=8001
      --default-target=http://hub:$(HUB_SERVICE_PORT)
      --error-target=http://hub:$(HUB_SERVICE_PORT)/hub/error
      --port=8000
    State:          Running
      Started:      Tue, 30 Mar 2021 10:02:09 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:http/_chp_healthz delay=60s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:http/_chp_healthz delay=0s timeout=1s period=2s #success=1 #failure=3
    Environment:
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'hub.config.ConfigurableHTTPProxy.auth_token' in secret 'hub'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fq8bh (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-fq8bh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fq8bh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     hub.jupyter.org/dedicated=core:NoSchedule
                 hub.jupyter.org_dedicated=core:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>


Name:         user-scheduler-7f59fc6f47-ndbd4
Namespace:    jhub
Priority:     0
Node:         eo-dell-r7525a/141.78.6.79
Start Time:   Tue, 30 Mar 2021 10:02:02 +0000
Labels:       app=jupyterhub
              component=user-scheduler
              pod-template-hash=7f59fc6f47
              release=jhub
Annotations:  checksum/config-map: 99d6ce9c5464d503ad5f8802423cca092d481fd68a67af995f28f379e18e8017
              cni.projectcalico.org/podIP: 192.168.234.204/32
              cni.projectcalico.org/podIPs: 192.168.234.204/32
Status:       Running
IP:           192.168.234.204
IPs:
  IP:           192.168.234.204
Controlled By:  ReplicaSet/user-scheduler-7f59fc6f47
Containers:
  kube-scheduler:
    Container ID:  docker://542a97d508cf2d66f3771b2c8724b7aef54ec2578c3ec8debb2f315eb366b7b5
    Image:         k8s.gcr.io/kube-scheduler:v1.19.7
    Image ID:      docker-pullable://k8s.gcr.io/kube-scheduler@sha256:0104e0a2954fdc467424a450a0362531b2081f809586446e4b2e63efb376a89a
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/local/bin/kube-scheduler
      --config=/etc/user-scheduler/config.yaml
      --authentication-skip-lookup=true
      --v=4
    State:          Running
      Started:      Tue, 30 Mar 2021 10:02:05 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:10251/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10251/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/user-scheduler from config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from user-scheduler-token-9wntj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      user-scheduler
    Optional:  false
  user-scheduler-token-9wntj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  user-scheduler-token-9wntj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     hub.jupyter.org/dedicated=core:NoSchedule
                 hub.jupyter.org_dedicated=core:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>


Name:         user-scheduler-7f59fc6f47-rndp9
Namespace:    jhub
Priority:     0
Node:         eo-dell-r7525a/141.78.6.79
Start Time:   Tue, 30 Mar 2021 10:02:02 +0000
Labels:       app=jupyterhub
              component=user-scheduler
              pod-template-hash=7f59fc6f47
              release=jhub
Annotations:  checksum/config-map: 99d6ce9c5464d503ad5f8802423cca092d481fd68a67af995f28f379e18e8017
              cni.projectcalico.org/podIP: 192.168.234.205/32
              cni.projectcalico.org/podIPs: 192.168.234.205/32
Status:       Running
IP:           192.168.234.205
IPs:
  IP:           192.168.234.205
Controlled By:  ReplicaSet/user-scheduler-7f59fc6f47
Containers:
  kube-scheduler:
    Container ID:  docker://1246031da0e6c311de06319cb130b51a25bacb8809a851ffdee675ced9cb0478
    Image:         k8s.gcr.io/kube-scheduler:v1.19.7
    Image ID:      docker-pullable://k8s.gcr.io/kube-scheduler@sha256:0104e0a2954fdc467424a450a0362531b2081f809586446e4b2e63efb376a89a
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/local/bin/kube-scheduler
      --config=/etc/user-scheduler/config.yaml
      --authentication-skip-lookup=true
      --v=4
    State:          Running
      Started:      Tue, 30 Mar 2021 10:02:09 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:10251/healthz delay=15s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10251/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/user-scheduler from config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from user-scheduler-token-9wntj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      user-scheduler
    Optional:  false
  user-scheduler-token-9wntj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  user-scheduler-token-9wntj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     hub.jupyter.org/dedicated=core:NoSchedule
                 hub.jupyter.org_dedicated=core:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>

Update: I made some progress. All pods are now running. However the service proxy-public is still pending:

admin-nb@eo-dell-r7525a:~$ cat > storage_dynamic_fast.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
admin-nb@eo-dell-r7525a:~$ kubectl apply -f storage_dynamic_fast.yaml
storageclass.storage.k8s.io/fast created

admin-nb@eo-dell-r7525a:~$ vi config_JupyterHub_v2.yaml
admin-nb@eo-dell-r7525a:~$ cat config_JupyterHub_v2.yaml
# This file can update the JupyterHub Helm chart's default configuration values.
#
# For reference see the configuration reference and default values, but make
# sure to refer to the Helm chart version of interest to you!
#
# Introduction to YAML:     https://www.youtube.com/watch?v=cdLNKUoMc6c
# Chart config reference:   https://zero-to-jupyterhub.readthedocs.io/en/stable/resources/reference.html
# Chart default values:     https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/e14d686fea782482b1f7d388118bf772a6ab5be7/jupyterhub/values.yaml
# Available chart versions: https://jupyterhub.github.io/helm-chart/
#
#

## inspired from https://discourse.jupyter.org/t/problem-using-kubernetes-for-jupyterhub-on-a-local-infrastructure/369/8
## This portion is missing from the tutorial for anyone trying to setup on bare metal.
## dynamic "fast" memory created before
hub:
  db:
    type: sqlite-memory

singleuser:
  storage:
    type: dynamic
    class: fast
##

admin-nb@eo-dell-r7525a:~$ helm upgrade --cleanup-on-fail --install jhub jupyterhub/jupyterhub --namespace jhub --create-namespace --version=0.11.1-n349.he14d686f --values config_JupyterHub_v2.yaml
Release "jhub" has been upgraded. Happy Helming!
NAME: jhub
LAST DEPLOYED: Tue Mar 30 15:34:04 2021
NAMESPACE: jhub
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Thank you for installing JupyterHub!

Your release is named jhub and installed into the namespace jhub.

You can find if the hub and proxy is ready by doing:

 kubectl --namespace=jhub get pod

and watching for both those pods to be in status 'Running'.

You can find the public IP of the JupyterHub by doing:

 kubectl --namespace=jhub get svc proxy-public

It might take a few minutes for it to appear!

Note that this is still an alpha release! If you have questions, feel free to
  1. Read the guide at https://z2jh.jupyter.org
  2. Chat with us at https://gitter.im/jupyterhub/jupyterhub
  3. File issues at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues

admin-nb@eo-dell-r7525a:~$ kubectl get pods -n jhub
NAME                              READY   STATUS    RESTARTS   AGE
continuous-image-puller-htgnj     1/1     Running   0          17m
hub-55875db7f5-nmpdd              1/1     Running   0          5m20s
proxy-7989b9cb88-4l79s            1/1     Running   0          5m20s
user-scheduler-7f59fc6f47-49dl6   1/1     Running   0          5m20s
user-scheduler-7f59fc6f47-6xtct   1/1     Running   0          5m20s
admin-nb@eo-dell-r7525a:~$ kubectl get services -n jhub
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
hub            ClusterIP      10.111.95.240    <none>        8081/TCP       5m26s
proxy-api      ClusterIP      10.97.145.202    <none>        8001/TCP       5m26s
proxy-public   LoadBalancer   10.104.148.191   <pending>     80:31200/TCP   5m26s

So is some load balancer component still missing?

Further Update: Following MetalLB, bare metal load-balancer for Kubernetes I got the installation working, i.e. I am now seeing a “Sign in” page for my Jupyter Hub via http.

However, a logged in user does not get a node:

Your server is starting up.
You will be redirected automatically when it's ready for you.
50%Complete
2021-03-30T16:33:41.831375Z [Warning] 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Event log
Server requested
2021-03-30T16:33:41.823736Z [Warning] 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
2021-03-30T16:33:41.831375Z [Warning] 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

admin-nb@eo-dell-r7525a:~$ kubectl get pods -n jhub
NAME                              READY   STATUS    RESTARTS   AGE
continuous-image-puller-htgnj     1/1     Running   0          72m
hub-55875db7f5-nmpdd              1/1     Running   0          60m
jupyter-nils                      0/1     Pending   0          71s
proxy-7989b9cb88-4l79s            1/1     Running   0          60m
user-scheduler-7f59fc6f47-49dl6   1/1     Running   0          60m
user-scheduler-7f59fc6f47-6xtct   1/1     Running   0          60m

So, apparently I still have to dig deeper into the persistent storage subject.

How did you solve the PersistenVolume issue for your hub?

By default Z2JH will dynamically create a PersistentVolume for each user. See Dynamic Volume Provisioning | Kubernetes if you’re not familiar with dynamic provisioning of storage.

If you haven’t installed a suitable provisioner your quickest option to get a demo working will be to disable persistent storage for users: Customizing User Storage — Zero to JupyterHub with Kubernetes documentation

Since it’s working that’s great. In case it’s helpful to you in future, an alternative is to install an ingress controller Advanced Topics — Zero to JupyterHub with Kubernetes documentation
This works where the only public resources are web-services since the ingress can reverse proxy multiple webservices.

Hi, I decided to go for NFS storage. Unfortunately, one recipe was outdated (using some beta API). The second one seemed to go through, but didn´t actually allocate a volume in my attempt (PVC pending): Provision Kubernetes NFS clients on a Raspberry Pi homelab | Opensource.com
I wondered whether some remains of the first rbac.yaml file are causing problems, but didn´t have time to look at this in detail.
I also asked my staff which provisioner our production Kubernetes cluster is using; I thought it was based on generic NFS functionality and that I might just copy the setup. However, it is using the trident provisioner that seems specific to the ONTAP of our NetApp all-flash system which I would rather not use for this prototype.
So I will probably follow your advice and disable persistent storage. Ultimately, I want to use a cluster file system, most probably Ceph.
Thanks for the help!

1 Like

Simon (manics), thanks again! I now disabled persistent storage and could start actually using the JupyterHub.
Inititially I was then disappointed that “import numpy as np” threw an error. However, after switching from the default notebook to the datascience notebook, the installation starts looking useful (as a demonstration platform).

1 Like