Cannot spawn server for new user

I’m new to jupyterhub. I’ve searched through this community posts as much as I can and haven’t been able to resolve the issue. I’m deploying jupyterhub via kubernates to gke. The jupyter-user pod is stuck in pending. Below are the config.yaml:

proxy:
  secretToken: tuFslOftLZF2t7ECEUZeMgmfOaIdv1RX

debug:
  enabled: true

singleuser:
  cmd: ["jupyterhub-singleuser", "--notebook-dir=/home"]
  startTimeout: 600
  defaultUrl: "/lab"
  image:
    name: teradata/jupyterlab-extensions
    tag: latest

hub:
  # command: ["sh", "-c", "pip install jupyter && exec jupyterhub"]

  services:
    datarango:
      admin: true
      name: datarango
      api_token: secret-token

  extraConfig:
    myConfig.py: |
      c.JupyterHub.authenticator_class = CustomAuth

Logs:
Hub logs:

[D 2024-04-13 03:06:34.280 JupyterHub proxy:924] Proxy: Fetching GET http://proxy-api:8001/api/routes
[D 2024-04-13 03:06:34.282 JupyterHub proxy:393] Checking routes
[I 2024-04-13 03:06:34.466 JupyterHub log:192] 200 GET /hub/api/ (jupyterhub-idle-culler@127.0.0.1) 6.60ms
[D 2024-04-13 03:06:34.468 JupyterHub scopes:884] Checking access to /hub/api/users via scope list:users
[D 2024-04-13 03:06:34.468 JupyterHub scopes:684] Unrestricted access to /hub/api/users via list:users
[I 2024-04-13 03:06:34.470 JupyterHub log:192] 200 GET /hub/api/users?state=[secret] (jupyterhub-idle-culler@127.0.0.1) 2.58ms
// health
[D 2024-04-13 03:06:34.751 JupyterHub reflector:374] pods watcher timeout
[D 2024-04-13 03:06:34.751 JupyterHub reflector:289] Connecting pods watcher
[D 2024-04-13 03:06:35.038 JupyterHub reflector:374] events watcher timeout
[D 2024-04-13 03:06:35.038 JupyterHub reflector:289] Connecting events watcher
repeated
// health
[I 2024-04-13 03:07:31.246 JupyterHub log:192] 302 GET / -> /hub/ (@::ffff:10.218.0.3) 0.47ms
[D 2024-04-13 03:07:31.381 JupyterHub base:407] Refreshing auth for zfjryuibwedbvgjvmxrvjcfjybevht
[I 2024-04-13 03:07:31.381 JupyterHub log:192] 302 GET /hub/ -> /user/zfjryuibwedbvgjvmxrvjcfjybevht/ (zfjryuibwedbvgjvmxrvjcfjybevht@::ffff:10.218.0.3) 6.14ms
[I 2024-04-13 03:07:31.501 JupyterHub log:192] 302 GET /user/zfjryuibwedbvgjvmxrvjcfjybevht/ -> /hub/user/zfjryuibwedbvgjvmxrvjcfjybevht/ (@::ffff:10.218.0.3) 0.49ms
[D 2024-04-13 03:07:31.627 JupyterHub scopes:884] Checking access to /hub/user/zfjryuibwedbvgjvmxrvjcfjybevht/ via scope access:servers
[D 2024-04-13 03:07:31.627 JupyterHub scopes:697] Argument-based access to /hub/user/zfjryuibwedbvgjvmxrvjcfjybevht/ via access:servers
[I 2024-04-13 03:07:31.628 JupyterHub log:192] 303 GET /hub/user/zfjryuibwedbvgjvmxrvjcfjybevht/ (zfjryuibwedbvgjvmxrvjcfjybevht@::ffff:10.218.0.3) 2.13ms
[D 2024-04-13 03:07:31.757 JupyterHub scopes:884] Checking access to /hub/spawn-pending/zfjryuibwedbvgjvmxrvjcfjybevht via scope servers
[D 2024-04-13 03:07:31.757 JupyterHub scopes:697] Argument-based access to /hub/spawn-pending/zfjryuibwedbvgjvmxrvjcfjybevht via servers
[I 2024-04-13 03:07:31.757 JupyterHub pages:399] zfjryuibwedbvgjvmxrvjcfjybevht is pending spawn
[I 2024-04-13 03:07:31.770 JupyterHub log:192] 200 GET /hub/spawn-pending/zfjryuibwedbvgjvmxrvjcfjybevht?next=%2Fhub%2Fuser%2Fzfjryuibwedbvgjvmxrvjcfjybevht%2F (zfjryuibwedbvgjvmxrvjcfjybevht@::ffff:10.218.0.3) 14.97ms
[W 2024-04-13 03:07:32.028 JupyterHub _xsrf_utils:195] Skipping XSRF check for insecure request GET /hub/api/users/zfjryuibwedbvgjvmxrvjcfjybevht/server/progress
[D 2024-04-13 03:07:32.028 JupyterHub scopes:884] Checking access to /hub/api/users/zfjryuibwedbvgjvmxrvjcfjybevht/server/progress via scope read:servers
[D 2024-04-13 03:07:32.028 JupyterHub scopes:697] Argument-based access to /hub/api/users/zfjryuibwedbvgjvmxrvjcfjybevht/server/progress via read:servers
[D 2024-04-13 03:07:32.029 JupyterHub spawner:2303] progress generator: jupyter-zfjryuibwedbvgjvmxrvjcfjybevht
// health
[D 2024-04-13 03:07:34.280 JupyterHub proxy:924] Proxy: Fetching GET http://proxy-api:8001/api/routes
[D 2024-04-13 03:07:34.283 JupyterHub proxy:393] Checking routes
// health
[D 2024-04-13 03:07:34.796 JupyterHub reflector:374] pods watcher timeout
[D 2024-04-13 03:07:34.796 JupyterHub reflector:289] Connecting pods watcher
[D 2024-04-13 03:07:35.101 JupyterHub reflector:374] events watcher timeout
[D 2024-04-13 03:07:35.102 JupyterHub reflector:289] Connecting events watcher
repeated
// health
[D 2024-04-13 03:08:34.280 JupyterHub proxy:924] Proxy: Fetching GET http://proxy-api:8001/api/routes
[D 2024-04-13 03:08:34.282 JupyterHub proxy:393] Checking routes
[W 2024-04-13 03:08:34.517 JupyterHub user:879] zfjryuibwedbvgjvmxrvjcfjybevht's server failed to start in 600 seconds, giving up.
    
    Common causes of this timeout, and debugging tips:
    
    1. Everything is working, but it took too long.
       To fix: increase `Spawner.start_timeout` configuration
       to a number of seconds that is enough for spawners to finish starting.
    2. The server didn't finish starting,
       or it crashed due to a configuration issue.
       Check the single-user server's logs for hints at what needs fixing.
    
[D 2024-04-13 03:08:34.517 JupyterHub user:985] Stopping zfjryuibwedbvgjvmxrvjcfjybevht
[I 2024-04-13 03:08:34.517 JupyterHub spawner:2849] Deleting pod jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht
[D 2024-04-13 03:08:34.553 JupyterHub user:1005] Deleting oauth client jupyterhub-user-zfjryuibwedbvgjvmxrvjcfjybevht
[D 2024-04-13 03:08:34.558 JupyterHub user:1008] Finished stopping zfjryuibwedbvgjvmxrvjcfjybevht
[E 2024-04-13 03:08:34.562 JupyterHub gen:630] Exception in Future <Task finished name='Task-328' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.11/site-packages/jupyterhub/handlers/base.py:1081> exception=TimeoutError('Timeout')> after timeout
    Traceback (most recent call last):
      File "/usr/local/lib/python3.11/site-packages/tornado/gen.py", line 625, in error_callback
        future.result()
      File "/usr/local/lib/python3.11/site-packages/jupyterhub/handlers/base.py", line 1088, in finish_user_spawn
        await spawn_future
      File "/usr/local/lib/python3.11/site-packages/jupyterhub/user.py", line 905, in spawn
        raise e
      File "/usr/local/lib/python3.11/site-packages/jupyterhub/user.py", line 801, in spawn
        url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    TimeoutError: Timeout
    
[I 2024-04-13 03:08:34.564 JupyterHub log:192] 200 GET /hub/api/users/zfjryuibwedbvgjvmxrvjcfjybevht/server/progress?_xsrf=[secret] (zfjryuibwedbvgjvmxrvjcfjybevht@::ffff:10.218.0.3) 62536.88ms
// health
[D 2024-04-13 03:08:34.842 JupyterHub reflector:374] pods watcher timeout
[D 2024-04-13 03:08:34.842 JupyterHub reflector:289] Connecting pods watcher
[D 2024-04-13 03:08:35.158 JupyterHub reflector:374] events watcher timeout
[D 2024-04-13 03:08:35.158 JupyterHub reflector:289] Connecting events watcher

User Scheduler 1 logs:

// This was written Over and over again
I0413 03:07:44.395062       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vwd4r_78893d0c-ad7a-43ba-bce6-c4fed89eb2a7 and has not yet expired
I0413 03:07:44.395080       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 03:07:48.196384       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vwd4r_78893d0c-ad7a-43ba-bce6-c4fed89eb2a7 and has not yet expired
I0413 03:07:48.196399       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 03:07:51.458034       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vwd4r_78893d0c-ad7a-43ba-bce6-c4fed89eb2a7 and has not yet expired
I0413 03:07:51.458070       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock

User Scheduler 2 logs:

I0413 02:56:23.425708       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 02:56:23.840823       1 eventhandlers.go:206] "Update event for scheduled pod" pod="jupyterhub/user-scheduler-64559855f8-hcjgf"
I0413 02:56:24.526654       1 eventhandlers.go:206] "Update event for scheduled pod" pod="jupyterhub/user-scheduler-64559855f8-hcjgf"
I0413 02:56:26.902254       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vv4b8_810d0a88-2ae9-4448-96dc-bc0b4bbd6039 and has not yet expired
I0413 02:56:26.902269       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 02:56:31.268991       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vv4b8_810d0a88-2ae9-4448-96dc-bc0b4bbd6039 and has not yet expired
I0413 02:56:31.269007       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 02:56:34.225401       1 leaderelection.go:352] lock is held by user-scheduler-64559855f8-vv4b8_810d0a88-2ae9-4448-96dc-bc0b4bbd6039 and has not yet expired
I0413 02:56:34.225418       1 leaderelection.go:253] failed to acquire lease jupyterhub/user-scheduler-lock
I0413 02:56:34.612112       1 eventhandlers.go:206] "Update event for scheduled pod" pod="jupyterhub/hub-77c7f67bc-t49sh"
I0413 02:56:38.448861       1 leaderelection.go:258] successfully acquired lease jupyterhub/user-scheduler-lock
I0413 02:58:34.591343       1 eventhandlers.go:118] "Add event for unscheduled pod" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 02:58:34.591441       1 scheduling_queue.go:1070] "About to try and schedule pod" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 02:58:34.591458       1 schedule_one.go:81] "Attempting to schedule pod" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 02:58:34.591694       1 binder.go:897] "Provisioning for claims of pod that has no matching volumes..." claimCount=1 pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" node="gke-jupyterhub-staging-default-pool-1a1c895d-5fc9"
I0413 02:58:34.591858       1 binder.go:365] "AssumePodVolumes" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" node="gke-jupyterhub-staging-default-pool-1a1c895d-5fc9"
I0413 02:58:34.591909       1 assume_cache.go:328] "Assumed object" description="v1.PersistentVolumeClaim" cacheKey="jupyterhub/claim-zfjryuibwedbvgjvmxrvjcfjybevht" version=838548
I0413 02:58:34.592008       1 binder.go:436] "BindPodVolumes" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" node="gke-jupyterhub-staging-default-pool-1a1c895d-5fc9"
I0413 03:01:23.312769       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PersistentVolume total 6 items received
I0413 03:01:52.315082       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.StatefulSet total 7 items received
I0413 03:01:59.321597       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Service total 6 items received
I0413 03:02:25.318754       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PodDisruptionBudget total 8 items received
I0413 03:02:37.313909       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSIStorageCapacity total 8 items received
I0413 03:02:46.318456       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.ReplicationController total 7 items received
I0413 03:03:15.318326       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSIDriver total 7 items received
I0413 03:04:10.315424       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Node total 15 items received
I0413 03:04:35.317015       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSINode total 9 items received
I0413 03:04:55.314726       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Namespace total 10 items received
I0413 03:05:06.315679       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PersistentVolumeClaim total 10 items received
I0413 03:05:53.320919       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.ReplicaSet total 14 items received
I0413 03:05:54.321460       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Pod total 19 items received
I0413 03:06:18.316961       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.StorageClass total 11 items received
I0413 03:06:30.316181       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PersistentVolume total 6 items received
I0413 03:08:34.549779       1 eventhandlers.go:161] "Delete event for unscheduled pod" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 03:08:34.604932       1 volume_binding.go:335] "Failed to bind volumes for pod" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" err="binding volumes: pod does not exist any more: pod \"jupyter-zfjryuibwedbvgjvmxrvjcfjybevht\" not found"
E0413 03:08:34.604962       1 framework.go:1045] "Failed running PreBind plugin" err="binding volumes: pod does not exist any more: pod \"jupyter-zfjryuibwedbvgjvmxrvjcfjybevht\" not found" plugin="VolumeBinding" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" node="gke-jupyterhub-staging-default-pool-1a1c895d-5fc9"
I0413 03:08:34.605028       1 assume_cache.go:342] "Restored object" description="v1.PersistentVolumeClaim" cacheKey="jupyterhub/claim-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 03:08:34.605113       1 schedule_one.go:877] "Unable to schedule pod, possibly due to node not found; waiting" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" err="running PreBind plugin \"VolumeBinding\": binding volumes: pod does not exist any more: pod \"jupyter-zfjryuibwedbvgjvmxrvjcfjybevht\" not found"
I0413 03:08:34.605179       1 schedule_one.go:898] "Pod doesn't exist in informer cache" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" err="pod \"jupyter-zfjryuibwedbvgjvmxrvjcfjybevht\" not found"
I0413 03:08:34.605215       1 scheduling_queue.go:982] "Pod doesn't exist in podLister, aborted adding it to the nominator" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 03:08:34.605275       1 schedule_one.go:951] "Updating pod condition" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht" conditionType=PodScheduled conditionStatus=False conditionReason="SchedulerError"
E0413 03:08:34.610806       1 schedule_one.go:936] "Error updating pod" err="pods \"jupyter-zfjryuibwedbvgjvmxrvjcfjybevht\" not found" pod="jupyterhub/jupyter-zfjryuibwedbvgjvmxrvjcfjybevht"
I0413 03:09:05.324131       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Service total 8 items received
I0413 03:09:24.316993       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSIStorageCapacity total 8 items received
I0413 03:09:25.321153       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PodDisruptionBudget total 7 items received
I0413 03:09:43.318020       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Node total 11 items received
I0413 03:10:24.321290       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.ReplicationController total 9 items received
I0413 03:11:04.318082       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.StatefulSet total 10 items received
I0413 03:11:47.321331       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSIDriver total 9 items received
I0413 03:11:49.320149       1 reflector.go:559] vendor/k8s.io/client-go/informers/factory.go:150: Watch close - *v1.StorageClass total 6 items received

User pod logs:

Defaulted container "notebook" out of: notebook, block-cloud-metadata (init)
1 Like

What was the status of the user pod (running, crash loop, etc)?

If it keeps restarting, check logs of previous container in pod restarted using --previous flag.

@consideRatio. The status of the user pod is pending. It is always pending

Try kubectl describe pod … and look at the events in the end, what do they say?

None. It says as events

Do the same for pvc and pv resources associated with the user, maybe you fail to provision storage in the cluster?

How do I do that? What are pvc and pv resources(persistent volume)

Kubectl get pvc

Then kubectl describe pvc (name)

— pvc is a request for a pv, and a pv is k8s representation of storage provisioned by something, like the GCPs persistent disks

When you start a user server, you get a pvc created for it - and k8s observers this and via cloud specific controllers provides storage for the user

Name:          claim-zfjryuibwedbvgjvmxrvjcfjybevht
Namespace:     jupyterhub
StorageClass:  standard-rwo
Status:        Pending
Volume:        
Labels:        app=jupyterhub
               chart=jupyterhub-3.3.6
               component=singleuser-storage
               heritage=jupyterhub
               hub.jupyter.org/username=zfjryuibwedbvgjvmxrvjcfjybevht
               release=3.3.6
Annotations:   hub.jupyter.org/username: zfjryuibwedbvgjvmxrvjcfjybevht
               volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
               volume.kubernetes.io/selected-node: gke-jupyterhub-staging-default-pool-1a1c895d-5fc9
               volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                Age                       From                                                                                              Message
  ----    ------                ----                      ----                                                                                              -------
  Normal  Provisioning          4m54s (x1473 over 3d20h)  pd.csi.storage.gke.io_gke-9083b8f6bc1e4aee93de-001a-abd5-vm_0106cbe6-800c-489c-805c-b17e8e7ff31d  External provisioner is provisioning volume for claim "jupyterhub/claim-zfjryuibwedbvgjvmxrvjcfjybevht"
  Normal  ExternalProvisioning  45s (x22285 over 3d20h)   persistentvolume-controller                                                                       Waiting for a volume to be created either by the external provisioner 'pd.csi.storage.gke.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

So this(sent) is what I get for describe pvc.

Seems like the provisioning of storage for the storageclass: standard-rwo, isnt going well. Have you made changes to the default storageclass in the GKE cluster?

No I have not. I didn’t do anything to the default storageclass

Classic GKE cluster, or autoprovisioning or similar? I dont remember what they are called, but there are two kinds

It is autoprovisioning

Sorry, actually its standard mode

Does that change anything?

Gke version? 1.29 perhaps?

It is 1.29.1-gke.1589018

Yes it is 1.29 version

Ah, there are development releases of the juoyterhub chart - can you try the latest?