Spawn failed: Timeout even when start_timeout is set to 3600 seconds

Not sure if this will help, but I think that error comes from an old version or either the helm chart or k8s:

Not sure about your infra, but maybe you can try to update it and see what happens?

Thanks @IvanYingX

I don’t see a direct way to downgrade the K8S version on DigitalOcean (it’s currently at 1.30.4).

So if indeed the failed to list *v1beta1.PodDisruptionBudget error is arising because I’m running a K8S version > 1.24, (per Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: the server could not find the requested resource · Issue #3926 · Azure/AKS · GitHub) … I’m wondering what the solution is.

I tried to use the latest dev version of the jupyterhub helm chart in the following way:

helm upgrade --cleanup-on-fail \
  --install helm-jh-test jupyterhub/jupyterhub --version 4.0.0-0.dev.git.6717.h61ab116 \
  --namespace my-jh-test \
  --create-namespace \
  --version=1 \
  --values config.yaml

But I but can still see the error.

Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: the server could not find the requested resource` in the log of the user-scheduler.

Wondering if there’s a way to inspect the helm chart to understand exactly which version of K8S should be used.

I was able to get notebooks to finish spawning with the following combination of helm command and config.yaml. I’m not sure if disabling the userScheduler is going to be an issue down the road…

helm upgrade --cleanup-on-fail \
  --install helm-jh-test jupyterhub/jupyterhub --version 4.0.0-0.dev.git.6717.h61ab116 \
  --namespace my-jh-test \
  --create-namespace \
  --version=1 \
  --values config.yaml

config.yaml:

debug:
  enabled: true

scheduling:
  userScheduler:
    enabled: false

hub:
  config:
    JupyterHub:
      authenticator_class: dummy
      log_level: DEBUG
    Authenticator:
      admin_users:
        - daniel
      allowed_users:
        - student1
    DummyAuthenticator:
      password: (redacted)
  networkPolicy:
    egress:
      - ports:
          - port: 6443
          - port: 443

singleuser: 

  # Setting a global start_timeout instead of 'per profile'
  startTimeout: 3600
  

If the pod hasn’t started there won’t be any logs, since they’re generated by the application running in the pod.
kubectl describe pod <pod name>
Will often contain clues as to why the pod can’t be started. Can you share the output here?

1 Like

@manics Thanks for your response. Per the note above, I did see the following error on the “hub” pod log:

Failed to watch *v1beta1.PodDisruptionBudget: failed to list
*v1beta1.PodDisruptionBudget: the server could not find the 
requested resource` in the log of the user-scheduler.

…which @IvanYingX helped me understand is probably a versioning issue, since in later versions the v1beta1.PodDisruptionBudget method is not longer available?

Therefore I used the latest version of the helm chart 4.0.0-0.dev.git.6717.h61ab116 and the config file with userScheduling disabled (see above), and I can finally get it to work on DigitalOcean!

Not sure if this is helpful to anyone, but I noticed that in the ZTJH docs for getting started on DigitalOcean, the command to spin up the K8S cluster uses the default sized nodes. But when I tried that, the notebook would hang during creation:

However, when I launched the cluster with larger nodes, that error went away and I was able to launch a notebook.

Command:

doctl k8s cluster create jupyter-kubernetes --region syd1 --node-pool="name=worker-pool;size=s-2vcpu-4gb;count=3"

Log when creating a Notebook: