Spawn failed due to 30 second timeout

rk92 · May 28, 2021, 6:25pm

I am having issues with getting a pod provisioned for a test user account.

Error is
Spawn failed: Server at http://<IP>:8888/namespace1/user/admin2/ didn't respond in 30 seconds

Partial log output for the pod as it is starting up shows
Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 447, in check_hub_version resp = await client.fetch(self.hub_api_url) tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting

I’ve attempted to pass in values to hub.extraConfig but these end up hitting a 5 minute timeout in my Helm deployment as my CI/CD pipeline runs.

      c.JupyterHub.tornado_settings = connect_timeout(0)

      c.JupyterHub.tornado_settings = request_timeout(0)

or
c.KubeSpawner.http_timeout = Int(30)

If I pass in singleuser.startTimeout = 120 the 30 second response error still shows up.

For configuration I am using the dummy authenticator and authenticator_class of dummy.

proxy:
  secretToken:
  service:
    type: ClusterIP

ingress:
  enabled: true
  hosts:
    - "website.com"
  annotations: 
    kubernetes.io/ingress.class: "nginx"

hub:
  baseUrl: /${class-section}
  extraConfig:
    myConfig.py: |
      c.JupyterHub.tornado_settings = connect_timeout(0)
      c.JupyterHub.tornado_settings = request_timeout(0)
  config:
    Authenticator:
      admin_users:
      allowed_users:
    DummyAuthenticator:
      password: 
    JupyterHub:
      authenticator_class:
singleuser:
  defaultUrl: "/lab"

scheduling:
  userScheduler:
    enabled: false
  podPriority:
    enabled: false
  userPlaceholder:
    enabled: true
    replicas: 2
  userPods:
    nodeAffinity:
      matchNodePurpose: require

cull:
  enabled: true
  timeout: 3600
  every: 3600

manics · May 28, 2021, 9:36pm

Have you seen the debugging page in the Z2JH docs? It’d be useful to see the details of your failing pod.
https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/debug.html

rk92 · May 28, 2021, 9:50pm

Here’s the debug log output of my most recent attempt to start up an environment.

kubectl logs jupyter-user1 -n phys400      
[W 2021-05-28 21:47:55.998 SingleUserNotebookApp configurable:190] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2021-05-28 21:47:56.014 SingleUserNotebookApp notebookapp:1593] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2021-05-28 21:47:57.727 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-05-28 21:47:57.727 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to 
update your config before our next release.
[W 2021-05-28 21:47:57.727 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to 
update your config before our next release.
[W 2021-05-28 21:47:57.727 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to 
update your config before our next release.
[I 2021-05-28 21:47:57.787 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
[I 2021-05-28 21:47:57.787 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
Patching auth into jupyter_server.base.handlers.JupyterHandler(jupyter_server.base.handlers.AuthenticatedHandler) -> JupyterHandler(jupyterhub.singleuser.mixins.HubAuthenticatedHandler, jupyter_server.base.handlers.AuthenticatedHandler)
[I 2021-05-28 21:47:57.793 SingleUserNotebookApp mixins:576] Starting jupyterhub-singleuser server version 1.4.1
[E 2021-05-28 21:48:17.802 SingleUserNotebookApp mixins:449] Failed to connect to my Hub at http://hub:8081/phys400/hub/api (attempt 1/5). Is it running?
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 447, in check_hub_version
        resp = await client.fetch(self.hub_api_url)
    tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting

rk92 · June 4, 2021, 8:02pm

@manics

I’ve still been running into the 30 second timeout issue and my logs look the same

[W 2021-06-04 19:59:07.037 SingleUserNotebookApp configurable:190] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2021-06-04 19:59:07.052 SingleUserNotebookApp notebookapp:1593] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2021-06-04 19:59:08.649 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-06-04 19:59:08.649 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your 
config before our next release.
[W 2021-06-04 19:59:08.649 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your 
config before our next release.
[W 2021-06-04 19:59:08.649 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your 
config before our next release.
[I 2021-06-04 19:59:08.656 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
[I 2021-06-04 19:59:08.657 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
Patching auth into jupyter_server.base.handlers.JupyterHandler(jupyter_server.base.handlers.AuthenticatedHandler) -> JupyterHandler(jupyterhub.singleuser.mixins.HubAuthenticatedHandler, jupyter_server.base.handlers.AuthenticatedHandler)
[I 2021-06-04 19:59:08.715 SingleUserNotebookApp mixins:576] Starting jupyterhub-singleuser server version 1.4.1
[E 2021-06-04 19:59:28.728 SingleUserNotebookApp mixins:449] Failed to connect to my Hub at http://hub:8081/phys101/hub/api (attempt 1/5). Is it running?
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 447, in check_hub_version
        resp = await client.fetch(self.hub_api_url)
    tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting

If I go to the webpage the loading bar is stuck at blue about 95% done and the message just says 2021-06-04T19:59:05Z [Normal] Started container notebook

Here is the rest of the event log from that page.

Event log
Server requested
2021-06-04T19:58:47Z [Normal] Successfully assigned phys101/jupyter-admin1 to gke-tf-jh-cluster-user-pool-40bd378f-sjj4
2021-06-04T19:58:55Z [Normal] AttachVolume.Attach succeeded for volume "pvc-03c4e983-363e-401f-8616-e7df5c64fdac"
2021-06-04T19:59:04Z [Normal] Container image "jupyterhub/k8s-network-tools:0.11.1" already present on machine
2021-06-04T19:59:05Z [Normal] Created container block-cloud-metadata
2021-06-04T19:59:05Z [Normal] Started container block-cloud-metadata
2021-06-04T19:59:05Z [Normal] Pulling image "jupyter/datascience-notebook:latest"
2021-06-04T19:59:05Z [Normal] Successfully pulled image "jupyter/datascience-notebook:latest"
2021-06-04T19:59:05Z [Normal] Created container notebook
2021-06-04T19:59:05Z [Normal] Started container notebook
Server requested
2021-06-04T19:58:47Z [Normal] Successfully assigned phys101/jupyter-admin1 to gke-tf-jh-cluster-user-pool-40bd378f-sjj4
2021-06-04T19:58:55Z [Normal] AttachVolume.Attach succeeded for volume "pvc-03c4e983-363e-401f-8616-e7df5c64fdac"
2021-06-04T19:59:04Z [Normal] Container image "jupyterhub/k8s-network-tools:0.11.1" already present on machine
2021-06-04T19:59:05Z [Normal] Created container block-cloud-metadata
2021-06-04T19:59:05Z [Normal] Started container block-cloud-metadata
2021-06-04T19:59:05Z [Normal] Pulling image "jupyter/datascience-notebook:latest"
2021-06-04T19:59:05Z [Normal] Successfully pulled image "jupyter/datascience-notebook:latest"
2021-06-04T19:59:05Z [Normal] Created container notebook
2021-06-04T19:59:05Z [Normal] Started container notebook

I see the pod come up, go into a running 1/1 state then it disappears.

At one point I did see this error below and commented out the userPods.nodeAffinity which seems to have removed the message.

0/3 nodes are available: 1 pod has unbound immediate persistentvolumeclaims, 2 node(s) didn't match node selector.

Here’s the rest of my config YAML.

proxy:
  secretToken:
  service:
    type: NodePort

ingress:
  enabled: true
  pathSuffix: "/*"
  hosts:
    - "<DOMAIN.COM>"
  annotations: 
    kubernetes.io/ingress.class: "gce"
    kubernetes.io/ingress.global-static-ip-name: "jh-lb-global-test"
    networking.gke.io/managed-certificates: "google-managed-ssl-certificate"

hub:
  baseUrl: /${class-section}
  config:
    Authenticator:
      admin_users:
        - admin1
        - admin2
      allowed_users:
        - user1
        - user2
    DummyAuthenticator:
      password: <PASSWORD>
    JupyterHub:
      authenticator_class: dummy
singleuser:
  startTimeout: 120
  profileList:
    - display_name: "Minimal environment"
      description: "To avoid too much bells and whistles: Python."
      default: true
    - display_name: "Datascience environment"
      description: "If you want the additional bells and whistles: Python, R, and Julia."
      kubespawner_override:
        image: jupyter/datascience-notebook:latest
    - display_name: "Spark environment"
      description: "The Jupyter Stacks spark image!"
      kubespawner_override:
        image: jupyter/all-spark-notebook:latest  
  memory:
    limit: 1G
    guarantee: 1G
  cpu:
    limit: .5
    guarantee: .5
  image:
    # You should replace the "latest" tag with a fixed version from:
    # https://hub.docker.com/r/jupyter/datascience-notebook/tags/
    # Inspect the Dockerfile at:
    # https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook/Dockerfile
    name: jupyter/datascience-notebook
    pullPolicy: Always
    tag: latest
  defaultUrl: "/lab"

scheduling:
  userScheduler:
    enabled: false
  podPriority:
    enabled: false
  # userPlaceholder:
  #   enabled: true
  #   replicas: 2
  userPods:
    nodeAffinity:
      matchNodePurpose: require
  # corePods:
  #   nodeAffinity:
  #     matchNodePurpose: require

cull:
  enabled: true
  timeout: 3600
  every: 3600

# prePuller:
#   hook:
#     enabled: false

manics · June 7, 2021, 7:26pm

Can you share some details about your K8S cluster? E.g. is it a standard public cloud offering, do you have full control or is it managed by a separate department, is there anything else running on the cluster and are there any customisations?

What version of Z2jh are you using, if it’s not the latest could you try 1.0.0-beta.1?

Topic		Replies	Views
JupyterHub on Kubernetes: Spawn Failed for users with `timeout 3`. Increase pod creation Timeout Zero to JupyterHub on Kubernetes help-wanted	5	2611	May 1, 2023
Spawn failed: Server at http://some_ip:8888/user/a/ didn't respond in 30 seconds Zero to JupyterHub on Kubernetes jupyterhub	4	1173	February 7, 2024
Jupyterhub helm chart 3.1.0+k8s 1.27.6: User pods not spawning got TimeoutError Zero to JupyterHub on Kubernetes jupyterhub , how-to , help-wanted	4	997	October 19, 2023
Timeout while spawning notebook server Zero to JupyterHub on Kubernetes jupyterlab , jupyterhub	1	173	May 4, 2024
Server never showed up at http://161.32.35.176:8888/user/niranjan/ after 30 seconds. Giving up Zero to JupyterHub on Kubernetes jupyterhub	4	884	February 28, 2024

Spawn failed due to 30 second timeout

Related topics