After OAuthenticator successfully returned the user, kubespawner stuck at creating PVC

Thanks once again. I will read through the links you have provided.

Regarding TREs, would love to talk to you about what we have or had as of a week ago until I broke it. A short presentation on the topic here - http://northwest-lsc-tre.surge.sh/#/4 and easiest to contact through LinkedIn (https://www.linkedin.com/in/vvcb/) and we can set up a Teams call

Previous version (1.2) was working beautifully. Since rebuilding the cluster (with Kubernetes 1.26), that does not work either. Authentication works but things get stuck again - this time a singleuser pod is deployed but block-cloud-metadata init container is stuck in waiting. If I disable this in the helm chart, things get stuck at the next stage (the single user container itself).

I wonder this may have something to do with how aiohttp handles HTTPS_PROXY (although this variable is set to a http endpoint).

Found the following:

I suspect I maybe headed down the wrong rabbit hole here - but aren’t they all interesting?! :slight_smile:

I am thinking about rewriting this section https://github.com/tomplus/kubernetes_asyncio/blob/d236c1cfa9fd8ae87f61adfe9d951239577f0fbf/kubernetes_asyncio/client/rest.py#L53-L65 and injecting it into the hub pod using the extraFiles key. (I should not be allowed to play with this stuff!).

A miore detailed error log below:

LogEntry
task: <Task pending name='Task-305' coro=<RequestHandler._execute() running at /usr/local/lib/python3.9/site-packages/tornado/web.py:1713> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342baabfd0>()]> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /usr/local/lib/python3.9/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <coroutine object shared_client.<locals>.close_client_task at 0x7f342bb5f4c0>
task: <Task pending name='Task-243' coro=<KubeSpawner._start_reflector.<locals>.catch_reflector_start() running at /usr/local/lib/python3.9/site-packages/kubespawner/spawner.py:2400> wait_for=<Task pending name='Task-242' coro=<ResourceReflector.start() running at /usr/local/lib/python3.9/site-packages/kubespawner/reflector.py:383> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342bb297c0>()]> cb=[<TaskWakeupMethWrapper object at 0x7f342bb67520>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-242' coro=<ResourceReflector.start() running at /usr/local/lib/python3.9/site-packages/kubespawner/reflector.py:383> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342bb297c0>()]> cb=[<TaskWakeupMethWrapper object at 0x7f342bb67520>()]>
Task was destroyed but it is pending!
SystemExit: 1
    sys.exit(1)
  File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2403, in catch_reflector_start
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.9/asyncio/events.py", line 80, in _run
    handle._run()
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
    self._run_once()
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
    self.asyncio_loop.run_forever()
  File "/usr/local/lib/python3.9/site-packages/tornado/platform/asyncio.py", line 215, in start
    loop.start()
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/app.py", line 3313, in launch_instance
Traceback (most recent call last):

During handling of the above exception, another exception occurred:

aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
    raise ClientHttpProxyError(
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
    _, proto = await self._create_proxy_connection(req, traces, timeout)
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
    proto = await self._create_connection(req, traces, timeout)
  File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
    conn = await self._connector.connect(
  File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
    r = await self.pool_manager.request(**args)
  File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
    return (await self.request("GET", url,
  File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
    response_data = await self.request(
  File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
    initial_resources_raw = await list_method(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
    await self._list_and_update()
    await self.pod_reflector.first_load_future
  File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
    status = await spawner.poll()
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
    await self.stop(spawner.name)
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
    await f
  File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2400, in catch_reflector_start
Traceback (most recent call last):

During handling of the above exception, another exception occurred:

asyncio.exceptions.TimeoutError: Could not create pod landerhub-prd/jupyter-single-user-pod-name
    raise asyncio.TimeoutError(fail_message)
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/utils.py", line 236, in exponential_backoff
    await exponential_backoff(
  File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2663, in _start
    url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 747, in spawn
    raise e
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 850, in spawn
    await spawn_future
  File "/usr/local/lib/python3.9/site-packages/jupyterhub/handlers/base.py", line 961, in finish_user_spawn
Traceback (most recent call last):
future: <Task finished name='Task-241' coro=<KubeSpawner._start_reflector.<locals>.catch_reflector_start() done, defined at /usr/local/lib/python3.9/site-packages/kubespawner/spawner.py:2398> exception=SystemExit(1)>
Task exception was never retrieved
task: <Task pending name='Task-239' coro=<shared_client.<locals>.close_client_task() running at /usr/local/lib/python3.9/site-packages/kubespawner/clients.py:58> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342c406fd0>()]>>
Task was destroyed but it is pending!
    
    aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
        raise ClientHttpProxyError(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
        _, proto = await self._create_proxy_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
        proto = await self._create_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
        conn = await self._connector.connect(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
        r = await self.pool_manager.request(**args)
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
        return (await self.request("GET", url,
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
        response_data = await self.request(
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
        initial_resources_raw = await list_method(**kwargs)
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
        await self._list_and_update()
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
        await self.pod_reflector.first_load_future
      File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
        status = await spawner.poll()
      File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
        await self.stop(spawner.name)
      File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
        await f
      File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2400, in catch_reflector_start
    Traceback (most recent call last):
[E 2023-03-11 12:42:03.019 JupyterHub spawner:2402] Reflector for pods failed to start.
    
    aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
        raise ClientHttpProxyError(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
        _, proto = await self._create_proxy_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
        proto = await self._create_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
        conn = await self._connector.connect(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
        r = await self.pool_manager.request(**args)
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
        return (await self.request("GET", url,
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
        response_data = await self.request(
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
        initial_resources_raw = await list_method(**kwargs)
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
        await self._list_and_update()
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
        await self.pod_reflector.first_load_future
      File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
        status = await spawner.poll()
      File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
        await self.stop(spawner.name)
      File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
    Traceback (most recent call last):
[E 2023-03-11 12:42:03.019 JupyterHub user:842] Failed to cleanup single-user-name's server that failed to start
    
    aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
        raise ClientHttpProxyError(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
        _, proto = await self._create_proxy_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
        proto = await self._create_connection(req, traces, timeout)
      File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
        conn = await self._connector.connect(
      File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
        r = await self.pool_manager.request(**args)
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
        return (await self.request("GET", url,
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
        response_data = await self.request(
      File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
        initial_resources_raw = await list_method(**kwargs)
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
        await self._list_and_update()
      File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
    Traceback (most recent call last):
[E 2023-03-11 12:42:03.005 JupyterHub reflector:385] Initial list of pods failed
  File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
    
       Check the single-user server's logs for hints at what needs fixing.
       or it crashed due to a configuration issue.
    2. The server didn't finish starting,
       to a number of seconds that is enough for spawners to finish starting.
       To fix: increase `Spawner.start_timeout` configuration
    1. Everything is working, but it took too long.
    
    Common causes of this timeout, and debugging tips:
    
[W 2023-03-11 12:41:43.860 JupyterHub user:824] single-user-name's server failed to start in 900 seconds, giving up.
[I 2023-03-11 12:41:40.858 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:36.729 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:29.136 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:25.377 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:22.160 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:18.870 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:15.674 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:14.462 JupyterHub log:186] 200 GET /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fsessions%3F1678538473615 (@10.244.4.1) 1.26ms
[I 2023-03-11 12:41:14.172 JupyterHub log:186] 302 GET /landerhub/hub/user/single-user-name/api/sessions?1678538473615 -> /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fsessions%3F1678538473615 (@10.244.4.1) 1.40ms
[I 2023-03-11 12:41:13.836 JupyterHub log:186] 302 GET /landerhub/user/single-user-name/api/sessions?1678538473615 -> /landerhub/hub/user/single-user-name/api/sessions?1678538473615 (@10.244.4.1) 0.58ms
[I 2023-03-11 12:41:12.584 JupyterHub log:186] 200 GET /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 4.10ms
[I 2023-03-11 12:41:12.581 JupyterHub pages:394] single-user-name is pending spawn
[I 2023-03-11 12:41:12.489 JupyterHub log:186] 302 POST /landerhub/hub/spawn/single-user-name -> /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 58.42ms
[I 2023-03-11 12:41:12.488 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:159] Successfully mounted landerhub-common to /home/jovyan/shared_readonly.
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:126] Successfully mounted landerhub-vc-sd to /home/jovyan/landerhub-vc-sd.
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:99] Attempting to mount {'volume_mounts': [{'mountPath': '/home/jovyan/landerhub-vc-sd', 'name': 'landerhub-vc-sd', 'readOnly': False}], 'volumes': [{'name': 'landerhub-vc-sd', 'persistentVolumeClaim': {'claimName': 'pvc-landerhub-vc-sd'}}]}...
[I 2023-03-11 12:41:12.455 JupyterHub provider:651] Creating oauth client jupyterhub-single-user-name
[I 2023-03-11 12:41:07.916 JupyterHub log:186] 200 GET /landerhub/hub/spawn/single-user-name (single-user-name@10.244.4.1) 11.37ms
[I 2023-03-11 12:41:07.909 JupyterHub jupyterhub_config_custom:48] Workspace colorectal_2ww_jupyter expired on 2022-12-31. Consider removing it from config.
[I 2023-03-11 12:41:03.861 JupyterHub log:186] 200 GET /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 21.28ms
[W 2023-03-11 12:40:24.317 JupyterHub log:186] 404 GET /landerhub/hub/api/users/single-user-name/server/progress (single-user-name@10.244.4.1) 15.22ms
[I 2023-03-11 12:40:20.214 JupyterHub log:186] 200 GET /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fkernelspecs%3F1678538419337 (@10.244.4.1) 23.54ms
[I 2023-03-11 12:40:19.893 JupyterHub log:186] 302 GET /landerhub/hub/user/single-user-name/api/kernelspecs?1678538419337 -> /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fkernelspecs%3F1678538419337 (@10.244.4.1) 1.51ms
[I 2023-03-11 12:40:19.606 JupyterHub log:186] 302 GET /landerhub/user/single-user-name/api/kernelspecs?1678538419337 -> /landerhub/hub/user/single-user-name/api/kernelspecs?1678538419337 (@10.244.4.1) 0.58ms
[I 2023-03-11 12:39:26.279 JupyterHub log:186] 200 GET /landerhub/hub/api/users?state=[secret] (jupyterhub-idle-culler@::1) 7.42ms
[I 2023-03-11 12:39:26.271 JupyterHub log:186] 200 GET /landerhub/hub/api/ (jupyterhub-idle-culler@::1) 11.68ms
[I 2023-03-11 12:39:26.172 JupyterHub app:3162] JupyterHub is now running, internal Hub API at http://hub:8081/landerhub/hub/
[I 2023-03-11 12:39:26.171 JupyterHub app:3113] Adding external service dask-gateway
[I 2023-03-11 12:39:26.167 JupyterHub service:133] Spawning python3 -m jupyterhub_idle_culler --url=http://localhost:8081/landerhub/hub/api --timeout=3600 --cull-every=600 --concurrency=10
[I 2023-03-11 12:39:26.166 JupyterHub service:385] Starting service 'jupyterhub-idle-culler': ['python3', '-m', 'jupyterhub_idle_culler', '--url=http://localhost:8081/landerhub/hub/api', '--timeout=3600', '--cull-every=600', '--concurrency=10']
[I 2023-03-11 12:39:26.165 JupyterHub app:3104] Starting managed service jupyterhub-idle-culler
[I 2023-03-11 12:39:26.165 JupyterHub app:3095] Private Hub API connect url http://hub:8081/landerhub/hub/
[I 2023-03-11 12:39:26.165 JupyterHub app:3093] Hub API listening on http://:8081/landerhub/hub/
[I 2023-03-11 12:39:26.163 JupyterHub app:3057] Not starting proxy
[W 2023-03-11 12:39:26.148 JupyterHub app:2594] Deleting OAuth client jupyterhub-single-user-name
[I 2023-03-11 12:39:26.147 JupyterHub app:2844] Initialized 0 spawners in 0.002 seconds
[I 2023-03-11 12:39:26.099 JupyterHub app:1934] Not using allowed_users. Any authenticated user will be allowed.
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-3.0.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Spawner: kubespawner.spawner.KubeSpawner-4.2.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Authenticator: oauthenticator.azuread.AzureAdOAuthenticator-15.1.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2775] Running JupyterHub version 3.0.0
Loading /usr/local/etc/jupyterhub/jupyterhub_config.d config: jupyterhub_config_custom.py
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
Loading /usr/local/etc/jupyterhub/secret/values.yaml