Thanks once again. I will read through the links you have provided.
Regarding TREs, would love to talk to you about what we have or had as of a week ago until I broke it. A short presentation on the topic here - http://northwest-lsc-tre.surge.sh/#/4 and easiest to contact through LinkedIn (https://www.linkedin.com/in/vvcb/) and we can set up a Teams call
Previous version (1.2) was working beautifully. Since rebuilding the cluster (with Kubernetes 1.26), that does not work either. Authentication works but things get stuck again - this time a singleuser pod is deployed but block-cloud-metadata init container is stuck in waiting. If I disable this in the helm chart, things get stuck at the next stage (the single user container itself).
I wonder this may have something to do with how aiohttp handles HTTPS_PROXY (although this variable is set to a http endpoint).
Found the following:
- https://stackoverflow.com/questions/66545715/how-do-i-use-https-proxy-with-aiohttp
- And more info about ssl_context here -https://docs.aiohttp.org/en/stable/client_reference.html
I suspect I maybe headed down the wrong rabbit hole here - but aren’t they all interesting?!
I am thinking about rewriting this section https://github.com/tomplus/kubernetes_asyncio/blob/d236c1cfa9fd8ae87f61adfe9d951239577f0fbf/kubernetes_asyncio/client/rest.py#L53-L65 and injecting it into the hub pod using the extraFiles key. (I should not be allowed to play with this stuff!).
A miore detailed error log below:
LogEntry
task: <Task pending name='Task-305' coro=<RequestHandler._execute() running at /usr/local/lib/python3.9/site-packages/tornado/web.py:1713> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342baabfd0>()]> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /usr/local/lib/python3.9/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <coroutine object shared_client.<locals>.close_client_task at 0x7f342bb5f4c0>
task: <Task pending name='Task-243' coro=<KubeSpawner._start_reflector.<locals>.catch_reflector_start() running at /usr/local/lib/python3.9/site-packages/kubespawner/spawner.py:2400> wait_for=<Task pending name='Task-242' coro=<ResourceReflector.start() running at /usr/local/lib/python3.9/site-packages/kubespawner/reflector.py:383> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342bb297c0>()]> cb=[<TaskWakeupMethWrapper object at 0x7f342bb67520>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-242' coro=<ResourceReflector.start() running at /usr/local/lib/python3.9/site-packages/kubespawner/reflector.py:383> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342bb297c0>()]> cb=[<TaskWakeupMethWrapper object at 0x7f342bb67520>()]>
Task was destroyed but it is pending!
SystemExit: 1
sys.exit(1)
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2403, in catch_reflector_start
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.9/asyncio/events.py", line 80, in _run
handle._run()
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
self._run_once()
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
self.asyncio_loop.run_forever()
File "/usr/local/lib/python3.9/site-packages/tornado/platform/asyncio.py", line 215, in start
loop.start()
File "/usr/local/lib/python3.9/site-packages/jupyterhub/app.py", line 3313, in launch_instance
Traceback (most recent call last):
During handling of the above exception, another exception occurred:
aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
raise ClientHttpProxyError(
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
_, proto = await self._create_proxy_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
conn = await self._connector.connect(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
r = await self.pool_manager.request(**args)
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
return (await self.request("GET", url,
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
response_data = await self.request(
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
initial_resources_raw = await list_method(**kwargs)
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
await self._list_and_update()
await self.pod_reflector.first_load_future
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
status = await spawner.poll()
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
await self.stop(spawner.name)
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
await f
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2400, in catch_reflector_start
Traceback (most recent call last):
During handling of the above exception, another exception occurred:
asyncio.exceptions.TimeoutError: Could not create pod landerhub-prd/jupyter-single-user-pod-name
raise asyncio.TimeoutError(fail_message)
File "/usr/local/lib/python3.9/site-packages/jupyterhub/utils.py", line 236, in exponential_backoff
await exponential_backoff(
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2663, in _start
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 747, in spawn
raise e
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 850, in spawn
await spawn_future
File "/usr/local/lib/python3.9/site-packages/jupyterhub/handlers/base.py", line 961, in finish_user_spawn
Traceback (most recent call last):
future: <Task finished name='Task-241' coro=<KubeSpawner._start_reflector.<locals>.catch_reflector_start() done, defined at /usr/local/lib/python3.9/site-packages/kubespawner/spawner.py:2398> exception=SystemExit(1)>
Task exception was never retrieved
task: <Task pending name='Task-239' coro=<shared_client.<locals>.close_client_task() running at /usr/local/lib/python3.9/site-packages/kubespawner/clients.py:58> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f342c406fd0>()]>>
Task was destroyed but it is pending!
aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
raise ClientHttpProxyError(
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
_, proto = await self._create_proxy_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
conn = await self._connector.connect(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
r = await self.pool_manager.request(**args)
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
return (await self.request("GET", url,
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
response_data = await self.request(
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
initial_resources_raw = await list_method(**kwargs)
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
await self._list_and_update()
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
await self.pod_reflector.first_load_future
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
status = await spawner.poll()
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
await self.stop(spawner.name)
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
await f
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 2400, in catch_reflector_start
Traceback (most recent call last):
[E 2023-03-11 12:42:03.019 JupyterHub spawner:2402] Reflector for pods failed to start.
aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
raise ClientHttpProxyError(
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
_, proto = await self._create_proxy_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
conn = await self._connector.connect(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
r = await self.pool_manager.request(**args)
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
return (await self.request("GET", url,
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
response_data = await self.request(
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
initial_resources_raw = await list_method(**kwargs)
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
await self._list_and_update()
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
await self.pod_reflector.first_load_future
File "/usr/local/lib/python3.9/site-packages/kubespawner/spawner.py", line 213, in async_method
status = await spawner.poll()
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 934, in stop
await self.stop(spawner.name)
File "/usr/local/lib/python3.9/site-packages/jupyterhub/user.py", line 840, in spawn
Traceback (most recent call last):
[E 2023-03-11 12:42:03.019 JupyterHub user:842] Failed to cleanup single-user-name's server that failed to start
aiohttp.client_exceptions.ClientHttpProxyError: 503, message='Service Unavailable', url=URL('http://proxy-url:800/')
raise ClientHttpProxyError(
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 1288, in _create_proxy_connection
_, proto = await self._create_proxy_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 905, in _create_connection
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
conn = await self._connector.connect(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
r = await self.pool_manager.request(**args)
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 177, in request
return (await self.request("GET", url,
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
response_data = await self.request(
File "/usr/local/lib/python3.9/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
initial_resources_raw = await list_method(**kwargs)
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 228, in _list_and_update
await self._list_and_update()
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
Traceback (most recent call last):
[E 2023-03-11 12:42:03.005 JupyterHub reflector:385] Initial list of pods failed
File "/usr/local/lib/python3.9/site-packages/kubespawner/reflector.py", line 383, in start
Check the single-user server's logs for hints at what needs fixing.
or it crashed due to a configuration issue.
2. The server didn't finish starting,
to a number of seconds that is enough for spawners to finish starting.
To fix: increase `Spawner.start_timeout` configuration
1. Everything is working, but it took too long.
Common causes of this timeout, and debugging tips:
[W 2023-03-11 12:41:43.860 JupyterHub user:824] single-user-name's server failed to start in 900 seconds, giving up.
[I 2023-03-11 12:41:40.858 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:36.729 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:29.136 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:25.377 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:22.160 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:18.870 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:15.674 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:14.462 JupyterHub log:186] 200 GET /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fsessions%3F1678538473615 (@10.244.4.1) 1.26ms
[I 2023-03-11 12:41:14.172 JupyterHub log:186] 302 GET /landerhub/hub/user/single-user-name/api/sessions?1678538473615 -> /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fsessions%3F1678538473615 (@10.244.4.1) 1.40ms
[I 2023-03-11 12:41:13.836 JupyterHub log:186] 302 GET /landerhub/user/single-user-name/api/sessions?1678538473615 -> /landerhub/hub/user/single-user-name/api/sessions?1678538473615 (@10.244.4.1) 0.58ms
[I 2023-03-11 12:41:12.584 JupyterHub log:186] 200 GET /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 4.10ms
[I 2023-03-11 12:41:12.581 JupyterHub pages:394] single-user-name is pending spawn
[I 2023-03-11 12:41:12.489 JupyterHub log:186] 302 POST /landerhub/hub/spawn/single-user-name -> /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 58.42ms
[I 2023-03-11 12:41:12.488 JupyterHub spawner:2469] Attempting to create pod jupyter-single-user-pod-name, with timeout 3
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:159] Successfully mounted landerhub-common to /home/jovyan/shared_readonly.
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:126] Successfully mounted landerhub-vc-sd to /home/jovyan/landerhub-vc-sd.
[I 2023-03-11 12:41:12.487 JupyterHub jupyterhub_config_custom:99] Attempting to mount {'volume_mounts': [{'mountPath': '/home/jovyan/landerhub-vc-sd', 'name': 'landerhub-vc-sd', 'readOnly': False}], 'volumes': [{'name': 'landerhub-vc-sd', 'persistentVolumeClaim': {'claimName': 'pvc-landerhub-vc-sd'}}]}...
[I 2023-03-11 12:41:12.455 JupyterHub provider:651] Creating oauth client jupyterhub-single-user-name
[I 2023-03-11 12:41:07.916 JupyterHub log:186] 200 GET /landerhub/hub/spawn/single-user-name (single-user-name@10.244.4.1) 11.37ms
[I 2023-03-11 12:41:07.909 JupyterHub jupyterhub_config_custom:48] Workspace colorectal_2ww_jupyter expired on 2022-12-31. Consider removing it from config.
[I 2023-03-11 12:41:03.861 JupyterHub log:186] 200 GET /landerhub/hub/spawn-pending/single-user-name (single-user-name@10.244.4.1) 21.28ms
[W 2023-03-11 12:40:24.317 JupyterHub log:186] 404 GET /landerhub/hub/api/users/single-user-name/server/progress (single-user-name@10.244.4.1) 15.22ms
[I 2023-03-11 12:40:20.214 JupyterHub log:186] 200 GET /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fkernelspecs%3F1678538419337 (@10.244.4.1) 23.54ms
[I 2023-03-11 12:40:19.893 JupyterHub log:186] 302 GET /landerhub/hub/user/single-user-name/api/kernelspecs?1678538419337 -> /landerhub/hub/login?next=%2Flanderhub%2Fhub%2Fuser%2Fsingle-user-name%40email.com%2Fapi%2Fkernelspecs%3F1678538419337 (@10.244.4.1) 1.51ms
[I 2023-03-11 12:40:19.606 JupyterHub log:186] 302 GET /landerhub/user/single-user-name/api/kernelspecs?1678538419337 -> /landerhub/hub/user/single-user-name/api/kernelspecs?1678538419337 (@10.244.4.1) 0.58ms
[I 2023-03-11 12:39:26.279 JupyterHub log:186] 200 GET /landerhub/hub/api/users?state=[secret] (jupyterhub-idle-culler@::1) 7.42ms
[I 2023-03-11 12:39:26.271 JupyterHub log:186] 200 GET /landerhub/hub/api/ (jupyterhub-idle-culler@::1) 11.68ms
[I 2023-03-11 12:39:26.172 JupyterHub app:3162] JupyterHub is now running, internal Hub API at http://hub:8081/landerhub/hub/
[I 2023-03-11 12:39:26.171 JupyterHub app:3113] Adding external service dask-gateway
[I 2023-03-11 12:39:26.167 JupyterHub service:133] Spawning python3 -m jupyterhub_idle_culler --url=http://localhost:8081/landerhub/hub/api --timeout=3600 --cull-every=600 --concurrency=10
[I 2023-03-11 12:39:26.166 JupyterHub service:385] Starting service 'jupyterhub-idle-culler': ['python3', '-m', 'jupyterhub_idle_culler', '--url=http://localhost:8081/landerhub/hub/api', '--timeout=3600', '--cull-every=600', '--concurrency=10']
[I 2023-03-11 12:39:26.165 JupyterHub app:3104] Starting managed service jupyterhub-idle-culler
[I 2023-03-11 12:39:26.165 JupyterHub app:3095] Private Hub API connect url http://hub:8081/landerhub/hub/
[I 2023-03-11 12:39:26.165 JupyterHub app:3093] Hub API listening on http://:8081/landerhub/hub/
[I 2023-03-11 12:39:26.163 JupyterHub app:3057] Not starting proxy
[W 2023-03-11 12:39:26.148 JupyterHub app:2594] Deleting OAuth client jupyterhub-single-user-name
[I 2023-03-11 12:39:26.147 JupyterHub app:2844] Initialized 0 spawners in 0.002 seconds
[I 2023-03-11 12:39:26.099 JupyterHub app:1934] Not using allowed_users. Any authenticated user will be allowed.
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-3.0.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Spawner: kubespawner.spawner.KubeSpawner-4.2.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2805] Using Authenticator: oauthenticator.azuread.AzureAdOAuthenticator-15.1.0
[I 2023-03-11 12:39:26.043 JupyterHub app:2775] Running JupyterHub version 3.0.0
Loading /usr/local/etc/jupyterhub/jupyterhub_config.d config: jupyterhub_config_custom.py
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
Loading /usr/local/etc/jupyterhub/secret/values.yaml