EKS Binderhub launching server fails after successful build

I am having trouble getting my Zero-To-Binderhub instance up and running. Currently I am using EKS with GCR configured as my container registry (service-account has admin credentials). When I try to build an example project: https://github.com/binder-examples/requirements it builds and pushes the container but the server fails to start.

Pushing image
Successfully pushed gcr.io/graceful-karma-302801/binderhub-stage-binder-2dexamples- 
2drequirements-55ab5c:82543a814cdd4cb125d03112a193dd4137d8751cBuilt image, launching...
Launching server...
Launch attempt 1 failed, retrying...
Launch attempt 2 failed, retrying...
Launch attempt 3 failed, retrying...
Internal Server Error

Things I’ve tried:

  • Multiple different repos (all build but fail to launch)
  • Different sizes of worker nodes (t2.small, t2.medium). No change and kubectl top nodes shows plenty of space
  • Checked to make sure the nodes are not restarting causing network instability

Any ideas what could be causing these launches to fail? Or any advice on how to find/dig into error messages for the launch would be helpful.

Have a look at your JupyterHub pod logs. If you also check the list of pods whilst the launch occurs kubectl get pods you should also be able to view the logs of the failing pod.

Thanks for the advice! It helped me dig in a bit further. I also tried to follow the steps for app debugging here. The pods are launching but the never get past the PENDING state. The pod below (*) will restart three times before the launch fails.

kubectl get pods --all-namespaces
NAMESPACE     NAME                                                  READY   STATUS    RESTARTS   AGE
kube-system   aws-node-7k5dz                                        1/1     Running   0          32m
kube-system   aws-node-l2dxk                                        1/1     Running   0          32m
kube-system   aws-node-ttmgq                                        1/1     Running   0          32m
kube-system   binder-854784654c-9v2rn                               1/1     Running   0          27m
kube-system   binderhub-image-cleaner-fsn5h                         1/1     Running   0          31m
kube-system   binderhub-image-cleaner-hzlk5                         1/1     Running   0          31m
kube-system   binderhub-image-cleaner-jnnlc                         1/1     Running   0          31m
kube-system   coredns-66bc8b7b7b-8c88m                              1/1     Running   0          35m
kube-system   coredns-66bc8b7b7b-rcg9v                              1/1     Running   0          35m
kube-system   hub-8546d44db4-2k7k9                                  1/1     Running   0          32m
*kube-system   jupyter-binder-2dexamples-2drequirements-2dftbxioh8   0/1     Pending   0          7s
kube-system   kube-proxy-2s2ln                                      1/1     Running   0          32m
kube-system   kube-proxy-kvcf8                                      1/1     Running   0          32m
kube-system   kube-proxy-wdrqr                                      1/1     Running   0          32m
kube-system   proxy-699bd568c7-8rbls                                1/1     Running   0          32m
kube-system   user-scheduler-78456bb65c-99t2r                       1/1     Running   0          32m
kube-system   user-scheduler-78456bb65c-n6rfn                       1/1     Running   0          32m

When I try to access the logs of this pod to see whats going on I hit some brick walls:

kubectl logs jupyter-binder-2dexamples-2drequirements-2dftbxioh8
Error from server (NotFound): pods "jupyter-binder-2dexamples-2drequirements-2dftbxioh8" not found

and

kubectl describe pods jupyter-binder-2dexamples-2drequirements-2dftbxioh8
Error from server (NotFound): pods "jupyter-binder-2dexamples-2drequirements-2dftbxioh8" not found

Events look fine AFAICT:

LAST SEEN   TYPE     REASON                    OBJECT                                          MESSAGE
41m         Normal   Starting                  node/ip-10-0-1-60.us-east-2.compute.internal    Starting kubelet.
41m         Normal   NodeHasSufficientMemory   node/ip-10-0-1-60.us-east-2.compute.internal  Node ip-10-0-1-60.us-east-2.compute.internal status is now: NodeHasSufficientMemory
41m         Normal   NodeHasNoDiskPressure     node/ip-10-0-1-60.us-east-2.compute.internal  Node ip-10-0-1-60.us-east-2.compute.internal status is now: NodeHasNoDiskPressure
41m         Normal   NodeHasSufficientPID      node/ip-10-0-1-60.us-east-2.compute.internal    Node ip-10-0-1-60.us-east-2.compute.internal status is now: NodeHasSufficientPID
41m         Normal   NodeAllocatableEnforced   node/ip-10-0-1-60.us-east-2.compute.internal    Updated Node Allocatable limit across pods
41m         Normal   RegisteredNode            node/ip-10-0-1-60.us-east-2.compute.internal    Node ip-10-0-1-60.us-east-2.compute.internal event: Registered Node ip-10-0-1-60.us-east-2.compute.internal in Controller
40m         Normal   Starting                  node/ip-10-0-1-60.us-east-2.compute.internal    Starting kube-proxy.
40m         Normal   NodeReady                 node/ip-10-0-1-60.us-east-2.compute.internal    Node ip-10-0-1-60.us-east-2.compute.internal status is now: NodeReady
41m         Normal   Starting                  node/ip-10-0-3-203.us-east-2.compute.internal   Starting kubelet.
41m         Normal   NodeHasSufficientMemory   node/ip-10-0-3-203.us-east-2.compute.internal   Node ip-10-0-3-203.us-east-2.compute.internal status is now: NodeHasSufficientMemory
41m         Normal   NodeHasNoDiskPressure     node/ip-10-0-3-203.us-east-2.compute.internal   Node ip-10-0-3-203.us-east-2.compute.internal status is now: NodeHasNoDiskPressure
41m         Normal   NodeHasSufficientPID      node/ip-10-0-3-203.us-east-2.compute.internal   Node ip-10-0-3-203.us-east-2.compute.internal status is now: NodeHasSufficientPID
41m         Normal   NodeAllocatableEnforced   node/ip-10-0-3-203.us-east-2.compute.internal   Updated Node Allocatable limit across pods
41m         Normal   RegisteredNode            node/ip-10-0-3-203.us-east-2.compute.internal   Node ip-10-0-3-203.us-east-2.compute.internal event: Registered Node ip-10-0-3-203.us-east-2.compute.internal in Controller
40m         Normal   Starting                  node/ip-10-0-3-203.us-east-2.compute.internal   Starting kube-proxy.
40m         Normal   NodeReady                 node/ip-10-0-3-203.us-east-2.compute.internal   Node ip-10-0-3-203.us-east-2.compute.internal status is now: NodeReady
41m         Normal   Starting                  node/ip-10-0-3-235.us-east-2.compute.internal   Starting kubelet.
41m         Normal   NodeHasSufficientMemory   node/ip-10-0-3-235.us-east-2.compute.internal   Node ip-10-0-3-235.us-east-2.compute.internal status is now: NodeHasSufficientMemory
41m         Normal   NodeHasNoDiskPressure     node/ip-10-0-3-235.us-east-2.compute.internal   Node ip-10-0-3-235.us-east-2.compute.internal status is now: NodeHasNoDiskPressure
41m         Normal   NodeHasSufficientPID      node/ip-10-0-3-235.us-east-2.compute.internal   Node ip-10-0-3-235.us-east-2.compute.internal status is now: NodeHasSufficientPID
41m         Normal   NodeAllocatableEnforced   node/ip-10-0-3-235.us-east-2.compute.internal   Updated Node Allocatable limit across pods
41m         Normal   RegisteredNode            node/ip-10-0-3-235.us-east-2.compute.internal   Node ip-10-0-3-235.us-east-2.compute.internal event: Registered Node ip-10-0-3-235.us-east-2.compute.internal in Controller
40m         Normal   Starting                  node/ip-10-0-3-235.us-east-2.compute.internal   Starting kube-proxy.
40m         Normal   NodeReady                 node/ip-10-0-3-235.us-east-2.compute.internal   Node ip-10-0-3-235.us-east-2.compute.internal status is now: NodeReady

The pod is also never listed on kubectl top pods when running the Kubernetes metric-server.

Any other advice on how to explore this problem further?

Could you look at your JupyterHub hub logs? Something like

kubectl logs hub-8546d44db4-2k7k9

or

kubectl logs deploy/hub

If that doesn’t show the error you could also try

You can give it a pod regex and it’ll automatically display the logs as soon as a matching pod starts.

Yes I was able to get those! On the t2.smalls the error turned out to be the size of the nodes. When I checked kubectl describe hub for the smalls they show Insufficient memory.

Events:
Type     Reason            Age                  From                      Message
----     ------            ----                 ----                      -------
Warning  FailedScheduling  89s (x3 over 2m46s)  binderhub-user-scheduler  0/3 nodes are available: 3 Insufficient memory.

I switched over to t2.larges just to make sure I had enough room for the next test. The pods all get out of the Pending stage this time! but the pull is still failing.

kubectl get pods --all-namespaces
NAMESPACE     NAME                                                  READY   STATUS             RESTARTS   AGE
kube-system   aws-node-mkqdt                                        1/1     Running            0          22m
kube-system   aws-node-mwdrr                                        1/1     Running            0          22m
kube-system   aws-node-vw4f9                                        1/1     Running            0          22m
kube-system   binder-56c97d9545-rvbsk                               1/1     Running            0          18m
kube-system   binderhub-image-cleaner-4mt28                         1/1     Running            0          21m
kube-system   binderhub-image-cleaner-jjxz2                         1/1     Running            0          21m
kube-system   binderhub-image-cleaner-jkw6k                         1/1     Running            0          21m
kube-system   coredns-66bc8b7b7b-vzvnh                              1/1     Running            0          26m
kube-system   coredns-66bc8b7b7b-z4c4z                              1/1     Running            0          26m
kube-system   hub-8546d44db4-2ct2l                                  1/1     Running            0          22m
kube-system   jupyter-binder-2dexamples-2drequirements-2ds93l6m1k   0/1     ImagePullBackOff   0          4m9s
kube-system   kube-proxy-jps5s                                      1/1     Running            0          22m
kube-system   kube-proxy-m2v97                                      1/1     Running            0          22m
kube-system   kube-proxy-m67jv                                      1/1     Running            0          22m
kube-system   kubernetes-dashboard-69574fd85c-gx5hw                 1/1     Running            0          22m
kube-system   metrics-server-7d7bbd97fc-nxpw6                       1/1     Running            0          22m
kube-system   proxy-699bd568c7-xh2pb                                1/1     Running            0          22m
kube-system   user-scheduler-78456bb65c-852rp                       1/1     Running            0          22m
kube-system   user-scheduler-78456bb65c-dnrp2                       1/1     Running            0          22m

Checking the logs again I found the error occurring in the KubeSpawner. Its mostly INFOs until the timeout error but I’ve attached the whole error from the first launch in-case its helpful.

kubectl logs jupyter-binder-2dexamples-2drequirements-2dt4p4cf70 -c notebook --namespace=kube-system    
Error from server (BadRequest): container "notebook" in pod "jupyter-binder-2dexamples-2drequirements-2dt4p4cf70" is waiting to start: trying and failing to pull image

and

kubectl logs hub-8546d44db4-2ct2l --namespace=kube-system
No config at /etc/jupyterhub/config/values.yaml
Loading /etc/jupyterhub/secret/values.yaml
Loading extra config: 00-binder
[I 2021-01-30 02:33:54.781 JupyterHub app:2332] Running JupyterHub version 1.2.2
[I 2021-01-30 02:33:54.781 JupyterHub app:2362] Using Authenticator: nullauthenticator.NullAuthenticator-1.0.0
[I 2021-01-30 02:33:54.781 JupyterHub app:2362] Using Spawner: builtins.BinderSpawner
[I 2021-01-30 02:33:54.781 JupyterHub app:2362] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.2.2
[I 2021-01-30 02:33:54.782 JupyterHub app:1457] Writing cookie_secret to /srv/jupyterhub/jupyterhub_cookie_secret
[I 2021-01-30 02:33:54.798 alembic.runtime.migration migration:155] Context impl SQLiteImpl.
[I 2021-01-30 02:33:54.799 alembic.runtime.migration migration:158] Will assume non-transactional DDL.
[I 2021-01-30 02:33:54.808 alembic.runtime.migration migration:517] Running stamp_revision  -> 4dc2d5a8c53c
[I 2021-01-30 02:33:54.819 alembic.runtime.migration migration:155] Context impl SQLiteImpl.
[I 2021-01-30 02:33:54.819 alembic.runtime.migration migration:158] Will assume non-transactional DDL.
[W 2021-01-30 02:33:54.895 JupyterHub app:1687] No admin users, admin interface will be unavailable.
[W 2021-01-30 02:33:54.895 JupyterHub app:1688] Add any administrative users to `c.Authenticator.admin_users` in config.
[I 2021-01-30 02:33:54.895 JupyterHub app:1717] Not using allowed_users. Any authenticated user will be allowed.
[I 2021-01-30 02:33:54.946 JupyterHub app:1840] Adding API token for service: binder
[I 2021-01-30 02:33:54.969 JupyterHub app:2399] Initialized 0 spawners in 0.002 seconds
[I 2021-01-30 02:33:54.971 JupyterHub app:2611] Not starting proxy
[I 2021-01-30 02:33:54.977 JupyterHub app:2647] Hub API listening on http://:8081/hub/
[I 2021-01-30 02:33:54.977 JupyterHub app:2649] Private Hub API connect url http://hub:8081/hub/
[I 2021-01-30 02:33:54.977 JupyterHub app:2662] Starting managed service cull-idle
[I 2021-01-30 02:33:54.977 JupyterHub service:339] Starting service 'cull-idle': ['python3', '-m', 'jupyterhub_idle_culler', '--url=http://localhost:8081/hub/api', '--timeout=3600', '--cull-every=600', '--concurrency=10', '--cull-users']
[I 2021-01-30 02:33:54.979 JupyterHub service:121] Spawning python3 -m jupyterhub_idle_culler --url=http://localhost:8081/hub/api --timeout=3600 --cull-every=600 --concurrency=10 --cull-users
[I 2021-01-30 02:33:54.983 JupyterHub app:2671] Adding external service binder
[I 2021-01-30 02:33:54.985 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:33:54.986 JupyterHub proxy:400] Adding default route for Hub: / => http://hub:8081
[I 2021-01-30 02:33:54.989 JupyterHub app:2722] JupyterHub is now running at http://:8000
[I 2021-01-30 02:33:55.110 JupyterHub log:181] 200 GET /hub/api/users (cull-idle@127.0.0.1) 13.04ms
[I 2021-01-30 02:34:54.993 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:35:54.993 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:36:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:37:54.995 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:38:54.993 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:39:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:40:54.992 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:41:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:42:25.828 JupyterHub log:181] 201 POST /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 34.98ms
[I 2021-01-30 02:42:25.873 JupyterHub reflector:201] watching for pods with label selector='component=singleuser-server' in namespace kube-system
[W 2021-01-30 02:42:25.911 JupyterHub spawner:2140] Ignoring unrecognized KubeSpawner user_options: binder_launch_host, binder_persistent_request, binder_ref_url, binder_request, image, repo_url, token
[I 2021-01-30 02:42:25.915 JupyterHub log:181] 202 POST /hub/api/users/binder-examples-requirements-a94hfpwr/servers/ (binder@::ffff:10.0.3.44) 69.58ms
[I 2021-01-30 02:42:25.915 JupyterHub spawner:1805] Attempting to create pod jupyter-binder-2dexamples-2drequirements-2da94hfpwr, with timeout 3
[I 2021-01-30 02:42:25.941 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.37ms
[I 2021-01-30 02:42:26.984 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 9.07ms
[I 2021-01-30 02:42:28.408 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.65ms
[I 2021-01-30 02:42:30.394 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 8.88ms
[I 2021-01-30 02:42:33.161 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.79ms
[I 2021-01-30 02:42:37.035 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.89ms
[I 2021-01-30 02:42:42.474 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 8.71ms
[I 2021-01-30 02:42:50.028 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 8.66ms
[I 2021-01-30 02:42:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:43:00.061 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 16.00ms
[I 2021-01-30 02:43:10.086 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 9.03ms
[I 2021-01-30 02:43:20.140 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 8.78ms
[I 2021-01-30 02:43:30.182 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 14.57ms
[I 2021-01-30 02:43:40.249 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 8.89ms
[I 2021-01-30 02:43:50.363 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.78ms
[I 2021-01-30 02:43:54.993 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:43:55.107 JupyterHub log:181] 200 GET /hub/api/users (cull-idle@127.0.0.1) 10.54ms
[W 210130 02:43:55 __init__:138] Not culling server binder-examples-requirements-a94hfpwr with pending spawn
[I 2021-01-30 02:44:00.395 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 15.73ms
[I 2021-01-30 02:44:10.421 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.05ms
[I 2021-01-30 02:44:20.448 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.08ms
[I 2021-01-30 02:44:30.508 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 15.03ms
[I 2021-01-30 02:44:40.563 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.92ms
 [I 2021-01-30 02:44:50.613 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.65ms
[I 2021-01-30 02:44:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:45:00.645 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 15.49ms
[I 2021-01-30 02:45:10.671 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 8.84ms
[I 2021-01-30 02:45:20.697 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 8.96ms
[I 2021-01-30 02:45:30.742 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 15.10ms
[I 2021-01-30 02:45:40.781 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.08ms
[I 2021-01-30 02:45:50.815 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.82ms
[I 2021-01-30 02:45:54.994 JupyterHub proxy:320] Checking routes
[I 2021-01-30 02:46:00.916 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 15.60ms
[I 2021-01-30 02:46:10.940 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.06ms
[I 2021-01-30 02:46:20.974 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 8.86ms
[I 2021-01-30 02:46:31.008 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.33) 15.53ms
[I 2021-01-30 02:46:41.061 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.3.44) 9.07ms
[I 2021-01-30 02:46:51.109 JupyterHub log:181] 200 GET /hub/api/users/binder-examples-requirements-a94hfpwr (binder@::ffff:10.0.1.237) 8.79ms
[I 2021-01-30 02:46:54.993 JupyterHub proxy:320] Checking routes
[E 2021-01-30 02:46:56.915 JupyterHub user:690] Unhandled error starting binder-examples-requirements-a94hfpwr's server: pod/jupyter-binder-2dexamples-2drequirements-2da94hfpwr did not start in 300 seconds!
[I 2021-01-30 02:46:56.916 JupyterHub spawner:1955] Deleting pod jupyter-binder-2dexamples-2drequirements-2da94hfpwr
[E 2021-01-30 02:46:58.309 JupyterHub gen:623] Exception in Future <Task finished name='Task-558' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py:893> exception=TimeoutError('pod/jupyter-binder-2dexamples-2drequirements-2da94hfpwr did not start in 300 seconds!')> after timeout
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 618, in error_callback
    future.result()
  File "/usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py", line 900, in finish_user_spawn
    await spawn_future
  File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 708, in spawn
    raise e
  File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 607, in spawn
    url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
  File "/usr/local/lib/python3.8/dist-packages/kubespawner/spawner.py", line 1917, in _start
    await exponential_backoff(
  File "/usr/local/lib/python3.8/dist-packages/jupyterhub/utils.py", line 179, in exponential_backoff
    raise TimeoutError(fail_message)
TimeoutError: pod/jupyter-binder-2dexamples-2drequirements-2da94hfpwr did not start in 300 seconds!

It sounds like JupyterHub’s KubeSpawner can’t access the docker registry. I’m not familiar with how EKS service accounts work, but have you tried explicitly creating a Z2JH secret with the registry credentials? Configuration Reference — Zero to JupyterHub with Kubernetes documentation

That does seem to be the problem. It’s interesting because the hub pod has no problem reading or writing to the container registry. When I choose a previously built Github project it knows it has been previously built and attempts to launch. This seems to just be a problem in the specific binder pod. I wasn’t able to find a work around all week. I also tried starting a cluster and explicitly adding the GCR secret from the Z2BH tutorial to the EKS service account without any luck. Any other suggestions for how to patch this? I’m leaning toward moving the project to Google Cloud :slightly_smiling_face:

Note it’s the binder pod that handles building the image and pushing it to the registry. The hub pod then launches the image via kubespawner.

If you’ve got at least one singleuser image in your registry (either from binderhub testing, or pushed there manually) it might be worth trying a Z2JH deployment on it’s own Zero to JupyterHub with Kubernetes — Zero to JupyterHub with Kubernetes documentation

Set singleuser.image.pullPolicy: Always and you can then play with your Z2JH registry credentials without the additional complexity of BinderHub.