When I try to build this repo alex_binders / matplotlib-binder · GitLab on mybinder dot org it recognizes that it should have an image for it, takes forever, fails, tries again and again and finally fails. When I try the same exact repo at https://notebooks.gesis.org/, it works (albeit with extremely slow page loads).
A. Presumably binderhub should take a different response then keep trying forever if it can’t get the image.
B. If you have a way to check for images that don’t work (cache says they should exist but they don’t), it would be nice if such images could get kicked from the cache.
Sadly no. I gave up after I went to gesis where it worked (albeit extremely slowly). I got that one to work by going directly to gesis.
How would I tell which backend member I was on, would a copy of the build log have that info?
P.S. This wasn’t the first time it had failed like this. It had done this several times over a few weeks, so if you tell me how to find this I’ll look to find out what backend is failing if this should happen again.
If you can copy the build logs when it fails, that will help. I did find a stuck build from your repo on our GKE cluster, so I deleted that. It seemed to build and launch promptly on OVH. It may be due to one or more unhealthy nodes, which can cause slow launch times due to disk pressure or other issues.
P.S. I tried ovh.mybinder.org, Binder and gke dot mybinder dot org. Each of them worked, though ovh and gke took a minute or so and gesis was short enough that I didn’t notice it.
Thanks for the context. I’d help add some servers to this except that I have a non-docker container side and so I still need to figure out how to get my binderhub back up and running as it is.
I am still getting errors, though the errors seem to have more details, appears that there is a lack of disk space at /var/lib/containerd (on the host side I assume):
Launching server...
Server requested
2023-06-21T20:58:22Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dctp16tns to user-202211a-node-dac61d
2023-06-21T20:58:23Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T20:58:23Z [Normal] Created container tc-init
2023-06-21T20:58:23Z [Normal] Started container tc-init
2023-06-21T20:58:24Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Launch attempt 1 failed, retrying...
Server requested
2023-06-21T21:05:27Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2diha50hsn to user-202211a-node-dac61d
2023-06-21T21:05:28Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T21:05:28Z [Normal] Created container tc-init
2023-06-21T21:05:28Z [Normal] Started container tc-init
2023-06-21T21:05:29Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:08:49Z [Warning] Failed to pull image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": rpc error: code = Unknown desc = failed to pull and unpack image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": mkdir /var/lib/containerd/io.containerd.content.v1.content/ingest/c36c6f79b6aade4be162ddc908f5550c45617289944f4a9d75468fa34fca5b94: no space left on device
2023-06-21T21:08:49Z [Warning] Error: ErrImagePull
2023-06-21T21:08:50Z [Normal] Back-off pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:08:50Z [Warning] Error: ImagePullBackOff
Spawn failed: pod ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2diha50hsn did not start in 300 seconds!
Launch attempt 2 failed, retrying...
Server requested
2023-06-21T21:10:11Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dv1bbyz3o to user-202211a-node-dac61d
2023-06-21T21:10:12Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T21:10:12Z [Normal] Created container tc-init
2023-06-21T21:10:12Z [Normal] Started container tc-init
2023-06-21T21:10:12Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:10:21Z [Warning] Failed to pull image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": rpc error: code = Unknown desc = failed to pull and unpack image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": mkdir /var/lib/containerd/io.containerd.content.v1.content/ingest/c36c6f79b6aade4be162ddc908f5550c45617289944f4a9d75468fa34fca5b94: no space left on device
2023-06-21T21:10:21Z [Warning] Error: ErrImagePull
2023-06-21T21:10:22Z [Normal] Back-off pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:10:22Z [Warning] Error: ImagePullBackOff```