Something up with mybinder.org cache

When I try to build this repo alex_binders / matplotlib-binder · GitLab on mybinder dot org it recognizes that it should have an image for it, takes forever, fails, tries again and again and finally fails. When I try the same exact repo at https://notebooks.gesis.org/, it works (albeit with extremely slow page loads).

A. Presumably binderhub should take a different response then keep trying forever if it can’t get the image.
B. If you have a way to check for images that don’t work (cache says they should exist but they don’t), it would be nice if such images could get kicked from the cache.

Do you know which backend member of the federation the error occured on, and what the exact error message was?
It’s working at the moment

This is the implementation of the redirector:

It redirects a user based on the initial request, but after that it doesn’t track the request all the way through the build and launch processes.

Sadly no. I gave up after I went to gesis where it worked (albeit extremely slowly). I got that one to work by going directly to gesis.

How would I tell which backend member I was on, would a copy of the build log have that info?

P.S. This wasn’t the first time it had failed like this. It had done this several times over a few weeks, so if you tell me how to find this I’ll look to find out what backend is failing if this should happen again.

If you can copy the build logs when it fails, that will help. I did find a stuck build from your repo on our GKE cluster, so I deleted that. It seemed to build and launch promptly on OVH. It may be due to one or more unhealthy nodes, which can cause slow launch times due to disk pressure or other issues.

Here you go.

Found built image, launching...
Launching server...
Server requested
2023-03-24T17:23:39Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dwitktrjz to user-202211a-node-9b20ab
2023-03-24T17:23:40Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-03-24T17:23:40Z [Normal] Created container tc-init
2023-03-24T17:23:40Z [Normal] Started container tc-init
2023-03-24T17:23:41Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Spawn failed: Timeout
Launch attempt 1 failed, retrying...
Server requested
2023-03-24T17:33:14Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dm3ih50tc to user-202211a-node-9b20ab
2023-03-24T17:33:15Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-03-24T17:33:15Z [Normal] Created container tc-init
2023-03-24T17:33:15Z [Normal] Started container tc-init
2023-03-24T17:33:16Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"

In case you’re wondering, yes, it’s getting stuck at the end of that log, I expect it will fail again (based off previous experience).

P.S. I tried ovh.mybinder.org, Binder and gke dot mybinder dot org. Each of them worked, though ovh and gke took a minute or so and gesis was short enough that I didn’t notice it.

So sometimes this works, but more often then not it’s still failling. Latest log:


Found built image, launching...
Launching server...
Server requested
2023-04-04T16:37:45Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dme2jxa6w to user-202211a-node-a04bcb
2023-04-04T16:37:46Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-04-04T16:37:46Z [Normal] Created container tc-init
2023-04-04T16:37:46Z [Normal] Started container tc-init
2023-04-04T16:37:47Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Spawn failed: Timeout
Launch attempt 1 failed, retrying...
Server requested
2023-04-04T16:47:23Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2djgtr8a3u to user-202211a-node-a04bcb
2023-04-04T16:47:24Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-04-04T16:47:24Z [Normal] Created container tc-init
2023-04-04T16:47:24Z [Normal] Started container tc-init
2023-04-04T16:47:26Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"

More failing going on today:

Found built image, launching...
Launching server...
Server requested
2023-05-04T20:56:16.444388Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dy0pascyc to user-202211a-node-2840ff
2023-05-04T20:56:20Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-05-04T20:56:20Z [Normal] Created container tc-init
2023-05-04T20:56:21Z [Normal] Started container tc-init
2023-05-04T20:56:21Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Launch attempt 1 failed, retrying...
Server requested
2023-05-04T20:58:43.386719Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2d0a6k4zro to user-202211a-node-865773
2023-05-04T20:58:45Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-05-04T20:58:45Z [Normal] Created container tc-init
2023-05-04T20:58:46Z [Normal] Started container tc-init
2023-05-04T20:58:46Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Spawn failed: pod ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2d0a6k4zro did not start in 300 seconds!
Launch attempt 2 failed, retrying...
Server requested
2023-05-04T21:08:21.456122Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2d3h7lcn7k to user-202211a-node-865773
2023-05-04T21:08:22Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-05-04T21:08:22Z [Normal] Created container tc-init
2023-05-04T21:08:22Z [Normal] Started container tc-init
2023-05-04T21:08:23Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Launch attempt 3 failed, retrying...
Server requested
2023-05-04T21:09:55.973267Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dwjgkymn1 to user-202211a-node-865773
2023-05-04T21:09:57Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-05-04T21:09:57Z [Normal] Created container tc-init
2023-05-04T21:09:57Z [Normal] Started container tc-init
2023-05-04T21:09:59Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"

@Alex_Orange A lot has changed in MyBinder-land in the last few days. Please see the recent announcement here, in particular the link to the blog associated blog posting.
See here and here and here for more context and how JupyterLite may be able to help in some cases.

2 Likes

Thanks for the context. I’d help add some servers to this except that I have a non-docker container side and so I still need to figure out how to get my binderhub back up and running as it is.

1 Like

I am still getting errors, though the errors seem to have more details, appears that there is a lack of disk space at /var/lib/containerd (on the host side I assume):

Launching server...
Server requested
2023-06-21T20:58:22Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dctp16tns to user-202211a-node-dac61d
2023-06-21T20:58:23Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T20:58:23Z [Normal] Created container tc-init
2023-06-21T20:58:23Z [Normal] Started container tc-init
2023-06-21T20:58:24Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
Launch attempt 1 failed, retrying...
Server requested
2023-06-21T21:05:27Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2diha50hsn to user-202211a-node-dac61d
2023-06-21T21:05:28Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T21:05:28Z [Normal] Created container tc-init
2023-06-21T21:05:28Z [Normal] Started container tc-init
2023-06-21T21:05:29Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:08:49Z [Warning] Failed to pull image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": rpc error: code = Unknown desc = failed to pull and unpack image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": mkdir /var/lib/containerd/io.containerd.content.v1.content/ingest/c36c6f79b6aade4be162ddc908f5550c45617289944f4a9d75468fa34fca5b94: no space left on device
2023-06-21T21:08:49Z [Warning] Error: ErrImagePull
2023-06-21T21:08:50Z [Normal] Back-off pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:08:50Z [Warning] Error: ImagePullBackOff
Spawn failed: pod ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2diha50hsn did not start in 300 seconds!
Launch attempt 2 failed, retrying...
Server requested
2023-06-21T21:10:11Z [Normal] Successfully assigned ovh2/jupyter-alex-5fbinders-2dmatplotlib-2dbinder-2dv1bbyz3o to user-202211a-node-dac61d
2023-06-21T21:10:12Z [Normal] Container image "jupyterhub/mybinder.org-tc-init:2020.12.4-0.dev.git.4289.h140cef52" already present on machine
2023-06-21T21:10:12Z [Normal] Created container tc-init
2023-06-21T21:10:12Z [Normal] Started container tc-init
2023-06-21T21:10:12Z [Normal] Pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:10:21Z [Warning] Failed to pull image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": rpc error: code = Unknown desc = failed to pull and unpack image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b": mkdir /var/lib/containerd/io.containerd.content.v1.content/ingest/c36c6f79b6aade4be162ddc908f5550c45617289944f4a9d75468fa34fca5b94: no space left on device
2023-06-21T21:10:21Z [Warning] Error: ErrImagePull
2023-06-21T21:10:22Z [Normal] Back-off pulling image "2lmrrh8f.gra7.container-registry.ovh.net/mybinder-builds/r2d-g5b5b759https-3a-2f-2fgitlab-2eflux-2eutah-2eedu-2falex-5fbinders-2fmatplotlib-2dbinder-5c6152:2786061c55f7c1fba835bf0ed97a9ddd09ffd40b"
2023-06-21T21:10:22Z [Warning] Error: ImagePullBackOff```