Display error message to the user when max number of named servers is reached

Is it possible to display pod error message to the user when launching repo if max number of named servers is reached? Right now our users are getting this:

Found built image, launching...
Launching server...
Launch attempt 1 failed, retrying...
Launch attempt 2 failed, retrying...
Launch attempt 3 failed, retrying...
Failed to launch image binder.repo.mgnt.in/binderhub/image-9764c8:f141901a04764ad927621dd05f5bde270a38ab82

Only then in the pod logs can we see:

E 210824 13:01:42 launcher:243] Error starting server server-yzrpgp8h for user user1 : HTTP 400: Bad Request
    b'{"status": 400, "message": "User user1 already has the maximum of 5 named servers.  One must be deleted before a new server can be created"

Every time users reach max number of named servers our support needs to check pod logs only to find out that there are too many servers launched. Is there a way to let users know they reached the limit on the launch page instead of this generic Failed to launch image message?

1 Like

Can you show us your configuration with secrets redacted, and tell us the version of the chart and any other components that you’re using? The user should see a proper error message as shown in this PR

Thanks for the reply. We deployed version 0.2.0-n523.h854be18 of Binderhub on AWS EKS with standard configuration. Configuration is in Ansible. I will try to find it but in the meantime can you confirm that PR was included in 0.2.0-n523.h854be18 release?

It should do. However it’s still failrly old and there have been other bug fixes since then, so before we spend too much time investigating would you mind trying the latest version?

If you still see a problem could you show us your debug logs for the whole launch? I think this is where the errors are generated in the latest version:

Deployed latest 0.2.0-n661.h8269b12 and the issue is still there. Also, the latest Helm chart for Binderhub changed quite bit. My Ansible chart no longer worked and the install guide on Secure with HTTPS — BinderHub 0.1.0 documentation is no longer applicable, specifically around setting up ingress and https.

Here are the logs from binder-* pod. Not very useful. Which log do you want? Can the issue be caused by us having integration through oauth?

[I 210826 00:24:05 build:389] Started build build-pet-project-2dresearch-c1f961-429-2b
[I 210826 00:24:05 build:391] Watching build pod build-pet-project-2dresearch-c1f961-429-2b
[I 210826 00:24:07 build:425] Watching logs of build-pet-project-2dresearch-c1f961-429-2b
[I 210826 00:24:44 log:140] 302 GET / -> https://notebooks.pet-shop-dev.in/hub/api/oauth2/authorize?client_id=binder-oauth-client&redirect_uri=https%3A%2F%2Fbinder.pet-shop-dev.in%2Foauth_callback&response_type=code&state=[secret] (@ 0.81ms
[I 210826 00:25:44 log:140] 302 GET / -> https://notebooks.pet-shop-dev.in/hub/api/oauth2/authorize?client_id=binder-oauth-client&redirect_uri=https%3A%2F%2Fbinder.pet-shop-dev.in%2Foauth_callback&response_type=code&state=[secret] (@ 0.78ms
[I 210826 00:25:52 builder:558] Launching pod for https://github.com/pet-project/research: 0 other pods running this repo (6 total)
[E 210826 00:25:52 builder:617] Retrying launch of https://github.com/pet-project/research after error (duration=0s, attempt=1): HTTPError()
[I 210826 00:25:52 build:455] Finished streaming logs of build-pet-project-2dresearch-c1f961-429-2b
[E 210826 00:25:56 builder:617] Retrying launch of https://github.com/pet-project/research after error (duration=0s, attempt=2): HTTPError()
[E 210826 00:26:04 builder:617] Retrying launch of https://github.com/pet-project/research after error (duration=0s, attempt=3): HTTPError()
[W 210826 00:26:20 web:1787] 409 GET /build/gh/pet-project/research/HEAD ( User ckarwicki already has the maximum of 5 named servers.  One must be deleted before a new server can be created
[I 210826 00:26:20 log:140] 200 GET /build/gh/pet-project/research/HEAD (carwicki@ 135122.36ms

I wouldn’t have thought it’s related to an external Oauth provider but if you can test with some other auth e.g. the dummy authenticator that would rule it out.

The logs aren’t as useful as I hoped. Is it completely reproducible, or does it sometimes show the expected message? I’m wondering if there might be a race condition, the code that forwards some messages from the backend to the front end is a bit complicated.

If you have a full reproducible example you’re willing to share would you mind opening a bug on Issues · jupyterhub/binderhub · GitHub with details and cross-reference this post?