Error notifying Hub of activity

Hi,

I have a sudden issue with a user pod not scaling down automatically. It seemed to run fine for some weeks, but now suddenly this issue appear.

The cluster runs z2jh 3.3.8 with fairly standard configuration. Any ideas on how to fix this issue?

Om that jupyter server failing to report activity, what version of jupyterhub and jupyter_server is it running?

1 Like

jupyter_server v2.14.2 and jupyterhub v5.2.1

what are the logs from the hub surrounding this event? Do other things involving requests to the Hub work, e.g. logging in and making requests to the server (launching JupyterLab, notebooks, etc.)?

1 Like

The error appeared after a user node failed to spawn. How that happened, I don’t know.

The user’s logon stayed like this for at least 1.5 hour. If something like this happens, they should be able to click the home menu and reset their server, right?

Ah, if the user failed to spawn, the credentials issued to the server would not be valid because they are revoked when the Hub doesn’t believe a server is running. Given that you are seeing a launch failure due to timeout and logs from the user server indicating that it does start, my guess is that the server is actually starting up, but it is too slow for your configuration. Setting c.Spawner.timeout to a larger value might fix your problem. More surrounding log context would help identify this situation.

The sequence of events, if this is right:

  1. server is requested
  2. timeout reached, Hub gives up and begins cleaning up (invalidating credentials, among other things)
  3. server finally starts, making initial requests tot he HUb API with expired credentials (error in your logs)
  4. (hopefully) server finishes getting shutdown

yes, if the launch fails, they should be able to try again.

2 Likes