Hi @minrk,
We have a few new observations with respect to this issue.
Under what circumstances inactive user-pods should be culled, but the user-pods still be up and the database looks as follows???
spawners table :-
12 12 “{”“pod_name”“: ““jupyter-username””}” “2024-02-06 09:45:30.758641” “2024-02-06 15:18:32.638” “{”“profile”“: ““my-profile””}”
servers table is empty
oauth_clients table also is empty
But the user-pod is up and here are the logs:
[D 2024-02-15 12:44:50.869 SingleUserLabApp mixins:491] Notifying Hub of activity 2024-02-06T10:51:15.413727Z
[E 2024-02-15 12:44:50.873 SingleUserLabApp mixins:511] Error notifying Hub of activity
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/jupyterhub/singleuser/mixins.py", line 509, in notify
await client.fetch(req)
File "/opt/conda/lib/python3.8/asyncio/tasks.py", line 349, in __wakeup
future.result()
ConnectionRefusedError: [Errno 111] Connection refused
[E 2024-02-15 12:44:50.873 SingleUserLabApp mixins:536] Error notifying Hub of activity
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/jupyterhub/singleuser/mixins.py", line 534, in keep_activity_updated
await self.notify_activity()
File "/opt/conda/lib/python3.8/site-packages/jupyterhub/singleuser/mixins.py", line 516, in notify_activity
await exponential_backoff(
File "/opt/conda/lib/python3.8/site-packages/jupyterhub/utils.py", line 237, in exponential_backoff
raise asyncio.TimeoutError(fail_message)
asyncio.exceptions.TimeoutError: Failed to notify Hub of activity
[D 2024-02-15 12:49:20.982 SingleUserLabApp mixins:491] Notifying Hub of activity 2024-02-06T10:51:15.413727Z
[D 2024-02-15 12:53:55.058 SingleUserLabApp mixins:491] Notifying Hub of activity 2024-02-06T10:51:15.413727Z
[D 2024-02-15 12:58:39.312 SingleUserLabApp mixins:491] Notifying Hub of activity 2024-02-06T10:51:15.413727Z
[D 2024-02-15 13:03:30.615 SingleUserLabApp mixins:491] Notifying Hub of activity 2024-02-06T10:51:15.413727Z
Here are the hub logs for this user-pod:
[D 2024-02-15 12:43:14.183 JupyterHub proxy:880] Proxy: Fetching GET http://proxy-api:8001/api/routes
[D 2024-02-15 12:43:14.184 JupyterHub proxy:953] Omitting non-jupyterhub route '/'
[D 2024-02-15 12:43:14.184 JupyterHub proxy:392] Checking routes
[W 2024-02-15 12:43:14.185 JupyterHub proxy:468] Deleting stale route /user/username/
[D 2024-02-15 12:43:14.185 JupyterHub proxy:880] Proxy: Fetching DELETE http://proxy-api:8001/api/routes/user/username
[I 2024-02-15 12:43:14.187 JupyterHub app:3242] JupyterHub is now running, internal Hub API at http://hub:8081/hub/
[D 2024-02-15 12:43:14.188 JupyterHub app:2847] It took 3.102 seconds for the Hub to start
[D 2024-02-15 12:43:14.279 JupyterHub base:297] Recording first activity for <APIToken('66bd...', service='jupyterhub-idle-culler', client_id='jupyterhub')>
[I 2024-02-15 12:43:14.286 JupyterHub log:191] 200 GET /hub/api/ (jupyterhub-idle-culler@::1) 9.60ms
[D 2024-02-15 12:43:14.292 JupyterHub scopes:863] Checking access via scope list:users
--
[D 2024-02-15 12:45:04.323 JupyterHub reflector:362] events watcher timeout
[D 2024-02-15 12:45:04.323 JupyterHub reflector:281] Connecting events watcher
[D 2024-02-15 12:45:04.457 JupyterHub reflector:362] pods watcher timeout
[D 2024-02-15 12:45:04.457 JupyterHub reflector:281] Connecting pods watcher
[D 2024-02-15 12:45:06.604 JupyterHub log:191] 200 GET /hub/health (@172.16.4.1) 0.62ms
[D 2024-02-15 12:45:08.310 JupyterHub base:342] Refreshing auth for username
[D 2024-02-15 12:45:08.317 JupyterHub scopes:863] Checking access via scope users:activity
[D 2024-02-15 12:45:08.317 JupyterHub scopes:690] Argument-based access to /hub/api/users/username/activity via users:activity
[D 2024-02-15 12:45:08.319 JupyterHub users:879] Not updating activity for <User(username 0/2 running)>: 2024-02-06T14:27:15.018783Z < 2024-02-14T06:57:21.708712Z
[D 2024-02-15 12:45:08.319 JupyterHub users:900] Not updating server activity on username/: 2024-02-06T14:27:15.018783Z < 2024-02-14T06:57:21.708712Z
[I 2024-02-15 12:45:08.320 JupyterHub log:191] 200 POST /hub/api/users/username/activity (username@172.16.3.35) 28.14ms
[D 2024-02-15 12:45:14.188 JupyterHub proxy:880] Proxy: Fetching GET http://proxy-api:8001/api/routes
[D 2024-02-15 12:45:14.191 JupyterHub proxy:953] Omitting non-jupyterhub route '/'
[D 2024-02-15 12:45:14.199 JupyterHub proxy:392] Checking routes
[D 2024-02-15 12:45:14.200 JupyterHub utils:278] Server at http://refresher:8081/services/refresher/ responded with 404
[D 2024-02-15 12:45:14.200 JupyterHub app:2493] External service refresher running at http://refresher:8081/
Culler configuration in the deployment-values file looks as follows:
cull:
adminUsers: false # --cull-admin-users
removeNamedServers: true # --remove-named-servers
timeout: 72000
every: 600
concurrency: 10 # --concurrency
maxAge: 0 # --max-age
The culler defined is as follows:
if get_config("cull.enabled", False):
jupyterhub_idle_culler_role = {
"name": "jupyterhub-idle-culler",
"scopes": [
"list:users",
"read:users:activity",
"read:servers",
"delete:servers",
# "admin:users", # dynamically added if --cull-users is passed
],
# assign the role to a jupyterhub service, so it gains these permissions
"services": ["jupyterhub-idle-culler"],
}
cull_cmd = ["python3", "-m", "jupyterhub_idle_culler"]
base_url = c.JupyterHub.get("base_url", "/")
cull_cmd.append("--url=http://localhost:8081" + url_path_join(base_url, "hub/api"))
cull_timeout = get_config("cull.timeout")
if cull_timeout:
cull_cmd.append(f"--timeout={cull_timeout}")
cull_every = get_config("cull.every")
if cull_every:
cull_cmd.append(f"--cull-every={cull_every}")
cull_concurrency = get_config("cull.concurrency")
if cull_concurrency:
cull_cmd.append(f"--concurrency={cull_concurrency}")
if get_config("cull.users"):
cull_cmd.append("--cull-users")
jupyterhub_idle_culler_role["scopes"].append("admin:users")
if not get_config("cull.adminUsers"):
cull_cmd.append("--cull-admin-users=false")
if get_config("cull.removeNamedServers"):
cull_cmd.append("--remove-named-servers")
cull_max_age = get_config("cull.maxAge")
if cull_max_age:
cull_cmd.append(f"--max-age={cull_max_age}")
c.JupyterHub.services.append(
{
"name": "jupyterhub-idle-culler",
"command": cull_cmd,
}
)
c.JupyterHub.load_roles.append(jupyterhub_idle_culler_role)