Hello, I can’t figure out the slowness with swarmspawner.
Launching a JupyterLab, via Jupyterhub using swarmspawner, in a docker/swarm cluster, seems a bit slow on some hosts.
Close to 50 s to get the service status running in the ‘start_object’ coroutine
Value confirmed by checking the Created dates of the docker network and the container (done on the target host)
In my case docker network created at 11:46:04 (GMT time) and docker container created at 11:46:55 (GMT time)
docker network inspect jupyter_network |grep Created
"Created": "2024-10-09T13:46:04.338874395+02:00",
docker container inspect fac949f6ac86 |grep Created
"Created": "2024-10-09T11:46:55.912624491Z",
Fortunately the container starts, but it takes a long time
Another test run on the leader host, using the same target host, done via
docker network create ... +
docker service create --constraint="node.id==9bgzo3wogakoqwftespnzlbm7" --network test ...
show that everything works fine in this case Container is started immediately after docker network is created. It’s what I expect …
docker network inspect test |grep Created
"Created": "2024-10-09T14:01:21.52683491+02:00",
docker container inspect 289fc8e65a96 |grep Created
"Created": "2024-10-09T12:01:21.915244364Z",
I have no idea how to check where time is spent by swarmspawner ? Which method to extends to log activity ?
Regards
Was not a copy and paste error. 13:46:04 was expressed in GMT+2, not the other date.
So I write both dates in GMT time.
Running docker service ps shows most of the waiting time is due to long time in preparing state
docker service ps jnp_notebook-bchambon
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
rxqaartiyoj2 jnp_notebook-bchambon.1 gitlab-registry.in2p3.fr/cc-in2p3-dev/jnp_docker/myjbook_rhel9:8.2.0 ccosvmse0027 Running Preparing 50 seconds ago
[root@ccosvmse0033 jnp]$ docker service ps jnp_notebook-bchambon
docker service ps jnp_notebook-bchambon
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
rxqaartiyoj2 jnp_notebook-bchambon.1 gitlab-registry.in2p3.fr/cc-in2p3-dev/jnp_docker/myjbook_rhel9:8.2.0 ccosvmse0027 Running Running 1 second ago
So I think it’s a docker problem or docker network problem of my cluster , NOT a swarmspawner one
Here are some logs from JupyterHub
| [I 2024-10-09 15:25:05.805 CustomJupyterHub base:937] User logged in: bchambon
| [I 2024-10-09 15:25:05.809 CustomJupyterHub log:192] 302 GET /hub/oauth_callback?state=[secret]&session_state=[secret]&code=[secret] -> /hub/spawn (bchambon@10.0.0.2) 1191.63ms
| [I 2024-10-09 15:25:05.963 CustomJupyterHub log:192] 200 GET /hub/spawn (bchambon@10.0.0.2) 55.36ms
| [I 2024-10-09 15:25:10.646 CustomJupyterHub provider:660] Creating oauth client jupyterhub-user-bchambon
| [I 2024-10-09 15:25:10.885 CustomJupyterHub dockerspawner:1033] Service 'jnp_notebook-bchambon' is gone
| [I 2024-10-09 15:25:11.530 CustomJupyterHub log:192] 302 POST /hub/spawn?_xsrf=[secret] -> /hub/spawn-pending/bchambon?_xsrf=[secret] (bchambon@10.0.0.2) 1016.83ms
| [I 2024-10-09 15:25:11.617 CustomJupyterHub pages:399] bchambon is pending spawn
| [I 2024-10-09 15:25:11.618 CustomJupyterHub log:192] 200 GET /hub/spawn-pending/bchambon?_xsrf=[secret] (bchambon@10.0.0.2) 3.10ms
| [I 2024-10-09 15:25:11.926 CustomJupyterHub dockerspawner:1311] Created service jnp_notebook-bchambon (id: qyrp9k3) from image gitlab-registry.in2p3.fr/cc-in2p3-dev/jnp_docker/myjbook_rhel9:8.2.0
| [I 2024-10-09 15:25:11.926 CustomJupyterHub dockerspawner:1335] Starting service jnp_notebook-bchambon (id: qyrp9k3)
| [W 2024-10-09 15:25:21.242 CustomJupyterHub base:1190] User bchambon is slow to start (timeout=10)
| [I 2024-10-09 15:26:06.249 CustomJupyterHub base:1090] User bchambon took 55.721 seconds to start
| [I 2024-10-09 15:26:06.249 CustomJupyterHub proxy:331] Adding user bchambon to proxy /user/bchambon/ => http://jnp_notebook-bchambon:8888
| [I 2024-10-09 15:26:06.251 CustomJupyterHub users:776] Server bchambon is ready
| [I 2024-10-09 15:26:06.252 CustomJupyterHub log:192] 200 GET /hub/api/users/bchambon/server/progress?_xsrf=[secret] (bchambon@10.0.0.2) 54496.08ms
| [I 2024-10-09 15:26:06.476 CustomJupyterHub log:192] 302 GET /hub/spawn-pending/bchambon?_xsrf=[secret] -> /user/bchambon/ (bchambon@10.0.0.2) 90.72ms
| [I 2024-10-09 15:26:06.781 CustomJupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bchambon&redirect_uri=%2Fuser%2Fbchambon%2Foauth_callback&response_type=code&state=[secret] -> /user/bchambon/oauth_callback?code=[secret]&state=[secret] (bchambon@10.0.0.2) 78.41ms
| [I 2024-10-09 15:26:07.002 CustomJupyterHub log:192] 200 POST /hub/api/oauth2/token (bchambon@10.0.3.229) 146.09ms
| [I 2024-10-09 15:26:07.024 CustomJupyterHub log:192] 200 GET /hub/api/user (bchambon@10.0.3.228) 19.58ms
Can you also post your single user server logs (docker containers where your JupyterLab is running)? Looking at timestamps of both JupyterHub and JupyterLab logs can give us some insights.
Hello,
Back to this problem for which I setup a workaround
As a matter of fact, I discovered that the slowness occured on a host when running a JupyterLab for the first time (= I mean when no other JupyterLab where running yet on that host), meaning that the docker network ‘jupyter_network’ should be created, because it didn’t exist yet. So the slowness was related to the creation of the docker network ‘jupyter_network’, for sure.
As workaround, I run a dummy docker service, using that docker network ‘jupyter_network’, on all hosts of the cluster. Objective being to force creation of a permanent docker network on all hosts, at startup of the stack of services.
I mean something like:
Once this was in place, no more slowness
As an end user, now a JupyterLab is spawned (via JupyterHub) and fully available in less than 6 to 8s, which is quite acceptable (cluster of 20 hosts and serving up to 100 simultaneous users)
As a conclusion, it was definitely a 'docker ’ slowness. Absolutely NOT related to swarmspawner