Hi all,
I started recently to look into Jupyterhub and play around with some setup possibilities, the current setup is a docker-swarm cluster on 2 VMs with traefik as reverse proxy (on master node) and a separate jupyterhub instance (on master node) with dockerSwarm spawner.
This works fine, a thing that caught my eye was that during single-notebook server start up jupyterhub logs error 503 GET socket hang up, the server start regardless but I am concerned that could lead to problems when more users will start using it (currently I am the only one using it while setting it up).
My logs, docker-compose files are the following:
Jupyterhub logs
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.614 JupyterHub base:1124] User my_user took 13.344 seconds to start
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.614 JupyterHub proxy:331] Adding user my_user to proxy /user/my_user/ => http://jupyter-my_user:8888
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | 08:30:18.617 [ConfigProxy] info: Adding route /user/my_user -> http://jupyter-my_user:8888
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | 08:30:18.618 [ConfigProxy] info: Route added /user/my_user -> http://jupyter-my_user:8888
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.619 JupyterHub users:899] Server my_user is ready
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | 08:30:18.619 [ConfigProxy] info: 201 POST /api/routes/user/my_user
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.620 JupyterHub log:192] 200 GET /hub/api/users/my_user/server/progress?_xsrf=[secret] (my_user@MY_IP) 12279.89ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.655 JupyterHub log:192] 302 GET /hub/spawn-pending/my_user?_xsrf=[secret] -> /user/my_user/ (my_user@MY_IP) 11.84ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.702 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-my_user&redirect_uri=%2Fuser%2Fmy_user%2Foauth_callback&response_type=code&state=[secret] -> /user/my_user/oauth_callback?code=[secret]&state=[secret] (my_user@MY_IP) 21.35ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.747 JupyterHub log:192] 200 POST /hub/api/oauth2/token (my_user@10.0.1.252) 36.20ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.756 JupyterHub log:192] 200 GET /hub/api/user (my_user@10.0.1.252) 7.37ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | 08:30:18.863 [ConfigProxy] error: 503 GET /user/my_user/static/lab/7730.7e3a9fb140d2d55a51fc.js socket hang up
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | 08:30:18.866 [ConfigProxy] error: 503 GET /user/my_user/static/lab/2160.8e96aa5b6f6d451bf57d.js socket hang up
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.867 JupyterHub _xsrf_utils:125] Setting new xsrf cookie for b'None:ZUCz-XldA2vbVNEM9pOfZ0zzkE3aj1fcvOPoODRatSI=' {'path': '/hub/', 'max_age': 3600}
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.877 JupyterHub log:192] 200 GET /hub/error/503?url=%2Fuser%2Fmy_user%2Fstatic%2Flab%2F7730.7e3a9fb140d2d55a51fc.js%3Fv%3D7e3a9fb140d2d55a51fc (@10.0.1.250) 11.56ms
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.878 JupyterHub _xsrf_utils:125] Setting new xsrf cookie for b'None:ZUCz-XldA2vbVNEM9pOfZ0zzkE3aj1fcvOPoODRatSI=' {'path': '/hub/', 'max_age': 3600}
jupyterhub_jupyterhub.1.x0uy0ekegjgd@vm | [I 2024-12-06 08:30:18.880 JupyterHub log:192] 200 GET /hub/error/503?url=%2Fuser%2Fmy_user%2Fstatic%2Flab%2F2160.8e96aa5b6f6d451bf57d.js%3Fv%3D8e96aa5b6f6d451bf57d (@10.0.1.250) 1.59ms
Traefik logs
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /hub/spawn-pending/my_user?_xsrf=MnwxOjB8MTA6MTczMzQ3Mzc3M3w1Ol94c3JmfDg4Ok5UUXpPR1JqTW1KbU5UVTROR0UwTVdJM1ptRmhOR0l5WWpJNE1qQTVOREU2WVRNd1kyTmtaVGMxTlRjMk5ETXdaR0kyT1dWbE9Ua3dNbVZpTkRReE9EST18YTdmMDQ1MDYwOWVlYWU0ODI1MzllOWNiNzg5ZjJkMWI2Y2U2YTgxN2Q3MjQ3NjUyMjBjOWRkNjgzZjg3YTlhNQ HTTP/2.0" 302 0 "-" "-" 219 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 17ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/ HTTP/2.0" 302 0 "-" "-" 220 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 8ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/lab? HTTP/2.0" 302 0 "-" "-" 221 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 4ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-my_user&redirect_uri=%2Fuser%2Fmy_user%2Foauth_callback&response_type=code&state=A7AwPYMieX8mp5LM8i0mQQ HTTP/2.0" 302 0 "-" "-" 222 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 24ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/oauth_callback?code=pBjDbEP5nvBlFpUkWVgXhR8Mm1wCUe&state=A7AwPYMieX8mp5LM8i0mQQ HTTP/2.0" 302 0 "-" "-" 223 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 51ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/lab? HTTP/2.0" 200 4572 "-" "-" 224 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 10ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/lab/extensions/jupyterlab_pygments/static/remoteEntry.5cbb9d2323598fbda535.js HTTP/2.0" 304 0 "-" "-" 225 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 7ms
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/lab/extensions/@jupyter-notebook/lab-extension/static/remoteEntry.04dfa589925e7e7c6a3d.js HTTP/2.0" 304 0 "-" "-" 226 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 10ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 499 Client Closed Request error="context canceled"
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 2ec410e6d17c3d32
traefik_traefik.1.ie97nxehqrsb@vm | MY_IP - - [06/Dec/2024:08:30:18 +0000] "GET /user/my_user/static/lab/7730.7e3a9fb140d2d55a51fc.js?v=7e3a9fb140d2d55a51fc HTTP/2.0" 499 21 "-" "-" 227 "jupyterhub-https@swarm" "http://10.0.1.250:8000" 1ms
traefik_traefik.1.ie97nxehqrsb@vm | 2024-12-06T08:30:18Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 499 Client Closed Request error="context canceled"
Traefik docker compose
version: '3.3'
services:
traefik:
# Use the latest v3.0.x Traefik image available
image: traefik:v3.0
ports:
# Listen on port 80, default for HTTP, necessary to redirect to HTTPS
- target: 80
published: 80
mode: host
# Listen on port 443, default for HTTPS
- target: 443
published: 443
mode: host
deploy:
placement:
constraints:
# Make the traefik service run only on the node with this label
# as the node with it has the volume for the certificates
- node.labels.traefik.main-node == true
labels:
# Enable Traefik for this service, to make it available in the public network
- traefik.enable=true
# Use the docker swarm overlay network (declared below)
- traefik.docker.network=jupyterhub_net
# Use the custom label "traefik.constraint-label=traefik-public"
# This public Traefik will only use services with this label
# That way you can add other internal Traefik instances per stack if needed
- traefik.constraint-label=traefik-public
# https-redirect middleware to redirect HTTP to HTTPS
# It can be re-used by other stacks in other Docker Compose files
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
# traefik-http set up only to use the middleware to redirect to https
- traefik.http.routers.traefik-public-http.rule=Host(`MY_IP`)
- traefik.http.routers.traefik-public-http.entrypoints=http
- traefik.http.routers.traefik-public-http.middlewares=https-redirect
# traefik-https the actual router using HTTPS
- traefik.http.routers.traefik-public-https.rule=Host(`MY_IP`)
- traefik.http.routers.traefik-public-https.entrypoints=https
- traefik.http.routers.traefik-public-https.tls=true
# Use the special Traefik service api@internal with the web UI/Dashboard
- traefik.http.routers.traefik-public-https.service=api@internal
# Define the port inside of the Docker service to use
- traefik.http.services.traefik-public.loadbalancer.server.port=8080
volumes:
# Add Docker as a mounted volume, so that Traefik can read the labels of other services
- /var/run/docker.sock:/var/run/docker.sock:ro
# Mount the certificates
- /etc/letsencrypt/archive/jupyterhub/fullchain1.pem:/certificates/fullchain.pem
- /etc/letsencrypt/archive/jupyterhub/privkey1.pem:/certificates/privkey.pem
- /home/vmadmin/docker_swarm/traefik/certs-traefik.yml:/etc/traefik/dynamic/certs-traefik.yml
command:
# Enable Docker in Traefik, so that it reads labels from Docker services
- --providers.docker
# Add a constraint to only use services with the label "traefik.constraint-label=traefik-public"
- --providers.docker.constraints=Label(`traefik.constraint-label`, `traefik-public`)
# Do not expose all Docker services, only the ones explicitly exposed
- --providers.docker.exposedbydefault=false
# Enable Docker Swarm mode
- --providers.swarm.endpoint=unix:///var/run/docker.sock
- --providers.file.filename=/etc/traefik/dynamic/certs-traefik.yml
# Create an entrypoint "http" listening on port 80
- --entrypoints.http.address=:80
# Create an entrypoint "https" listening on port 443
- --entrypoints.https.address=:443
# Enable the access log, with HTTP requests
- --accesslog
# Enable the Traefik log, for configurations and errors
- --log.level=DEBUG
- --log
# Enable the Dashboard and API
- --api
networks:
# Use the public network created to be shared between Traefik and
# any other service that needs to be publicly available with HTTPS
- jupyterhub_net
networks:
# Use the previously created public network "traefik-public", shared with other
# services that need to be publicly available via this Traefik
jupyterhub_net:
external: true
Jupyterhub docker compose
version: "3"
services:
jupyterhub:
image: "jupyterhub-docker-swarm-custom:5.2.1"
ports:
- target: 8000
published: 8000
mode: host
deploy:
placement:
constraints:
# place hub on master node
- node.labels.traefik.main-node == true
- node.role == manager
labels:
- traefik.enable=true
- traefik.constraint-label=traefik-public
# create router rule HTTPS
- traefik.http.routers.jupyterhub-https.rule=Host(`my_domain.com`) || Host(`www.my_domain.com`)
- traefik.http.routers.jupyterhub-https.entrypoints=https
- traefik.http.routers.jupyterhub-https.tls=true
# create router rule HTTP redirect rule
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
# traefik-http set up only to use the middleware to redirect to https
- traefik.http.routers.jupyterhub-http.rule=Host(`my_domain.com`) || Host(`www.my_domain.com`)
- traefik.http.routers.jupyterhub-http.entrypoints=http
- traefik.http.routers.jupyterhub-http.middlewares=https-redirect
# add exposed port for traefik to see (does not get it from docker swarm)
- traefik.http.services.jupyterhub.loadbalancer.server.port=8000
- traefik.docker.network=jupyterhub_net
volumes:
- jupyterhub_pv:/mnt/jupyterhub
- /var/run/docker.sock:/var/run/docker.sock
- jupyterhub:/srv/jupyterhub
networks:
- jupyterhub_net
environment:
DOCKER_NETWORK_NAME: jupyterhub_net
volumes:
jupyterhub_pv:
driver: local
driver_opts:
type: none
o: bind
device: /mnt/jupyterhub
jupyterhub:
driver: local
driver_opts:
type: none
o: bind
device: /srv/jupyterhub
networks:
jupyterhub_net:
external: true
Therefore I am curious, what could be the cause of this error?
How could I fix this?
Did anyone experience a similar problem and can share if this lead to further problems?
Thank you in advance for any help or hints!