Custom Spawner with Jupyterhub on K8s

I have written a custom spawner (EC2Spawner) that spins up a full EC2 instance and runs the user notebook servers on there.

My initial setup was:

  • Jupyterhub running on a pet EC2.
  • Upon spawning, an EC2 instance is launched, and docker-machine is used to connect to it and start the notebook server.

This has been working fine for a year now. However that setup was more of a PoC than production-ready, but is getting used more and more and starts to feel like a special snowflake to manage in our infrastructure landscape (fully kubernetes native).

So I wanted to migrate the hub to Kubernetes. But it has been a week now and I have made little progress.

I wanted to avoid connecting to the host docker socket, so I don’t use docker-machine anymore.
However the interplay of proxies and authentication is not very clear to me.

My current configuration is:
Hub is deployed via Deployement and port 8000 is exposed via an Ingress (+Service of course).
The Hub API is separately exposed by a NodePort.

jupyterhub_config.py

c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.hub_connect_ip = 'localhost'
c.JupyterHub.port = 8000

I set hub_connect_ip to localhost because both the proxy and the remote servers use it to find the hub, however only the servers are remote. And proxy would fail when I set it to a routable IP (from the internet).
For the servers, I set the --hub-api-url flag to point to the NodePort endpoint of the Hub API.

Hub and proxy are deployed together.

The remote server is started with the following user_data on its own EC2 instances (Security Groups are correctly configured):

docker run -d -p {server_port}:8888 \
  {env_vars} \
  jupyterhub/singleuser:1.0.0 {self.cmd[0]} \
  --hub-api-url {os.environ.get('HUB_URL')} \
  --debug

Everything starts fine, the notebook can contact to the Hub.
However authentication check fails.

On the notebook server:

[D 2020-08-28 11:38:28.295 SingleUserNotebookApp auth:421] No user identified
[W 2020-08-28 11:38:28.314 SingleUserNotebookApp auth:303] Failed to check authorization: [405] Method Not Allowed

On the hub:

[I 2020-08-28 11:35:44.995 JupyterHub base:638] User dummy-user took 129.996 seconds to start

[I 2020-08-28 11:35:44.995 JupyterHub proxy:242] Adding user dummy-user to proxy /user/dummy-user/ => http://xx.xxy.xz.xx:8888

[I 2020-08-28 11:35:44.997 JupyterHub users:533] Server dummy-user is ready

[W 2020-08-28 11:35:44.997 JupyterHub users:442] Stream closed while handling /hub/api/users/dummy-user/server/progress

[I 2020-08-28 11:35:44.998 JupyterHub log:158] 200 GET /hub/api/users/dummy-user/server/progress (dummy-user@aa.aab.ac.aa) 108435.98ms
  
[I 2020-08-28 11:36:45.770 JupyterHub proxy:301] Checking routes
  
[I 2020-08-28 11:38:28.003 JupyterHub log:158] 302 GET /hub/user/dummy-user/ -> /hub/spawn?next=%2Fhub%2Fuser%2Fdummy-user%2F (dummy-user@aa.aab.ac.aa) 44.45ms
  
[I 2020-08-28 11:38:28.060 JupyterHub log:158] 302 GET /hub/spawn?next=%2Fhub%2Fuser%2Fdummy-user%2F -> /user/dummy-user/ (dummy-user@aa.aab.ac.aa) 6.30ms
  
[I 2020-08-28 11:38:28.252 JupyterHub log:158] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-dummy-user&redirect_uri=%2Fuser%2Fdummy-user%2Foauth_callback&response_type=code&state=[secret] -> /user/dummy-user/oauth_callback?code=[secret]&state=[secret] (dummy-user@aa.aab.ac.aa) 13.84ms

[W 2020-08-28 11:38:28.314 JupyterHub log:158] 405 POST /oauth2/token (dummy-user@10.42.5.0) 11.71ms

I made sure the token of the notebook server was valid by issuing requests to the Hub API and it worked.

Also I have tested this with a DummyAuthenticator and GitlabOAuth.
And I tried using the jupyterhub Helm chart but it is strongly coupled with the logic of KubeSpawner and customizing seemed even more work.

Any hints on what I could be doing wrong would be greatly appreciated!

Thanks

Forgot to mention that I tested this with versions 0.9.6, 1.0.0 and 1.2 (1.1.0 has a broken docker image so I couldn’t test it, see this issue)