While doing researches related to combining JupyterHub with GitLab (outside of using the later for authentication), I became aware of a project from the Swiss Data Science Center named Renku that partly does what I would like to accomplish. They have it available at https://renkulab.io.
They used to use JupyterHub as part of their deployment but for the latest release of their platform moved to implement their own Kubernetes Jupyter server manager called Amalthea.
The full explanation of the reasons they did so can be found in the README file. Basically, JupyterHub is seen as a full self-sufficient application that end-users can install and directly make use of (authentication and user management is included, Jupyter server spawning and status tracking, etc.) while Amalthea is only there to manage the Jupyter server handling part in Kubernetes to integrate it within other setups. From that point of view, system admins are responsible to provide the authentication, currently OpenID connect only, as well as user management.
JupyterHub uses the KubeSpawner class to create the Jupyter server pods but I was wondering whether there could be some ideas/concepts that could be useful from Amalthea and if it could be a project of interest more globally to the Jupyter community.
It sounds like a neat idea! Effectively they’ve partially implemented JupyterHub as a Kubernetes controller, and are managing singleuser-servers as custom resource definitions (CRDs). Makes sense where you’re all-in on Kubernetes, have no need for all of JupyterHub’s features, and want an easy way to add custom features without worrying about JupyterHub compatibility.
Thanks @sgaist for taking note of our project and mentioning it here. In fact it has been on my to-do list to reach out to the Jupyter community about the project, but you were faster. However, now is definitly a good moment to talk about the project as we’ve been running Amalthea in production as part of the RenkuLab platform for 2-3 months, and after some small initial hickups (like k8s probes preventing the culling of idle servers) Amalthea is now managing the up to 300 parallel user sessions on renkulab.io smoothly. So I feel confident to claim that the project is past the “proof-of-concept” stage.
One aspect that I would like to stress is the extensibility of the custom resources which was already mentioned by @manics. This was indeed one of the guiding principles when designing Amalthea, but something like adding a network policy could be achieved even more easily by adding a patch to the JupyterServer object spec. In fact, we heavily use the patching option in RenkuLab as can be seen here. For example we add a side-car container to each JupyterServer which acts as a proxy for the http traffic to the hosted git repository which is cloned into the JupyterServer. As this is a very RenkuLab-specific use-case, we intentionally kept it out of Amalthea.
If there’s an interest in the Jupyter community to learn more about Amalthea, we’re more than happy to share our ideas and experience in more detail. Also, we are certainly very open for any form of collaboration with or contribution back to the project Jupyter if what we have developed is seen as useful for a larger scope.
The JupyterHub meeting tends to be more technically focussed on JupyterHub. A few JupyterHub devs have already discussed Amalthea and whether there’s potential for sharing code between it and e.g. KubeSpawner in the longer term, so if you’ve got any thoughts it’d be great to hear them as well as see a demo!
In contrast the Community call has a mix of advanced users, developers, sysadmins, and non-technical users, and is a good forum to show off what you’re doing at a high level. It’s recorded so also gets more views afterwards. With my academic hat on I’d say it’s a good place to get more awareness and promote what you’re doing.
I’m biased and interested in the technical details so I’d say come to the JupyterHub meeting , but really the community call is equally good. You can even do both if you want!
Sounds great, then I’m joining the JupyterHub meeting on January 20 and talk about those technical details . I might be happy to seize the opportunity and present Amalthea to the larger community call audience at some point too.
You are totally welcome at a community call if you feel the inspiration! I saw your lovely presentation in the JupyterHub meeting today and am sure there’s people across projects that would be interested if you ever feel like sharing (they’re always on the last Tuesday of a month).
Thank you very much @isabela-pf! I definitely feel the inspiration to do this. The one in January is tight, but I’ll get back to you about the February edition!
@isabela-pf I can confirm that me and/or someone else from my team will happily join the February community call. That would be February 22nd, 4pm UTC, right? (just to make sure we block the correct slot in our agendas).