Recent rollouts of JupyterHub for Kubernetes for 100+ student academic courses?

I’m a prof at Queen’s University (Kingston, Ontario) who runs a Jupyter-based math modelling course with 200-250 students. For the last four years, we’ve had students install Anaconda locally for their Jupyter Notebook access, but we’d like to move to cloud-based JupyterHub option for September 2024, and we have some seed money to do so.

In looking for recent deployments at that scale, most of what I’ve seen has been awhile ago (2018-2021), or ongoing deployments like at UC Berkeley without much deployment detail. The “Zero to JupyterHub” docs are great, but I’m also looking for companion materials, e.g. blog posts or documentation from people’s lived experience, e.g. “I followed the JupyterHub K8S docs through parts X and Y, but I really had to customize Z. And look out for A: this really blew up our first week of classes!”

For relative skill level, I was comfortable setting up a cloud-based TheLittlestJupyterHub with nbgrader for marking in that same class, but I recognize that it’s a big jump once K8S server/node/file persistence gets in the mix!

Thanks,

Alan

Alan Ableson
Assistant Professor
Mechanical and Materials Engineering, Queen’s University
ableson@queensu.ca

2 Likes

There is public config of many live jupyterhub deployments in the github repo 2i2c-org/infrastructure for reference.

Id recommend AWS setup a k8s cluster with eksctl together with EFS for home dir storage, or GCP and persistent disks but google filestore isn’t great — and for a single hub perhaps using the proxy.traefik / autohttps things in z2jh, but for more using cert-manager + ingress-nginx charts.

I’m not sure but i think maybe nbgrader hasnt been made compatible/easy to use with z2jh based deployments? I havnt kept up to date but i recall assumptions on local filesystems being broken.

1 Like

Cluster-autoscaler is relevant also on EKS. And use r5.xlarge nodes / n2-highmem-4 nodes or similar at least for users, memory is the constraint.

I wish things were even easier :confused:

I think cost-efficiently, assuming your time is worth notable amounts per hour, it could be better to ask for help by 2i2c.org for example - but there is a lot of nice things one can learn from doing it yourself.

Disclaimer: I’m part of 2i2c.org now, but also a maintainer of z2jh for many more years

1 Like

Thanks for the pointer! 2i2c was on my radar, but I hadn’t seen their (your?) infrastructure guide: that’s a great resource.

I’m pretty sure I have a lot more to learn though from the number of terms just in your reply that I only loosely recognize! :slight_smile: I appreciate you mentioning the nodes though: I was afraid of the need for the x-large or high mem nodes: they are definitely pricier than the basic options, but if what’s students actually need it’s better to know in advance.