JupyterHub for a reproducible research platform

sgaist · June 8, 2021, 12:46pm

I am part of a team that is currently exploring the use of JupyterHub as a base for the next generation of our platform for reproducible research (currently known as BEAT).

Background: We already make use of JupyterHub over Kubernetes for running academic courses on Machine Learning at our institute. We have, therefore, some experience installing and customising this toolset for this purpose.

To what concerns the re-design of BEAT, we feel it makes sense to pursue the JupyterHub over Kubernetes track, but we are currently unsure about what it would take to implement some of our specifications.

Roughly speaking, we would be interested in implementing something equivalent to the notebook interface of Kaggle, while storing notebooks on a Git server (e.g. GitHub/GitLab/…). In this environment, we would like users to be subject to processing quotas (CPU/RAM) and have access to shareable storage “buckets”. Aside from JupyterHub over Kubernetes, we have also tried BinderHub (and even contributed back some patches). However that approach does not cover all of our specs.

I try to summarise here some of the points that interest us:

Sharable data volumes (K8s resources) between users, in a user controlled manner: User A would like to share one of its data volumes with User B but not the whole content of its allocated space.
Variable quotas: user A has access to X amount of CPU/RAM from K8s cluster, while user B has Y.
Custom front-end to notebooks: as in Kaggle, we would like to allow users to be able to change some aspects of the current processing environment (e.g. which data volumes to attach), and to monitor available resource usage (using an iframe was tried but it does not really cover our use case).

From the looks of it, some of these points could be covered by the Jupyter Enterprise Gateway.

We would like avoid re-doing this if somebody is already tackling a similar use case, or work collaboratively if anyone is interested. Naturally, we intend to open-source all of our contributions.

Please let us know what you think of this, your suggestions on how to approach this use-case, and interest in following this up.

Topic		Replies	Views
Hosting JupyterHubs - Any tips for new admins? JupyterHub	17	2169	March 19, 2020
Is it possible to share notebook in preview or readonly mode between different users on k8s deployment Zero to JupyterHub on Kubernetes jupyterlab , jupyterhub , how-to , help-wanted	0	366	March 10, 2023
How to use jupyterhub api for allocating individual single-user notebook JupyterHub jupyterhub , how-to , help-wanted	1	381	September 8, 2020
File level access in jupyterhub? Zero to JupyterHub on Kubernetes jupyterhub , how-to , help-wanted	3	55	September 25, 2024
How to share notebooks with other users on jupyterhub JupyterHub	5	5724	July 1, 2025

JupyterHub for a reproducible research platform

Related topics