Hi all,
I am Steffen, a Phd. student working at the RWTH Aachen University in Germany.
Usually, I working on my dissertation in the area of real-time power system simulation and hardware in the loop testing at the EON Energy Research Center.
However, in the last months I got the chance to return to my original studies (computer engineering) to build a fairly big JupyterHub cluster in collaboration with the IT Center of our university.
And these last months have been a great adventure! From the beginning my goal has been to push new ideas/technologies for this cluster.
Currently I am working on a couple of areas in the Jupyter universe:
IPv6 support
E.g. due to IPv4 shortage, we decided to deploy an IPv6-only Kubernetes cluster in which every user will get a publicly routable IPv6 address in his single-user environment. This allows us to easily provide access to the public Internet for all cluster users from the single-user environments while maintaining the ability to track down abuse based on these public IPv6 addresses.
At the same time it is a nice project to showcase IPv6 readiness in the university.
Profile management
We provide each lecture the ability to define their own single-user environments by providing us a Dockerfile based on the docker-stacks repo. As we currently have around 20 different lectures and labs, the management of these single-user Environments becomes a bit laborious.
To solve this issue, we started to write a JupyterHub service which will monitor and trigger GitLab CI pipelines to build Docker images for these profiles using GitLab CI runners in the Kubernetes cluster.
Multi-tenancy
Given that we are a quite large university in Germany, we have multiple faculties using our cluster already since we launched last week. To keep it manageable by our IT Center, we need a certain degree of multi-tenancy. This issue is partially solved by our profile management service described in the previous section. This includes the integration into our Moodle LMS, self-service for profile management as well as a possibility to restrict access to certain profiles based on the course membership in Moodle.
All of this becoming slightly more difficult since we are using different Single-Sign-On solutions for Moodle and Jupyter 
Here are some more details about our cluster:
Software
- CentOS (8.1)
- Docker (19.04)
- Shibboleth SP (nginx-http-shibboleth)
- Kubernetes (1.18.0)
- Calico
- nginx-ingress
- Rook.io / Ceph (1.3.0 / 15.2)
- IPv6-only cluster
- Jool NAT64 Gateway
- Rancher (2.4.2)
- Jupyter{Hub,Lab} (
masterbranch for all components) - Automated Provisioning via Ansible and Razor
Hardware
7x Dell PowerEdge 740XD:
- Dual Socket Systems: 2x 16C / 32T Xeon Gold 5218 2,3 Ghz
- 768 GB DDR4 RAM / node (5.376 GB total)
- 100 TiB Ceph Hyper-Converged storage
- Dual 10 GigE / node
Additionally, 1 of the nodes is equipped with:
- 2x NVIDIA Tesla T4 GPGPUs