Hello, I have created Binderhub (using the zero to binderhub guide) on our company server, and I want our users to have access to this binderhub to build and run their images with jupyter notebooks. Potentially dozens of users will be using it, and creating images that might take GBs of storage space.
My question is, the manual insists on me setting up a repository (docker hub or microsoft azure or others), and from I understand, using an image repository is not necessary, as it is only used as a “transport” middle man between binder and jupyterhub, correct? And also users could potentially create thousands of images, that kind of upload/download traffic to/from docker hub could be very taxing, right?
I want to run everything locally on my server, that means binderhub, jupyterhub, images, everything on one machine under one machine’s public IP in the same kubernetes cluster (I would love to dodge kubernetes and just install it on the machine like normal apps, but I guess that does not work with binderhub and I have to use helm?).
So is it possible to just have binderhub and jupyterhub locally next to each other in the same kubernetes cluster, and forget about using a repository? If I set “use_registry” to “false” in the config.yaml, it seems to skip the uploading to docker hub, but then it fails on starting the image, so somehow I have to install jupyterhub to the kubernetes cluster too with helm?
So my MAIN question is: if I indeed dont have to use an image repository and can just run binderhub and everything locally, how do I run the images, how do I set up the path between binderhub and jupyterhub? I thought that by installing binderhub with helm, it also installs its own jupyterhub (because how else would it run the image after building it), is it true? Or do I have to use the “zero to jupyterhub” guide to install jupyterhub into helm to run my images? I am quite confused about how to actually run the built notebook images, as it is supposed to be run in jupyterhub, but I have to provide IP of the jupyterhub, which I dont get, I thought jupyterhub is just run locally in the same kubernetes cluster, why would someone run binderhub on one machine and launched the images by uploading&downloading them through docker hub to jupyterhub running on a completely different machine? If I run binderhub and jupyterhub locally, do I just refer to jupyterhub IP as 127.0.0.1?
So how do I go from “I can succesfully build images in binderhub” to also “running them locally on the same machine so that user can build and run the image through their browser”? Thanks a lot, I am very confused about the helm usage and setup, about what role does jupyterhub play and how do I give it the images and run it.
BTW I actually already run jupyterhub on the same machine (it was working before I even thought about binderhub), but with batchspawner (PBSSpawner to be exact), so it spawns jupyter notebook servers on the compute nodes of our cluster (supercomputer). Right now it is enough for me if I can run binderhub images just locally (with jupyterhub’s local process spawner I suppose?), but in the future, it is possible to have binderhub for my users, that builds their image, and starts it on a compute node (I dont mind having a separate second jupyterhub for that, as I dont expect my jupyterhub users can mix the notebook servers created from jupyterhub with notebooks created from binderhub?
Thanks a ton, hope I explained correctly what I want. If I didnt, just forget about my setup and how I think I should run things, and show me a working tutorial where Binderhub builds and runs images without uploading them to docker hub without assuming I know anything, just how to set this up on a freshly installed ubuntu machine (I am mainly confused about where do I get the jupyterhub and how to run images in it?)