New to JupyterHub - Queries

Hi there,

have some queries regards setting up my first JupyterHub to manage my JupyterLab deployments in future.

  1. Base on this guide (https://jupyterhub.readthedocs.io/en/stable/installation-guide-hard.html) we intend to setup this in an airgap environement without internet connectivity. If we manage to setup an internal apt-mirror and pypi mirror are we able to achieve the installation?

  2. I saw the installation instruction on creating default conda envs for JupyterHub is this mandatory? As currently we dont have any conda mirrors?

  3. I am exploring whether to do jupyter hub or baremetal vm or on docker… any benefits and trade offs? I saw jupyter notebook docker stacks that can be used which i think is great but considering our airgap env nature will it be useful? I.e a data scientist or user needs some frameworks library we have to do a python install from the offline repo

Yes. In fact, a private docker registry should be the only thing you need. To build your own user images, an apt mirror and a PyPI and/or conda mirror ought to work.

No. It’s up to you how you install software in the user images. Most of our default images setup Python and the scientific Python stack with conda, but the only really hard requirements are that the jupyterhub and notebook packages are installed. This can be from PyPI.

Lots! The biggest difference is the requirement for unix user accounts to exist in the VM case, and not in the docker case. This can recommend both directions, depending on your existing situation. If you already have infrastructure for dealing with user accounts on shared machines (LDAP PAM plugin, etc.), then this may be a plus. For most folks, though, it can be more of a cost, and the lack of needing to manage users makes docker deployments simpler. The littlest jupyterhub is on a bare VM, but creates prefixed user accounts only while servers are running.

Software installation and updates might be made simpler or more complicated by your airgapped setup. I can’t really speak to that, but maybe that should the deciding factor. I would guess that doing docker builds somewhere else, and requiring only access to a private docker registry would make managing simpler.

I saw jupyter notebook docker stacks that can be used which i think is great but considering our airgap env nature will it be useful? I.e a data scientist or user needs some frameworks library we have to do a python install from the offline repo

You can use them or not. It might be easier for your policy maintenance to not use them. They are pretty big and might be hard to audit. They should be configured such that users can perform a pip install --user from your private registry just fine.

3 Likes

I have a docker based system on github under joequant/bitquant

I’ve found that VM makes things much too complicated, but docker is extremely useful. The main thing that docker is great for is that 1) if it turns out that you have to replace hardware and start with a new machine then you can just move the docker image onto the new machine and this sort of thing is really useful if you have to replace disks 2) having multiple docker images is useful since you can have different separate groups running on the same hardware.

Hello Joseph,

thanks for the update! any idea on jupyterhub usage on GPUs? If my VM has 1 GPU card attached only and assuming a docker image with gpu cuda driver is built to be consumed, if multiple pod is spun up how does it consume the GPU concurrently?