Local Unix users in Jupyterhub (minikube) setting up Authentication, Volumes, NB_UID, NB_USER

Hi,

I am trying to set up a Jupyterhub installation for use in our research group. A similar setup might be needed for people running courses etc.

The idea is to use a single, powerful workstation that is shared between a few users to serve various jupyter-remote-desktop images for various tasks that require more CPU/RAM and better connectivity to the data store than users have on their local machines.

There are various issues around user authentication, volume permissions and user IDs that I see touched on in various threads here and in various tutorials but I don’t seem to manage to put them all together.

My setup is a workstation running Ubuntu 20.04, with minikube installed and jupyterhub installed from the helm chart following the instructions from zero2JH.
I have local unix users, let’s call them peter, paul and mary, which all have a directory under /home/<username> and additional storage under /data/shared/<username>. Lets say they have UIDs 200,201,202 respectively, and all belong to the same group. There is no LDAP.

What I would definitely like to have is:

  • the users should be able to authenticate to jupyterhub using their normal unix password and username
  • the users should be able to select from several jupyter desktop images
  • the /data/shared/<username> volumes should be mounted as volumes to the started containers with the correct permissions, I assume that requires setting NB_USER, NB_UID environment variables somehow to e.g. mary, 202 when spawning a container for mary.

What I am not sure about, and what I consider optional:

  • the HOME directory/home/<username> might be mounted as a volume and possibly also be the /home/<username> HOME directory within the container (instead of /home/jovyan). I am not 100% sure I want this, as the users might persist some .bashrc or other config files that might mess up the standardized environment that I am trying to provide within the container or that is in conflict between regular ssh remote use of the workstation and use with Jupyterhub.

I have managed to set up a basic config.yaml that allows users (that I manually add) to log in and also to provide them a choice of docker images to start.

Now there are several things I am struggeling with. I will end this post with one concrete question and will add to the thread with additional questions as they arise:

I don’t quite understand how to provide correct user names and user IDs. I have seen some examples of more complicated cases (e.g. for LDAP) where people are using a custom authenticator written in Python which then use this to set NB_UID and NB_USER.
I don’t know whether I need to write a custom authenticator for this simple use case or can just use one of the provided ones? And how do I get the username and user id from the authenticator to pass it to the spawner? A little snippet to put into config.yaml for the helm chart woud be greatly appreciated.

JupyterHub on Kubernetes (Z2JH) is fully containerised deployment, so you can’t use local Unix users for authentication. That’s why a remote authenticator like LDAP or OAuth is used in situations where a common user identity is required on different servers.

If you want to run your users servers in a container, but have everything hosted on a single server, and use local authentication, you’re probably better off using GitHub - jupyterhub/dockerspawner: Spawns JupyterHub single user servers in Docker containers and install/configure JupyterHub yourself.

You’ll need to subclass your chosen authenticator to set the NB_* variables- the information. This is an example of the authenticate() method:

you’d need to modify that to fetch and return additional information in auth_state… and then use that information to set some environment variables:
https://jupyterhub.readthedocs.io/en/stable/reference/authenticators.html#using-auth-state

Thanks for taking the time to reply @manics .

I didn’t realize I couldn’t use PAM authentication with Kubernetes, but now that you mention it it makes predect sense.
The machine is behind a firewall so I don’t think I can use OAuth services such as those from Github.

I also was not aware of Dockerspawner. When googling how to set up a Jupyterhub one almost inevatbly ends up on the Zero to Jupyterhub webpage. As the first entrypoint that creates the impression that Kubernetes is the way to go to set up Jupyterhub.

Now that I invested a bit of time learning about Kubernetes and setting up minikube I am still wondering whether I can make this work (this is only for a handful of users and on a non-public network, so maybe I can just maintain a simple dictionary based user/password authenticator) with what I already have.

I will also look at Dockerspawner and see whether I can make it work.

Z2JH includes the Native Authenticator which might work for you?

1 Like