User's `$HOME` in JupyterHub

Sorry to bother again - still trying to wrap my head about user management in jupyterhub.

I have a repo2docker build set-up which works perfectly with mybinder. Here is the binder config.

In particular, the postBuild script generates several elements in joyvan’s $HOME which are necessary for a proper env. In MyBinder hubs, this works as expected. Here is a view on a standard user’s $HOME after a MyBinder launch:

jovyan@jupyter-oggm-2doggm-2dedu-2dr2d-2d9p31elan:~$ ls -lah 
total 100K 
drwxr-xr-x 1 jovyan jovyan 4.0K Jun 24 14:38 . 
drwxr-xr-x 1 root root 4.0K May 21 15:19 .. 
-rw-r--r-- 1 jovyan jovyan 220 May 21 15:19 .bash_logout 
-rw-r--r-- 1 jovyan jovyan 3.7K May 21 15:19 .bashrc 
drwxr-xr-x 1 jovyan jovyan 4.0K Jun 24 14:06 binder 
drwx------ 1 jovyan jovyan 4.0K Jun 24 14:15 .cache 
drwxrwsr-x 2 jovyan jovyan 4.0K Jun 24 14:02 .conda
drwx------ 1 jovyan jovyan 4.0K Jun 24 14:05 .config 
drwxr-xr-x 2 jovyan jovyan 4.0K Jun 24 14:05 .empty 
drwxr-xr-x 2 jovyan jovyan 4.0K Jun 24 14:15 .fonts 
drwxr-xr-x 1 jovyan jovyan 4.0K Jun 24 13:55 .git 
drwxr-xr-x 5 jovyan jovyan 4.0K Jun 24 14:38 .ipython 
drwxr-xr-x 3 jovyan jovyan 4.0K Jun 24 14:38 .jupyter 
drwx------ 3 jovyan jovyan 4.0K Jun 24 14:37 .local 
drwxr-xr-x 3 jovyan jovyan 4.0K Jun 24 14:08 .npm 
drwxr-xr-x 3 jovyan jovyan 4.0K Jun 24 14:15 .oggm 
drwxr-xr-x 3 jovyan jovyan 4.0K Jun 24 14:15 .oggm_cache 
-rw-r--r-- 1 jovyan jovyan 265 Jun 24 14:15 .oggm_config 
drwxr-xr-x 9 jovyan jovyan 4.0K Jun 24 14:38 oggm-edu 
-rw-r--r-- 1 jovyan jovyan 807 May 21 15:19 .profile 
-rw-r--r-- 1 jovyan jovyan 313 Jun 24 13:55 README.md 
drwxr-xr-x 5 jovyan jovyan 4.0K Jun 24 14:15 .salem_cache 
-rw-r--r-- 1 jovyan jovyan 1.5K Jun 24 13:55 .travis.yml 
drwxr-xr-x 3 jovyan jovyan 4.0K Jun 24 14:09 .yarn

I use repo2docker on the same repository to create images which are pushed to dockerhub. Here is the CI script.

When these containers are used in JupyterLab (created with the zero2jupyterhub instructions), the user’s HOME is incomplete:

jovyan@jupyter-fabi:~$ ls -lah 
total 36K 
drwxrwsr-x 6 root users 4.0K Jun 24 14:36 . 
drwxr-xr-x 1 root root 4.0K Jun 24 14:04 .. 
drwxr-sr-x 3 jovyan users 4.0K Jun 24 14:36 .jupyter 
drwx--S--- 3 jovyan users 4.0K Jun 24 14:36 .local 
drwxrwS--- 2 root users 16K Jun 24 14:35 lost+found

The postbuild files are missing and the user privileges are different. Here is my config.yaml:

singleuser:
  image:
    name: oggm/oggm-edu-r2d
    tag: 20190624
  defaultUrl: "/lab"
  lifecycleHooks:
    postStart:
      exec:
        command: ["gitpuller", "https://github.com/OGGM/oggm-edu-notebooks", "master", "notebooks"]
  cpu:
    limit: 2
    guarantee: 0.5

hub:
  extraConfig:
    jupyterlab: |
      c.Spawner.cmd = ['jupyter-labhub']

Obviously I am missing something, but where? In the repo2docker call? In my JupyterHub config? Thanks in advance for your help!

1 Like

JupyterHub mounts a permanent volume at /home/jovyan so all the files that are there in the docker image get “hidden” by that new volume.

You can use the --target-repo-dir flag for repo2docker to set a different directory for it to use to put contents.

1 Like

Thanks @betatim - do I understand correctly that the workflow to get a pre-configured $HOME in JupyterHub with a custom repo2docker image is following:

repo2docker --target-repo-dir /home_commons ...

Then, in the JupyterHub config.yaml, use a post start copy command:

singleuser:
  lifecycleHooks:
    postStart:
      exec:
        command: ["cp", "-r", "/home_commons/*", "/home/joyvan/"]

Is that the way to go?

The documentation says: “Path inside the image where contents of the repositories are copied to.” Here I am more interested in other files created in $HOME by my postBuild script - will that still work?

Maybe we can improve the documentation a bit. What it means is “directory where the contents of the repo is put and all the build operations happen” or some such. postBuild is executed in the directory into which the repo is checked out. So I think this is what you are looking for.

The answer to how to then make it available in the home directory of each user is, unfortunately: “it depends”. You might want to copy stuff, use something like nbgitpuller to update the home dir but not overwrite changes, use a sub-directory in the home directory as a place to put stuff, symlink, etc.

1 Like

I understand - the problem with the postStart approach I outlined above is that it will overwrite possible user changes in their home after restarting their hub. So it might need some more thinking but in core it seems to be what I need.

I made a small PR to improve the documentation: https://github.com/jupyter/repo2docker/pull/721

I think in general, $HOME isn’t a variable safely used in repo2docker, at least if it’s for use with JupyterHub. You could put some of this in a start script to move files from $REPO_DIR to $HOME at launch, and put the files in $REPO_DIR in postBuild.

Yes, I’m learning by doing. So adding a --target-repo-dir /oggm_commons to the repo2docker call broke a postBuild wich is working with mybinder (log). The problem is that repo2docker still works in /home/jovyan, while I was kind of expecting --target-repo-dir to be the “new home” altogether.

So what I’ll try now is to make postBuild write things in /commons directly (assuming postBuild can write everywhere).

OK, so what I’ve tried didn’t work:

  • if I set target-repo-dir, to something else (say /commons), then jupyterhub will use this as home instead, thus still overwriting it’s content when a new user logs in
  • if I don’t set target-repo-dir and try to write from postBuild into a new folder (like /commons) I get a permission error.

Do you know of a way out? Either force jupyterhub to start with user jovyan in config.yaml, OR by specifying a better folder path in which to write to from the postBuild environment - any recommendation?

Thanks!

For the record: I’ve now copied the postBuild logic into a “start script” which is run when jupyterhub users log-in for the first time. It’s a bit suboptimal (things are done twice) but it does what’s needed :wink: