Folder Permissions

I am using Azuread as the oauth for jupyterhub and I have created a folder “user-data” in the home directory of the user account for server. I have also set the create_kwargs as seen below to the user account UID.

c.DockerSpawner.extra_create_kwargs = {
    "user": "1000", # Can also be an integer UID
}

The user-data folder is set to 1000 as owner and group.

Whenever I login and start the container the user folder gets created as root:root under the “user-data” folder.

Am I able to use the home directory to store the users data? I also tried creating a folder directly in the /home directory and set the chown properly but that still not work.

I can use the command below

chown -R <user>:<user> user-data/

and that will set permissions correctly so the user can access it but I dont really want to have to that for each new user.

Ironically I did have it working correctly but it seems to have broken in my further testing.

It is like it is ignoring the user on startup.

Is there a better way of handling user data?

I tried to create a pre_spawn_hook as below

c.DockerSpawner.extra_create_kwargs = {
    "user" : "1000", # Can also be an integer UID
}

# Makes the users directory and changes ownership to spawn user
def folder_permissions(DockerSpawner): 
    # Leaf directory 
    directory = "{username}"
    # Parent Directories 
    parent_dir = "/home/folder"
    # Path 
    path = os.path.join(parent_dir, directory) 
    if os.path.isdir(path) == False:
        uid = 1000
        gid = 1000
        # Create the directory 
        os.makedirs(path) 
        os.chown(path, uid, gid)
        print("Directory '% s' created" % directory)
    else:
        print("Directory '% s' already exist" % directory)

c.DockerSpawner.pre_spawn_hook = folder_permissions

but that still did not work. I read that trying to use os.chown() required superuser so I ran my JupyterHub docker compose with sudo and that did not work either.

Should the output of the hook display in JupyterHub logs or will it show up in the spawned docker container logs? I cant seem to find where it is even running.

You need to get username from the spawner argument in the pre_spawn_hook.

# Makes the users directory and changes ownership to spawn user
def folder_permissions(spawner): 
    # Leaf directory 
    directory = f"{spawner.user.name}"
    # Parent Directories 
    parent_dir = "/home/folder"
    # Path 
    path = os.path.join(parent_dir, directory) 
    if os.path.isdir(path) == False:
        uid = 1000
        gid = 1000
        # Create the directory 
        os.makedirs(path) 
        os.chown(path, uid, gid)
        spawner.log.info("Directory '% s' created" % directory)
    else:
        spawner.log.info("Directory '% s' already exist" % directory)

Notice that we need to use spawner.log to get logs from pre_spawn_hook in your hub logs.

Thanks for the quick reply!

I updated my config with your changes and it now shows up in the logs that the folder was created but the permissions were still wrong. I adjusted the code some to output the “id’s” of the paths but it does not output anything.

# Makes the users directory and changes ownership to spawn user
def folder_permissions(spawner): 
    # Leaf directory 
    directory = f"{spawner.user.name}"
    # Parent Directories 
    parent_dir = "/home/user/user-data"
    # Path 
    path = os.path.join(parent_dir, directory) 
    if os.path.isdir(path) == False:
        uid = 1000
        gid = 1000
        # Create the directory 
        os.makedirs(path) 
        os.chown(path, uid, gid)
        spawner.log.info("Owner id of the directory:", os.stat(path).st_uid)
        spawner.log.info("Group id of the directory:", os.stat(path).st_gid)
        spawner.log.info("Directory '% s' created" % directory)
    else:
        spawner.log.info("Directory '% s' already exist" % directory)

That leads me to think that it is not os.chown() the folders. I tried running jupyterhub with sudo to see if that may help give it permissions but that did not help either. Does the spawner run the hooks as the user that is set under extra_create_kwargs?

I set extra_create_kwargs user to root and set the environment variables for the NB_UID, NB_GID, and NB_USER but that did not work either. Inside the container was just running as root.

c.DockerSpawner.extra_create_kwargs = {
    "user" : "root", # Can also be an integer UID
}

c.DockerSpawner.environment = {
    "NB_UID" : "1000",
    "NB_GID" : "1000",
    "NB_USER": "{username}"
}

It just seems that no matter what I set it will always set the permissions of the folder as root.

I am using a container from nvidia as a base, nvcr.io/nvidia/pytorch:23.05-py3. Maybe that is creating permission issues because it is set to run as root?

pre_spawn_hook will be run as the user the JupyterHub is running under. I think you will need to run JupyterHub with sudo to be able to do chown.

Could you share your logs? Without the logs, it is hard to find the root of the problem.

Please see below for the log of the output. u_first and u_last were put in place of the user logged in.

output.log

jupyterhub | [I 2023-09-19 22:38:28.232 JupyterHub app:2859] Running JupyterHub version 4.0.2
jupyterhub | [I 2023-09-19 22:38:28.232 JupyterHub app:2889] Using Authenticator: oauthenticator.azuread.AzureAdOAuthenticator-16.0.7
jupyterhub | [I 2023-09-19 22:38:28.233 JupyterHub app:2889] Using Spawner: dockerspawner.dockerspawner.DockerSpawner-12.1.0
jupyterhub | [I 2023-09-19 22:38:28.233 JupyterHub app:2889] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-4.0.2
jupyterhub | [I 2023-09-19 22:38:28.250 JupyterHub app:1664] Loading cookie_secret from /data/jupyterhub_cookie_secret
jupyterhub | [I 2023-09-19 22:38:28.378 JupyterHub proxy:556] Generating new CONFIGPROXY_AUTH_TOKEN
jupyterhub | [I 2023-09-19 22:38:28.396 JupyterHub app:1984] Not using allowed_users. Any authenticated user will be allowed.
jupyterhub | [I 2023-09-19 22:38:28.431 JupyterHub app:2928] Initialized 0 spawners in 0.003 seconds
jupyterhub | [I 2023-09-19 22:38:28.438 JupyterHub metrics:278] Found 1 active users in the last ActiveUserPeriods.twenty_four_hours
jupyterhub | [I 2023-09-19 22:38:28.440 JupyterHub metrics:278] Found 2 active users in the last ActiveUserPeriods.seven_days
jupyterhub | [I 2023-09-19 22:38:28.441 JupyterHub metrics:278] Found 2 active users in the last ActiveUserPeriods.thirty_days
jupyterhub | [W 2023-09-19 22:38:28.441 JupyterHub proxy:746] Running JupyterHub without SSL. I hope there is SSL termination happening somewhere else…
jupyterhub | [I 2023-09-19 22:38:28.441 JupyterHub proxy:750] Starting proxy @ http://:8000
jupyterhub | 22:38:28.755 [ConfigProxy] e[32minfoe[39m: Proxying http://*:8000 to (no default)
jupyterhub | 22:38:28.759 [ConfigProxy] e[32minfoe[39m: Proxy API at http://127.0.0.1:8001/api/routes
jupyterhub | [I 2023-09-19 22:38:29.138 JupyterHub app:3178] Hub API listening on http://jupyterhub:8080/hub/
jupyterhub | 22:38:29.138 [ConfigProxy] e[32minfoe[39m: 200 GET /api/routes
jupyterhub | 22:38:29.141 [ConfigProxy] e[32minfoe[39m: 200 GET /api/routes
jupyterhub | [I 2023-09-19 22:38:29.142 JupyterHub proxy:477] Adding route for Hub: / => http://jupyterhub:8080
jupyterhub | 22:38:29.144 [ConfigProxy] e[32minfoe[39m: Adding route / → http://jupyterhub:8080
jupyterhub | 22:38:29.146 [ConfigProxy] e[32minfoe[39m: Route added / → http://jupyterhub:8080
jupyterhub | 22:38:29.147 [ConfigProxy] e[32minfoe[39m: 201 POST /api/routes/
jupyterhub | [I 2023-09-19 22:38:29.147 JupyterHub app:3245] JupyterHub is now running at http://:8000
jupyterhub | [I 2023-09-19 22:38:47.637 JupyterHub log:191] 200 GET /hub/home (u_first u_last@192.168.2.105) 160.43ms
jupyterhub | [I 2023-09-19 22:38:50.034 JupyterHub provider:659] Creating oauth client jupyterhub-user-u_first%20u_last
jupyterhub | [I 2023-09-19 22:38:50.066 JupyterHub jupyterhub_config:60] Bad message (TypeError(‘not all arguments converted during string foru_firsting’)): {‘name’: ‘JupyterHub’, ‘msg’: ‘Owner id of the directory:’, ‘args’: (1000,), ‘levelname’: ‘INFO’, ‘levelno’: 20, ‘pathname’: ‘/srv/jupyterhub/jupyterhub_config.py’, ‘filename’: ‘jupyterhub_config.py’, ‘module’: ‘jupyterhub_config’, ‘exc_info’: None, ‘exc_text’: None, ‘stack_info’: None, ‘lineno’: 60, ‘funcName’: ‘folder_permissions’, ‘created’: 1695163130.0668325, ‘msecs’: 66.0, ‘relativeCreated’: 24796.883821487427, ‘thread’: 139997906571264, ‘threadName’: ‘MainThread’, ‘processName’: ‘MainProcess’, ‘process’: 1}
jupyterhub | [I 2023-09-19 22:38:50.067 JupyterHub jupyterhub_config:61] Bad message (TypeError(‘not all arguments converted during string foru_firsting’)): {‘name’: ‘JupyterHub’, ‘msg’: ‘Group id of the directory:’, ‘args’: (1000,), ‘levelname’: ‘INFO’, ‘levelno’: 20, ‘pathname’: ‘/srv/jupyterhub/jupyterhub_config.py’, ‘filename’: ‘jupyterhub_config.py’, ‘module’: ‘jupyterhub_config’, ‘exc_info’: None, ‘exc_text’: None, ‘stack_info’: None, ‘lineno’: 61, ‘funcName’: ‘folder_permissions’, ‘created’: 1695163130.067097, ‘msecs’: 67.0, ‘relativeCreated’: 24797.14822769165, ‘thread’: 139997906571264, ‘threadName’: ‘MainThread’, ‘processName’: ‘MainProcess’, ‘process’: 1}
jupyterhub | [I 2023-09-19 22:38:50.067 JupyterHub jupyterhub_config:62] Directory ‘u_first u_last’ created
jupyterhub | [I 2023-09-19 22:38:50.085 JupyterHub dockerspawner:988] Container ‘jupyter-u_first-20u_last’ is gone
jupyterhub | [I 2023-09-19 22:38:50.126 JupyterHub dockerspawner:1272] Created container jupyter-u_first-20u_last (id: ea8cce6) from image pytorch-test:1.4
jupyterhub | [I 2023-09-19 22:38:50.126 JupyterHub dockerspawner:1296] Starting container jupyter-u_first-20u_last (id: ea8cce6)
jupyterhub | [I 2023-09-19 22:38:51.000 JupyterHub log:191] 302 GET /hub/spawn/u_first%20u_last → /hub/spawn-pending/u_first%20u_last (u_first u_last@192.168.2.105) 1007.56ms
jupyterhub | [I 2023-09-19 22:38:51.013 JupyterHub pages:398] u_first u_last is pending spawn
jupyterhub | [I 2023-09-19 22:38:51.020 JupyterHub log:191] 200 GET /hub/spawn-pending/u_first%20u_last (u_first u_last@192.168.2.105) 10.52ms
jupyterhub | [I 2023-09-19 22:38:53.764 JupyterHub log:191] 200 GET /hub/api (@192.168.160.3) 0.89ms
jupyterhub | [I 2023-09-19 22:38:53.795 JupyterHub log:191] 200 POST /hub/api/users/u_first%20u_last/activity (u_first u_last@192.168.160.3) 21.79ms
jupyterhub | [I 2023-09-19 22:38:54.960 JupyterHub base:990] User u_first u_last took 4.963 seconds to start
jupyterhub | [I 2023-09-19 22:38:54.961 JupyterHub proxy:330] Adding user u_first u_last to proxy /user/u_first%20u_last/ => http://192.168.160.3:8888
jupyterhub | 22:38:54.962 [ConfigProxy] e[32minfoe[39m: Adding route /user/u_first u_last → http://192.168.160.3:8888
jupyterhub | 22:38:54.963 [ConfigProxy] e[32minfoe[39m: Route added /user/u_first u_last → http://192.168.160.3:8888
jupyterhub | 22:38:54.964 [ConfigProxy] e[32minfoe[39m: 201 POST /api/routes/user/u_first%20u_last
jupyterhub | [I 2023-09-19 22:38:54.964 JupyterHub users:768] Server u_first u_last is ready
jupyterhub | [I 2023-09-19 22:38:54.965 JupyterHub log:191] 200 GET /hub/api/users/u_first%20u_last/server/progress?_xsrf=[secret] (u_first u_last@192.168.2.105) 3792.07ms
jupyterhub | [I 2023-09-19 22:38:54.986 JupyterHub log:191] 302 GET /hub/spawn-pending/u_first%20u_last → /user/u_first%20u_last/ (u_first u_last@192.168.2.105) 3.58ms
jupyterhub | [I 2023-09-19 22:38:55.058 JupyterHub log:191] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-u_first%2520u_last&redirect_uri=%2Fuser%2Fu_first%2520u_last%2Foauth_callback&response_type=code&state=[secret] → /user/u_first%20u_last/oauth_callback?code=[secret]&state=[secret] (u_first u_last@192.168.2.105) 25.38ms
jupyterhub | [I 2023-09-19 22:38:55.106 JupyterHub log:191] 200 POST /hub/api/oauth2/token (u_first u_last@192.168.160.3) 35.00ms
jupyterhub | [I 2023-09-19 22:38:55.119 JupyterHub log:191] 200 GET /hub/api/user (u_first u_last@192.168.160.3) 10.34ms
jupyterhub | [I 2023-09-19 22:38:55.472 JupyterHub log:191] 200 GET /hub/api/user (u_first u_last@192.168.160.3) 2.98ms
jupyterhub | [I 2023-09-19 22:39:07.762 JupyterHub log:191] 200 GET /hub/home (u_first u_last@192.168.2.105) 8.36ms
jupyterhub | [I 2023-09-19 22:39:08.641 JupyterHub proxy:356] Removing user u_first u_last from proxy (/user/u_first%20u_last/)
jupyterhub | 22:39:08.643 [ConfigProxy] e[32minfoe[39m: Removing route /user/u_first u_last
jupyterhub | 22:39:08.644 [ConfigProxy] e[32minfoe[39m: 204 DELETE /api/routes/user/u_first%20u_last
jupyterhub | [I 2023-09-19 22:39:08.648 JupyterHub dockerspawner:1390] Stopping container jupyter-u_first-20u_last (id: ea8cce6)
jupyterhub | [I 2023-09-19 22:39:09.253 JupyterHub dockerspawner:1018] Removing container ea8cce67034862011ee3789f8ce19bb52c739673af3a3b713fd2098d35b5c89f
jupyterhub | [I 2023-09-19 22:39:09.295 JupyterHub base:1197] User u_first u_last server took 0.654 seconds to stop
jupyterhub | [I 2023-09-19 22:39:09.296 JupyterHub log:191] 204 DELETE /hub/api/users/u_first%20u_last/server?_xsrf=[secret] (u_first u_last@192.168.2.105) 662.33ms
canceled

JupyterHub ran as ‘sudo docker compose up’. I did notice this time that there was an error when it got to the output of the UID and GID.

extra_create_kwargs was set to “user” : “1000”

The error was due to the bad string formatting of the above lines. They must be

spawner.log.info("Owner id of the directory: %s" % os.stat(path).st_uid)
spawner.log.info("Group id of the directory: %s" % os.stat(path).st_gid)

I guess your pre_spawn_hook has not been properly executed due to those errors. Correct them and try again?

I made the corrections you mentioned but all it did was print out that what the id’s were and that they were correct, 1000, but the folder was still root.

I have noticed that it will still show that it created the u_first u_last directory each time regardless if the folder is there or not.

Someone had pointed me to “CHOWN_EXTRA” and “CHOWN_EXTRA_OPTS” environment variables and I tried those with a docker run command. It worked.

sudo docker run --user root --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -it --rm -p 8888:8888 -e NB_USER=az_user -e NB_GID=1000 -e NB_UID=1000 -e CHOWN_EXTRA="/home/az_user" -e CHOWN_EXTRA_OPTS="-R" -v /home/<user>/user-data/<az_user>:/home/az_user pytorch-test:1.4

JupyterHub for some reason will not pass the environment variables to the spawned container.

This is the config I have so far:

c.DockerSpawner.extra_create_kwargs = {
    "user" : "root", # Can also be an integer UID
}

c.DockerSpawner.extra_host_config = {
    "runtime" : "nvidia",
    "ipc_mode" : "host",
    "shm_size" : "256m" # Increases the shared memory size limit for the docker container
}

c.DockerSpawner.environment = {
    "NB_UID" : "1000",
    "NB_GID" : "1000",
    "NB_USER" : "{username}",
    "CHOWN_EXTRA" : "/home/{username}",
    "CHOWN_EXTRA_OPTS" : "-R",
}

c.DockerSpawner.volumes = {"/home/<user>/user-data/{username}": notebook_dir}

Is there another way that I am suppose send the variables to the container?

Still have to figure out the ulimit memlock and the ulimit stack part but one step at a time.

Ideally I would not like to run the container itself as root but at this point I think I would like to see something work.

Thanks again for all the help, very much appreciated!