Shared writable folder for each group

Hi,

We are developing a data science platform where jupyterhub is embedded as part of the workflow. It is hosted with DockerSpawner.

Basically users create projects on our platform and work in teams, and we would like each project to have a shared folder in jupyterhub. Each project folder is only accessible to its team members, and each of them can edit files in that folder.

I’ve read other similar posts, but our demand seems more complex. Because users and groups are not predefined, shared folder cannot be achieved via settings in Dockerfile or jupyterhub configuration.

What I’ve currently figured out is that I could create folders, create a ‘group’, and add users to the group through API.

Does anyone have a clue or other ideas to do this? Many thanks! :heart:

1 Like

Spawners support a pre_spawn_hook, maybe you can use that to create your group folder/volume?

Hi manics,

Thanks for your reply. Do you mean that we can use pre_spawn_hook to connect to our platform database and retrieve a user-project dictionary, prior to the spawning for each user’s server?

How can this hook be related to DockerSpawner (or self-defined MyDockerSpawner proposed by @minrk here)?

Actually I learnt programming on my own and still feel quite confused about this. If possible, could you please provide some sample code snippets or describe the logic of how it works?

Thank you soooo much!

Yes, this can be arbitrary code to modify the Spawner. For example:

async def pre_spawn_hook(spawner):
    volumes = await fetch_volumes_for_user(spawner.user.name)
    spawner.volumes = volumes

c.DockerSpawner.pre_spawn_hook = pre_spawn_hook

If you are already using your own custom Spawner class, this same logic can be in start(), instead of adding it in a hook via config:

async def start(self):
    group_names = [group.name for group in self.user.groups]
    self.volumes = await fetch_volumes_for_user(self.user.name, group_names)
    return (await super().start())

The pre_spawn_hook is meant as a way to add code to the beginning of this process without needing to define a subclass. But if you already have a subclass, you can extend methods there.

Hi minrk,

Thanks so much for your detailed explanation of these two methods. Actually I’ve just figured it out and was planning to post my workaround here, though it is a little bit tedious. I’ll try to modify it according to your hints :grinning: Thank you again!

import sqlite3
team_map = {} # prepare an empty dict
def pre_spawn_hook(spawner):
    username = spawner.user.name # get the username
    team_map[str(username)] = []
    conn = sqlite3.connect("./app/app.db") # connect to our app's database
    cur = conn.cursor()
    projects = cur.execute("SELECT name FROM Project LEFT JOIN user_projects ON user_projects.project_id = Project.id WHERE user_projects.user_id = %s;"%(str(username)))
    for p in projects:
        team_map[str(username)].append(p[0]) # retrieve user and project information and put into team_map
    conn.close()

c.Spawner.pre_spawn_hook = pre_spawn_hook

class MyDockerSpawner(DockerSpawner):
    def start(self):
        teams = team_map[self.user.name]
        # add team volume to volumes
        for team in teams:
            self.volumes['jupyterhub-team-{}'.format(team)] = {
                'bind': '/home/jovyan/{}'.format(team),
                'mode': 'rw'
            }
            # resolve 'permission denied' issue
            self.environment = {
                "CHOWN_HOME": "yes",
                "CHOWN_EXTRA": "/home/jovyan",
                "CHOWN_HOME_OPTS": "-R",
                "NB_UID": 1000,
                "NB_GID": 1000,
            }
            self.extra_create_kwargs = {'user': 'root'}
        return super().start()

c.JupyterHub.spawner_class = MyDockerSpawner
c.DockerSpawner.remove = True
2 Likes