RTC Collaboartion in groupwork

Hi,

I have a common use case (I believe) that I have been banging my head on for the last week.

I have a set of users/students lets call them A, B and C. These users work together in different projects represented by groups (in my case by ldap groups) AB, AC where each user A and C is part of group AC etc. The number of groups and the members of each group are dynamic.

Now I want to enable “teamwork” in jupyterhub so that:
If a user A is part of a group (ie AC), user A should be able to login and starts its own notebook server (as normal) BUT also be able to start a shared notebook server for each team it is part of.
All members of group AC should be able to access this shared server (but not all servers) so User A and C should be able to access servers AC but user C should not access AB etc.

I think this is a rather common use case and it even have a tutorial for static memberships ( Real-time collaboration without impersonation — JupyterHub documentation ). However, making this dynamic (I cannot restart the hub every time a group changes) and allowing all users of a group to automatically spawn and access the shared servers are not so straight forward.

I have gotten almost there but it involves doing “brain surgery” on the database and adding “group users”, roles that allow all users of that (ldap) group access to the server of the “group_user”, a group to distinguish real users from “group_users” by inserting records directly into the database from post_auth_hooks and some rather awkward filtering of users in the spawn step. It feels it should be some easier way of doing this as I believe it is a rather common use case.

Am I designing this completely wrong or am I missing something fundamental (the amount of users and groups make it impractical use jupyterhub for manual user and group mgmt and/or create “group users” on the SSO).

We actually implemented this feature in the last couple of weeks to address the same use case you described. In particular, we made the following changes to get this work.

  • We developed a custom hub-managed JupyterHub service, allowing users to create groups, manage invites, etc. In the end, we store the group memberships in a database table.
  • For each group, we dynamically create a “collaboration user”, i.e., JupyterHub user, using JupyterHub’s REST API (see POST /users/{name}).
  • All members of the group will be granted access to start a server in the name of the collaboration user. Thus, every user can still run their personal servers, but also work together via the “collaboration user”. Access can be granted using the JupyterHub’s share endpoint (see POST /shares/{user}). We grant the scopes access:servers!server=<collab user>/ and admin:servers!server=<collab user>/. The first scope is necessary to actually access the server. The second scope is necessary to start the collab user’s server.
  • Access to the collab user can be revoked using empty scopes (e.g., if a member leaves a group).
  • Since we are providing multiple profiles, we ensured that the collaboration user has only access to one specific profile by returning only the said profile in the profile list if the user is a collaboration user (i.e., username starts with collab-)
  • Similarly, we fail the spawn if a regular user tries to run the group profile

If you have any further questions or a better way to achieve the same functionality, please let me know!

2 Likes

JupyterHub 5+ can manage roles through the authenticator.

2 Likes

Thanks,

I got inspired by your solution and we are now doing something very similar. However, we opted for the less dynamic approach to only update the “roles” at login (post auth hook).

Regards

2 Likes

Thanks,

With this and the handler.user_from_username() function I am happy to say that I can avoid the ORM and low level db mgmt I did previously.

I am still not sure how “correct” it is to use handler.user_from_username() to create users on the fly and await user.save_auth_state() to save the extra auth_state settings in the post_auth_hook step but it mostly work (I seam to get some random hangs on login that I have not yet identified where they are).

2 Likes

Hi,

To answer my own post if anyone is interested in the solution we finally use (soon at least ) it is available here: jupyterhub/jupyterhub_config.py · 117-next-version-rtc · csma / jupyterhub · GitLab

There is however many site specific bits in the code above, below is a simplified stub .
In overview we do :

  1. Enable save authstate and “manage_roles”
  2. create “post_auth_hook” to create “collab_users” and assign roles dynamically based on (in our case) ldap group membership and save information in auth_state.
  3. Override refresh_user to not check/update information for “collab_users”

The main config :

c.JupyterHub.authenticator_class = ( myGitLabOAuthenticator # We override refresh_user to differentiate bwteeen different users ) c.Authenticator.enable_auth_state = True c.Authenticator.manage_roles = True c.Authenticator.post_auth_hook = post_auth_hook

The overrides and hook to create the “collab_users” and assign roles to users

class myGitLabOAuthenticator(GitLabOAuthenticator):
    async def refresh_user(self, user, handler=None):
        auth_state = await user.get_auth_state() or {}
        # Skip refresh check for group accounts
        if auth_state["collab_user"]:
            return True  # No refresh needed

        # Update the auth_state
        auth_model = await super().refresh_user(user, handler)
        if auth_model:
            auth_state = auth_model["auth_state"]
            gitlab_user = auth_state["gitlab_user"]
            auth_state["collab_user"] = False
            auth_state["ldap_info"] = await get_ldap_info(self.log, LDAP_URL, user.name, gitlab_user)
            auth_state["membership_info"] = get_direct_membership_from_jwt_token(auth_state)
            auth_model["auth_state"] = auth_state
      
        return auth_model


async def post_auth_hook(authenticator, handler, authentication):
    """
    Sync LDAP groups to JupyterHub roles after authentication.
    Used to enable RTC Sharing and enable group based mounting
    ldap_info: uid, gid and groups
    gl_info: memebership check for predefined gitlab groups, (gpu, sudo and gpu_super) and ssh keys
    """

    username = authentication["name"]
    authenticator.log.info(f"[POST_AUTH] Enter for {username}")
    auth_state = authentication.get("auth_state", {})
    authenticator.log.debug(f"[POST_AUTH] {auth_state=}")
    gitlab_user = auth_state["gitlab_user"]

    auth_state["collab_user"] = False  # Collab users never login, always true ?
    auth_state["ldap_info"] = await get_ldap_info(authenticator.log, LDAP_URL, username, gitlab_user)
    auth_state["membership_info"] = get_direct_membership_from_jwt_token(auth_state)

    authentication["auth_state"] = auth_state
    authentication["roles"] = DEFAULT_USER_ROLES
    # Add collaborative role definitions defined by ldap_groups, pre-filtered to only include projects as groups
    if ALLOW_RTC:
        authentication["roles"] += await create_collab_users_and_roles(
            authenticator.log, handler, auth_state["ldap_info"]["unixGroups"].values(), username
        )
    return authentication

The code to create “collab_users” and return the related roles to allow access to servers.

async def create_collab_users_and_roles(log, handler, ldap_groups, username):
    extra_roles = []
    for gid, group_name in ldap_groups:
        collab_username = role_name = group_name
        if collab_user := handler.user_from_username(collab_username):
            collab_auth_state = {
                "ldap_info": {"uidNumber": gid, "gidNumber": gid, "unixGroups": {}},
                "membership_info": [],
                "collab_user": True,
            }

            await collab_user.save_auth_state(collab_auth_state)
            role = {
                "name": role_name,
                "description": f"Role to allow users in group {group_name} to fully manage the server for {collab_username}",
                "scopes": [
                    f"admin:servers!user={collab_username}",
                    f"servers!user={collab_username}",
                    f"access:servers!user={collab_username}",
                    "admin-ui",
                    f"list:users!user={collab_username}",
                    f"read:users!user={collab_username}",
                ],
            }
            extra_roles.append(role)
        
    return extra_roles



To access the “shared group server” as users first logins normally and then can access:

<jupyterhub_url>/hub/spawn/groupname

(groupname == collab_user.username == cn of ldap group )

The caveat is that while it works for us more testing and vetting need to be done to ensure it works and do not introduce bugs in other installations.
We have for instance no cleanup of “old users” as we reset the jupyterhub database on every restart of jupyerhub anyway.

3 Likes