Plans on bringing RTC to jupyterhub

Hi,

are there already plans to bring the new real time collaboration feature from jupyterlab to jupyterhub?

4 Likes

Hi all,

Just to add +1 here. I’m excited to see RTC will be available in JupyterLab 3.1 and have been wondering how to enable it in our JupyterHub deployment on GKE.

I saw enabling RTC requires JupyterLab being launched with the --collaborative option, and spent some time trying to understand how this might be configured in JupyterHub, but I got a bit lost. Any advice much appreciated :slight_smile:

Thanks,
Alistair

1 Like

It’s definitely on the wish list! Unfortunately it’s not as easy as it might first seem. You can configure JupyterHub to launch jupyterlab c.Spawner.cmd = ['jupyter-labhub'] and pass additional arguments c.Spawner.args = ['--collaborative'] but the problem is how do you safely share a link?

In https://jupyterhub.example.org/user/user-1/lab/tree/foo.ipynb only user-1 can access anything under https://jupyterhub.example.org/user/user-1/, and this is enforced by JupyterHub, so we’d need a way to provide RTC access to one server without giving someone access to everyone’s servers.

Thinking out loud, maybe it’d be doable with a third-party hub extension/service that handles just RTC- and a JupyterLab plugin that communicates with this service?

1 Like

With the current implementation you get out of the box: you don’t. It’s all working on MyBinder, but everybody’s anonymous, and you have to trust everybody (and the MyBinder federation), but an in-house Binder is probably going to Just Work today, if you’ve already figured out the other abstractions.

As mentioned over here, some classes of solutions to connecting the UI of n spawned servers:

  • a matched pair of…
    • a Hub service that implements/wraps “the other end” of…
    • a labextension that provides IDocumentProviderFactory pointing at something other than /lab/api/yjs installed in every spawned server
  • a matched pair of…
    • some backing store which can store the stuff, and can be found/connected to/authenticated by the environment variables/request headers available to…
    • a serverextension that replaces the as-shipped YjsEchoWebSocket on /lab/api/yjs installed in every spawned server

While the latter option is attractive as it might not require any custom JS/TS/WebPack, the user experience might be rather lacking from what people would expect.

My gut feel is doing this without also offering out-of-document, searchable, in-Lab chat is probably not going to feel very good… as mentioned over there, being able to back yjs with an embeddable XMPP web client and known-good XMPP server (e.g. the MMORPG/telecon-grade ejabberd… your NIC will give up before it does) would be a very strong play. Oh, and look, if done right, you’d get self-hosted video chat, too, with jupyterlab-videochat (once the rooms there are also extensible, a WIP) given a self-hosted Jitsi, which could then (weirdly) power your own private virtual world.

Even given those, this is not going to magically solve the rest of the problems that sysadmins will have to deal with at scale, e.g. role-based, fine-grained permissions at the folder/document/cell/line, integration with DVCS/CI, logging/auditing, data spill remediation, but having a demonstration of a way this could work is important for getting to those next steps.

2 Likes

With the access:servers scope in JupyterHub 2.0, sharing a link to your server can work, as the deployment can choose to grant collections of users access to each other’s servers, e.g. access:servers!group=students to grant access to student servers. There isn’t yet a mechanism for users to grant access to their own servers, so it has to be done at the admin-level, but if you know you want e.g. group-level sharing, that can work. Without teaching the collaboration about JupyterHub auth, this means that you are granting other folks full permissions to act ‘as you’ with the running server, but that’s okay in limited circumstances, and enough to get off the ground for early adopters to try things out.

2 Likes

Thanks @minrk for the pointers! I’ve put together a quick example:

1 Like

At the moment roles are defined in the config file at startup, and can’t be changed whilst JupyterHub is running. If support for changing roles at runtime was added then a separate Hub service running with admin scope could take care of adding an appropriate role for only the requesting user.

Without that you can still get quite close to giving users control of who else can access their server if you know the list of users in advance. You can create a set of groups in advance, whose membership can be modified at runtime:

List of users defined at startup:

allowed_users = [
  'user-1',
  'user-2',
  'user-3',
]

Iterate through list of users…

load_groups = {}
load_roles = []
for user in allowed_users:

… create a group rtc-access-{user} that will store the list of other users who will have access to this user’s server, initially empty:

    access_group_name = f'rtc-access-{user}'
    load_groups[access_group_name] = []

… create a role that allows access to this user’s server (access:servers!user={user}), and assign that role to the above group:

    load_roles.append({
        'name': access_group_name,
        'description': f'RTC access to {user}',
        'scopes': [f'access:servers!user={user}'],
        'groups': [access_group_name],
    })

… create a role that allows this user to manage the rtc-access-{user} group (groups!group={access_group_name})

    manage_name = f'rtc-manage-{user}'
    load_roles.append({
        'name': manage_name,
        'description': f'Manage users in group {access_group_name}',
        'scopes': [f'groups!group={access_group_name}'],
        'users': [user],
    })
c.Authenticator.allowed_users = set(allowed_users)
c.JupyterHub.load_groups = load_groups
c.JupyterHub.load_roles = load_roles

This should allow a user to add other users to the rtc-access-{user} group through the POST /groups/rtc-access-{user}/users/ API endpoint using their own token.

1 Like

Hi @manics, thanks for creating the blueprint here. I tried to expand on it to allow JupyterHubs with non-predefined list of users (e.g. OAuthenticator login with your institutional email) and created:

  • a wrapper script that runs jupyterhub as asyncio task and restarts it when new users are detected
  • jupyterhub-config.py that creates a sharing group and role for each user
  • JupyterLab extension that allows to edit sharing group via API calls

This is how it looks in practice: JupyterHub_RTC_collaboration

Code is available here: GitHub - ktaletsk/jupyterhub-rtc-config-wrapper: Example JupyterHub deployment with auto-generating server sharing permissions

5 Likes

This is neat, and should be enough for a start. Do I understand correctly though that this gives other users full server access, including abilities to share/unshare/stop, etc?

JupyterHub’s RBAC can only control whether or not someone has access to your singleuser server. For more granular control of what someone can do inside the singleuser server it needs its own permissions system. There’s an ongoing discussion here:

2 Likes

Quickly reporting here about how I decided to support RTC on my group’s jupyterhub (small instance, trusted users).

  1. I create a shared folder (volume really) accessible by all users (same mounted volume).
  2. I create a shared hub user and override the roles as follows:
     c.JupyterHub.load_roles = [
         {
             "name": "user",
             "description": "Allow users to access the shared server in addition to default perms",
             "scopes": ["self", "access:servers!user=shared"],
         }
     ]
    
  3. I provide the users the following instructions.

    Collaboration workflow

    1. Find someone with whom you want to edit a file
    2. Copy the file and all dependencies into a subfolder of the shared folder in your home.
    3. Go to a URL https://<hub_url>/user/shared/workspaces/<your_team_name> together with your friend.
    4. Edit collaboratively
    5. Once done, copy the files back from your shared folder.

    IMPORTANT: DON’T USE SHARED FOLDER AS THE PRIMARY STORAGE LOCATION!!! THIS IS A BAD IDEA AND IT WILL HURT YOU


Seems to work well except for a weird bug: after restarting the hub the users can’t start their own servers with the following error


Deleting the user and creating the user again removes the error.

4 Likes

Wonderful, thanks for testing! I’ll see if I can track down the permission subset issue.

2 Likes