BinderHub with private GitLab and user scopes

Hi,

This question might sound similar to the one posted in Private Gitlab Access for BinderHub.

What would be the preferred way to reuse the JupyterHub GitLab authenticator to determine whether a user is able to build Binders for a given repo?

I suppose there is always the escape hatch of the extraConfig here:

Which could be used to fetch the auth_state and perform an action (clone or wget) to check whether a user has access to a repo using the git token.

The idea is to be able to prevent users to build and launch a Binder for a repo they don’t have access to in GitLab (no read permission).

Thanks!

1 Like

Not tested yet, but it should be possible to do something similar to what they do in GitLab for their JupyterHub integration to retrieve a token:

And use the token to call the GitLab API.

Unless there is a way to set the git_credentials on the fly using the user credentials from the GitLab auth (see @betatim’s post: Private Gitlab Access for BinderHub)

It looks like the credentials are passed into the builder here:


But then you need to trace that up the chain- I expect it’d require some refactoring to get the user credentials in there.

I’m not sure how much work is involved in passing the user credentials into BinderHub though, if you’ve got an authenticated BinderHub the auth is handled by JupyterHub, so you need to figure out how to pass that auth to Binder.

I suspect your original idea of checking for access in the Spawner may be the easiest even if it’s not the most elegant solution, since as you point out the Spawner has access to auth_state.

Thanks @manics for the input!

Opened https://github.com/jupyterhub/binderhub/issues/1117 to discuss a technical solution.

1 Like

Posting here for completeness.

It is possible to control the user access before launching a Binder, using the c.Launcher.pre_launch_hook configurable. For example:

from urllib.parse import urlparse, quote
from tornado.httpclient import AsyncHTTPClient, HTTPError

GITLAB_HOSTNAME = "gitlab.example.com"

async def pre_launch_hook(launcher, image, username, server_name, repo_url):
    user = await launcher.get_user_data(username)
    auth_state = user.get('auth_state', None)

    if not auth_state:
        launcher.log.warning("No auth state for %s", username)
        return

    access_token = auth_state['access_token']
    namespace = urlparse(repo_url).path.strip('/').strip('.git')
    namespace = quote(namespace, safe='')

    # check the user has access to the repo
    client = AsyncHTTPClient()
    api_url = f"https://{GITLAB_HOSTNAME}/api/v4/projects/{namespace}?access_token={access_token}"
    try:
        resp = await client.fetch(api_url)
    except HTTPError as err:
        raise Exception(f"User {username} does not have access to {repo_url}") from err

c.Launcher.pre_launch_hook = pre_launch_hook
c.GitLabRepoProvider.hostname = GITLAB_HOSTNAME

The user will then be faced with the following messages in the BinderHub UI:

However there doesn’t seem to be a way at the moment to control access at build time. Which could be used to check whether users are allowed to build a repo they don’t have access to. For now they can still build a repo by guessing the name. The repo builds because the c.GitLabProvider.private_token has access to the repo to resolve the refs:

The following issue mentions the idea of adding a pre_build_hook configurable that would enable such use case: https://github.com/jupyterhub/binderhub/issues/1117

2 Likes