BinderHub with private GitLab and user scopes

jtp · March 3, 2020, 11:08am

Hi,

This question might sound similar to the one posted in Private Gitlab Access for BinderHub.

What would be the preferred way to reuse the JupyterHub GitLab authenticator to determine whether a user is able to build Binders for a given repo?

I suppose there is always the escape hatch of the extraConfig here:

github.com

jupyterhub/binderhub/blob/b6446b12b30f741d9e82b7aec1498ede4776cd79/helm-chart/binderhub/values.yaml#L78


from tornado import web


# get custom config from values.custom
import z2jh
cors = z2jh.get_config('custom.cors', {})
auth_enabled = z2jh.get_config('custom.binderauth_enabled', False)


# image & token are set via spawn options
from kubespawner import KubeSpawner


class BinderSpawner(KubeSpawner):
    def get_args(self):
        if auth_enabled:
            args = super().get_args()
        else:
            args = [
                '--ip=0.0.0.0',
                '--port=%i' % self.port,
                '--NotebookApp.base_url=%s' % self.server.base_url,
                '--NotebookApp.token=%s' % self.user_options['token'],
                '--NotebookApp.trust_xheaders=True',

Which could be used to fetch the auth_state and perform an action (clone or wget) to check whether a user has access to a repo using the git token.

The idea is to be able to prevent users to build and launch a Binder for a repo they don’t have access to in GitLab (no read permission).

Thanks!

jtp · March 23, 2020, 10:30am

Not tested yet, but it should be possible to do something similar to what they do in GitLab for their JupyterHub integration to retrieve a token:

And use the token to call the GitLab API.

jtp · March 23, 2020, 10:42am

Unless there is a way to set the git_credentials on the fly using the user credentials from the GitLab auth (see @betatim’s post: Private Gitlab Access for BinderHub)

manics · March 23, 2020, 12:21pm

It looks like the credentials are passed into the builder here:

github.com

jupyterhub/binderhub/blob/0193fc333eb639e714db15afa4b5f1d3daa90083/binderhub/builder.py#L365


    repo_url=repo_url,
    ref=ref,
    image_name=image_name,
    push_secret=push_secret,
    build_image=self.settings['build_image'],
    memory_limit=self.settings['build_memory_limit'],
    docker_host=self.settings['build_docker_host'],
    node_selector=self.settings['build_node_selector'],
    appendix=appendix,
    log_tail_lines=self.settings['log_tail_lines'],
    git_credentials=provider.git_credentials,
    sticky_builds=self.settings['sticky_builds'],
)


with BUILDS_INPROGRESS.track_inprogress():
    build_starttime = time.perf_counter()
    pool = self.settings['build_pool']
    # Start building
    submit_future = pool.submit(build.submit)
    # TODO: hook up actual error handling when this fails
    IOLoop.current().add_callback(lambda : submit_future)

But then you need to trace that up the chain- I expect it’d require some refactoring to get the user credentials in there.

I’m not sure how much work is involved in passing the user credentials into BinderHub though, if you’ve got an authenticated BinderHub the auth is handled by JupyterHub, so you need to figure out how to pass that auth to Binder.

I suspect your original idea of checking for access in the Spawner may be the easiest even if it’s not the most elegant solution, since as you point out the Spawner has access to auth_state.

jtp · July 1, 2020, 12:03pm

Thanks @manics for the input!

Opened https://github.com/jupyterhub/binderhub/issues/1117 to discuss a technical solution.

jtp · July 3, 2020, 9:39am

Posting here for completeness.

It is possible to control the user access before launching a Binder, using the c.Launcher.pre_launch_hook configurable. For example:

from urllib.parse import urlparse, quote
from tornado.httpclient import AsyncHTTPClient, HTTPError

GITLAB_HOSTNAME = "gitlab.example.com"

async def pre_launch_hook(launcher, image, username, server_name, repo_url):
    user = await launcher.get_user_data(username)
    auth_state = user.get('auth_state', None)

    if not auth_state:
        launcher.log.warning("No auth state for %s", username)
        return

    access_token = auth_state['access_token']
    namespace = urlparse(repo_url).path.strip('/').strip('.git')
    namespace = quote(namespace, safe='')

    # check the user has access to the repo
    client = AsyncHTTPClient()
    api_url = f"https://{GITLAB_HOSTNAME}/api/v4/projects/{namespace}?access_token={access_token}"
    try:
        resp = await client.fetch(api_url)
    except HTTPError as err:
        raise Exception(f"User {username} does not have access to {repo_url}") from err

c.Launcher.pre_launch_hook = pre_launch_hook
c.GitLabRepoProvider.hostname = GITLAB_HOSTNAME

The user will then be faced with the following messages in the BinderHub UI:

However there doesn’t seem to be a way at the moment to control access at build time. Which could be used to check whether users are allowed to build a repo they don’t have access to. For now they can still build a repo by guessing the name. The repo builds because the c.GitLabProvider.private_token has access to the repo to resolve the refs:

github.com

jupyterhub/binderhub/blob/72bcb59cf956f53a07f0d4b45f12cc6c1257c6cf/binderhub/builder.py#L251



repo_url = self.repo_url = provider.get_repo_url()

# labels to apply to build/launch metrics
self.repo_metric_labels = {
    'provider': provider.name,
    'repo': repo_url,
}

try:
    ref = await provider.get_resolved_ref()
except Exception as e:
    await self.fail("Error resolving ref for %s: %s" % (key, e))
    return
if ref is None:
    await self.fail("Could not resolve ref for %s. Double check your URL." % key)
    return

self.ref_url = await provider.get_resolved_ref_url()
resolved_spec = await provider.get_resolved_spec()

The following issue mentions the idea of adding a pre_build_hook configurable that would enable such use case: https://github.com/jupyterhub/binderhub/issues/1117

Topic		Replies	Views
BinderHub API authentication BinderHub	2	727	September 27, 2019
How do you configure Binder when running locally with minikube? Binder	10	1979	April 9, 2019
Private Gitlab Access for BinderHub BinderHub	8	4039	March 3, 2021
Setting up Binder with Jupyterhub OAUTH Binder	5	1855	September 30, 2019
A question about BinderHub authentication and privacy Binder	3	967	February 2, 2021

BinderHub with private GitLab and user scopes

Related topics