Current state
During the July’s JupyterHub/BinderHub meeting, amongst other things, we discussed the growing interest in using BinderHub with user specific private repos.
To some extent, BinderHub is able to do that but this requires to have a central user that can access private repositories which can only be achieved if one has the control of the target provider like for example a custom GitLab instance or an organisation in the case of GitHub with again a dedicated user to access the repositories. The security issues related to a single account being able to access everything are pretty clear.
BinderHub can make use of JupyterHub authentication system when one wants to use BinderHub in a private fashion. In the context of private repos, JupyterHub’s authentication could be used to get the required credentials in order to pull private repo content and build images from them. However, beside the need to implement the credential retrieval part, the authentication system can only handle one provider thus making it of limited use in the scenario of supporting multiple providers in BinderHub.
As we can see here, this is not just a question of minor tweaks to the current code base.
Where we would like to go
To put it simply: create an authentication mechanism that allows credentials retrieval and being able to configure and use multiple sources for a single installation.
The way the current authentication works does not scale with the requirements made above. Therefore @manics has suggested to refactor the authentication system to make it easier to use in standard JupyterHub installations as well as for the purposes of BinderHub. Part of this work will be to make the authentication pluggable so that it will allow to:
- Use more than one authentication provider at the same time
- Implement new providers in an easier fashion
- Request credentials for the logged in user to access private repos appropriately
This last point is of more interest to BinderHub than JupyterHub itself at this time.
JupyterHub is not the first system that needs that kind of support and inspiration could be taken from projects like django-allauth.
There’s also python-social-auth which is more generic and might fit the current needs without having to re-implement all the flows.
For the record, there are already some ideas that can be found in this jupyterhub/oauthenticator issue.
In any case, the task here is not a small item as it touches a core element with security implications thus @sgibson91 proposed to organise this work in a more coordinated fashion so that we can better prepare the related subtasks with the support of @yuvipanda, @minrk and @betatim.
The goal of this post is to gather ideas and suggestions about this topic in order to prepare a JEP and lay out a plan for the implementation.