Building image from latest git repo on spawn

Hello,

I’ve been working on setting up a notebook-based workshop with JupyterHub on OpenShift. To sum up my inquiry here quickly, I’m basically wondering if there’s functionality to configure a JupyterHub pod that pulls the latest image from the git repo specified upon spawn?

I’ve tried to do this with nbgitpuller and repo2dockerspawner, as both of these build a specified image upon spawn, but have encountered errors in trying to use these with my cloud environment. I’m also aware that what I’m asking is the exact functionality of mybinder. However, there is a vested interest in avoiding mybinder in this scenario.

I’d very much so appreciate any help on how to approach this issue, and whether there is another practical way of going about facilitating this functionality. The alternative is of course that my time would be better spent continuing trying to get the aforementioned tools (repo2dockerspawner or nbgitpuller) to work. If that is the case, are there any resources on getting these to work specifically with openshift?

Best Regards,
Thomas Stautland

Hi!
Are you able to tell us a bit more about your requirements, in particular what aspects of BinderHub make it unsuitable for you, and how much flexibility do you need when specifying Docker images and/or GitHub repositories?

1 Like

+1 to @manics question about more details. Would be interesting to know/learn about and there are features in BinderHub (the software powering mybinder.org) which are disabled on mybinder.org but people find useful in more restricted environments.

In general work on making BinderHub work on platforms like Openshift where running as root or with a fixed user ID is tricky would be a cool contribution to BinderHub itself. Needs a bit of thinking how to do it but would be interesting.

Thanks for the prompt reply. We were actually using mybinder to host this workshop previously, so it’s not really a technical issue. The shift away from this is only due to the company I’m working for not wishing to rely on Google products, when presenting a workshop. If that weren’t an issue I would likely go ahead and continue using mybinder, as it was working fine for the prototype workshop.

As for the desired behavior, I don’t think it needs to be very flexible. There is a single repository that is being pulled. Importantly, this repository is under continuous development, so it is prone to changes in between workshop sessions. Currently, a new-app is defined creating an image from the Git repo, so a Docker image is never explicitly defined. What I hope to be able to do is to define the new-app in such a way that each time a new notebook instance is spawned it pulls the latest repository information and creates a notebook instance from this.

Hopefully this provides some better context for what the end goal is and doesn’t just rephrase my original query.

1 Like

Thanks for the response. As alluded to in my response @manics, the functionality of mybinder suits the workshop well. Our efforts to deploy on Openshift is just an attempt to define a more thoroughly in-house solution, if possible.

1 Like

In case it’s not clear mybinder.org is a public instance of BinderHub, but you can run your own instance too on your own Kubernetes cluster. As @betatim mentioned there are loads of additional features such as authentication, limiting repos, though in your case converting it to run on OpenShift would require some work.

However if you only have one repo and it’s just a matter of ensuring the latest version is run could you setup an automated Docker build on e.g. Docker Hub, or your own registry using your own CI build tools? Zero-to-jupyterhub includes a singleuser.image.pullPolicy which means whenever a user starts up their singleuser server the spawner can be configured to always check for an updated Docker image, for example it could always pull the latest Docker image tag.

2 Likes

Ok, that’s clarifying, there’s also an interest in sticking with OpenShift for the time being, but it’s good to know that BinderHub on Kubernetes is an open alternative. Ḯ’ll certainly keep that in mind.

This makes sense. Should be able to accomplish the same using Jenkins as it comes integrated with OpenShift. I’ll give this a shot. Thanks!

2 Likes