Should JupyterHub and BinderHub come in more flavors?

#1

The installs that I have seen are ubuntu based

I have done some work with JupyterHub and over the past couple of days have been working with an internal binder implementation. In each case, the installs or the docker configurations seem to require ubuntu.

While experimenting with configuring a notebook to “launch binder”, I found that there are a number of commands that just port slightly differently, for example, adduser takes different options on RHEL than Ubuntu.

Does ignoring RHEL reduce market penetration

I am far from an expert, but where I work the vast majority of our servers are running RHEL. That has been a disadvantage on the few times that I have worked with these two products. I was wondering about the server mix in the commercial institutions. I found this article, which is not especially quantitative, but it suggests that RHEL has a much higher penetration of the server market than ubuntu.

  • Should we be offering an alternative flavor for installation?
  • Has this already been researched?
#2

Good question. For BinderHub could you point to some specific things? I don’t think anything should require Ubuntu, it does need translating to other dialects though.

Speaking for BinderHub and its related projects: we decided to only give the commands for Ubuntu (see recent comment on revamped contributing guide). The assumption we made is that if you are installing BinderHub you are probably experienced enough to translate the Ubuntu command to your local dialect (with your personal opinion mixed in). Basically thinking of it as a lingua franca.

The advantages of doing it this way for the project are:

  • that we can keep the guide short(ish),
  • don’t have to resolve the various opinions on how you “should” install things (aka I personally think that apt-get install python is the worst way to install Python but can live with it being the precise way of saying “install Python”)
  • we have access to Ubuntu so can check the commands do work.

For running BinderHub in production you shouldn’t have to care about what flavour OS you are on. The bigger problem is having a kubernetes cluster with all the right bells and whistles. For example I have no real idea what OS the nodes of mybinder.org actually run and never needed to know either. Maybe you can specify a bit more what is missing or where we could extend/clarify things.

I think maybe we should add a sentence or two to communicate this vision? Also like the zero2jupyterhub guide which only really covers GKE, others are encouraged to contribute docs for other cloud providers. It is purely a team resources question for why the docs are geared towards GKE.

1 Like
#3

I did not meet your assumption. I am much more of a developer than a sysadmin. For me, RHEL was a blocker for the-littlest-jupyterhub. I got around it eventually by finding a way to run ubuntu in a local cloud. This was one of those things that I wanted badly enough to find a way to do it, just much later than I wanted.

My first attempt was at the post JupyterCon 2018 hackathon. I did not get a chance to return to that until several months later. Fortunately, it was pretty easy after I abandoned trying to get it to work on RHEL.

If I find something specifically on the BinderHub question, I can mention it here.

Binder related, see item 3 in the readthedocs. The following options to adduser:

--disabled-password \
    --gecos "Default user"

did not work in the RHEL release that I tried. I have not researched whether some newer version might have them. This is just information, not a criticism. I understand that it would take more resources for Jupyter to support both.

1 Like
#4

So there you are making your own Dockerfile for a repository you want to run on a BinderHub. We should definitely mention that the commands listed there assume you are in a Debian/Ubuntu like base image. At least I can see it mentioned right now.

However my standard question when people are making their own Dockerfile for a repo: are you sure you need to? More than 50% of people I ask don’t actually have to, the things they want to do can be achieved with what repo2docker already provides.

Because Dockerfiles let you do “what ever you want” it requires huge amounts of resources to provide support, which is why you will get very little. Another reason not to use a custom Dockerfile. Much better to file an issue with what it is you are trying to do but can’t with what repo2docker currently supports. That way we also get an estimate of what people want to do with their repos on binder and then make it easy to do that via repo2docker.

I think an action item here is to make the text in https://mybinder.readthedocs.io/en/latest/tutorials/dockerfile.html even more explicit that “if you choose this path you are in your own private adventure down the docker rabbit hole”.

#5

A good way for a middle ground between manageable for the project and all-encompassing would be to add a “3rd party installation options” section to Jupyter projects’ install docs. Each entry then can point to external repos where people have scratched an itch (e.g. “install to RHEL”), with a summary what you can get from that project.

Discoverability is easiest from the main project’s docs.

1 Like
#6

Indeed, I hit this precisely because I needed to go down a proprietary rabbit-hole. :grinning: I needed something that pip would not install, and which is a resource shared by many and won’t be in repos like mine.

#7

https://zero-to-jupyterhub.readthedocs.io/en/latest/#resources-from-the-community awaits :slight_smile: