Binder from github & dockerfile not starting Rstudio?

Hi all,
This will clearly show my fundamental lack of understanding of how things work, but… here goes… :slight_smile:

So: I have created a public GitHub repo: GitHub - StatisticsHealthEconomics/stat0019_binder: This repo is used to create a fully functional Rstudio environment for computation where I’ve copied a Dockerfile - the reason for this is because I need to use an existing docker image, which has various pieces of software already installed (with a view of making life easier for my students in their remote practicals…).

I did follow the instructions in the mybinder docs to set up the Dockerfile and I think I’ve done it OK (at least the process goes on until the end…).

Anyway: when I launch the image via docker — something like

docker run -ti --rm -p 8787:8787 -e PASSWORD=pass giabaio/stat0019:17022021

all works fine and the relevant packages are all loaded.
If I try and share the image via mybinder, the process works, in the sense that I get a “success” message. However, it then fails to launch Rstudio and in fact I end up with a 404 : Not Found page…

I can browse the “non-Rstudio” part of the binder, which seems to show the files are there. I can open a new Terminal and run some Linux commands to check things (which, as far as I understand, don’t point to anything weird — but I don’t know much, as you’ll have guessed, at this stage…). But even from the top-right cascade menu, there’s no way for me to launch Rstudio.

I’m sure I’m missing something absolutely obvious — possibly the GitHub repo has something wrong? Or maybe I have messed up some other bit of the installation? Anybody can help?

Many thanks and please bear with the amateur… :slight_smile:

You say the process works, but it doesn’t. There’s a warning in the build. I put that warning I saw below (scroll over to the right so you can see it):

Removing intermediate container 7968b14daf10
 ---> 285c46c4cd12
{"aux": {"ID": "sha256:285c46c4cd12b46636620046bc7f799eeeda587607cc79ad7586926652b47b6f"}}[Warning] One ormore build-args [NB_USER NB_UID] were not consumed
Successfully built 285c46c4cd12
Successfully tagged turingmybinder/binder-prod-statisticshealtheconomics-2dstat0019-5fbinder-b96a03:e4a023c5f971a3ad181a3741fdd69ae1ac8a2b88
Pushing image
....

The build is looking for the argument [NB_USER NB_UID] .

If you look at your Dockerfile at https://github.com/StatisticsHealthEconomics/stat0019_binder/blob/d37d3d23bebdf78b53e191f2e3729ea1187eef89/Dockerfile , you’ve commented out the blocks dealing with that.

Look again at the following with respect to that:

In particular the minimal-dockerfile describes what you need for [NB_USER NB_UID] .

It looks like your Dockerfile doesn’t install R/RStudio in a way that Jupyter launched via MyBinder recognizes and adds. The symptom is there is no R kernel or Rstudio choice in the Jupyter Dashboard. I do see that R is present if you launch if from the terminal in a session launched from your repo.

Then try adding the following at the end of your Dockerfile based on GitHub - binder-examples/dockerfile-rstudio: Using RStudio with Binder with a custom Dockerfile :

## run any install.R script we find
RUN if [ -f install.R ]; then R --quiet -f install.R; fi

Thank you, @fomightez — I did say I obviously didn’t know what I was doing…

I had commented out the bits about defining the new user because when it’s not commented, the build just fails (complaining that a user with ID 1000 already exists). This way it did get to the end of the process, so I thought it did work…

I’ve tried to implement the changes you suggest and still doesn’t work. I’m still investigating, but I think the problem is that, probably, the original image I’m importing does have an existing user jovyan, which messes up with the rest of the installation…

If I use

# Must specigy a tag
FROM giabaio/stat0019:17022021

# install the notebook package
RUN pip install --no-cache --upgrade pip && \
    pip install --no-cache notebook

# create user with a home directory
ARG NB_USER
ARG NB_UID
ENV USER ${NB_USER}
ENV HOME /home/${NB_USER}

RUN echo $HOME
RUN echo $NB_USER

# Copy repo into ${HOME}, make user own $HOME
USER root
COPY . ${HOME}
RUN chown -R ${NB_USER} ${HOME}
USER ${NB_USER}

## run any install.R script we find
RUN if [ -f install.R ]; then R --quiet -f install.R; fi

then it fails and shows that

# Must specigy a tag
FROM giabaio/stat0019:17022021

# install the notebook package
RUN pip install --no-cache --upgrade pip && \
    pip install --no-cache notebook

# create user with a home directory
ARG NB_USER
ARG NB_UID
ENV USER ${NB_USER}
ENV HOME /home/${NB_USER}

RUN echo $HOME
RUN echo $NB_USER

# Copy repo into ${HOME}, make user own $HOME
USER root
COPY . ${HOME}
RUN chown -R ${NB_USER} ${HOME}
USER ${NB_USER}

## run any install.R script we find
RUN if [ -f install.R ]; then R --quiet -f install.R; fi

I think I need to check that more carefully — but any piece of advice more than welcome!
Thanks

Yes, you aren’t getting past:

Step 11/13 : RUN chown -R ${NB_USER} ${HOME}
 ---> Running in 4a62bb5fd956
chown: invalid user: ‘jovyan’
Removing intermediate container 4a62bb5fd956
The command '/bin/sh -c chown -R ${NB_USER} ${HOME}' returned a non-zero code: 1

Did you try commenting out just that RUN chown -R ${NB_USER} ${HOME} line? If you are lucky jovyan already owns it.

The better way is to do it a supported way. See Specifying an R environment with a runtime.txt file and the strong encouragement against the way you are trying here.
I know you wanted to make it more consistent with that Docker image you already have; however, you can specify R version in runtime.txt and install the packages you want via the install.R file.
It looks like you’d like OpenBUGS and JAGS. The latter you can install via apt.txt, see here and here. It looks like for OpenBUGS you’d want to use a postBuild file because that one doesn’t allow use of any package manager, it seems. Or is this making it harder than it is? I see that there is a Github archive of it here. Plus there are two commands, I’m seeing here that I wonder if they can just be run in postBuild since a lot of the Rocker stuff works on MyBinder. Similarly, here does OpenBugs install with source and a make step that could possible be placed in postBuild. (postbuild files are discussed here and here and demonstrated here.)

Problem is I don’t just need R/Rstudio — there are programmes I need to go along R (specifically, OpenBUGS and JAGS). I know how to install on a Linux machine, but doesn’t that still mean a Dockerfile?

It shouldn’t. I added specific coverage above. With OpenBUGS being the only one I have questions about.

Thank you — that’s very helpful! I need to spend a moment looking through the documentation you point out. You’re right that I may be better off doing this – but like I said, I am a bit lost right now and have to do some more reading…

Thanks!

My vision of the files I discussed are here. With the default runtime that was there it looks like it will be R 3.6.3. Maybe though it will tell us if we are on the right track for installing JAGS and OpenBUGS.

Currently OpenBUGS still not cooperating.

Getting warmer… Like you say, OpenBUGS proves problematic… If I exclude it then all installs fine and I can run jags. Installing the other packages is fine.

I’ve tried to follow a few of the methods described, including configuring OpenBUGS to a folder different than /usr to avoid permission issues, but it almost invariably gives me a couple of warnings (to do with aclocal-1.11 being missing) and then a fatal error.

In file included from OpenBUGSCli.c:2:0:
/usr/include/stdio.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory
 #include <bits/libc-header-start.h>
compilation terminated.

My apt.txt file includes

gcc-multilib g++-multilib

which are known dependencies for OpenBUGS, but this doesn’t save it. I did try to also include automake and perl, which are suggested by the warnings, but again no joy…

Yes, I was getting to adding some stuff to apt.txt based on R2jags-image/Singularity at 1f0b9c35a00634f6801c99e3c9dacc04aeb84fa7 · gshamov/R2jags-image · GitHub ; however, I had some other stuff come up.

How do I test OpenBUGS?
Mine builds at present.

When I run jags I see see:

jovyan@jupyter-fomightez-2dstat0019-5fbinder-2dgrthm23x:~$ jags
Welcome to JAGS 4.3.0 on Thu Feb 18 01:54:48 2021
JAGS is free software and comes with ABSOLUTELY NO WARRANTY
Loading module: basemod: ok
Loading module: bugs: ok
.

Does that last line mean OpenBUGS works?

UPDATE: I think I figured out how to test OpenBUGS works by looking at what make check triggered after make during one of the builds that failed because I still had && make install after make check. I’ve since removed both of those. make check was unnecessary because just takes time and scrolls by to fast in normabl build anyway, and I removed && make install because that was failing due to permissions and I instead installed a symbolic link in ~/.local/bin as you are supposed to do for such installs.

IT IS WORKING. This is what I see in the terminal in RStudio now:

jovyan@jupyter-fomightez-2dstat0019-5fbinder-2dyr27udq8:~$ OpenBUGSCli -h
OpenBUGS version 3.2.3 rev 1012
type 'modelQuit()' to quit
OpenBUGS>

By the way, I don’t believe your version here is structured correctly. If there is a binder directory then the files such as apt.txt and runtime.txt and install.R should be in it as well as the postBuild file. The idea is that you could put other configuration files, such as runtime.txt, outside of the binder directory and it will be used when it is accessed via other avenues besides MyBInder, but that a Binderhub should be the only thing that uses the items in binder.

See here:

" If a binder/ folder is used, Binder will only read configuration files from that location (i.e. myproject/binder/requirements.txt ) and will ignore those in the repository’s root ( myproject/environment.yml and myproject/requirements.txt )."

I restructured mine to have the configuration files in the binder directory as I gathered you may prefer that.

Brilliant — thank you! I couldn’t work on it until now, but will pick it up and continue testing. Thank you so much for your help!

OK, so thanks again. That’s really close to what I need. I think the “normal” installation of OpenBUGS ensures that the R package R2OpenBUGS can find the actual programme automatically. So the function bugs automatically calls Sys.which("OpenBUGS") and finds it without the need for any specific argument to be passed to the option OpenBUGS.pgm.

The current installation launches OpenBUGS by doing

OpenBUGSCli

(as you’ve tested). This kind of messes up with running OpenBUGS remotely from R and one needs to specify

model=bugs(...,OpenBUGS.pgm="/home/jovyan/openbugs/src/OpenBUGSCli")

Then all works OK (in the limited tests I’ve made anyway…). But if you don’t specify the OpenBUGS.pgm option, then R2OpenBUGS::bugs can’t find OpenBUGS and the R script fails…

I think I can live with that — I’ll just need to tell my students they need to specify the full path (which I’m sure they won’t register. I can see the conversation going:

  • “Dear Prof, it doesn’t work!”
  • “Have you included the full path?”

:sweat_smile:

I think you can make it work however you prefer.
We should be able to add what you want to the ~/.local/bin so it will be in the path. I put the one I saw the make check using because I thought that was what was needed, but we could alter it in the postBuild. I think what you’d prefer is below?

mkdir -p  $HOME/.local/bin
ln -s /home/jovyan/openbugs/src/OpenBUGSCli ~/.local/bin/OpenBUGSCli
ln -s /home/jovyan/openbugs/src/OpenBUGSCli ~/.local/bin/OpenBUGS

Does that look right and would address your needs?

Plus, do you want me to try to get the R version to be 3.6.2? It shouldn’t be that difficult to adjust the `runtime.txt if you’d like me to try? The idea then will be you should just be able to make the content of the configuration files you have match what I have and things will work in your repository.

By the way the postBuild file has special permissions and so you cannot edit in the Github browser. I always use git on my local machine. (That may not apply to you but I’m putting it here to remind others later.)

Thank you, @fomightez

I think what you’d prefer is below?

mkdir -p  $HOME/.local/bin
ln -s /home/jovyan/openbugs/src/OpenBUGSCli ~/.local/bin/OpenBUGSCli
ln -s /home/jovyan/openbugs/src/OpenBUGSCli ~/.local/bin/OpenBUGS

That’s exactly right! I’ve pulled your repo and rearranged mine and all seems to be working OK. I’ve run the ln commands on the fly on a running instance and it does the trick — now R2OpenBUGS::bugs can find OpenBUGS automatically (so hopefully one less problem with the students… :wink: ).

Thanks also for your pointer about postBuild — I was already working locally and then pushing to the GitHub repo, but good to know!

Finally, I think what you mean with the R version is to change the runtime.txt file to look something like

r3.6.2-2019-02-18

?
I’ve not fiddled with that as I was trying to get the bigger issues sorted (and thanks again for your invaluable help!), but perhaps that’s also useful. Thank you!

I don’t think you have the syntax quite right there. See here for some closer. However, that isn’t working as intended yet because…
More importantly for now though is why did you pick that snapshot from 2019-2-18? I’m looking at the list pf snapshots and when I click on that one, it seems to say it is R 3.5.2 here? Which version of R are you seeking?
I am still getting 3.6.3 currently but it may be because that snapshot date and the 3.6 don’t match?

I just typed wrongly… I intended to use today as a template-date (I think what you need to do is to check the available snapshots?), but wasn’t very clear… Thanks — I think I know what you mean.

As a word of advice, how do people feel about GESIS Notebooks? If students can actually save the files if they change them would actually be very cool — but I haven’t had bandwidth or time to look into them?..

Thanks again!

GESIS Notebooks, see https://notebooks.gesis.org/ , is sort of an offshoot of Binder. GESIS is a member of the Federation but the only one to allow persistence. The issue there is you have to sign up as a scientist. I haven’t used it enough myself to know how it works in a class setting.

I’m still unclear which version of R you want? 3.5.2 or 3.6.2?

Sorry — I think 3.6.2 would be best, but the students could do even if it was 3.5.2; if it’s easy enough to do the most recent (so 3.6.2) then it’d be great!

And thanks about GESIS.

G