RStan in binder does not work because of RAM demands?

I am coming to the conclusion that using Stan in binder, specifically using Stan via rstan in an RStudio based binder project, is not going to work. The reason is a RAM issue, I am tentatively concluding.

What I tried

After trying and failing with a few other approaches, I managed to get an RStudio binder with rstan and brms installed. This required making my own Docker image, which I uploaded to DockerHub, as described here.

For reference, the relevant GitHub repo is here, and here’s the binder launch button: Binder

How it failed

If we try the little test script therein named hello.R, whose contents are following

library(tidyverse)
library(brms)

data_df <- tibble(x = rnorm(10))

M <- brm(x ~ 1, data = data_df)

the compiling of the sampler fails after about 2 or 3 minutes of silence.
The error message is, in my experience, a typical somewhat uninformative message effectively saying the compiler did not compile, e.g.

Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! file504ae4af77.cpp:6:36: warning: ISO C++11 requires whitespace after the macro name
 #define STAN__SERVICES__COMMAND_HPP#include <boost/integer/integer_log2.hpp>
...
...
/usr/local/lib/R/site-library/RcppEigen/include/Eigen/src/Core/arch/SSE/PacketMath.h:61:40: warning 
 Show Traceback
 
 Rerun with Debug
 Error in sink(type = "output") : invalid connection 

But the binder dockerfile works locally

Given that this binder project is based on a custom Dockerfile (in repo), on my local machine, I can cd into my local clone of the repo and do

docker build -t hellobinder_with_rstan_test .

and then run it locally with

docker run -p 8888:8888 hellobinder_with_rstan_test:latest

and that launches and I can open RStudio and run my hello.R and it all works.

Conclusion?

The C++ compiler for Stan uses a lot of RAM, even for tiny models, such as in my example. I assume I hit some RAM threshold for my binder project and that interrupted the compiling. From this, I am tentatively concluding that getting binder to run even tiny Stan models is just not going to work because of the RAM demands.

Would that be a safe conclusion to draw?

Hi, your conclusion is correct. We limit instances on mybinder.org to 1GB of RAM since this is a free service. An alternative option is to run on the GESIS version of mybinder which has substantially more RAM https://notebooks.gesis.org/ (you get yet another increase by registering for a free account)

2 Likes

Hi @sgibson91, Thanks for letting me know that my diagnosis was in fact correct. I was a bit doubtful because in this post , @betatim suggested the limit might be 3GB, and 3GB seemed like plenty for doing the tiny task I was trying. But if it is in fact 1GB, that explains a lot.

Thanks for the GESIS tip. I was not aware of that option.

Just to supply some context, when teaching Bayesian methods using Stan/brms, a very persistent issue is students encountering problems installing the C++ toolchain. Many give up when they encounter these problems. I was hoping I would be able to provide binder based examples with small demos which would be perfectly sufficient for teaching introductory topics. I will try my luck with GESIS notebooks now.

Ah yes, that post might be slightly out of date now. The current config deployed onto mybinder.org, available here, is 1GB.

Hope GESIS works for you! If I remember correctly, you are allocated 32GB with an account (@MridulS ?) - all your students would need to sign up though.

Yeah, the authenticated side of GESIS notebooks offers upto 32GB RAM + 3vCPU for every user currently.
Another thing to keep in mind is that we run an augmented version of mybinder.org (persistent binderhub) so the changes made in the notebooks will persist for students who sign up on notebooks.gesis.org :slight_smile:

2 Likes

@Mark_Andrews

Just tested it and it (still) runs :wink: GitHub - arnim/RStan-Binder: Files for running RStan on Binder

1 Like