RStudio (or OpenRefine?) as default interactive application on empty repo


#1

I’m working on integration of repo2docker into Whole Tale and have some additional questions about Rstudio
and other non-Jupyter interactive applications. This is tangential to my comment in repo2docker/#533.

Currently, WT treats the Jupyter and RStudio (and OpenRefine) environments separately. We provide base images for each and when the user selects their default interactive environment, they start from the associated image – even on an empty “repo”. WT supports the use case of a user starting from a blank workspace with their interactive environment of choice.

In the case of RStudio, repo2docker only installs the package if it detects an R buildpack file (runtime.txt, DESCRIPTION, Stencilia context, etc). Given an empty repo, repo2docker provides a Jupyter environment by default. While I can override the command at runtime (per issue #533), I don’t appear to be able to tell repo2docker to install RStudio by default. The OpenRefine example uses a postBuild script.

I’m not sure how best to approach this in WT. A couple of options come to mind:

  1. When the user creates a new “Tale” (our equivalent of the Binder repo), we could initialize it with appropriate buildpack files on their behalf – maybe the runtime.txt for R and postBuild script for OpenRefine.
  2. Modify repo2docker to enable alternative default interactive applications (also somewhat related to #545 and #546) – this would require some option for installing by default. At this point, I think Jupyter, RStudio, OpenRefine and noVNC/Xpra are our main cases – with Jupyter and RStudio clearly dominant.

I’m open to the idea of WT implementing the repo2docker reproducibility best-practices and always creating a “pinned” environment, which means #1 is a reasonable approach. I can also see this as a case for expanding repo2docker to enable alternative default interactive environments.


#2

One thing you can do is adjust the Repo2Docker.default_buildpack configuration to be a different buildpack when no environment specification is found. You may find, however, that some buildpacks other than the default make assumptions that certain files are present. This would be a bug, I think.

deployment-configurable defaults in repo2docker are a tricky one. On the one hand, it makes sense from an audience-specific deployment case (it’s annoying for all your users to have to specify openrefine if everyone is using that), on the other it’s a nightmare for portability (this repo worked on WT repo2docker, but not mybinder.org repo2docker).

So I’d probably aim at 1. and look at a way in repo2docker to make that as easy as possible.


#3

Thanks @minrk. I’ve been looking at default_buildpack and agree that having a WT-specific configuration would likely create more problems. Aside from portability, for reproducibility/provenance we’d need to include a reference to the custom configuration in the published artifact. Trying to stick to a common configuration seems like the best approach.

What about allowing the repo to declare the default application should be RStudio – aside from the urlpath parameter in the badge/link? For example, https://github.com/binder-examples/rocker might open Rstudio by default when the repo URL is pasted into mybinder.org.


#4

That’s possible. We’ll have to think about whether that can be truly generic (setting arbitrary launch command) vs picking from a supported list