Repo2docker/mybinder.org as part of data/code publishing guidelines

betatim · February 3, 2019, 2:55pm

Should Binder try and push to get itself included in pages that make recommendations to authors/researchers on how to archive their code?

Someone pointed me at https://gatesopenresearch.org/for-authors/data-guidelines and I thought we should start conversations about if/how we could get repo2docker/mybinder.org included in that list.

Who could we reach out to start talking about this?

KirstieJane · February 3, 2019, 3:39pm

I think this is a great idea! I know Iain Hrynaszkiewicz (https://researchdata.springernature.com/users/11717-iain-hrynaszkiewicz) and I think he would be a great person to chat with.

I also think Elizabeth DuPre (https://elizabeth-dupre.com/#/) is thinking about projects related to computational environment sharing. I’ll ping this to her via DM.

And also, one of the big messages of The Turing Way (https://github.com/alan-turing-institute/the-turing-way) is to promote binder! So I’ll keep an eye here too and make sure we’re cross pollinating in a sensible way

KirstieJane · February 3, 2019, 4:39pm

I’ve also started a twitter thread tagging a bunch of funders/journals: https://twitter.com/kirstie_j/status/1092089383811989505?s=21

(Pinning here so I don’t lose it in the future!)

choldgraf · February 4, 2019, 5:53am

I think we should - and that this needs to be a specific push from our angle. @KirstieJane if y’all (or anybody else you know of) is interested in thinking specifically about Binder in the context of open reproducible and sharable workflows, perhaps it’d be worth a group brainstorm?

betatim · February 4, 2019, 7:32am

One thing I realised already:

For data we went through a phase of publishing your data openly where everyone just sent their spreadsheets or weirdly formatted data to some kind of archive. After that we now have FAIR and research data management plans. I think it is fair to say that the data that is being published with papers now is vastly more useful now.

Maybe this conversation should also focus on the “how to usefully publish your code and the environment it runs in” (like the FAIR guides for data)?

You can already publish your code to Zenodo (or other places like it) as an archive, however there doesn’t seem to be anything as clear and well known as the FAIR guides for data to do that. There is even less information on how to share the environment in a way that others can reproduce it.

psychemedia · February 4, 2019, 10:29am

As well drawing on best practice from data sharing principles, it might also be worth looking at how pre-existing tooling in this space sells itself and to what extent it would provide a pre-existing/practical way of implementing any mooted guidelines, as well as how it may fall short.

eg things like:

tools for rebuilding computational environments in general in docker or VMs: repo2docker/dockter/source2image, devops tooling (Ansible, Puppet, Vagrant);
tools for reproducing environments within a particular programming language environment: Python watermark package, or R’s packrat (and probably others…).

Topic		Replies	Views
Dataverse Community Meeting short talk? Binder	9	746	June 10, 2020
Share your Binder! discuss	12	2980	June 10, 2019
"Paired" examples to show why binder is so useful Binder	0	668	February 27, 2019
Usage question, can my researchers keep using the binder for their work if we cite binder in publication? Binder	4	838	January 4, 2021
"reproducible" binder environments with repo2docker, dockerhub and nbgitpuller discuss	10	2129	August 7, 2019

Repo2docker/mybinder.org as part of data/code publishing guidelines

Related topics