On Friday we’re running our first
#TuringWay Binder workshop
I’m leading the section on “reproducible computational environments”. The plan is:
- 30 min talk on why sharing code and data isn’t enough and some of the different components of a computational environment (hardware, software, package versions etc)
- 30-40 min small group discussions around (ideally) 5 example binder-ised repos that have two branches with something changed about their computational environment so that the identical code in each branch gives different outputs
- 20 min talk on what Binder is, a little on how it works and why its useful
This post is asking for help coming up with these “paired” examples.
We already have one that @sgibson91 made: https://github.com/binder-examples/matplotlib-versions. The
mpl-v1.5 branch gives a different looking output to the
mpl-v2.0 branch. The code is the same but all the default plotting settings were updated so it would be hard, for example, be able to come up with the exact same figures as a paper using the old version of matplotlib.
I have a couple of other ideas:
- integer division in python 2 vs python 3
- a command being refactored between different versions of sklearn so that code doesn’t run with an updated version
But I’d really like to have:
- A cleverer one, ideally that only happens when you combine different packages
- one that uses R or another non-python language
Any suggestions would be suuuuper helpful.
The way I imagine the exercise works is that we don’t tell the groups what the difference is, we let them try the different branches and figure it out. I think it will be quite fun, but please throw up any red flags if you’ve tried this and seen it fail miserably
Thanks so much in advance