Guix-Jupyter: Towards self-contained, reproducible notebooks

dhirschfeld · January 11, 2020, 1:00am

It seems to be conflating two separate things - notebooks / analytics scripts and packaging the environments required to run them.

conda is the best tool we have for creating reproducible analytics environments and as you point out containers solve the “system software” issue.

[conda] is generally not very good at reproducing software environments at different points in time or space

In your linked tweet it sounds like the environment wasn’t saved with explicit specs. Whilst it’s appropriate to (optimistically) loosely pin your dependency versions in your meta.yaml to ensure reproducibility of an environment you should export an explicit env-spec.txt which exactly pins down the dependency versions, build numbers and even channels.

Creating a docker container with this environment ensures replicability and publishing the explicit env-spec.txt allows the environment to be reproduced locally.

Our internal CI/CD automatically builds docker images and as part of that bakes in the env-spec.txt for the environment so that it’s always available. In the case of web-app containers the env-spec.txt is made available on a /api/env-spec endpoint.

Packaging is complicated but that can be alleviated by automation exactly as is done with Binder. Other than not listing the explicit specs for the environment I’m not sure what reproducibility issues Binder doesn’t solve?

Last but not least, we still haven’t solved the core issue, which is that notebooks are not self-contained: they do not describe the dependencies they need.

I think this is where we disagree - I don’t think they should. IMHO that’s the job of a proper package manager, conda and package specification DSL - meta.yaml

Topic		Replies	Views
Guix-Jupyter 0.3.0 released: a kernel for self-contained, reproducible notebooks Notebook help-wanted , reproducibility	0	48	November 14, 2024
Guix-Jupyter 0.2.1 released: a kernel for self-contained, reproducible notebooks Notebook announcement , release	0	624	January 25, 2021
Request for early feedback on "JupyterHub Share Link" JupyterHub	23	3414	February 18, 2020
Jupyter vs Marimo Notebook	1	3041	September 18, 2024
Jupyter Security-Related Documentation Security	2	1143	November 5, 2021

Guix-Jupyter: Towards self-contained, reproducible notebooks

Related topics