Interactive exploration without sacrificing reproducibility

Hi all, I’m interested in the ability of widgets to replicate the kinds of interactivity that researchers are able to achieve in desktop data exploration tools: drilling down into the records that underly some statistic or visual element, etc. However, I’m aware that doing so loses the benefit of reproducibility otherwise offered by notebooks: the interactions in data exploration are ephemeral and so cannot be reproduced.

I had thought that interactive tools could generate code that would allow exploratory interactions to be reproduced.

I have recently found jupyter-bifrost that claims to do similar, providing exploration “without sacrificing the reproducibility of code. Changes made in the Bifrost GUI are automatically translated into Pandas Queries, allowing developers to jump back into scripting whenever it is most convenient.” So far, I’m struggling to get this tool to work.

Please let me know if you know of related discussions or implementations of widgets that trace their interactions in the form of code that can then be inserted into other cells. Other thoughts and critiques are also welcome!

Thanks

2 Likes

There is mito for spreadsheets: https://trymito.io/

2 Likes

Interesting, thanks @krassowski! So, it seems Mito:

  • captures the modifications applied to the data using the interactive widget.
  • captures derived views (e.g. pivot tables) and assigns them, as DataFrames, to new local variable names
  • creates a new cell to put its generated code in when it has a first modification/view to report, and is able to both append to and update portions of that cell’s code.
  • inspects the frame in which it is called to identify the expression that it should use to refer to the input data in generated code (how R-like!)

On the implementation side, I’ve not found a repository for Mito, though the code is licensed under BSD 3-clause. I would like to see its implementation for managing the state of generated code pulled out as a reusable ipywidgets extension…

To provide some clarity: I’m part of a project looking to develop reproducible interaction functionality for the applied text analytics space.

I’ve also just seen bamboolib (https://bamboolib.8080labs.com/) recently acquired by Databricks.