DAG Based notebooks?

One of the things most people complain about notebooks is the order of cells execution. I was wondering what was the feeling of the community/contributors about this - I imagine this is a very long-term feature to implement. Inspiration could be drawn from people at hex (who just released the version 2.0) or Pluto in Julia.

4 Likes

There is some exploration in this direction:

2 Likes

Thanks Michal! I know this has pros and cons of course - say you update the grid for hyperparameter search: this will trigger cross validation again. Of course, in most of these DAG notebooks there are ways to switch this off - I guest most of the work for this feature would be providing a pleasant an functional UI. This seems more of a thing for JupyterLab 5, while it would be better to focus on things like inline variable insertion.

I currently try to write a JavaScript extension that observes a Jupyter Notebook and builds a DAG. However, it already seems to be hard to get and evaluate the output of a single cell:

Maybe that is one of the reasons why other implementations of observable notebooks are based on extra kernels?

Related:

Jupyter:

ObservableHQ:

Starboard:

1 Like

Overview on related jupyterlab projcts:

  • nbsafety

GitHub - nbsafety-project/nbsafety: Fearless interactivity for Jupyter notebooks. (last change yesterday)

  • akernel

GitHub - davidbrochart/akernel: Asynchronous, reactive Python Jupyter kernel. (last change 1 month ago)

=> Does not work on Windows, yet

  • ipyspaghetti

GitHub - cphyc/ipyspaghetti (last change 8 months ago)

Needs to be installed from source: Publish on pypy · Issue #4 · cphyc/ipyspaghetti · GitHub

Did not get it working.

  • dfkernel

GitHub - dataflownb/dfnotebook-extension: The dataflow notebook extension for JupyterLab (last change a year ago)

=> ValueError: The extension “@dfnotebook/dfnotebook-extension” does not yet support the current version of JupyterLab.

  • reactivepy

GitHub - jupytercalpoly/reactivepy: A reactive Python kernel (last change 3 years ago)
=> Output of cells is not shown
=> Not actively maintained, also see Similar project: akernel · Issue #37 · jupytercalpoly/reactivepy · GitHub

2 Likes

Update to the JavaScript extension for JupyterLab: Accessing the output of a given cell works now.

Update to akernel: I could resolve my issue on windows by uninstalling the package zmq.

Update to ipyspaghetti: I could resolve my installation issue on windows after activating the windows developer mode (requires admin rights):

1 Like

By the way, I realised this might be super useful for stuff like sphinx and jupyter books, as it does not force re-execution of the whole thing even if some details change. Am I wrong, perhaps? I am writing my final dissertation on Jupyter Notebooks that I convert with Jupytext to MyST format (because I need to set some metadata) and everytime something minor changes, like a header, the hole MyST Notebook needs to be re-executed when I re-export the book.

We have internally developed a DAG-based execution engine as a JupyterLab extension. A free community version was just released. Please visit http://link.makinarocks.ai/download.html to download the whl files and JupyterLab Desktop bundles (you’ll need your email address to receive the product key.)

The composed graph can be executed directly or exported as a Kubeflow pipeline.

2 Likes