One of the things most people complain about notebooks is the order of cells execution. I was wondering what was the feeling of the community/contributors about this - I imagine this is a very long-term feature to implement. Inspiration could be drawn from people at hex (who just released the version 2.0) or Pluto in Julia.
There is some exploration in this direction:
- by @davidbrochart in reactive akernel GitHub - davidbrochart/akernel: Asynchronous, reactive Python Jupyter kernel. (see Jupyter Community Call - August 31, 2021 - YouTube)
- discussion in notebook repo: Cell dependency graph · Issue #1175 · jupyter/notebook · GitHub
- was also brought up in January on jupyterlab meeting: Weekly Dev Meetings: Jan-Jul 2021 · Issue #117 · jupyterlab/team-compass · GitHub, there are links to older exploration projects in there
- one more: Discussion on implementing directed acyclic graph-like structure in notebooks · Issue #118 · jupyterlab/team-compass · GitHub
Thanks Michal! I know this has pros and cons of course - say you update the grid for hyperparameter search: this will trigger cross validation again. Of course, in most of these DAG notebooks there are ways to switch this off - I guest most of the work for this feature would be providing a pleasant an functional UI. This seems more of a thing for JupyterLab 5, while it would be better to focus on things like inline variable insertion.
I currently try to write a JavaScript extension that observes a Jupyter Notebook and builds a DAG. However, it already seems to be hard to get and evaluate the output of a single cell:
Maybe that is one of the reasons why other implementations of observable notebooks are based on extra kernels?
Related:
Jupyter:
- Cell dependency graph · Issue #1175 · jupyter/notebook · GitHub
- javascript - How to use events in jupyterlab extensions? - Stack Overflow
ObservableHQ:
- Noob: Can you run your own Observable server - #37 by stefaneidelloth - Community Help - The Observable Forum
- An Unofficial ObservableHQ Compiler đź’» / Alex Garcia | Observable
Starboard:
Overview on related jupyterlab projcts:
- nbsafety
GitHub - nbsafety-project/nbsafety: Fearless interactivity for Jupyter notebooks. (last change yesterday)
- akernel
GitHub - davidbrochart/akernel: Asynchronous, reactive Python Jupyter kernel. (last change 1 month ago)
=> Does not work on Windows, yet
- ipyspaghetti
GitHub - cphyc/ipyspaghetti (last change 8 months ago)
Needs to be installed from source: Publish on pypy · Issue #4 · cphyc/ipyspaghetti · GitHub
Did not get it working.
- dfkernel
GitHub - dataflownb/dfnotebook-extension: The dataflow notebook extension for JupyterLab (last change a year ago)
=> ValueError: The extension “@dfnotebook/dfnotebook-extension” does not yet support the current version of JupyterLab.
- reactivepy
GitHub - jupytercalpoly/reactivepy: A reactive Python kernel (last change 3 years ago)
=> Output of cells is not shown
=> Not actively maintained, also see Similar project: akernel · Issue #37 · jupytercalpoly/reactivepy · GitHub
Update to the JavaScript extension for JupyterLab: Accessing the output of a given cell works now.
Update to akernel: I could resolve my issue on windows by uninstalling the package zmq.
Update to ipyspaghetti: I could resolve my installation issue on windows after activating the windows developer mode (requires admin rights):
By the way, I realised this might be super useful for stuff like sphinx
and jupyter books
, as it does not force re-execution of the whole thing even if some details change. Am I wrong, perhaps? I am writing my final dissertation on Jupyter Notebooks that I convert with Jupytext to MyST format (because I need to set some metadata) and everytime something minor changes, like a header, the hole MyST Notebook needs to be re-executed when I re-export the book.
We have internally developed a DAG-based execution engine as a JupyterLab extension. A free community version was just released. Please visit http://link.makinarocks.ai/download.html to download the whl files and JupyterLab Desktop bundles (you’ll need your email address to receive the product key.)
The composed graph can be executed directly or exported as a Kubeflow pipeline.