Best practices re: reusable code

I am new to Jupyter notebooks based programming and I am looking for advice for best practices on how to make Jupyter code re-usable

Specifically, we have one notebook which preps and cleans a bunch of data into a pandas data frame, and then we run a lot of different experiments with that prepped data. Each of the experiments begins with a cell that starts with:

%run -i hh_data_prep.ipynb

This has worked Ok, BUT – as notebooks begin to pile up in our folder, I wanted to start organizing the notebooks in subdirectories, and here is where I’m running into trouble.

In a sub-directory, if I do something like this:

%run -i …/hh_data_prep.ipynb

Or use fancier ways to build the path, it never works right – Jupyter notebooks don’t seem to have a strong grasp of what directory they are in relative to other notebooks maybe? In my subdirectory, if I ask python to print out what directory it thinks we’re in, it prints out the directory above me Lol : (

Here is where I’m kind of begging for advice – I’m a newb and it’s entirely possible there’s something easy I’m just not grasping here – My questions to the community are basically:

A) Can you think of a way around the problem I’ve described?

B) Is there a better strategy for re-using code in Jupyter? Am I using a dumb, clunky approach where there is something better to do instead?

Thank You!! : )

2 Likes

We built importnb for this purpose. It lets .ipynb (and, crucially, .py) files import .ipynb files in all kinds of nasty ways, and has been mentioned a bunch of time here. There are other tools out there that do file-based transformation of notebooks, but we like using them directly, and by using python’s extensible import system, can fool most tools without any special cases… though mypy needs a bit of care.

One caveat: we don’t yet have a 3.12-compatible release out… we’re hoping to remove all 3.7-specific dependencies, and get mypy type checking working, too!

In a folder like:

repo/
  the utility notebook.ipynb
  a demo notebook.ipynb

In a demo notebook.ipynb:

import importnb
with importnb.Notebook():
    from the_utility_notebook import the_utility
the_utility()

Note the above uses simple string mangling, abusing the PYTHONPATH=. opinion of ipython.

However, importnb can also find notebooks inside python modules. So with a little more structure:

pyproject.toml
my_package/
  __init__.py  # empty, but gotta have it, PEP420 be damned :)
  utilities/  
   the utility notebook.ipynb
  demos/
    a demo notebook.ipynb

After python -m pip install -e ., the demo notebook can:

import importnb
with importnb.Notebook():
    from ..utilities import the_utility_notebook # or my_package.utilities
the_utility_notebook.the_utility()

The additional value of making a package-of-notebooks is that __init__.py can use importnb to expose specific utilities into a nice public API, where the consumer would be none the wiser… provided the notebooks do a good job of not over-using only-works-in-ipython features.

5 Likes

AWESOME!! :slight_smile: This helps us a LOT – I’ll try this out today – Thank You!! : )

1 Like