Jupyter Paths priority order

Hey folks,

A quick post to make one of the features from the latest releases jupyter_core more visible, which might be useful to users juggling between multiple virtual environments on a day to day basis.

The jupyter --paths command tells where Jupyter looks for config and data files. It usually outputs something like the following:

$ jupyter --paths
config:
    /home/user/.jupyter
    /home/user/miniforge3/envs/myenvironment/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/user/.local/share/jupyter
    /home/user/miniforge3/envs/myenvironment/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/user/.local/share/jupyter/runtime

Here we see that the first entry on the list for the config files is /home/user/.jupyter. Which means that all virtual environment will be sharing that same folder to store config files, for example JupyterLab settings and workspaces.

With recent releases of jupyter_core, it is now possible to set the JUPYTER_PREFER_ENV_PATH environment variable to change this order so the path to the virtual environment shows up first.

First make sure to have jupyter_core>=4.7. For example to upgrade with pip:

pip install -U jupyter_core

The JUPYTER_PREFER_ENV_PATH environment variable can for example be set in the ~/.bashrc or ~/.zshrc files, so it is set automatically when a new terminal is started:

export JUPYTER_PREFER_ENV_PATH=1

Then in a new terminal, running the jupyter --paths command will output:

$ jupyter --paths
config:
    /home/user/miniforge3/envs/myenvironment/etc/jupyter
    /home/user/.jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/user/miniforge3/envs/myenvironment/share/jupyter
    /home/user/.local/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/user/.local/share/jupyter/runtime

This can be quite handy to keep environments well isolated.

Thanks @jasongrout for adding this in Add an environment variable flag to switch the user and environment path order by jasongrout · Pull Request #199 · jupyter/jupyter_core · GitHub!

More info in the documentation: Paths for Jupyter files — jupyter_core 4.8.0.dev0 documentation

Hope it helps!

11 Likes

This is so cool it should prbb be a blog post for better visibility.
Thanks Jeremy!

1 Like

Also of note: we’re currently ruminating on how Python packages might replace data_files approach to add to/update these paths. It is our hope this would reduce the barriers and opinions involved in extending Jupyter from its primary packaging ecosystem, even though other package managers are just fine with putting stuff in $PREFIX/(etc|share).

There’s this strawman PR that has examples of declarative setuptools and flit configurations with entry_points, but other ideas put forward:

  • custom package metadata
  • namespace packages

As usual, naming things and cache invalidation rear their ugly heads. Insights greatly desired!

1 Like

Update on the entry_points PR in jupyter_core: we learned a whole lot! But it might not be the right path.

When loaded with a 1000 packages that use the mechanism, the performance is, unsurprisingly, poor. Some light caching helps, but it’s still 11s to emit jupyter --paths --json on a fairly modern linux box with an SSD.

Granted, loading 1000 JSON files and globbing in 1000 directories isn’t going to be fast anyways. Not a showstopper, perhaps, but not encouraging. Of interest would be trying with a more representative level of extensions (I have a few truly gnarly environments around, which probably still only have <100 jupyter-aware packages).

The next tack to try, since we have numbers now, is to look at the namespace package approach (custom metadata ended up being a non-starter, as it’s not widely supported). These are tricky to get right, as we’d have many uncoordinated distributions creating files in in the same top-level package in site-packages… e.g. $PREFIX/site-packages/jupyter-config and $PREFIX/site-packages/jupyter-data.

flit doesn’t/won’t support distributing multiple packages (or namespace packages, at all) in the same folder, though poetry does. So some of the benefits of this approach would be limited.