Notes on JupyterLab Build

These are just some notes, mainly for myself, on where the JupyterLab build system is today, what problems it is trying to solve, and alternative ways we could solve those problems. Very much in progress!

To be clear this isn’t a roadmap for where JupyterLab is heading, just some directions I am personally interested in exploring.

JupyterLab Build

User Stories

What I would like to try is articulating the different ways people need the JupyterLab build system, to see if maybe we could serve the different needs with different tools.

  1. The first use case is a user installing a JupyterLab extension. They maybe have seen a github repo advertising some fancy new feature of JupyterLab, and they wanna try it out. I would guess their first goal is try out the extension and they don’t care as much about bundle size.

  2. We also have someone working on a feature in JupyterLab core. They want to be able to edit code and quickly have the build rerun so they can see the results.

  3. Then there are the third party extension developers who are like those working on core, but they are just working on an extension.

  4. Finally, we have the people who are trying to serve a custom build of JupyterLab for their colleagues. They care about optimizing the build so they serve the smallest bundle possible.

Issues and Solutions

  1. IMHO this is the most important use case, because it could be someone’s first experience with JuptyerLab. If it feels confusing or doesn’t work for them, we risk pushing them away. We keep getting issues opened around build oddities (webpack command not found, node_modules folder is missing. Happens again · Issue #6102 · jupyterlab/jupyterlab · GitHub, More extensions build issues · Issue #5177 · jupyterlab/jupyterlab · GitHub), and they can be pretty hard to diagnose. One reason I think they are particularly confusing is that users often just want to install an lab extension, and they are thrust (unknowingly) into the world of Webpack, NPM, Yarn, Node, etc. We try to wrap these things for them, with our own jupyter labextension and jupyter lab commands to install extensions and trigger the webpack build. One way to make this simpler is if we didn’t require a Webpack build after installing an extension: Create and install extensions without Webpack · Issue #5672 · jupyterlab/jupyterlab · GitHub
  2. From my perspective, this works OK at the moment. Changes are relatively quick to be loaded in the browser. However, if the JS build process was seperating from the python server, then we could play with things like Webpack’s Hot Module Replacement more easily to speed up the dev process.
  3. For the basic case, we do have a workflow here, but if you wanna do more complicated things, like link a local dependency of your extension (https://gitter.im/jupyterlab/jupyterlab?at=5c911975fcaf7b5f73de5a98) then it gets a bit tricky. The key issue is that you not only have to understand Yarn’s linking system but also how it integrates with JuptyterLab’s build system. If we could separate those things, so that you could look at the JS builds in isolation from the Python server process, then this could help conceptually clear up the matter. For example, if the Python process exposed an API, and the JS builds created a static web app, then you could build it however you liked as long as it could connect to the Python server.
  4. There has been some work to do this already, with the jupyterlab_delux cookiecutter: Discuss including example of jupyterlab distribution in core · Issue #6090 · jupyterlab/jupyterlab · GitHub. Basically the idea is to be able to create a conda package that contains JL plus some preinstalled extensions. So that users who install it don’t have to build jupyterlab to get the packages. There are some continued issues with this approach however (Install extensions in a modified JupyterLab · Issue #6132 · jupyterlab/jupyterlab · GitHub)

Possible next steps

Separate Python serving from JS building.

  1. Have JupyterLab JS build as static web app.
  2. Be able to launch this with a simple file server, and point it to the server URL, with a token.
  3. Launch the server separately as a Python process.

This would allow you to conceptually separate the two things and understand better how they interact.

Installing extensions without webpack
The basic idea here is we prebuild extensions to bundle all of their logic in one JS files. Then, we use web modules to import that JS file from the user’s browser, instead of bundling with Webpack.

  1. Prebuild extensions with webpack so that they are already minified and include all their dependencies, besides those JupyterLab and Phosphor packages, and any other that need to be singletons and we know will exist.
  2. Specify, in some config option, a list of extension URLs. All of these are dynamically imported and should return a default export of a list of extensions (like a normal jupyterlab extension JS package should).
  3. On startup, instead of building the installed extensions into the JS bundle, we load this list dynamically, and import each extension. We enable every extension exported by these files, except those that are disabled.

This was before my time in JupyterLab, but this has already all be tried (Third Party Extensions (single build) · Issue #728 · jupyterlab/jupyterlab · GitHub). One difference was that before web modules weren’t a browser standard and they were relying on require JS. The idea here isn’t to make this THE way of serving plugins to JupyterLab, just to allow this to be A way of doing it, if you so choose (yes there are tradeoffs in ensuring the write global versions and shipping many requests to the browser).

What would we need for this?

  1. We can already use webpack to build a library that includes all it’s dependencies besides some common packages.
  2. I need to investigate how to do this, then package the libraries you left out of the package into a separate module, and have webpack use that module, when it is imported…

Maybe webpack already can do this (emphasis mine):

The runtime, along with the manifest data, is basically all the code webpack needs to connect your modularized application while it’s running in the browser. It contains the loading and resolving logic needed to connect your modules as they interact. This includes connecting modules that have already been loaded into the browser as well as logic to lazy-load the ones that haven’t.

So it already has a runtime it uses to match up imports to what JS libraries they match with. Ideally, we want one file for core phosphor/jupyterlab packages, and then one for each other extension. Maybe this “webpack runtime” can track "Oh this extension needs to import phosphor/algorithms and we have already loaded this in our core module, so let’s provide it.

A subset of 4. is people distributing JupyterLab as part of a turn-key distribution (packages / images), i.e. me. :slight_smile:

Possible pain points are installing to staging areas (i.e. write to paths that are different from runtime paths), and ensuring mutable data is written to explicitly configurable directories (not any ‘installed’ ones).

Cool! Is this the debian package you mentioned in your intro?

AFAIK the core premise is that everything in the site-packages/jupyterlab directory is immutable, and everything in the Jupyterlab Application directory is mutable.

I am curious what specifically would be helpful to have more control over, in terms of on disk location.

FYI there was also a conversation about packaging JupyterLab in NixOS that might be useful.

I can (and will) tell you as soon as I tackle that lab to-do. :smiling_face:

1 Like

Hi, I’m also from the 4-th category, and while I do care about build size, it’s not really a big deal. What does bother me however is reproducibility of builds, things seem to break in subtle ways and quite often because of all the fast moving JS dependencies. Right now leaflet got updated few days ago on npm and it broke leaflet-draw plugin, and now I can’t edit GeoPolygons in JupterLab, but this functionality does work in the notebook, because in the notebook package-lock is used and it’s using older version of leaflet so things work fine.

I’m not a Javascript developer so don’t keep track of all the webpack/yarn/npm tooling, so maybe there is a simple solution that would let me pin down version of leaflet library and move on, but I have been googling for a while now and can’t seem to find any reference on how to “freeze” a particular JS lib for JupyterLab build. I realise that this might not be a trivial thing to do, given that all the different extensions might have different version constraints.

I might very well be missing something very obvious here, but at this stage I’m seriously considering writing dummy extension that just declares a tighter version range on 3rd party dependencies I care about as a solution to this problem.

There’s a hacky workaround for this that you can use to get back up and running, using Yarn resolutions.

  1. ) Navigate to where the jupyterlab python package is installed (on my system, <virutal env>/Lib/site_packages/jupyterlab)
  2. ) Edit the ./staging/package.json file and add the following top-level property:
"resolutions": {
  "leaflet": "good-version-here"
}
  1. ) Run jupyter lab build

You should then be all set, though any updates to the jupyterlab python package will nuke your changes.

2 Likes

Thanks @quigleyj-mavenomics, this almost worked. Unfortunately jupyter lab build, at least in my case, wipes out staging folder. But I was able to get it to work by doing this instead:

#Step 1: patch package.json as described above
jlpm clean
rm yarn.lock
jlpm install
jlpm build

Some of these steps might be redundant.

I have done this same workaround. You have to modify the staging folder in site_packages not in the JupyterLab Application Directory. The one in site packages is not wiped.

EDIT: Here is an example of documenting this procedure: GitHub - Quansight/jupyterlab-omnisci at 4d2b4042d8c76239cd7ed8fa51b8aceb8e9e671d

Thanks for pointing this out @saulshanabrook, I didn’t realise there were two staging folders, clearly I didn’t read the post above thoroughly. After making changes in the site_packages staging folder everything worked as advertised.

We had a productive call today about this today: https://github.com/jupyterlab/jupyterlab/issues/5672#issuecomment-526278264. The summary is we want to move forward, generally with the outline I have in the description for installing extensions without webpack, but building off of webpack’s DLL mechanism to prebuild extensions that can import modules from core without being built with them.

@saulshanabrook , @ [Kirill888]
My situation is quite similar to kirill888: I am from the 4th category, and we need to make some simple (but necessary) modifications to the jupyterlab (to disable certain feature), and released the modified version to be used by folks in my company. I am also not a javascript person, so npm/yarn/webpack/jlpm/react are all new to me.

Reproducbility is important to me, as I need to put it in CICD build behind corporate firewall/proxy. The build cannot use registry.yarnpkg.com directly, and have to access internal artifactory proxy channel to the yarnpkg.com. This is where I got stuck, and have been looking for help [ Here ] (Source build of jupyterlab 2.2.x failed with settingregistry.ts:221:11 - error TS2722: Cannot invoke an object which is possibly 'undefined' (validate object))

Basically, the problem I think is boiled down to this:

  1. The build works if I use the 2.2.x jupyterlab source, and the yarn.lock file, accessing registry.yarnpkg.com
  2. Build failed when I remove the yarn.lock. I can see yarn.lock has been regenerated accessing the artifactory channel, and compilation seems to work OK, except failing in comping one typescript module.
  3. The failure is likelyy to be related to having different packages/ode modules that I used (different from the released yarn.lock). However, there does not seem to be a way to: use the same versions in yarn.lock, but use a different registry.

Just curious if you have run into similar issue, and any solution? Thanks very much in advance!

BTW, I did run through a simple react tutorial using artifactory channel and my ~/.yarnrc to install the node modules, so I have some confidence that my npm/yarn settings are reasonable.