How can I prevent an out of memory error in Github Actions?

'm working on a project that uses the repo2docker Github Actions workflow: GitHub - jupyterhub/repo2docker-action: A GitHub action to build data science environment images with repo2docker and push them to registries.

However, the required packages have grown and on mamba env update, the runner crashes, due to what I believe is an out of memory error. Image shown below:

Does anyone have suggestions for workarounds, shortcuts, etc? Thanks!

I’ve looked into the action itself, and registered for a free enterprise trial to get larger runners as a temporary workaround, but I’m looking for a long term fix.

Doing the solves off-line with conda-lock will usually solve OoM during environment creation, but won’t cache very well with r2d.

1 Like

What do you mean by won’t cache very well? WIll work on generating a conda-lock for the environment and seeing if that helps.

Repo2docker has special treatment for (.binder/)environment.yml, but doesn’t understand either of the conda-lock output files, partially as conda-lock hasn’t declared a “well-known” file.

When it finds one of the “well-known” files, it does a “smart” install:

  • copies just that file into the building container
  • runs the package manager with some flags
  • cleans up the cache after the install

This layer then gets cached, and the rest of the process continues. If you don’t change your environment, and land on a repo2docker host that already has a previous image, you don’t have to rebuild.

anyhow, to use a conda-lock against stock repo2docker, you have to:

  • create a file called something other than environment.yml (otherwise r2d will find it)
  • create the lock, maybe with a file called .binder/create-conda-lock.sh
#!/usr/bin/env bash
# .binder/create-conda-lock.sh
set -eux
cd .binder 
conda-lock \
  --mamba \
  --kind explicit \
  --platform linux-64 \
  --file not-environment.yml
  • make a script that can create the lock (won’t be useful locally)
#!/usr/bin/env bash
# .binder/update-env-from-lock.sh
set -eux
mamba create \
  --prefix ${NB_PYTHON_PREFIX} \
  --file .binder/conda-linux-64.lock
# ...and if needed
pip install --no-deps -r .binder/not-requirements.txt
# or other things conda-lock can't do
  • use it in .binder/postBuild
#!/usr/bin/env bash
# .binder/postBuild
set -eux
bash .binder/update-env-from-lock.sh
  • check in all of these files!
1 Like

Another avenue to explore is conda-pack: this is slightly different as it takes an entire conda environment (even pip- or npm install -g packages).

This would be a more complex CI approach, where the conda-pack would be built out of band (on a full linux vm), then fetched during postBuild inside repo2docker. But this has some of the same shortcomings, as there’s no way for (repo2)docker to cache layers random shell inputs.

On the up-side: conda-pack archives are portable to any linux-64 machine, so they have value outside of docker.