Conda, pip or something else for building containers

What do people use for adding packages to containers? Conda or just plain pip? I’m planning a migration from vm based cluster to k8s. I use conda extensively for adding packages by user requests. Current setup have 65 python modules and appr 110 R packages in my ansible playbooks all installed with conda (eg. mamba). Any advice on how to minimize build times for containers?

Best regards,
Roy

Hello, @Roy_Dragseth

I’m creating a custom image using below Dockerfile
This image based on the tensorflow-notebook image.

FROM quay.io/jupyter/tensorflow-notebook

COPY --chown=${NB_UID}:${NB_GID} requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt && \
    fix-permissions "${CONDA_DIR}" && \
    fix-permissions "/home/${NB_USER}"

COPY overrides.json ${CONDA_DIR}/share/jupyter/lab/settings/overrides.json

This link may also help.

For build time in particular, especially if you are using conda, lock files can help. Conda installs are pretty quick if you use explicit environment files, because there is no solve, it’s just a list of URLs to download and extract. These can be passed to micromamba to bootstrap an environment for a docker image pretty quickly, and it’s my go-to for maintaining docker images.

I have a relatively complex real-world example here, with key points:

Another build-time tip is to mount caches, instead of disabling them, as is the common practice for keeping images small. Caches are great, and reducing rebuild time is the point of them.

For example, to mount a conda package cache:

ENV MAMBA_ROOT_PREFIX=/tmp/conda
RUN --mount=type=cache,target=$MAMBA_ROOT_PREFIX micromamba create -p /opt/conda -f /tmp/conda.lock

or for pip:

ENV PIP_CACHE_DIR=/tmp/pip-cache
RUN  --mount=type=cache,target=$PIP_CACHE_DIR pip install -r /tmp/requirements.txt

With lockfiles in place and caches mounted, builds after the first, even if the docker cache is invalidated, amount to only extracting packages, as the solve and download steps (the two most expensive steps) are both skipped.