Repo2docker image size differences when run on different machines

I’m using version 2023.06.0 of repo2docker to create a custom kernel image for JupyterHub. I have a mostly empty github repo (only has a requirements.txt file)

when I run it on my Mac like this

jupyter-repo2docker <repo-link>

it results in a 2.45GB image. however it’s in arm64 architecture (verified with docker inspect) and I tried different arguments to change the platform but haven’t gotten it to work yet so I decided to just try it on my throwaway Ubuntu VM. However, I couldn’t get docker to run without root, so I just installed everything in root for the purposes of creating the image and ran this

jupyter-repo2docker --user-id 1000 --user-name ubuntu <same-repo-link>

but this results in a 7.52GB image size. why? I thought it would just take whatever base image it uses and install the requirements.txt on top of it, what is making it +5GB in size?

So I tried a new combination of args to attempt to create an image in amd64 format - which it did (after several minutes) but it’s also 7.24GB. is this expected for amd64?

jupyter-repo2docker --Repo2Docker.platform=linux/amd64 <repo-link>

Also, still saw this warning during each step:

[Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

Well, I’ve tried different options but for some reason the image size for an amd64 image always comes out 3x bigger than the arm64 version. This is my requirements.txt: GitHub - lauramariel/gts-jupyter-image: for use with repo2docker

I tried the same with binder-example repo: GitHub - binder-examples/requirements: Simple requirements.txt based example and amd64 was a little bigger but not by much.

amd64: 1.93GB
arm64: 1.71GB

Have you tried comparing the build logs, and the packages that are installed? Maybe your dependencies are built differently for amd and arm? E.g. perhaps one comes with extra libraries, requires compilation, or happens to create more temporary files.

You could try inspecting the image layers to find out which ones are larger than expected?

1 Like