KIP, containerd and ImagePullSecrets

I’m running into problems with kernel images in a private registry and a cluster using containerd.

What I configured:

  # Create RBAC resources
  rbac: true
  # ImagePullSecrets for a ServiceAccount, list of secrets in the same namespace
  # to use for pulling any images in pods that reference this ServiceAccount.
  # Must be set for any cluster configured with private docker registry.
  imagePullSecrets: 
    - harbor-registry
  commonLabels: {}
    # app.kubernetes.io/name: [your app name]

# You can optionally create imagePull Secrets
imagePullSecretsCreate:
  enabled: true
  annotations: {}
    # this annotatoin allows to keep secret even if helm release is deleted
    # "helm.sh/resource-policy": "keep"
  secrets: 
    - name: "harbor-registry"
      data: "eyJhdX... (base64 encoded dockerconfigjson)"

and I changed to criSocket: /run/containerd/containerd.sock

This put me in a position where:

  1. My custom kernelspec image is successfully pulled from a private project from our harbor resgistry.
  2. The secret harbor-registey is created in the enterprise-gatewaynamespace by helm.
  3. kernel-image-puller-sa is created in the enterprise-gateway namespace and has the imagePullSecret: harbor-registry
  4. The logs of the kip pods show that our of 3 custom kernels in my kernelspec, all three are detected and should be pulled. The two from public projects in the registry are pulled successfully (no secret needed) but the one form the private project isn’t: (Error executing crictl -r unix:///run/containerd/containerd.sock pull [...] failed with status code [manifests 2024-01-09]: 401 Unauthorized")

It seems the secret + sa work fine to pull pods when creating kubernetes resources (like the kernelspecs image), but are not used when the KIP pods try to pull via the docker client.

PS: Pulling without the KIP (via spark.kubernetes.container.image.pullSecrets) doesn’t work either, because the kernels are started in their own namespaces and thus the secret is not available there. Using the KIP to circumvent that would have been nice.

Hi @BBuchhold - thanks for posting this issue. You might also want to create an issue in the EG repo where others may be watching. I suspect there are changes necessary to KIP or its configuration, but that’s not very helpful.

If you do decide to create an issue, please cross-reference the issue here (and vice versa). Thank you.

I’ve created: KIP cannot use ImagePullSecret when using containerd · Issue #1359 · jupyter-server/enterprise_gateway · GitHub

If there is a chance that this not working is actually expected and not due to misconfiguration on my end, I will try to look for a solution here (e.g. create + mount the secret in other form that a dockerconfigjson so that it can be used by crictl or reading the dockerconfig json in image_puller.py and added the auth part to the crictl call, etc) and report back if I am successful.

I have now added a solution to the Github issue. I am not familiar enough with Kubernetes to tell if this is doing it the “right way”, but at least it fixes my problem. I am happy to improve upon my fix if that makes sense.