Binder redirect to JupyterHub returns 403 Forbidden

Hello everyone,
i’m currently trying to get to run a BinderHub installation on the k8s cluster of my institution. The steps of building an image are going fine but the following login into JupyterHub fails, thus it’s rather a JupyterHub issue I guess (i can also reproduce the error using cURL). After successfully building an image and pushing it to our private Image Registry, i receive a 403 error when trying to open the application in JupyterHub. The following happens:

  1. Binder builds and pushes the image successfully. Everything fine so far. I’m using my Hello World notebook for these tests: Nicholas Steffen Kappel / notebooks · GitLab
  2. After the notebook pod is started, the browser makes following request: GET https://[our-binderhub.de]/hub/user/n_kapp03-notebooks-lcp7toie/?token=token_of_22_chars
  3. Response is 302 which redirects to GET https://[our-binderhub.de]/hub/login?next=/hub/user/n_kapp03-notebooks-lcp7toie/?token=token_of_22_chars
  4. This returns 403 Forbidden

A couple things that might be relevant:

  • BinderHub runs without authentication (auth_enabled: false)
  • I’ve noticed is that the redirect looks weird. token looks like a query param bit since it is preceded by ? instead of & it’s not resolved as such. I’ve tried to put token into an additional query param manually with no success.
  • The required user appears to exist: POST https://[our-binderhub.de]/hub/api/users/n_kapp03-notebooks-lcp7toie returns 409 "User n_kapp03-notebooks-lcp7toie already exists".
  • Kubernetes Ingress is configured so that requests to https://[our-binderhub.de] go to BinderHub and only requests to https://[our-binderhub.de]/hub go to the JupyterHub pod. Could this possibly be an issue? For our scenario, this is the most straightforward solution.
  • I’m running JupyterHub using the image jupyterhub/k8s-hub:0.11.1. The BinderHub image is self built using Chartpress with very minor changes so far.

Any help or insight is appreciated!
PS: Is there a chance to edit code inside the JupyterHub pod to put some debug messages in there? There’s probably no such thing as hot reload i suppose.

Hi. Are you able to show us your full BinderHub chart configuration with secrets redacted?

Sure. Maybe someone could confirm if the requests made to JupyterHub are proper or not?!
Note that .yaml files are mostly unchanged compared to BinderHub GitHub repo and configuration was done through values.

config.yaml (containing the values)

config:
  BinderHub:
    use_registry: true
    image_prefix: "my-organisation.com/sys/binderhub/"
    auth_enabled: false
    build_image: my-organisation.com/sys/binderhub/repo2docker:0.0
    hub_url: https://binderhub-dev.my-orga.com/hub
  DockerRegistry:
    token_url: "https://also-my-organisation.com/jwt/auth"
    url: "https://my-organisation.com"
    username: binder
    password: *********
imageCleaner:
  enabled: false
image:
  secret: binder-image-rw
service:
  type: ClusterIP
ingress:
  enabled: true
  annotations: {
    kubernetes.io/ingress.class: nginx-internal
  }
  hosts: [binderhub-dev.my-orga.com]
  pathSuffix: ''
  tls:
    - hosts:
      - binderhub-dev.my-orga.com
      secretName: certificate

jupyterhub:  # jupyterHub sub chart
  proxy:
    service:
      type: ClusterIP
  scheduling:
    userScheduler:
      enabled: false
  ingress:
    enabled: true
    annotations: {
      kubernetes.io/ingress.class: nginx-internal
    }
    hosts: [binderhub-dev.my-orga.com]
    pathSuffix: 'hub'
    tls:
      - hosts:
        - binderhub-dev.my-orga.com
        secretName: certificate
  imagePullSecrets: [binder-image-rw]

deployment.yaml (only difference to original is adding a secret for my custom binder image

apiVersion: apps/v1
kind: Deployment
metadata:
  name: binder
spec:
  replicas: {{ .Values.replicas }}
  selector:
    matchLabels:
      app: binder
      component: binder
      release: {{ .Release.Name }}
  strategy:
    rollingUpdate:
        {{- if eq (.Values.replicas | int) 1 }}
        maxSurge: 1
        maxUnavailable: 0
        {{- end }}
  template:
    metadata:
      labels:
        app: binder
        name: binder
        component: binder
        release: {{ .Release.Name }}
        heritage: {{ .Release.Service }}
        {{- with .Values.deployment.labels }}
        # Because toYaml + indent is super flaky
        {{- range $key, $value := .Values.deployment.labels }}
        {{ $key }}: {{ $value | quote }}
        {{- end }}
        {{- end }}
      annotations:
        # This lets us autorestart when the configmap changes!
        checksum/config-map: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
        checksum/secret: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }}
        {{- with .Values.podAnnotations }}
        {{- . | toYaml | trimSuffix "\n" | nindent 8 }}
        {{- end }}
    spec:
      {{- with .Values.initContainers }}
      initContainers:
        {{- . | toYaml | nindent 8 }}
      {{- end }}
      nodeSelector: {{ .Values.nodeSelector | toJson }}
      {{- if .Values.rbac.enabled }}
      serviceAccountName: binderhub
      {{- end }}
      volumes:
      - name: config
        configMap:
          name: binder-config
      - name: secret-config
        secret:
          secretName: binder-secret
      {{- if .Values.config.BinderHub.use_registry }}
      - name: docker-secret
        secret:
          secretName: binder-push-secret
      {{- else }}
      - name: docker-socket
        hostPath:
          path: /var/run/docker.sock
      {{- end }}
      {{- with .Values.extraVolumes }}
      {{- . | toYaml | nindent 6 }}
      {{- end }}
      # only difference to original since i use self built image
      imagePullSecrets: 
      - name: {{ .Values.image.secret }}
      containers:
      - name: binder
        image: {{ .Values.image.name }}:{{ .Values.image.tag }}
        args:
          - --config
          - /etc/binderhub/config/binderhub_config.py
        volumeMounts:
          - mountPath: /etc/binderhub/config/
            name: config
          - mountPath: /etc/binderhub/secret/
            name: secret-config
          {{- if .Values.config.BinderHub.use_registry }}
          - mountPath: /root/.docker
            name: docker-secret
            readOnly: true
          {{- else }}
          - mountPath: /var/run/docker.sock
            name: docker-socket
          {{- end }}
          {{- with .Values.extraVolumeMounts }}
          {{- . | toYaml | nindent 10 }}
          {{- end }}
        resources:
          {{- .Values.resources | toYaml | nindent 10 }}
        imagePullPolicy: IfNotPresent
        env:
        - name: BUILD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: JUPYTERHUB_API_TOKEN
          valueFrom:
            secretKeyRef:
              name: binder-secret
              key: "binder.hub-token"
        {{- if .Values.config.BinderHub.auth_enabled }}
        - name: JUPYTERHUB_API_URL
          value: {{ (print (.Values.config.BinderHub.hub_url_local | default .Values.config.BinderHub.hub_url | trimSuffix "/") "/hub/api/") }}
        - name: JUPYTERHUB_BASE_URL
          value: {{ .Values.jupyterhub.hub.baseUrl | quote }}
        - name: JUPYTERHUB_CLIENT_ID
          value: {{ .Values.jupyterhub.hub.services.binder.oauth_client_id | quote }}
        - name: JUPYTERHUB_OAUTH_CALLBACK_URL
          value: {{ .Values.jupyterhub.hub.services.binder.oauth_redirect_uri | quote }}
        {{- if .Values.jupyterhub.hub.allowNamedServers }}
        - name: JUPYTERHUB_ALLOW_NAMED_SERVERS
          value: "true"
        - name: JUPYTERHUB_NAMED_SERVER_LIMIT_PER_USER
          value: {{ .Values.jupyterhub.hub.namedServerLimitPerUser | quote }}
        {{- end }}
        {{- end }}
        {{- with .Values.extraEnv }}
        {{- . | toYaml | nindent 8 }}
        {{- end }}
        ports:
          - containerPort: 8585
            name: binder
        {{- if .Values.deployment.readinessProbe.enabled }}
        readinessProbe:
          httpGet:
            path: {{ .Values.config.BinderHub.base_url | default "/" }}versions
            port: binder
          initialDelaySeconds: {{ .Values.deployment.readinessProbe.initialDelaySeconds }}
          periodSeconds: {{ .Values.deployment.readinessProbe.periodSeconds }}
          timeoutSeconds: {{ .Values.deployment.readinessProbe.timeoutSeconds }}
          failureThreshold: {{ .Values.deployment.readinessProbe.failureThreshold }}
        {{- end }}
        {{- if .Values.deployment.livenessProbe.enabled }}
        livenessProbe:
          httpGet:
            path: {{ .Values.config.BinderHub.base_url | default "/" }}versions
            port: binder
          initialDelaySeconds: {{ .Values.deployment.livenessProbe.initialDelaySeconds }}
          periodSeconds: {{ .Values.deployment.livenessProbe.periodSeconds }}
          timeoutSeconds: {{ .Values.deployment.livenessProbe.timeoutSeconds }}
          failureThreshold: {{ .Values.deployment.livenessProbe.failureThreshold }}
        {{- end }}

All other files under helm-chart/binderhub/templates are identical to binderhub/helm-chart/binderhub/templates at master · jupyterhub/binderhub · GitHub.
No changes were made to helm-chart/binderhub/files/binderhub_config.py. helm-chart/chartpress.yaml is also identical except for imagePrefix which points to my custom image.
You can observe that a custom image for repo2docker is used as well. This is because we need to work around a proxy within our network. I don’t see how this could be related to my problem though.

Can you share the logs of the user pod and hub pod?

  1. After the notebook pod is started, the browser makes following request: GET https://[our-binderhub.de]/hub/user/n_kapp03-notebooks-lcp7toie/?token=token_of_22_chars

This is the first problem. The request should be sent to /user/n_kapp... not /hub/user/n_kapp.... When using BinderHub without auth, there should be no requests originating in the browser handled by the Hub at all. This is generally a symptom of requests not getting routed correctly, or URLs not being constructed correctly.

This request is what would normally happen when a server is not running. It triggers the login process (the redirect to /hub/login), and back to the running server. The login page return a 403 because without auth, login is disabled. This is the expected behavior when a request goes to the Hub, which is what shouldn’t be happening in the first place.

Kubernetes Ingress is configured so that requests to https://[our-binderhub.de] go to BinderHub and only requests to https://[our-binderhub.de]/hub go to the JupyterHub pod. Could this possibly be an issue?

Yes, I believe this is the issue. I don’t think pathSuffix is the right configuration here. In particular, jupyterhub doesn’t only serve requests from /hub/, it also expects /user/ etc. pathSuffix was introduced by this PR and as I understand it, should ~never be used unless you are using GCE ingress. I think you are getting a confusing bug due to a coincidence that you used pathSuffix that happens to be a URL prefix the Hub already uses (/hub/). If you’d used a different pathSuffix, the error might have been clearer that the URLs were not correct (I’m not sure, though).

Instead, if you want to set the base URL of the hub deployment, use config:

jupyterhub:
  hub:
    baseUrl: jupyterhub

(I’m not using hub because it could get a little confusing to have URLs with /hub/hub/..., but there’s nothing technically wrong with it). This tells all the jupyterhub URLs to use this prefix. Then the user URL should be /jupyterhub/user/n_kapp... and the hub URL (only for API requests) should be /jupyterhub/hub/api/....

2 Likes

Yep. The path configs were indeed the problem.
As advised, I replaced

ingress:
  # (...)
 pathSuffix: 'hub'`

with

hub:
    baseUrl: /jupyterhub

Works instantly!
Thank you @minrk for your precise help. Case closed.

1 Like