JTH over Azure DNS Zone

Hi - Okay I have posted a couple of times to describe my banging my head to get ZTJH working on an Azure DNS. Here is my setup:

I have an Azure DNS under a:

  • Resource Group Variable: $DNS_NAME
  • Zone Variable: $HOST_ZONE
  • Record: $DOMAIN

I can successfully create the cluster using the guidance and setup the JTH with Helm. I have also installed cert-manager and nginx-ingress.

I can access the JTH via HTTP with the following configs:

#### Add Hub Images and users
cat << EOF > deploy/jupyterhub-config-v4-http.yaml
proxy:
  secretToken: $(cat .secret/secretToken$VERSION.txt)
  service:
	type: ClusterIP  # This is key to set as ClusterIP when using ngnix vs LoadBalancer
hub:
  config:
    JupyterHub:
      authenticator_class: dummy
      admin_access: true
    Authenticator:
      allowed_users:
        - user1
        - user2
    DummyAuthenticator:
      password: mypw
      admin_users:
        - admin1
singleuser:
  image:
    name: jupyter/all-spark-notebook
    tag: x86_64-ubuntu-22.04
  cmd: null
  cpu:
    limit: 1
    guarantee: 1
  memory:
    limit: 8G
    guarantee: 1G
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  hosts:
    - $DOMAIN_NAME
EOF

helm upgrade \
	--cleanup-on-fail \
	--install "$CLUSTER_NAME-http" jupyterhub/jupyterhub \
	--namespace "$JUPYTERHUB_NAMESPACE" \
	--create-namespace \
	--version=3.3.7 \
	--values deploy/jupyterhub-config-v4-http.yaml
	
#### Let's test that we can access via the ip4 first
kubectl --namespace=jupyterhub get service proxy-public

INGRESS_IP=$(kubectl get services -n ingress-nginx -o jsonpath='{.items[?(@.metadata.name=="nginx-ingress-ingress-nginx-controller")].status.loadBalancer.ingress[0].ip}')
echo $INGRESS_IP

#### Using the external load balancer IP 
az network dns record-set a update \
  --zone-name $HOST_ZONE \
  --resource-group $DNS_NAME  \
  --name $DOMAIN \
  --set aRecords[0].ipv4Address=$INGRESS_IP
  
cat << EOF > deploy/ingress-http.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: jupyterhub
  namespace: $JUPYTERHUB_NAMESPACE
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
	# Changed with cody
	nginx.ingress.kubernetes.io/proxy-body-size: "0"
  labels:
    app: jupyterhub
spec:
  ingressClassName: nginx
  rules:
  - host: $DOMAIN_NAME
    http:
      paths:
      - backend:
          service:
            name: proxy-public
            port:
              number: 80
        path: /
        pathType: Prefix
EOF

kubectl apply -f deploy/ingress-http.yaml

**However when I deploy as HTTPS and verify that the certs are issued I am unable to access the deploy externally. I have checked against firewalls and for port 80 and 443. ** After digging allot the most significant error I have found is this:
[mydomain.com] acme: error: 400 :: urn:ietf:params:acme:error:connection :: 135.237.18.183: Fetching http://mydomain.com/.well-known/acme-challenge/dZAxnm4lyK1T9allEQ38sKaLSVVAng5b104Cmi4fLoI: Timeout during connect (likely firewall problem)

Which Claude tells me is the ACME Challenge failing due to timeout.
I have run every check the Claude can provide and it all points back to this.

My HTTPS Deploy is:

# Create Azure DNS secret for cert-manager
cat << EOF > deploy/azure-dns-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: azure-dns-config
  namespace: cert-manager
data:
  client-secret: $(echo -n "$APP_KEY" | base64)
stringData:
  clientID: $CLIENT_ID
  subscriptionID: $SUBSCRIPTION_ID
  tenantID: $TENANT_ID
  resourceGroup: $DNS_NAME
EOF 

kubectl apply -f deploy/azure-dns-secret.yaml

# Deploy for https
# Add Hub Images and users
cat << EOF > deploy/jupyterhub-config-v4-https.yaml
proxy:
  secretToken: $(cat .secret/secretToken$VERSION.txt)
  service:
    type: ClusterIP
    annotations:
      service.beta.kubernetes.io/azure-load-balancer-internal: "false"
  https:
    enabled: true
    hosts:
      - $DOMAIN_NAME
    letsencrypt:
      contactEmail: $EMAIL
hub:
  config:
    JupyterHub:
      authenticator_class: dummy
      admin_access: true
    Authenticator:
      allowed_users:
        - user1
        - user2
    DummyAuthenticator:
      password: mypw
      admin_users:
        - admin1
singleuser:
  image:
    name: jupyter/all-spark-notebook
    tag: x86_64-ubuntu-22.04
  cmd: null
  cpu:
    limit: 1
    guarantee: 1
  memory:
    limit: 8G
    guarantee: 1G
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  hosts:
    - $DOMAIN_NAME
  tls:
    - secretName: $CERT_NAME
      hosts:
        - $DOMAIN_NAME
EOF

helm upgrade \
	--cleanup-on-fail \
	--install "$CLUSTER_NAME-http" jupyterhub/jupyterhub \
	--namespace "$JUPYTERHUB_NAMESPACE" \
	--create-namespace \
	--version=3.3.7 \
	--values deploy/jupyterhub-config-v4-https.yaml

INGRESS_IP=$(kubectl get services -n ingress-nginx -o jsonpath='{.items[?(@.metadata.name=="nginx-ingress-ingress-nginx-controller")].status.loadBalancer.ingress[0].ip}')
echo $INGRESS_IP

# Using the external load balancer IP 
az network dns record-set a update \
  --zone-name $HOST_ZONE \
  --resource-group $DNS_NAME  \
  --name $DOMAIN \
  --set aRecords[0].ipv4Address=$INGRESS_IP
	
# Let's test that we can access via the ip4 first
kubectl --namespace=jupyterhub get service proxy-public

# Create ClusterIssuer for Let's Encrypt
cat << EOF > deploy/cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: $EMAIL
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - dns01:
        azureDNS:
          clientID: $CLIENT_ID
          clientSecretSecretRef:
            name: azure-dns-config
            key: client-secret
          tenantID: $TENANT_ID
          subscriptionID: $SUBSCRIPTION_ID
          resourceGroupName: $DNS_NAME
          hostedZoneName: $HOST_ZONE
EOF

kubectl get secret azure-dns-config -o yaml -n $JUPYTERHUB_NAMESPACE

kubectl apply -f deploy/cluster-issuer.yaml

# Create Certificate resource for JupyterHub
cat << EOF > deploy/certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: jupyterhub-cert
  namespace: $JUPYTERHUB_NAMESPACE
spec:
  secretName: $CERT_NAME
  dnsNames:
  - $DOMAIN_NAME
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  # Optional fields to specify certificate duration and renewal period
  duration: 2160h
  renewBefore: 720h
EOF

kubectl apply -f deploy/certificate.yaml -n $JUPYTERHUB_NAMESPACE

cat << EOF > deploy/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: jupyterhub
  namespace: $JUPYTERHUB_NAMESPACE
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  labels:
    app: jupyterhub
spec:
  ingressClassName: nginx
  rules:
  - host: $DOMAIN_NAME
    http:
      paths:
      - backend:
          service:
            name: proxy-public
            port:
              name: http
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - $DOMAIN_NAME
    secretName: $CERT_NAME
EOF

kubectl apply -f deploy/ingress.yaml

It looks like you’re using Nginx ingress deployed separately, so presumably you’re requesting a certificate to go in front of that ingress? If so Z2JH shouldn’t affect the request, and you’re probably better off reading the cert-manager and/or Azure documentation rather than relying on an AI helper which can sometimes provide guidance, but not always.

Do you mean you can access JupyterHub via the domain name, or via the public IP?

What’s the purpose of the Create Azure DNS secret for cert-manager configuration- what’s it used for?

Thank for the reply :slight_smile: I went the route of exhaustively referencing the guidance via these sources first:

And these helped with some things not mentioned in the official doc:

https://test-zerotojh.readthedocs.io/en/pin-sphinx/microsoft/step-zero-azure.html#microsoft-azure
https://alan-turing-institute.github.io/hub23-deploy/azure-prereqs/key-vault/service-principals.html

In the end with these I was only ever able to get access via the IP4 over the load balancer. So I did not find that this worked: Security — Zero to JupyterHub with Kubernetes documentation

At least for me. Admittedly I am a novice with K8 setups and have few resources. I turned to this solution because I need place for Team collaboration and to develop quick apps for proofs of concept. So a single VM was not ideal. In the end here are my requirements:

  • A Jupyter Hub K8 Server capable of supporting up to 10 users (This is complete and reproducible)
  • Azure Hosting
  • Use an Azure DNS Resource Group, Zone, RecordSet instead of a purchased Domain (This is set up).
  • Start with http access using RecordSet.Zone.com and the ngnix controller (This works and is reproducible)
  • Secure with https with Azure DNS as RecordSet.Zone.com. This is where it is failing.

Also I am uncertain of the benefits of using the nginx ( Advanced Topics — Zero to JupyterHub with Kubernetes documentation).

And finally of course if anyone has a repo with a complete set of YAML and code flow that works solely for an Azure setup that would be incredibly generous.

I’m not familiar with Azure. What exactly does this mean? Do you have a public domain, accessible from anywhere? This is a requirement for Lets Encrypt certificates.