Experiences with GKE Autopilot?

Hi Everyone,

We’ve been running happily on GKE with two nodes on our cluster (core/user) for the past few years. For other projects we’ve been playing with GKE autopilot and the savings have been significant. Has anyone provisioned their GKE hubs using autopilot? Any experiences?

Thanks in advance!

Isabel

2 Likes

Hi Isabel,

Have you deployed on autopilot since June? If so can you share the yaml changes needed to deploy?

I have many jupyterhubs running on GKE, I am just now testing autopilot but not getting anywhere. Not sure what changes I need to make to the yaml. Helm seems to deploy the hub on autopilot, but am unable to login. The error messages I see are posted below:

Spawn failed: (403) Reason: error HTTP response headers: HTTPHeaderDict({'Audit-Id': '0fe43273-8bde-4757-89df-ba284cb99710', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Warning': '299 - "Autopilot set default resource requests on Pod test/jupyter-tony-5fcricelli for container block-cloud-metadata, as resource requests were not specified, and adjusted resource requests to meet requirements. See http://g.co/gke/autopilot-defaults and http://g.co/gke/autopilot-resources."', 'X-Kubernetes-Pf-Flowschema-Uid': '61916b1c-dc34-419b-a736-a3f47069e4c2', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'c64f4da2-2a0a-44e5-9037-0c3285757b46', 'Date': 'Sat, 06 Nov 2021 14:51:24 GMT', 'Content-Length': '1900'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"admission webhook \"validation.gatekeeper.sh\" denied the request: [denied by autogke-disallow-privilege] container

block-cloud-metadata is privileged; not allowed in Autopilot. Requesting user: system:serviceaccount:test:hub and groups:
[\"system:serviceaccounts\", \"system:serviceaccounts:test\", \"system:authenticated\"]
[denied by autogke-default-linux-capabilities] linux capability {\"NET_ADMIN\"} on container block-cloud-metadata not allowed; Autopilot only allows the capabilities:

[\"SETPCAP\", \"MKNOD\", \"AUDIT_WRITE\", \"CHOWN\", \"NET_RAW\", \"DAC_OVERRIDE\", \"FOWNER\", \"FSETID\", \"KILL\", \"SETGID\", \"SETUID\", \"NET_BIND_SERVICE\", \"SYS_CHROOT\", \"SETFCAP\"]
. Requesting user: system:serviceaccount:test:hub and groups: [\"system:serviceaccounts\", \"system:serviceaccounts:test\", \"system:authenticated\"] ","reason":"[denied by autogke-disallow-privilege] container

block-cloud-metadata is privileged; not allowed in Autopilot. Requesting user: system:serviceaccount:test:hub and groups: [\"system:serviceaccounts\", \"system:serviceaccounts:test\", \"system:authenticated\"]

 [denied by autogke-default-linux-capabilities] linux capability {\"NET_ADMIN\"} on container block-cloud-metadata not allowed;

Autopilot only allows the capabilities:
[\"SETPCAP\", \"MKNOD\", \"AUDIT_WRITE\", \"CHOWN\", \"NET_RAW\", \"DAC_OVERRIDE\", \"FOWNER\", \"FSETID\", \"KILL\", \"SETGID\", \"SETUID\", \"NET_BIND_SERVICE\", \"SYS_CHROOT\", \"SETFCAP\"]
. Requesting user: system:serviceaccount:test:hub and groups: [\"system:serviceaccounts\", \"system:serviceaccounts:test\", \"system:authenticated\"] ","code":403}

You can disable this using

singleuser:
  cloudMetadata:
    blockWithIptables: false

This should be safe as long as you’ve got NetworkPolicies working for ingress and egress.

1 Like

Brilliant! Thank you, I am up and running on autopilot. I will review the Network Policies link you posted. Thanks again!

1 Like

Hi Manics, I don’t think network policy will work as stated Here that it doesn’t supported by autopilot?

Hi All, I actually didn’t deploy with Autopilot as I wasn’t sure about stability. For me, the use of taints and requiring node affinity was crucial in the past. Are you running the autopilot hubs in production? How many users do you have and how has the experience been? Do you use taints or node affinity? Thanks for any insight!