I am trying to use an instance of postgres on gce for jupyterhub. At one point I had it working and the db schema created, but it now doesn’t connect and I can’t figure out what I did before that worked
I have this in the config.yaml for helm:
cookieSecret: 6e91bf348949d39d96cd7bb963e7b9f040be28314146516c60566705d773f563
# url: postgresql://postgres:<redacted>@
# url: postgresql+psycopg2://postgres@
# url:
# url: postgresql+psycopg2://<db-username>:<db-password>@<db-hostname>:<db-port>/<db-name>
upgrade: true
type: postgres
- ReadWriteMany
storage: 11Gi
**`Preformatted text`and a workload identity yaml of this form:**
serviceAccountName: jupyter-dev-ksa
iam.gke.io/gke-metadata-server-enabled: "true"
- name: gke-jupyterhub
# ... other container configuration
image: gcr.io/jupyterhub-373622/2sigma:2eb9a54
# This app listens on port 8080 for web traffic by default.
- containerPort: 8080
- name: PORT
value: "8080"
# This project uses environment variables to determine
# how you would like to run your application
# To use the Go connector (recommended) - use INSTANCE_CONNECTION_NAME (proj:region:instance)
# To use TCP - Setting INSTANCE_HOST will use TCP (e.g.,
# To use Unix, use INSTANCE_UNIX_SOCKET (e.g., /cloudsql/proj:region:instance)
value: ""
# value: jupyterhub-373622:us-central1:jupyter-dev
- name: DB_PORT
value: "5432"
- name: DB_USER
name: db-user-pass
key: username
- name: DB_PASS
name: db-user-pass
key: password
- name: DB_NAME
name: db-user-pass
key: database
# [END cloud_sql_proxy_k8s_secrets]
# [START cloud_sql_proxy_k8s_container]
- name: cloud-sql-proxy
# It is recommended to use the latest version of the Cloud SQL proxy
# Make sure to update on a regular schedule!
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0 # make sure to use the latest version
# If connecting from a VPC-native GKE cluster, you can use the
# following flag to have the proxy connect over private IP
# - ip_address_types=PRIVATE"
# Enable structured logging with LogEntry format:
- "--structured-logs"
# Replace DB_PORT with the port the proxy should listen on
# Defaults: MySQL: 3306, Postgres: 5432, SQLServer: 1433
- "--port=5432"
# jupyterhub-373622:us-central1:jupyter-dev=tcp:"
- "-instances=projects/jupyterhub-373622/global/networks/default=tcp:5432"
# - "--credentials-file=/secrets/key.json"
# - "-enable_iam_login"
runAsNonRoot: true
memory: "2Gi"
cpu: "1"
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
**I’m having trouble finding documentation to fill in a missing piece, I still don’t fully understand the concept of how the hub knows how to use the sidecar, or how to verify my ports and other configs are all ok. Removing the url for the db from the config.yaml causes the hub to use the default sql. Anything I add fails to connect with **
connection to server at “”, port 5432 failed: Connection refused
I’m not familiar with Postgres and GCE workload identities, but I’m guessing from your post that you run a special proxy container that connects to postgres using the identity, and JupyterHub connects to that container as if it were postgres? Assuming that’s the case, then running the postgres proxy container inside the same pod as the main hub process (i.e. as a sidecar) means the hub should be able to connect to it using If it’s not then it sounds like a problem with your postgres proxy container.
Are you using Z2JH or your own Helm chart? How exactly are you adding the postgres proxy container to the deployment/chart?
The image was created with repo2docker and is just the jupyter/base-notebook with a few additional packages installed for our plan moving forward
Maybe adding the proxy container to the chart is the issue. I haven’t seen anything about that in the docs I’ve tried to follow.
As I mentioned, somehow this did work earlier. I can connect to the postgres db and see the users configured in the config.yaml, and the rest of the schema
I know at one point I tried setting tcp and/or ports 5432:5432 but seem to have lost exactly where and how, and don’t know if that had anything to do with when it worked or not.
edit 3/14 - Now that I think about it, I did have other network entries I tried and also tried a connection to the private ip. Maybe something there worked and I didn’t realize it.
Maybe an alternate question would be ot anybody familiar with deploying on GCP Kubernetes in general, how can I get better visibility into my logs. I feel like I am working in the dark. Kubernetes and GCP are also completely new to me.
this is the entire config.yaml
name: gcr.io/jupyterhub-373622/2sigma
# tag: python-3.10
tag: 8ae33d1
# `cmd: null` allows the custom CMD of the Jupyter docker-stacks to be used
# which performs further customization on startup.
cmd: null
cookieSecret: <redacted>
url: postgresql://postgres:@
upgrade: true
type: postgres
- ReadWriteMany
storage: 12Gi
client_id: <redacted>
client_secret: <redacted>
oauth_callback_url: http://jupyter-dev.2sigmaschool.org/hub/oauth_callback
- 2sigmaschool.org
login_service: 2Sigma School
authenticator_class: google
admin_access: true
- craig
- vishal
- student.one
This means the hub will try to connect to the postgresql proxy on localhost, so your proxy container must be in the hub pod. There’s no sign of this in your config.yaml.
The Z2JH docs contain some examples for debugging Z2JH K8s pods:
Normal Scheduled 4m57s default-scheduler Successfully assigned 2sigma/hub-5fbc86c44f-4gckq to gke-jupytercluster-default-pool-a6b95700-5kqv
Normal Pulled 4m56s kubelet Container image "gcr.io/jupyterhub-373622/2sigma:8ae33d1" already present on machine
Normal Created 4m56s kubelet Created container cloud-sql-proxy
Normal Started 4m56s kubelet Started container cloud-sql-proxy
Normal Pulled 4m56s kubelet Container image "gcr.io/jupyterhub-373622/2sigma:8ae33d1" already present on machine
Normal Created 4m56s kubelet Created container gke-jupyterhub
Normal Started 4m56s kubelet Started container gke-jupyterhub
Normal Pulled 4m56s kubelet Container image "jupyterhub/k8s-hub:2.0.0" already present on machine
Normal Created 4m56s kubelet Created container hub
Normal Started 4m56s kubelet Started container hub
Warning Unhealthy 4m31s (x16 over 4m55s) kubelet Readiness probe failed: Get "": dial tcp connect: connection refused
Now trying to connect to postgresql+psycopg2:// is timing out rather than getting rejected. Changing to any random ip or port also times out though
entering something without the postgresql protocol reverts back to connecting to the sqlite
This seems like progress. I put the workload-identity info directly in the config.yaml and now I get an error that looks like my service account is either not used or is missing permission. AFAI can see, the service account I think I’m using seems ok
The cloud-sql-proxy is at the bottom of the included code snippet. Not sure how to dig deeper in the log to see what auth is being attempted
{"severity":"INFO","timestamp":"2023-03-21T03:58:03.467Z","message":"Authorizing with Application Default Credentials"}
{"severity":"ERROR","timestamp":"2023-03-21T03:58:03.829Z","message":"The proxy has encountered a terminal error: unable to start: failed to get instance: Refresh error: failed to get instance metadata (connection name = \"jupyterhub-373622:us-central1:jupyter-dev\"): googleapi: Error 403: The client is not authorized to make this request., notAuthorized"}
I don’t know how I specify the service account now that I put the proxy in extracontainers. Originally, following the document I found at Google, I had the service account in a yaml I deployed as a deploy pod that had an “app” and a proxy like below. The jupyter-dev-ksa was supposed to be an identity that represents the application in the GKE cluster.
{"severity":"INFO","timestamp":"2023-03-22T03:18:29.335Z","message":"Authorizing with Application Default Credentials"}
{"severity":"INFO","timestamp":"2023-03-22T03:18:30.266Z","message":"[jupyterhub-373622:us-central1:jupyter-dev] Listening on"}
{"severity":"INFO","timestamp":"2023-03-22T03:18:30.267Z","message":"The proxy has started successfully and is ready for new connections!"}
But I don’t know how to have the hub use it.
It seems to me that all I really need is my hub, which was running just fine with sqlite, and the extracontainer with the proxy that can use the jupyter-dev-ksa I created from following the Google document to connect to Postgres. Once the proxy connects, the hub just uses it by
I either need the hub in the sidecar or need the hub to use the sidecar. I don’t know how to get both to work at the same time