Helm deploy jupyterhub, hub pod cannot access health check

meng-yu-github · September 24, 2021, 3:46am

Hi !

Start jupyterhub with helm as the default parameter (PV is created), but the hub pod has an error.
The state of hub changes from Running to CrashLoopBackOff.
Using kubectl describe to discover hub health check failure.
Failed to initialize database using kubectl logsdiscovery.
What should I do?

Specific error messages：

# kubectl describe pod hub-6f985cc46-8m9w8 -n limy-test
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  59m                    default-scheduler  Successfully assigned limy-test/hub-6f985cc46-8m9w8 to k8s-node
  Normal   Pulled     58m (x4 over 59m)      kubelet            Container image "jupyterhub/k8s-hub:1.1.3" already present on machine
  Normal   Created    58m (x4 over 59m)      kubelet            Created container hub
  Normal   Started    58m (x4 over 59m)      kubelet            Started container hub
  Warning  Unhealthy  58m (x5 over 59m)      kubelet            Readiness probe failed: Get "http://10.244.113.185:8081/hub/health": dial tcp 10.244.113.185:8081: connect: connection refused
  Warning  BackOff    4m30s (x272 over 59m)  kubelet            Back-off restarting failed container

# kubectl logs hub-6f985cc46-8m9w8 -n limy-test
Loading /usr/local/etc/jupyterhub/secret/values.yaml
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
[I 2021-09-24 03:34:03.728 JupyterHub app:2459] Running JupyterHub version 1.4.2
[I 2021-09-24 03:34:03.728 JupyterHub app:2489] Using Authenticator: jupyterhub.auth.DummyAuthenticator-1.4.2
[I 2021-09-24 03:34:03.728 JupyterHub app:2489] Using Spawner: kubespawner.spawner.KubeSpawner-1.1.0
[I 2021-09-24 03:34:03.728 JupyterHub app:2489] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.4.2
[E 2021-09-24 03:34:03.742 JupyterHub app:2969]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/app.py", line 2966, in launch_instance_async
        await self.initialize(argv)
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/app.py", line 2501, in initialize
        self.init_db()
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/app.py", line 1703, in init_db
        dbutil.upgrade_if_needed(self.db_url, log=self.log)
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/dbutil.py", line 112, in upgrade_if_needed
        orm.check_db_revision(engine)
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/orm.py", line 771, in check_db_revision
        current_table_names = set(inspect(engine).get_table_names())
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/inspection.py", line 64, in inspect
        ret = reg(subject)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/reflection.py", line 182, in _engine_insp
        return Inspector._construct(Inspector._init_engine, bind)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/reflection.py", line 117, in _construct
        init(self, bind)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/reflection.py", line 128, in _init_engine
        engine.connect().close()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 3165, in connect
        return self._connection_cls(self, close_with_result=close_with_result)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 96, in __init__
        else engine.raw_connection()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 3244, in raw_connection
        return self._wrap_pool_connect(self.pool.connect, _connection)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 3214, in _wrap_pool_connect
        Connection._handle_dbapi_exception_noconnection(
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 2068, in _handle_dbapi_exception_noconnection
        util.raise_(
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 207, in raise_
        raise exception
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 3211, in _wrap_pool_connect
        return fn()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 307, in connect
        return _ConnectionFairy._checkout(self)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 767, in _checkout
        fairy = _ConnectionRecord.checkout(pool)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 425, in checkout
        rec = pool._do_get()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/impl.py", line 256, in _do_get
        return self._create_connection()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 253, in _create_connection
        return _ConnectionRecord(self)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 368, in __init__
        self.__connect()
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 611, in __connect
        pool.logger.debug("Error on connect(): %s", e)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
        compat.raise_(
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 207, in raise_
        raise exception
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 605, in __connect
        connection = pool._invoke_creator(self)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/create.py", line 578, in connect
        return dialect.connect(*cargs, **cparams)
      File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 584, in connect
        return self.dbapi.connect(*cargs, **cparams)
    sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
    (Background on this error at: http://sqlalche.me/e/14/e3q8)

# kubectl get pod -n limy-test -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP               NODE         NOMINATED NODE   READINESS GATES
continuous-image-puller-clqw2     1/1     Running   0          83m   10.244.235.209   k8s-master   <none>           <none>
continuous-image-puller-m7ff7     1/1     Running   0          83m   10.244.113.180   k8s-node     <none>           <none>
hub-6f985cc46-8m9w8               0/1     Error     20         78m   10.244.113.185   k8s-node     <none>           <none>
proxy-66bb55984f-bnfts            1/1     Running   0          83m   10.244.113.182   k8s-node     <none>           <none>
user-scheduler-65b559c7c9-6bmhb   1/1     Running   0          83m   10.244.113.183   k8s-node     <none>           <none>
user-scheduler-65b559c7c9-xgnd8   1/1     Running   0          83m   10.244.113.184   k8s-node     <none>           <none>

config.yaml file information：

hub:
  config:
    JupyterHub:
      admin_access: true
      admin_users:
        - limy
        - root
prePuller:
  hook:
    enabled: false

Helm version information:

[root@k8s-master kubespawner]# helm version
version.BuildInfo{Version:"v3.7.0", GitCommit:"eeac83883cb4014fe60267ec6373570374ce770b", GitTreeState:"clean", GoVersion:"go1.16.8"}

kubernetes version information:

[root@k8s-master kubespawner]# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

manics · September 25, 2021, 1:55pm

It sounds like JupyterHub is unable to create a database file, which suggests there’s a problem with your storage. For instance, the permissions may be incorrect.

You’ll have investigate your Kubernetes cluster and storage provider to see if there’s a solution. If you tell us how your K8s cluster was setup, in as much information as possible, we might be able to help.

meng-yu-github · September 26, 2021, 1:22am

Thank you for your answer.

Here is my PV and PVC environment.

My PV file：

# cat test-jupyterhub-helm-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: test-pv-helm-jupyter-nfs
  namespace: limy-test
  labels:
    name: test-pv-helm-jupyter-nfs
    storetype: nfs
spec:
  storageClassName: "helm-jupyterhub"
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: "1Gi"
  nfs:
    path: /opt/kubespawner/helm/jupyterhub-volumn
    server: 191.168.6.1

PVC deployed by Helm：

# helm get manifest v3.7.0 -n limy-test
# Source: jupyterhub/templates/hub/pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: hub-db-dir
  labels:
    component: hub
    app: jupyterhub
    release: v3.7.0
    chart: jupyterhub-1.1.3
    heritage: Helm
spec:
  storageClassName: "helm-jupyterhub"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: "1Gi"

The value in the value.yaml file：

 db:
    type: sqlite-pvc
    upgrade:
    pvc:
      annotations: {}
      selector: {}
      accessModes:
        - ReadWriteOnce
      storage: 1Gi
      subPath:
      storageClassName: helm-jupyterhub
    url:
    password:

manics · September 27, 2021, 7:54pm

My guess is your NFS dynamic provisioner doesn’t set the permissions on the volume to allow the JupyterHub user to write to it. If you read up on the docs for your storage provisioner you may find something about configuring it to make the volume writeable by a different user/group id. Otherwise the easiest option is probably to log in to your NFS server and chown the directory.

meng-yu-github · September 29, 2021, 11:28am

Thank you for your answer! @manics

I modified the permissions of the data volume. Now the database can be initialized successfully.
But the hub still can’t get up and can’t connect to the proxy, but the proxy has started.
My kubernetes environment is installed using kubedm.

I refer to this and restart calico, but the result is the same.

Kubernetes - Api_request to the proxy failed with status code 599, retrying

The hub error is as follows：

# kubectl logs hub-58d9d8bc57-7wvgw -n limy-test
Loading /usr/local/etc/jupyterhub/secret/values.yaml
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
[I 2021-09-29 11:18:40.895 JupyterHub app:2459] Running JupyterHub version 1.4.2
[I 2021-09-29 11:18:40.896 JupyterHub app:2489] Using Authenticator: jupyterhub.auth.DummyAuthenticator-1.4.2
[I 2021-09-29 11:18:40.896 JupyterHub app:2489] Using Spawner: kubespawner.spawner.KubeSpawner-1.1.0
[I 2021-09-29 11:18:40.896 JupyterHub app:2489] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.4.2
[W 2021-09-29 11:18:40.969 JupyterHub app:1793]
    JupyterHub.admin_users is deprecated since version 0.7.2.
    Use Authenticator.admin_users instead.
[I 2021-09-29 11:20:46.943 JupyterHub app:1838] Not using allowed_users. Any authenticated user will be allowed.
[I 2021-09-29 11:20:47.023 JupyterHub app:2526] Initialized 0 spawners in 0.005 seconds
[I 2021-09-29 11:20:47.032 JupyterHub app:2738] Not starting proxy
[W 2021-09-29 11:21:07.057 JupyterHub proxy:851] api_request to the proxy failed with status code 599, retrying...
[W 2021-09-29 11:21:27.251 JupyterHub proxy:851] api_request to the proxy failed with status code 599, retrying...
[E 2021-09-29 11:21:27.251 JupyterHub app:2969]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/app.py", line 2967, in launch_instance_async
        await self.start()
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/app.py", line 2742, in start
        await self.proxy.get_all_routes()
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/proxy.py", line 898, in get_all_routes
        resp = await self.api_request('', client=client)
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/proxy.py", line 862, in api_request
        result = await exponential_backoff(
      File "/usr/local/lib/python3.8/dist-packages/jupyterhub/utils.py", line 184, in exponential_backoff
        raise TimeoutError(fail_message)
    TimeoutError: Repeated api_request to proxy path "" failed.


 # kubectl describe pod hub-58d9d8bc57-7wvgw -n limy-test
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  5m46s                  default-scheduler  Successfully assigned limy-test/hub-58d9d8bc57-7wvgw to k8s-node
  Normal   Pulled     5m46s                  kubelet            Container image "jupyterhub/k8s-hub:1.1.3" already present on machine
  Normal   Created    5m45s                  kubelet            Created container hub
  Normal   Started    5m45s                  kubelet            Started container hub
  Warning  Unhealthy  45s (x108 over 5m45s)  kubelet            Readiness probe failed: Get "http://10.244.113.130:8081/hub/health": dial tcp 10.244.113.130:8081: connect: connection refused

jupyterhub service:

# kubectl get service -n limy-test -o wide
NAME           TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE     SELECTOR
hub            NodePort   10.97.5.21       <none>        8081:32714/TCP   5m59s   app=jupyterhub,component=hub,release=v3.7.0
proxy-api      NodePort   10.105.53.7      <none>        8001:30518/TCP   5m59s   app=jupyterhub,component=proxy,release=v3.7.0
proxy-public   NodePort   10.111.101.141   <none>        80:32561/TCP     5m59s   component=proxy,release=v3.7.0

jupyterhub pod:

# kubectl get pod -n limy-test
NAME                              READY   STATUS             RESTARTS   AGE     IP               NODE         NOMINATED NODE   READINESS GATES
continuous-image-puller-5h98k     1/1     Running            0          4m20s   10.244.235.193   k8s-master   <none>           <none>
continuous-image-puller-z5smk     1/1     Running            0          4m20s   10.244.113.131   k8s-node     <none>           <none>
hub-558b6485-2q6g5                0/1     CrashLoopBackOff   3          4m20s   10.244.113.133   k8s-node     <none>           <none>
proxy-ccd5f79bc-s85nz             1/1     Running            0          4m20s   10.244.113.132   k8s-node     <none>           <none>
user-scheduler-65b559c7c9-jncct   1/1     Running            0          4m20s   10.244.113.135   k8s-node     <none>           <none>
user-scheduler-65b559c7c9-mkgvv   1/1     Running            0          4m20s   10.244.113.134   k8s-node     <none>           <none>

kubernetes pod:

# kubectl get pod -n kube-system -o wide
NAME                                     READY   STATUS    RESTARTS   AGE   IP                NODE         NOMINATED NODE   READINESS GATES
calico-kube-controllers-8db96c76-44fqq   1/1     Running   0          17m   10.244.113.129    k8s-node     <none>           <none>
calico-node-2rs9d                        1/1     Running   0          17m   191.168.6.2       k8s-node     <none>           <none>
calico-node-qk2kb                        1/1     Running   0          17m   191.168.6.1       k8s-master   <none>           <none>
coredns-558bd4d5db-tct5q                 1/1     Running   3          50d   192.168.235.206   k8s-master   <none>           <none>
coredns-558bd4d5db-tmqww                 1/1     Running   3          50d   192.168.235.208   k8s-master   <none>           <none>
etcd-k8s-master                          1/1     Running   4          50d   191.168.6.1       k8s-master   <none>           <none>
kube-apiserver-k8s-master                1/1     Running   2          16d   191.168.6.1       k8s-master   <none>           <none>
kube-controller-manager-k8s-master       1/1     Running   8          50d   191.168.6.1       k8s-master   <none>           <none>
kube-proxy-dhdh6                         1/1     Running   4          39d   191.168.6.2       k8s-node     <none>           <none>
kube-proxy-m7dgl                         1/1     Running   3          39d   191.168.6.1       k8s-master   <none>           <none>
kube-scheduler-k8s-master                1/1     Running   7          50d   191.168.6.1       k8s-master   <none>           <none>
metrics-server-68b8ffb4c9-7ftzj          1/1     Running   2          12d   192.168.235.217   k8s-master   <none>           <none>

Helm version information:

[root@k8s-master kubespawner]# helm version
version.BuildInfo{Version:"v3.7.0", GitCommit:"eeac83883cb4014fe60267ec6373570374ce770b", GitTreeState:"clean", GoVersion:"go1.16.8"}

kubernetes version information:

[root@k8s-master kubespawner]# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

manics · September 29, 2021, 8:50pm

It sounds like you may have a problem with your Kubernetes networking. Can you redeploy with debug logging enabled in Z2JH, disable NetwokrPolicies, and manually try and connect to the proxy service from a manually run pod, e.g. see

Alexander_S · February 24, 2022, 11:09pm

Kubernetes can throw this error at you when the automatic HTTPS fails.
I cannot with certainty say that this solution helps you, but I would recommend that you try to delete your autohttps pod if you have enabled HTTPS. This will cause the autohttps service to restart, which will update your SSL certificate. If the automatic HTTPS was the problem, then this solution will fix it.

For more details, see the end of this post: Trouble getting HTTPS / letsencrypt working with 0.9.0-beta.4 - #5 by matthew.brett

Seema_Attar · January 20, 2023, 2:50pm

@ meng-yu-github
Hey! I have the same issue! Could you manage to solve it?

Topic		Replies	Views
Helm install on-prem K8 Zero to JupyterHub on Kubernetes	9	3415	January 21, 2021
Installation runs without error messages, but hub pod remains "Pending" Zero to JupyterHub on Kubernetes	7	1814	April 3, 2021
SQL OperationalError with JH default config Zero to JupyterHub on Kubernetes	4	2112	December 15, 2021
Spawn failed: Could not create PVC claim-foobar JupyterHub jupyterhub , help-wanted	12	1182	December 11, 2024
Kubernetes - Api_request to the proxy failed with status code 599, retrying Zero to JupyterHub on Kubernetes community , help-wanted	9	7892	January 24, 2024

Helm deploy jupyterhub, hub pod cannot access health check

Kubernetes - Api_request to the proxy failed with status code 599, retrying

Related topics