The JupyterHub pod is failing

Hello,

I have an instance of JupyterHub deployed using ODH (Open Data Hub). The pods were running fine from past couple of months but suddenly since last two days the pods have started getting Restarted automatically. The errors which I am able to see from the previous logs of the container are:

[C 2022-09-13 05:09:29.959 JupyterHub app:2937] Received signal SIGTERM, initiating shutdown...
[I 2022-09-13 05:09:29.960 JupyterHub app:2573] Cleaning up 2 services...
[E 2022-09-13 05:09:49.269 JupyterHub ioloop:761] Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x7f876c08af98>>, <Task finished coro=<JupyterHub.check_services_health() done, defined at /opt/app-root/lib/python3.6/site-packages/jupyterhub/app.py:2090> exception=AttributeError("'NoneType' object has no attribute 'proto'",)>)
    Traceback (most recent call last):
      File "/opt/app-root/lib/python3.6/site-packages/tornado/ioloop.py", line 741, in _run_callback
        ret = callback()
      File "/opt/app-root/lib/python3.6/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
        future.result()
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/app.py", line 2096, in check_services_health
        await Server.from_orm(service.orm.server).wait_up(timeout=1)
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/objects.py", line 116, in from_orm
        return cls(orm_server=orm_server)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1000, in __init__
        super_kwargs[key] = value
      File "/opt/rh/rh-python36/root/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
        next(self.gen)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1131, in hold_trait_notifications
        self.notify_change(change)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1176, in notify_change
        c(change)
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/objects.py", line 131, in _orm_server_changed
        self.proto = obj.proto
    AttributeError: 'NoneType' object has no attribute 'proto'

I have also attached the complete log file of the container for reference.

Can anyone please help me in resolving this issue as a lot of users are facing problems due to this.

I don’t see an attachment. Maybe share via a pastebin/gist?

What version of JupyterHub and what Spawner?

-Min

I missed the attachment in the initial post. The logs are as follows:

Waiting to become leader...
Assigned as new leader
+ trap 'kill -TERM $PID' TERM INT
+ PID=36
+ wait 36
+ start-jupyterhub.sh
+ set -eo pipefail
+ PATH=/opt/app-root/bin:/opt/rh/rh-python36/root/usr/bin:/opt/rh/rh-nodejs10/root/usr/bin:/opt/rh/httpd24/root/usr/bin:/opt/rh/httpd24/root/usr/sbin:/opt/app-root/src/.local/bin/:/opt/app-root/src/bin:/opt/app-root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/app-root/scripts
+ . /opt/app-root/etc/jupyterhub_config.sh
++ '[' x '!=' x ']'
++ '[' -f /opt/app-root/src/.jupyter/jupyterhub_config.sh ']'
++ '[' -f /opt/app-root/configs/jupyterhub_config.sh ']'
+ jupyterhub upgrade-db -f /opt/app-root/etc/jupyterhub_config.py
+ exec jupyterhub -f /opt/app-root/etc/jupyterhub_config.py
[I 2022-09-13 05:07:38.152 JupyterHub app:2459] Running JupyterHub version 1.4.2
[I 2022-09-13 05:07:38.152 JupyterHub app:2490] Using Authenticator: oauthenticator.openshift.OpenShiftOAuthenticator-14.0.1dev
[I 2022-09-13 05:07:38.152 JupyterHub app:2490] Using Spawner: builtins.OpenShiftSpawner
[I 2022-09-13 05:07:38.153 JupyterHub app:2490] Using Proxy: jupyterhub_traefik_proxy.toml_configmap.TraefikTomlConfigmapProxy-0+untagged.320.gb468da0
[I 2022-09-13 05:07:38.154 JupyterHub app:1530] Loading cookie_secret from env[JPY_COOKIE_SECRET]
[I 2022-09-13 05:07:38.430 JupyterHub provider:576] Updating oauth client service-jsp-api
[I 2022-09-13 05:07:38.846 JupyterHub app:2529] Initialized 0 spawners in 0.004 seconds
[I 2022-09-13 05:07:38.848 JupyterHub app:2738] Not starting proxy
[I 2022-09-13 05:07:38.848 JupyterHub app:2774] Hub API listening on http://0.0.0.0:8081/hub/
[I 2022-09-13 05:07:38.848 JupyterHub app:2776] Private Hub API connect url http://jupyterhub:8081/hub/
[I 2022-09-13 05:07:38.848 JupyterHub app:2789] Starting managed service jsp-api at http://jupyterhub:8181
[I 2022-09-13 05:07:38.849 JupyterHub service:339] Starting service 'jsp-api': ['jupyterhub-singleuser-profiles-api']
[I 2022-09-13 05:07:38.851 JupyterHub service:121] Spawning jupyterhub-singleuser-profiles-api
 * Serving Flask app "jupyterhub_singleuser_profiles.api.api" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
/opt/app-root/lib/python3.6/site-packages/jupyterhub_singleuser_profiles/openshift.py:65: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  result = yaml.load(config_map.data[key_name])
 * Running on http://0.0.0.0:8181/ (Press CTRL+C to quit)
10.128.4.1 - - [13/Sep/2022 05:07:49] "e[33mGET /services/jsp-api/ HTTP/1.1e[0m" 404 -
[I 2022-09-13 05:07:49.241 JupyterHub app:2789] Starting managed service idle-culler
[I 2022-09-13 05:07:49.241 JupyterHub service:339] Starting service 'idle-culler': ['/opt/app-root/bin/python3', '-m', 'jupyterhub_idle_culler', '--timeout=3600']
[I 2022-09-13 05:07:49.244 JupyterHub service:121] Spawning /opt/app-root/bin/python3 -m jupyterhub_idle_culler --timeout=3600
[I 2022-09-13 05:07:49.265 JupyterHub app:2798] Adding external service prometheus
[I 2022-09-13 05:07:49.266 JupyterHub proxy:347] Checking routes
[I 2022-09-13 05:07:49.267 JupyterHub app:2849] JupyterHub is now running at http://:8080/
[I 2022-09-13 05:07:49.465 JupyterHub log:189] 200 GET /hub/api/users (idle-culler@127.0.0.1) 55.48ms
[I 2022-09-13 05:08:40.063 JupyterHub log:189] 302 GET / -> /hub/ (@127.0.0.1) 0.76ms
[I 2022-09-13 05:08:47.610 JupyterHub log:189] 302 GET / -> /hub/ (@127.0.0.1) 0.65ms
[I 2022-09-13 05:08:57.899 JupyterHub log:189] 302 GET / -> /hub/ (@127.0.0.1) 0.56ms
[I 2022-09-13 05:09:17.428 JupyterHub log:189] 302 GET / -> /hub/ (@127.0.0.1) 0.60ms
[I 2022-09-13 05:09:28.068 JupyterHub log:189] 302 GET / -> /hub/ (@127.0.0.1) 0.53ms
++ kill -TERM 36
+ trap - TERM INT
+ wait 36
[C 2022-09-13 05:09:29.959 JupyterHub app:2937] Received signal SIGTERM, initiating shutdown...
[I 2022-09-13 05:09:29.960 JupyterHub app:2573] Cleaning up 2 services...
[E 2022-09-13 05:09:49.269 JupyterHub ioloop:761] Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x7f876c08af98>>, <Task finished coro=<JupyterHub.check_services_health() done, defined at /opt/app-root/lib/python3.6/site-packages/jupyterhub/app.py:2090> exception=AttributeError("'NoneType' object has no attribute 'proto'",)>)
    Traceback (most recent call last):
      File "/opt/app-root/lib/python3.6/site-packages/tornado/ioloop.py", line 741, in _run_callback
        ret = callback()
      File "/opt/app-root/lib/python3.6/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
        future.result()
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/app.py", line 2096, in check_services_health
        await Server.from_orm(service.orm.server).wait_up(timeout=1)
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/objects.py", line 116, in from_orm
        return cls(orm_server=orm_server)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1000, in __init__
        super_kwargs[key] = value
      File "/opt/rh/rh-python36/root/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
        next(self.gen)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1131, in hold_trait_notifications
        self.notify_change(change)
      File "/opt/app-root/lib/python3.6/site-packages/traitlets/traitlets.py", line 1176, in notify_change
        c(change)
      File "/opt/app-root/lib/python3.6/site-packages/jupyterhub/objects.py", line 131, in _orm_server_changed
        self.proto = obj.proto
    AttributeError: 'NoneType' object has no attribute 'proto'

[I 2022-09-13 05:09:50.389 JupyterHub app:2585] Leaving single-user servers running
[I 2022-09-13 05:09:50.389 JupyterHub app:2593] I didn't start the proxy, I can't clean it up
[I 2022-09-13 05:09:50.389 JupyterHub app:2610] ...done
+ STATUS=0
+ exit 0

Hello, sorry the attachment got missed. I have provided the logs in the above reply now.

Jupyterhub Version: 1.3.0
Spawner: OpenshiftSpawner

JupyterHub 1.4.2 is from a pretty long time ago. There’s a good chance the logged error is a bug that’s been fixed. But as far as I can see, this error is only occurring while the Hub is shutting down, so shouldn’t have any consequences while the Hub is running.

This bit means that something external is killing JupyterHub. Maybe an Out-of-Memory killer? I don’t think it is anything in JupyterHub itself, which is just shutting down because something else told it to.

Hello,

The Jupyterhub is shutting down on its own without any external event and that seems to be very unusual as this was working for more than a year till now.

I don’t know how to debug what’s sending SIGTERM, since it’s not part of JupyterHub, but it is external to JupyterHub. The logs here show that something is sending JupyterHub SIGTERM, which is ultimately what’s responsible for it shutting down, following the instruction it’s been given. Figuring out what’s sending SIGTERM to JupyterHub is the crux here.

1 Like