Z2jh 500 : Internal Server Error

I have installed z2jh on a local k8s cluster. Sometimes I get 500 error, other times it works fine.
I’m using jupyterhub Version 3.0.0, helm chart 2.0.0

500 : Internal Server Error
The error was:
Failed to connect to Hub API at ‘http://hub:8081/hub/api’. Is the Hub accessible at this URL (from host: jupyter-hammad-20ali-20baig)?

Logs of pod proxy-779b595bff-hhmwm:

Summary

05:18:22.442 [ConfigProxy] debug: PROXY WEB /user/hammad%20ali%20baig/tree to http://10.84.159.242:8888
05:18:22.447 [ConfigProxy] debug: Not recording activity for status 302 on /user/hammad ali baig
05:18:22.453 [ConfigProxy] debug: PROXY WEB /hub/api/oauth2/authorize to http://hub:8081
05:18:22.468 [ConfigProxy] debug: Not recording activity for status 302 on /
05:18:22.475 [ConfigProxy] debug: PROXY WEB /user/hammad%20ali%20baig/oauth_callback to http://10.84.159.242:8888
05:18:42.502 [ConfigProxy] debug: Not recording activity for status 500 on /user/hammad ali baig
05:18:42.541 [ConfigProxy] debug: PROXY WEB /user/hammad%20ali%20baig/custom/custom.css to http://10.84.159.242:8888
05:18:42.545 [ConfigProxy] debug: Not recording activity for status 304 on /user/hammad ali baig

Logs of pod hub-788d5c45f9-bf2q2:

Summary

[D 2023-10-10 05:16:07.024 JupyterHub reflector:281] Connecting pods watcher
[D 2023-10-10 05:16:08.558 JupyterHub log:186] 200 GET /hub/health (@10.12.11.87) 0.94ms
[D 2023-10-10 05:16:09.740 JupyterHub provider:415] Validating client id jupyterhub-user-hammad%20ali%20baig
[D 2023-10-10 05:16:09.741 oauthlib.oauth2.rfc6749.grant_types.authorization_code authorization_code:363] Validating redirection uri /user/hammad%20ali%20baig/oauth_callback for client jupyterhub-user-hammad%20ali%20baig.
[D 2023-10-10 05:16:09.741 oauthlib.oauth2.rfc6749.grant_types.base base:230] Using provided redirect_uri /user/hammad%20ali%20baig/oauth_callback
[D 2023-10-10 05:16:09.741 JupyterHub provider:490] validate_redirect_uri: client_id=jupyterhub-user-hammad%20ali%20baig, redirect_uri=/user/hammad%20ali%20baig/oauth_callback
[D 2023-10-10 05:16:09.742 oauthlib.oauth2.rfc6749.grant_types.base base:171] Validating access to scopes [‘access:servers!server=hammad ali baig/’, ‘read:users:name!user’, ‘read:users:groups!user’, ‘access:servers!user=hammad ali baig’] for client ‘jupyterhub-user-hammad%20ali%20baig’ (<OAuthClient(identifier=‘jupyterhub-user-hammad%20ali%20baig’)>).
[D 2023-10-10 05:16:09.743 JupyterHub provider:614] Allowing request for scope(s) for jupyterhub-user-hammad%20ali%20baig: access:servers!server=hammad ali baig/,read:users:name!user,read:users:groups!user,access:servers!user=hammad ali baig
[W 2023-10-10 05:16:09.743 JupyterHub auth:298] Service Server at /user/hammad%20ali%20baig/ requested scopes access:servers!server=hammad ali baig/,read:users:name!user,read:users:groups!user,access:servers!user=hammad ali baig for user hammad ali baig, granting only access:servers!server=hammad ali baig/,read:users:name!user,read:users:groups!user.
[D 2023-10-10 05:16:09.743 JupyterHub auth:305] Skipping oauth confirmation for <User(hammad ali baig 1/1 running)> accessing Server at /user/hammad%20ali%20baig/
[D 2023-10-10 05:16:09.744 oauthlib.oauth2.rfc6749.endpoints.authorization authorization:98] Dispatching response_type code request to <oauthlib.oauth2.rfc6749.grant_types.authorization_code.AuthorizationCodeGrant object at 0x7f8210c7b910>.
[D 2023-10-10 05:16:09.744 JupyterHub provider:415] Validating client id jupyterhub-user-hammad%20ali%20baig
[D 2023-10-10 05:16:09.744 oauthlib.oauth2.rfc6749.grant_types.authorization_code authorization_code:363] Validating redirection uri /user/hammad%20ali%20baig/oauth_callback for client jupyterhub-user-hammad%20ali%20baig.
[D 2023-10-10 05:16:09.744 oauthlib.oauth2.rfc6749.grant_types.base base:230] Using provided redirect_uri /user/hammad%20ali%20baig/oauth_callback
[D 2023-10-10 05:16:09.744 JupyterHub provider:490] validate_redirect_uri: client_id=jupyterhub-user-hammad%20ali%20baig, redirect_uri=/user/hammad%20ali%20baig/oauth_callback
[D 2023-10-10 05:16:09.745 oauthlib.oauth2.rfc6749.grant_types.base base:171] Validating access to scopes {‘access:servers!server=hammad ali baig/’, ‘read:users:name!user’, ‘read:users:groups!user’} for client ‘jupyterhub-user-hammad%20ali%20baig’ (<OAuthClient(identifier=‘jupyterhub-user-hammad%20ali%20baig’)>).
[D 2023-10-10 05:16:09.745 JupyterHub provider:614] Allowing request for scope(s) for jupyterhub-user-hammad%20ali%20baig: access:servers!server=hammad ali baig/,read:users:name!user,read:users:groups!user
[D 2023-10-10 05:16:09.746 oauthlib.oauth2.rfc6749.grant_types.authorization_code authorization_code:246] Pre resource owner authorization validation ok for <oauthlib.Request SANITIZED>.
[D 2023-10-10 05:16:09.746 oauthlib.oauth2.rfc6749.grant_types.authorization_code authorization_code:171] Created authorization code grant {‘code’: ‘1xqiS4fbfisDunAhMnlKFcyH1i1pIo’, ‘state’: ‘eyJ1dWlkIjogIjg4ZTk2OTI0MTBjOTRkYzRiNjJmMzMxYzZjZGEyNDNhIiwgIm5leHRfdXJsIjogIi91c2VyL2hhbW1hZCUyMGFsaSUyMGJhaWcvdHJlZSJ9’} for request <oauthlib.Request SANITIZED>.
[D 2023-10-10 05:16:09.746 oauthlib.oauth2.rfc6749.grant_types.authorization_code authorization_code:278] Saving grant {‘code’: ‘1xqiS4fbfisDunAhMnlKFcyH1i1pIo’, ‘state’: ‘eyJ1dWlkIjogIjg4ZTk2OTI0MTBjOTRkYzRiNjJmMzMxYzZjZGEyNDNhIiwgIm5leHRfdXJsIjogIi91c2VyL2hhbW1hZCUyMGFsaSUyMGJhaWcvdHJlZSJ9’} for <oauthlib.Request SANITIZED>.
[D 2023-10-10 05:16:09.746 JupyterHub provider:241] Saving authorization code jupyterhub-user-hammad%20ali%20baig, 1xq…, (), {}
[I 2023-10-10 05:16:09.749 JupyterHub log:186] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-hammad%2520ali%2520baig&redirect_uri=%2Fuser%2Fhammad%2520ali%2520baig%2Foauth_callback&response_type=code&state=[secret] → /user/hammad%20ali%20baig/oauth_callback?code=[secret]&state=[secret] (hammad ali baig@::ffff:10.12.11.85) 12.27ms
[D 2023-10-10 05:16:10.558 JupyterHub log:186] 200 GET /hub/health (@10.12.11.87) 1.32ms
[D 2023-10-10 05:16:10.559 JupyterHub log:186] 200 GET /hub/health (@10.12.11.87) 1.27ms

Logs of pod jupyter-hammad-20ali-20baig:

Summary

[E 2023-10-10 05:25:40.980 SingleUserNotebookApp mixins:449] Failed to connect to my Hub at http://hub:8081/hub/api (attempt 1/5). Is it running?
Traceback (most recent call last):
File “/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py”, line 447, in check_hub_version
resp = await client.fetch(self.hub_api_url)
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
[E 2023-10-10 05:25:40.987 SingleUserNotebookApp ioloop:761] Exception in callback functools.partial(<function _HTTPConnection.init.. at 0x7fa62def3c10>, <Task finished name=‘Task-2’ coro=<_HTTPConnection.run() done, defined at /opt/conda/lib/python3.9/site-packages/tornado/simple_httpclient.py:293> exception=gaierror(-3, ‘Temporary failure in name resolution’)>)
Traceback (most recent call last):
File “/opt/conda/lib/python3.9/site-packages/tornado/ioloop.py”, line 741, in _run_callback
ret = callback()
File “/opt/conda/lib/python3.9/site-packages/tornado/simple_httpclient.py”, line 290, in
gen.convert_yielded(self.run()), lambda f: f.result()
File “/opt/conda/lib/python3.9/site-packages/tornado/simple_httpclient.py”, line 338, in run
stream = await self.tcp_client.connect(
File “/opt/conda/lib/python3.9/site-packages/tornado/tcpclient.py”, line 265, in connect
addrinfo = await self.resolver.resolve(host, port, af)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 398, in resolve
result = await IOLoop.current().run_in_executor(
File “/opt/conda/lib/python3.9/concurrent/futures/thread.py”, line 52, in run
result = self.fn(self.args, **self.kwargs)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 382, in _resolve_addr
addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
File “/opt/conda/lib/python3.9/socket.py”, line 953, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
[E 2023-10-10 05:25:52.998 SingleUserNotebookApp mixins:449] Failed to connect to my Hub at http://hub:8081/hub/api (attempt 2/5). Is it running?
Traceback (most recent call last):
File “/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py”, line 447, in check_hub_version
resp = await client.fetch(self.hub_api_url)
File “/opt/conda/lib/python3.9/site-packages/tornado/simple_httpclient.py”, line 338, in run
stream = await self.tcp_client.connect(
File “/opt/conda/lib/python3.9/site-packages/tornado/tcpclient.py”, line 265, in connect
addrinfo = await self.resolver.resolve(host, port, af)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 398, in resolve
result = await IOLoop.current().run_in_executor(
File “/opt/conda/lib/python3.9/concurrent/futures/thread.py”, line 52, in run
result = self.fn(self.args, **self.kwargs)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 382, in _resolve_addr
addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
File “/opt/conda/lib/python3.9/socket.py”, line 953, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
[W 2023-10-10 05:25:57.007 SingleUserNotebookApp _version:70] jupyterhub version 3.0.0 != jupyterhub-singleuser version 1.4.1. This could cause failure to authenticate and result in redirect loops!
[I 2023-10-10 05:25:57.007 SingleUserNotebookApp notebookapp:2302] Serving notebooks from local directory: /home/jupyter
[I 2023-10-10 05:25:57.008 SingleUserNotebookApp notebookapp:2302] Jupyter Notebook 6.4.0 is running at:
[I 2023-10-10 05:25:57.008 SingleUserNotebookApp notebookapp:2302] http://jupyter-hammad-20ali-20baig:8888/user/hammad%20ali%20baig/
[I 2023-10-10 05:25:57.008 SingleUserNotebookApp notebookapp:2303] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2023-10-10 05:25:57.011 SingleUserNotebookApp mixins:556] Updating Hub with activity every 300 seconds
[I 2023-10-10 05:25:57.511 SingleUserNotebookApp log:189] 302 GET /user/hammad%20ali%20baig/ → /user/hammad%20ali%20baig/tree? (@10.12.11.83) 0.92ms
[I 2023-10-10 05:25:57.551 SingleUserNotebookApp log:189] 302 GET /user/hammad%20ali%20baig/ → /user/hammad%20ali%20baig/tree? (@::ffff:10.12.11.85) 0.80ms
[I 2023-10-10 05:25:57.561 SingleUserNotebookApp log:189] 302 GET /user/hammad%20ali%20baig/tree? → /hub/api/oauth2/authorize?client_id=jupyterhub-user-hammad%2520ali%2520baig&redirect_uri=%2Fuser%2Fhammad%2520ali%2520baig%2Foauth_callback&response_type=code&state=[secret] (@::ffff:10.12.11.85) 2.25ms
[E 2023-10-10 05:26:07.599 SingleUserNotebookApp auth:334] Error connecting to http://hub:8081/hub/api: HTTPConnectionPool(host=‘hub’, port=8081): Max retries exceeded with url: /hub/api/oauth2/token (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7fa62d5422e0>: Failed to establish a new connection: [Errno -2] Name or service not known’))
[W 2023-10-10 05:26:07.599 SingleUserNotebookApp web:1787] 500 GET /user/hammad%20ali%20baig/oauth_callback?code=7AtkQkCJOhSQZEV5b7dEY4K5kePCuS&state=eyJ1dWlkIjogIjM3YzJhZTE2MGI0NzRjOTE4MTQwYTIyYmMxNTA2Y2I2IiwgIm5leHRfdXJsIjogIi91c2VyL2hhbW1hZCUyMGFsaSUyMGJhaWcvdHJlZT8ifQ (::ffff:10.12.11.85): Failed to connect to Hub API at ‘http://hub:8081/hub/api’. Is the Hub accessible at this URL (from host: jupyter-hammad-20ali-20baig)?
[E 2023-10-10 05:26:07.621 SingleUserNotebookApp log:181] {
“X-Forwarded-Host”: “10.12.11.89:31234”,
“X-Forwarded-Proto”: “http”,
“X-Forwarded-Port”: “31234”,
“X-Forwarded-For”: “::ffff:10.12.11.85”,
“X-Dynatrace-Application”: “v=2;appId=;rid=1770694249;rpid=-1123069457;en=9nws2q7b”,
“Upgrade-Insecure-Requests”: “1”,
“Cookie”: “jupyterhub-user-hammad%20ali%20baig=[secret]; jupyterhub-user-hammad%20ali%20baig-oauth-state=[secret]; jupyterhub-session-id=[secret]; _xsrf=[secret]”,
“Connection”: “close”,
“Referer”: “http://10.12.11.89:31234/hub/spawn/hammad%20ali%20baig”,
“Accept-Encoding”: “gzip, deflate”,
“Accept-Language”: “en-US,en;q=0.5”,
“Accept”: "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,
/
;q=0.8",
“User-Agent”: “Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0”,
“Host”: “10.12.11.89:31234”,
“X-Dynatrace”: “FW4;1172688746;2;103837869;39134;0;-703642093;1478;4cb1;2h01;3h063070ad;4h98de;5h01;6hd1f2ab9c5bcd303852c4f5feb5aaf107;7h993fcf9c645f8c61”,
“Traceparent”: “00-d1f2ab9c5bcd303852c4f5feb5aaf107-993fcf9c645f8c61-01”,
“Tracestate”: “d60f4613-45e5cf6a@dt=fw4;2;63070ad;98de;0;0;0;5c6;00b2;2h01;3h063070ad;4h98de;5h01;7h993fcf9c645f8c61”
}
[E 2023-10-10 05:26:07.621 SingleUserNotebookApp log:189] 500 GET /user/hammad%20ali%20baig/oauth_callback?code=[secret]&state=[secret] (@::ffff:10.12.11.85) 10034.32ms
[E 2023-10-10 05:26:07.622 SingleUserNotebookApp mixins:538] Error notifying Hub of activity
Traceback (most recent call last):
File “/opt/conda/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py”, line 536, in notify
await client.fetch(req)
File “/opt/conda/lib/python3.9/site-packages/tornado/simple_httpclient.py”, line 338, in run
stream = await self.tcp_client.connect(
File “/opt/conda/lib/python3.9/site-packages/tornado/tcpclient.py”, line 265, in connect
addrinfo = await self.resolver.resolve(host, port, af)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 398, in resolve
result = await IOLoop.current().run_in_executor(
File “/opt/conda/lib/python3.9/concurrent/futures/thread.py”, line 52, in run
result = self.fn(*self.args, **self.kwargs)
File “/opt/conda/lib/python3.9/site-packages/tornado/netutil.py”, line 382, in _resolve_addr
addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
File “/opt/conda/lib/python3.9/socket.py”, line 953, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

config.yaml

Summary

hub:
networkPolicy:
enabled: false
config:
JupyterHub:
authenticator_class: ldapauthenticator.LDAPAuthenticator
LDAPAuthenticator:
allowed_groups:
- CN=xxxx,OU=AADDC Users,DC=xxxx,DC=com
escape_userdn: false
lookup_dn: true
lookup_dn_search_user: “xxxx”
lookup_dn_search_password: “xxxx”
lookup_dn_search_filter: ({login_attr}={login})
lookup_dn_user_dn_attribute: “cn”
user_attribute: “sAMAccountName”
user_search_base: “OU=AADDC Users,DC=xxxx,DC=com”
server_address: xxxx.xxxx.com
server_port: 636
use_ssl: true
KubeSpawner:
http_timeout: 600
k8s_api_request_retry_timeout: 600
k8s_api_request_timeout: 600
start_timeout: 600
db:
type: sqlite-memory

prePuller:
hook:
enabled: true

proxy:
chp:
networkPolicy:
enabled: false
traefik:
networkPolicy:
enabled: false
service:
nodePorts:
http: 31234
type: NodePort

debug:
enabled: true

scheduling:
userScheduler:
enabled: false
corePods:
nodeAffinity:
matchNodePurpose: ignore

singleuser:
startTimeout: 300
networkPolicy:
enabled: false
cpu:
guarantee: 0.5
memory:
guarantee: 1G
extraPodConfig:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
component: singleuser-server
image:
name: XXXX
pullPolicy: Always
tag: latest
pullSecrets:
- name: xxxx
profileList:
- display_name: Default
description: “Old JupyterHub with some enhancements.”
default: true
- display_name: minimal
description: “To avoid too much bells and whistles: Python.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
- display_name: tensorflow
description: “Scientific Jupyter Notebook Python Stack w/ TensorFlow.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
- display_name: scipy
description: “Scientific Jupyter Notebook Python.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
- display_name: datascience
description: “If you want the additional bells and whistles: Python, R, and Julia.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
- display_name: pyspark
description: “Python and Spark Jupyter Notebook.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
- display_name: spark
description: “The Jupyter Stacks spark image.”
kubespawner_override:
image: XXXX
pullPolicy: Always
image_pull_secrets:
- name: xxxx
storage:
type: none
extraVolumes:
- name: jupyterhub-shared
persistentVolumeClaim:
claimName: xxxx
extraVolumeMounts:
- name: jupyterhub-shared
mountPath: xxxx
extraFiles:
jupyter_notebook_config.json:
mountPath: /etc/jupyter/jupyter_notebook_config.json
data:
MappingKernelManager:
cull_idle_timeout: 86400
cull_interval: 1800
cull_connected: true
cull_busy: false

cull:
enabled: true
users: true
adminUsers: true
removeNamedServers: false
timeout: 86400
every: 1800
concurrency: 10
maxAge: 0

Hello, @hammadab

Logs of pod jupyter-hammad-20ali-20baig contains below.

Have you ever tested all host name resolution in the cluster name?

Best regards

Issue resolved by restarting kube-system:

kubectl rollout restart deamonset calico-node -n kube-system

WAIT for all pods to restart

kubectl rollout restart deployment calico-kube-controllers -n kube-system

WAIT for all pods to restart

kubectl rollout restart deamonset kube-proxy -n kube-system

WAIT for all pods to restart

kubectl rollout restart deployment coredns -n kube-system

WAIT for all pods to restart

1 Like

Thank you for your post. I got the same issue, but in my kube-system namespace, I have different deployments and demon sets:

$ kubectl get all -n kube-system
NAME                                                READY   STATUS      RESTARTS   AGE
pod/local-path-provisioner-84db5d44d9-bpz42         1/1     Running     0          2d14h
pod/metrics-server-67c658944b-p8tz2                 0/1     Running     0          2d14h
pod/coredns-6799fbcd5-hpvzl                         1/1     Running     0          2d14h
pod/helm-delete-traefik-crd-qqf6j                   0/1     Completed   0          42h
pod/svclb-ingress-nginx-controller-7f6f88b6-wxb26   2/2     Running     0          42h

NAME                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns         ClusterIP   10.43.0.10    <none>        53/UDP,53/TCP,9153/TCP   2d14h
service/metrics-server   ClusterIP   10.43.8.184   <none>        443/TCP                  2d14h

NAME                                                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/svclb-ingress-nginx-controller-7f6f88b6   1         1         1       1            1           <none>          42h

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/local-path-provisioner   1/1     1            1           2d14h
deployment.apps/coredns                  1/1     1            1           2d14h
deployment.apps/metrics-server           0/1     1            0           2d14h

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-67c658944b           1         1         0       2d14h
replicaset.apps/local-path-provisioner-84db5d44d9   1         1         1       2d14h
replicaset.apps/coredns-6799fbcd5                   1         1         1       2d14h

NAME                                COMPLETIONS   DURATION   AGE
job.batch/helm-delete-traefik-crd   1/1           4s         42h

What should I do in this case?
best