500: Failed to connect to my Hub at http://hub:8081/hub/hub/api

Hey Guys,

I’m running into this issue after single user pods spawns and trying to connect to hub which is running in a different namespace than single user pods.

Helm chart version: 0.10.2
K8s version: eks v1.17
Jupyterhub version: 1.1.0

[E 2020-11-17 00:45:06.093 SingleUserNotebookApp log:174] 500 GET /hub/user/tirumerla/oauth_callback?code=[secret]&state=[secret] (@172.30.12.179) 29.46ms
[E 2020-11-17 00:45:10.023 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://hub:8081/hub/hub/api (attempt 3/5). Is it running?
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
        resp = await client.fetch(self.hub_api_url)
      File "/opt/conda/lib/python3.6/site-packages/tornado/simple_httpclient.py", line 336, in run
        source_ip=source_ip,
      File "/opt/conda/lib/python3.6/site-packages/tornado/tcpclient.py", line 270, in connect
        addrinfo = await self.resolver.resolve(host, port, af)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 396, in resolve
        None, _resolve_addr, host, port, family
      File "/opt/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 379, in _resolve_addr
        addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
      File "/opt/conda/lib/python3.6/socket.py", line 745, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    socket.gaierror: [Errno -2] Name or service not known
[E 2020-11-17 00:45:18.035 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://hub:8081/hub/hub/api (attempt 4/5). Is it running?
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
        resp = await client.fetch(self.hub_api_url)
      File "/opt/conda/lib/python3.6/site-packages/tornado/simple_httpclient.py", line 336, in run
        source_ip=source_ip,
      File "/opt/conda/lib/python3.6/site-packages/tornado/tcpclient.py", line 270, in connect
        addrinfo = await self.resolver.resolve(host, port, af)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 396, in resolve
        None, _resolve_addr, host, port, family
      File "/opt/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 379, in _resolve_addr
        addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
      File "/opt/conda/lib/python3.6/socket.py", line 745, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    socket.gaierror: [Errno -2] Name or service not known
[E 2020-11-17 00:45:34.048 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://hub:8081/hub/hub/api (attempt 5/5). Is it running?
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
        resp = await client.fetch(self.hub_api_url)
      File "/opt/conda/lib/python3.6/site-packages/tornado/simple_httpclient.py", line 336, in run
        source_ip=source_ip,
      File "/opt/conda/lib/python3.6/site-packages/tornado/tcpclient.py", line 270, in connect
        addrinfo = await self.resolver.resolve(host, port, af)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 396, in resolve
        None, _resolve_addr, host, port, family
      File "/opt/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/lib/python3.6/site-packages/tornado/netutil.py", line 379, in _resolve_addr
        addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
      File "/opt/conda/lib/python3.6/socket.py", line 745, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    socket.gaierror: [Errno -2] Name or service not known

Doesn’t this code supposed to be c.JupyterHub.hub_connect_url = f"http://hub.<namespace>:{os.environ['HUB_SERVICE_PORT']}"? or is there any other place this needs to be changed?

Appreciate your help.

Thanks.

I think you are right that you need to add the namespace there to support pods running in a different namespace.

If you try to communicate directly with the hub from another namespace, you will be blocked by NetworkPolicy though, but you can relax that by configuring the Helm chart with hub.networkPolicy.interNamespaceAccessLabels=accept.

It seems like you may have some issue with your configuration of hub.baseUrl also as I observe http://hub:8081/hub/hub/api in your logs. If you want to configure the c.JupyterHub.base_url, you must do it using the Helm chart configuration at this point because it influence more than just the jupyterhub software inside the hub pod.

I’m not sure about everything that’s needed to support this, but the biggest part is within KubeSpawner, because it needs to start keeping track of pods in the entire cluster rather than just the local namespace. It will require more RBAC permissions on the hub pod’s associated k8s service account as well. This is a long standing complex feature.

See @athornton’s work in https://github.com/jupyterhub/kubespawner/pull/458 for some more insights.

1 Like

Thanks @consideRatio for your response. Yeah we have our own network policies to talk between two namespaces. I disabled network policies as part of the helm chart. Also, i have the RBAC permissions enabled to talk from hub namespace to singleuser namespace. Regarding the base url i enabled this in helm chart this is due to our ingress rules. Technically this can be removed but i haven’t tested it yet.
I also have another question would be great if you can take a look at this.

Appreciate all your help.

In my case, it was an issue with security group rules. Adding allow rules for 8081(from pod to hub) and 8888(from hub to the pod) fixed the problem.