HTTP 599: Failed to connect to <proxy-api-ip> 8001: Connection refused

Hi,

We are running binder on our own bare metal cluster with ubuntu 18, binder v0.2.0-n121.h6d936d7. So far we have had lot of issues with the networking configuration with binder :


To be able to access the binderhub https (Error accessing Hub API - HTTP 599: SSL certificate problem), we are using a load balancer: https://metallb.universe.tf/

Now, we are experiencing some networking issue that prevent the hub pod to be created:
paste <(kubectl get pods -n=binderhub) <(kubectl get pod -o=custom-columns=NODE:.spec.nodeName,START:.metadata.creationTimestamp -n=binderhub) | column -s $'\t' -t

NAME                                             READY   STATUS             RESTARTS   AGE	NODE                  START
binder-698cd44977-z7q7h                          1/1     Running            0          22d	neurolibre-worker-3   2020-09-08T17:00:56Z
binderhub-image-cleaner-299wf                    1/1     Running            0          22d	neurolibre-master     2020-09-08T17:00:56Z
binderhub-image-cleaner-f5scr                    1/1     Running            0          22d	neurolibre-worker-4   2020-09-08T17:00:56Z
binderhub-image-cleaner-g9p9g                    1/1     Running            0          22d	neurolibre-worker-2   2020-09-08T17:00:56Z
binderhub-image-cleaner-kk9q6                    1/1     Running            0          22d	neurolibre-worker-3   2020-09-08T17:00:56Z
binderhub-image-cleaner-ldtvm                    1/1     Running            0          22d	neurolibre-worker-1   2020-09-08T17:00:56Z
binderhub-image-cleaner-t2xrt                    1/1     Running            0          22d	neurolibre-worker-0   2020-09-08T17:00:56Z
binderhub-proxy-ingress-nginx-controller-9ch76   1/1     Running            0          22d	neurolibre-master     2020-09-08T16:59:26Z
binderhub-proxy-ingress-nginx-controller-qfp5d   1/1     Running            0          22d	neurolibre-worker-2   2020-09-08T16:59:26Z
binderhub-proxy-ingress-nginx-controller-qx56k   1/1     Running            0          22d	neurolibre-worker-0   2020-09-08T16:59:26Z
binderhub-proxy-ingress-nginx-controller-s8jzm   1/1     Running            0          22d	neurolibre-worker-4   2020-09-08T16:59:26Z
binderhub-proxy-ingress-nginx-controller-sbpc6   1/1     Running            0          22d	neurolibre-worker-3   2020-09-08T16:59:26Z
binderhub-proxy-ingress-nginx-controller-wffmn   1/1     Running            0          22d	neurolibre-worker-1   2020-09-08T16:59:26Z
hub-759ff48b58-ppmlw                             0/1     CrashLoopBackOff   6337       22d	neurolibre-worker-2   2020-09-08T21:36:56Z
proxy-564c87f85f-7pjsp                           1/1     Running            0          22d	neurolibre-worker-0   2020-09-08T17:00:56Z
user-scheduler-6867f76fb5-95dkr                  1/1     Running            39         22d	neurolibre-worker-2   2020-09-08T21:36:56Z
user-scheduler-6867f76fb5-bwfps                  1/1     Running            35         22d	neurolibre-worker-1   2020-09-08T17:00:56Z

It seems that it cannot reach to the proxy-api IP

[E 2020-10-01 14:39:42.652 JupyterHub app:2718]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2716, in launch_instance_async
        await self.start()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2524, in start
        await self.proxy.get_all_routes()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/proxy.py", line 806, in get_all_routes
        resp = await self.api_request('', client=client)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/proxy.py", line 774, in api_request
        result = await client.fetch(req)
    tornado.curl_httpclient.CurlError: HTTP 599: Failed to connect to 10.103.168.39 port 8001: Connection refused

Here is the list of our services:

NAMESPACE     NAME                                                 TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
binderhub     binder                                               NodePort       10.111.172.93   <none>          80:30229/TCP                 22d
binderhub     binderhub-proxy-ingress-nginx-controller             LoadBalancer   10.97.26.161    <public-IP>   80:30784/TCP,443:30888/TCP   22d
binderhub     binderhub-proxy-ingress-nginx-controller-admission   ClusterIP      10.106.10.170   <none>          443/TCP                      22d
binderhub     hub                                                  ClusterIP      10.101.119.67   <none>          8081/TCP                     22d
binderhub     proxy-api                                            ClusterIP      10.103.168.39   <none>          8001/TCP                     22d
binderhub     proxy-public                                         NodePort       10.102.77.41    <none>          443:31783/TCP,80:32233/TCP   22d
default       kubernetes                                           ClusterIP      10.96.0.1       <none>          443/TCP                      22d
kube-system   kube-dns                                             ClusterIP      10.96.0.10      <none>          53/UDP,53/TCP,9153/TCP       22d
kube-system   tiller-deploy                                        ClusterIP      10.100.32.90    <none>          44134/TCP                    22d

I don’t have lot of knowledge in networking, and much less in binderhub networking. If anyone (binder dev?) can point me out to a solution, or explain me the binder network component it would be really appreciated :slight_smile:

Thank you,