I am using jupyter enterprise gateway 3.2.2 with jupyter hub 2.1.1 and JupyterLab 3.6.3.
I am having a weird problem when starting a kernel on a kubernetes cluster. The images I use for the kernels are heavy, and so it takes long to pull them onto the kernel pods. Therefore, I’ve set the ‘—GatewayClient.request_timeout’ on the JupyterLab to 5 minutes. I’ve also modified the Openshift’s route configuration to have a larger timeout than the default 30 seconds.
When I try to launch a new kernel, after approximately 2 minutes, I get the following error message on the JupyterLab:
“Error Starting Kernel
Invalid response: 503 Service Temporarily Unavailable”
This changes the selected kernel on the lab to “no kernel” automatically.
There are no additional error messages or error / warning logs displayed in the JupyterLab itself, in the enterprise gateway and in the JupyterHub. All 3 are set to a DEBUG log level.
After some additional time, when the kernel pod actually launches, I can choose it from the kernel tab in the “Use Kernel from Other Session” section.
I am looking for a way to avoid such error message, and I want to understand where is this error coming from.
It seems to be connected to the JupyterHub itself, because when I create an independent JupyterLab and configure it to the gateway, the error does not show…
Note - I’ve also tried to configure other timeout settings in the JupyterLab, such as ‘—GatewatClient.connect_timeout’, ‘—GatewatClient.response_timeout’, ‘—GatewatClient.gateway_retry_interval_max’. All of these changes led to no change in the behavior I mentioned above.