Hi all,
We have a JupyterHub service on Kubernetes cluster. We are using a lightly modified version of the z2jh setup and it is working great
Unfortunately yesterday one of our users were running a task that caused his python kernel to crash. After this crash his pod became unusable due to an error in the Proxy (I believe). After the kernel crashes it says it will restart the kernel (but the kernel is actually never restarted), and then we see this error in the proxy logs:
07:12:45.988 [ConfigProxy] info: 200 GET /api/routes
07:13:45.989 [ConfigProxy] info: 200 GET /api/routes
07:14:20.080 [ConfigProxy] debug: PROXY WEB /user/g02557/api/terminals?1624605140065 to http://192.168.70.223:8888
07:14:23.057 [ConfigProxy] error: 503 GET /user/g02557/api/kernels/13b909a5-4cf6-4746-8d22-c662dbf5f081/channels?session_id=c1c1a78e-4720-4592-9423-027f1c7f6347 Error: connect ETIMEDOUT 192.168.70.223:8888
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1145:16) {
errno: âETIMEDOUTâ,
code: âETIMEDOUTâ,
syscall: âconnectâ,
address: â192.168.70.223â,
port: 8888
}
07:14:23.066 [ConfigProxy] error: Uncaught Exception: write after end
07:14:23.067 [ConfigProxy] error: Error [ERR_STREAM_WRITE_AFTER_END]: write after end
at writeAfterEnd (_stream_writable.js:266:14)
at Socket.Writable.write (_stream_writable.js:315:5)
at IncomingMessage. (/srv/configurable-http-proxy/lib/configproxy.js:458:30)
at IncomingMessage.emit (events.js:314:20)
at IncomingMessage.Readable.read (_stream_readable.js:508:10)
at flow (stream_readable.js:1008:34)
at resume (_stream_readable.js:989:3)
at processTicksAndRejections (internal/process/task_queues.js:84:21)
07:14:23.067 [ConfigProxy] error: Uncaught Exception: write after end
07:14:23.067 [ConfigProxy] error: Error [ERR_STREAM_WRITE_AFTER_END]: write after end
at writeAfterEnd (_stream_writable.js:266:14)
at Socket.Writable.write (_stream_writable.js:315:5)
at IncomingMessage. (/srv/configurable-http-proxy/lib/configproxy.js:458:30)
at IncomingMessage.emit (events.js:314:20)
at addChunk (_stream_readable.js:298:12)
at readableAddChunk (_stream_readable.js:273:9)
at IncomingMessage.Readable.push (_stream_readable.js:214:10)
at HTTPParser.parserOnBody (_http_common.js:135:24)
at Socket.socketOnData (_http_client.js:475:22)
at Socket.emit (events.js:314:20)
After this error occurs, then the user canât access his pod (it becomes unresponsive), and eventually throws a âBad Gatewayâ error. We then need to delete the userâs pod, and delete the proxy pod for the setup to function again.
And I canât seem to figure out why this happen
Any suggestions to this, or has anyone experienced the same?