Problem connecting singleuser pod to spark

I’m having an issue connecting to Spark from a singleuser pod launched via Jupyterhub 2.0. I suspect it’s a networking issue.

I can launch a dedicated jupyterlab pod and connect to the Spark instance; I need to set the spark.driver.host to the IP addr of the pod, but once done, Spark can communicate pack to the jupyterlab instance and the job can run.

When I try to do the same in a Jupyterhub context, it looks like connectivity between the Spark workers and the singleuser instance is blocked. I have disabled network policies and the cloudmetadata container; I tried to add some ingress rules but I guess I’m not doing this correctly.

Any pointers on how to troubleshoot such issues?

Thanks, rgds,
Sean.

As often happens, the solution occurred to me a few mins after submitting this post.

I set the spark.driver.port within the singleuser pod and set the allowedIngressPorts to the same value; in this way, spark can talk back to the singleuser instance.

Hope this helps someone!

Rgds,
Sean.

1 Like