Executing PySpark code in a Jupyter Notebook using Z2JK where the user configures the Spark Session with the spark deployment mode set to "client" with the Spark executors running in their own dedicated Kubernetes cluster

Has anyone successfully executed PySpark code in a Jupyter Notebook using Z2JK where the user configures the Spark Session with the spark deployment mode set to “client” with the Spark executors running in their own dedicated Kubernetes cluster?

We can’t get this to work because the spark executors are not able to communicate back to the Spark driver which is running in the Jupyter Notebook - PySpark docker image (jupyter/pyspark-notebook:latest).

That is because the executors require a static Host Name for the Spark Driver (e.g., Jupyter Notebook - PySpark K8S pod).

The only thing I can think of is to MODIFY the KUBESPAWNER so that it creates a “headless” K8S service for each spawned Single User pod. Has anyone done this before AND what are the pitfalls and issues with doing this??

1 Like

I am facing same issue. the hostname is not an issue, as you can use python command to get hostname from notebook.
I am trying to expose port on singleuser pod, but without lucky.