I have been toying with enterprise gateway and spark. I could successfully get things to run the way I want (including with a custom kernel image and kernel spec image). Unfortunately, I’m pretty new to both, kubernetes and to spark so I am not very confident in my understanding and if my solution is optimal already.
As far as I understand it, enterprise gateway currently does a spark-submit in cluster mode to launch the kernel (that is also the spark driver). Is this understanding correct? Of course, this means that the launch already sets the numbers of executors according to the SPARK_OPS in my kernel,json.
Is there a way to make this more dynamic, i.e. let a notebook user choose how many executors to spawn or even better allow using a SparkConf within the notebook to adjust this?
If my understanding is correct, the second option may be tricky unless I somehow switch to client-mode, right? However I would also be happy with the first and prefer that I don’t need n different kernel specs for the same with with n different values for num_executors.