JupyterHUB on top of EMR and GPU nodes

Hello,
We are using a managed Hadoop service by AWS with JupyterHUB.
We want to limit notebooks with TensorFlow on EMR Nodes with GPUs.
I failed to find any mention about how to send Notebooks (PySpark) on specific nodes (with specific tags).

Would the spark.yarn.{am,executor}.nodeLabelExpression tags applied in the spark-submit call satisfy this requirement? If so, you could create a specific kernel specification (kernel.json file) that invokes spark-submit. If you needed the notebook kernel (spark driver) to also run on the gpu node (i.e., cluster mode), then you’d also need to insert Enterprise Gateway into the picture and configure your Notebook servers to proxy their kernel management operations to EG.

If this is something that sounds promising, with or without EG, we can work on crafting up a kernel.json (and shell script) that invokes spark-submit to launch the IPython kernel. EG provides sample kernelspecs and yours would probably look similar to the spark_python_yarn_{client,cluster} specs.

Looks like I will implement it with spark.yarn.{am,executor}.nodeLabelExpression

sudo yarn rmadmin -addToClusterNodeLabels "GPU(exclusive=false)"
Add the following properties to /etc/hadoop/conf/capacity-scheduler.xml 
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.GPU.capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.default.accessible-node-labels.GPU.capacity</name>
<value>100</value>
</property>

and then stop and start the ResourceManager like: 
sudo stop hadoop-yarn-resourcemanager
sudo start hadoop-yarn-resourcemanager


Add the following properties to the YARN configuration file i.e. yarn-site.xml file in the nodes of the GPU Task instance group:

<property>
<name>yarn.nodemanager.node-labels.provider</name>
<value>config</value>
</property>

<property>
<name>yarn.nodemanager.node-labels.provider.configured-node-partition</name>
<value>GPU</value>
</property>

Once these properties are added, restart the NodeManager on the respective nodes:
sudo stop hadoop-yarn-nodemanager
sudo start hadoop-yarn-nodemanager
1 Like