Cell execution time increased by 3X when moving to single EC2 machine to kubernetes

shubhamkanwal · May 23, 2022, 8:14am

Hi,
Recently we switched from hosting jupyterhub in EC2 machine to AWS EKS.
Below is the comparison between old setup and new setup-

Setup	Configuration
EC2-old	m5.4xlarge - 16 cores, 64GB RAM, 2880 Mbps EBS bandwidth
EKS-new	m5a.4xlarge - 16 cores, 64GB RAM, 4750 Mbps EBS bandwidth

We are using EFS as persistent volume in EKS so EBS bandwidth doesn’t matter here. So the old and new configurations are similar.

However, the scripts execution is 3 times slower in EKS based jupyterhub.
Script-

from random import random

def estimate_pi(n=1e7) - > “area”:
in_circle = 0
total = n

while n != 0:
    prec_x = random()
    prec_y = random()
    if pow(prec_x, 2) + pow(prec_y, 2) <= 1:
        in_circle += 1 # inside the circle
    n -= 1
    
return 4 * in_circle / total

%prun estimate_pi()

I ran above script in both the setups-
Old-

New-

So far what i have observed-

The nbextensions are not the cause as both setups have same extensions enabled.
We have below CPU and RAM allocation for every user -

Setup	Configuration
EC2-old	4 cores , 16GB RAM when launching single server notebook
EKS-new	1/2 total cores, RAM Guarantee - 1GB, Limit - 32GB

Still digging into this issue.

What could be the reason for this slowing down of kernels.
Could it be the container that i’m using for data science env?

Need your help,suggestions in this?

Thanks

shubhamkanwal · May 23, 2022, 8:21am

This is the docker file i am using for data science kernel. On top of this installing some extensions.

FROM jupyter/datascience-notebook:lab-3.3.2

USER root

RUN apt-get update && apt-get install git -y

RUN conda install -c conda-forge jupyterhub==1.5.0

RUN pip install tornado>=6.0.4

ENV JUPYTER_CONFIG_DIR /jupyter/.jupyter

ENV JUPYTER_DATA_DIR /jupyter/.local/share/jupyter

ENV JUPYTER_RUNTIME_DIR /jupyter/.local/share/jupyter/runtime

RUN mkdir -p $JUPYTER_CONFIG_DIR

RUN mkdir -p $JUPYTER_DATA_DIR

RUN mkdir -p $JUPYTER_RUNTIME_DIR

ARG FRAMEWORKS_LINE=“streamlit dash bokeh panel holoviews”

RUN pip install $FRAMEWORKS_LINE

RUN pip install voila==0.3.5

RUN pip install jhsingle-native-proxy>=0.7.6

RUN pip install plotlydash-tornado-cmd>=0.0.6 bokeh-root-cmd>=0.1.2 rshiny-server-cmd>=0.0.2 voila-materialstream>=0.2.6

COPY jupyter_notebook_config_extra.py /etc/jupyter/

RUN cat /etc/jupyter/jupyter_notebook_config_extra.py >> /etc/jupyter/jupyter_notebook_config.py

RUN rm /etc/jupyter/jupyter_notebook_config_extra.py

COPY voila.json /etc/jupyter

RUN pip install jupyter-containds

USER root

RUN fix-permissions /etc/jupyter/

ADD ./jupyter_notebook_config.py /opt/conda/etc/jupyter/jupyter_notebook_config.py

RUN fix-permissions /opt/conda/etc/jupyter/

RUN pip install jupyter_contrib_nbextensions \

  && pip install cdsdashboards>=0.6.1 \

  && jupyter contrib nbextension install --system \

  && pip install jupyter_nbextensions_configurator \

  && jupyter nbextensions_configurator enable --system \

  && pip install yapf # for code pretty \

  && pip install ipywidgets \

  && jupyter nbextension enable --py widgetsnbextension --sys-prefix

RUN chmod -R 777 /jupyter

RUN chmod -R 777 /opt

RUN chmod -R 777 /etc/jupyter/

manics · May 23, 2022, 3:49pm

Are you running exactly the same Docker image on EC2 and EKS? If you’re not then it’s possible some libraries aren’t as optimised.

Have you tried using identical EC2 instances instead of different ones (m5.4xlarge vs m5a.4xlarge), and assigning the same resources to your EKS container instead of limiting the number of cores and RAM?

Have you checked what other applications are running on the same node, and compared the used CPU/memory usage for both instances?

Topic		Replies	Views
Delay in cell execution... kinda? JupyterLab jupyterlab , help-wanted	4	50	February 3, 2025
Tiny JupyterHub spawning on EKS? JupyterHub	4	876	October 6, 2021
Example deployment on AWS EKS Zero to JupyterHub on Kubernetes announcement , how-to	2	1067	April 9, 2024
Ipynb Server startup really slow JupyterHub how-to	7	774	March 19, 2021
JupyterHub for Personal Use or Small Team JupyterHub	4	995	March 15, 2019

Cell execution time increased by 3X when moving to single EC2 machine to kubernetes

However, the scripts execution is 3 times slower in EKS based jupyterhub. Script-

Related topics

However, the scripts execution is 3 times slower in EKS based jupyterhub.
Script-