Cell execution time increased by 3X when moving to single EC2 machine to kubernetes

Recently we switched from hosting jupyterhub in EC2 machine to AWS EKS.
Below is the comparison between old setup and new setup-

Setup Configuration
EC2-old m5.4xlarge - 16 cores, 64GB RAM, 2880 Mbps EBS bandwidth
EKS-new m5a.4xlarge - 16 cores, 64GB RAM, 4750 Mbps EBS bandwidth

We are using EFS as persistent volume in EKS so EBS bandwidth doesn’t matter here. So the old and new configurations are similar.

However, the scripts execution is 3 times slower in EKS based jupyterhub.

from random import random

def estimate_pi(n=1e7) - > “area”:
in_circle = 0
total = n

while n != 0:
    prec_x = random()
    prec_y = random()
    if pow(prec_x, 2) + pow(prec_y, 2) <= 1:
        in_circle += 1 # inside the circle
    n -= 1
return 4 * in_circle / total

%prun estimate_pi()

I ran above script in both the setups-


So far what i have observed-

  1. The nbextensions are not the cause as both setups have same extensions enabled.
  2. We have below CPU and RAM allocation for every user -
Setup Configuration
EC2-old 4 cores , 16GB RAM when launching single server notebook
EKS-new 1/2 total cores, RAM Guarantee - 1GB, Limit - 32GB

Still digging into this issue.

What could be the reason for this slowing down of kernels.
Could it be the container that i’m using for data science env?

Need your help,suggestions in this?


This is the docker file i am using for data science kernel. On top of this installing some extensions.

FROM jupyter/datascience-notebook:lab-3.3.2

USER root

RUN apt-get update && apt-get install git -y

RUN conda install -c conda-forge jupyterhub==1.5.0

RUN pip install tornado>=6.0.4

ENV JUPYTER_CONFIG_DIR /jupyter/.jupyter

ENV JUPYTER_DATA_DIR /jupyter/.local/share/jupyter

ENV JUPYTER_RUNTIME_DIR /jupyter/.local/share/jupyter/runtime




ARG FRAMEWORKS_LINE=“streamlit dash bokeh panel holoviews”


RUN pip install voila==0.3.5

RUN pip install jhsingle-native-proxy>=0.7.6

RUN pip install plotlydash-tornado-cmd>=0.0.6 bokeh-root-cmd>=0.1.2 rshiny-server-cmd>=0.0.2 voila-materialstream>=0.2.6

COPY jupyter_notebook_config_extra.py /etc/jupyter/

RUN cat /etc/jupyter/jupyter_notebook_config_extra.py >> /etc/jupyter/jupyter_notebook_config.py

RUN rm /etc/jupyter/jupyter_notebook_config_extra.py

COPY voila.json /etc/jupyter

RUN pip install jupyter-containds

USER root

RUN fix-permissions /etc/jupyter/

ADD ./jupyter_notebook_config.py /opt/conda/etc/jupyter/jupyter_notebook_config.py

RUN fix-permissions /opt/conda/etc/jupyter/

RUN pip install jupyter_contrib_nbextensions \

  && pip install cdsdashboards>=0.6.1 \

  && jupyter contrib nbextension install --system \

  && pip install jupyter_nbextensions_configurator \

  && jupyter nbextensions_configurator enable --system \

  && pip install yapf # for code pretty \

  && pip install ipywidgets \

  && jupyter nbextension enable --py widgetsnbextension --sys-prefix

RUN chmod -R 777 /jupyter

RUN chmod -R 777 /opt

RUN chmod -R 777 /etc/jupyter/

Are you running exactly the same Docker image on EC2 and EKS? If you’re not then it’s possible some libraries aren’t as optimised.

Have you tried using identical EC2 instances instead of different ones (m5.4xlarge vs m5a.4xlarge), and assigning the same resources to your EKS container instead of limiting the number of cores and RAM?

Have you checked what other applications are running on the same node, and compared the used CPU/memory usage for both instances?