Increase the size of /dev/shm on Z2JH kubernetes?

cboettig · January 23, 2024, 3:32am

Some popular machine learning packages, like ray (go bears!) make use of /dev/shm space and strongly recommend that /dev/shm allocation be at least 1/3rd of the RAM allocation on a container. Otherwise it will throw a warning like below:

024-01-23 03:24:21,039	WARNING services.py:1996 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=2.00gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2024-01-23 03:24:21,181	INFO worker.py:1724 -- Started a local Ray instance.

When running a jupyterlab in a standalone docker, --shm-size argument shown works perfectly. But, how do we accomplish this in a JupyterHub helm chart config.yaml when running with kubernetes?

stebo85 · January 23, 2024, 10:28am

Under storage in your config yaml:

storage:
extraVolumes:
- name: shm-volume
emptyDir:
medium: Memory
extraVolumeMounts:
- name: shm-volume
mountPath: /dev/shm

Hope that helps and works for you
Kind regards
Steffen

cboettig · January 23, 2024, 5:30pm

Thanks @stebo85 , looks great!

but this configuration confuses me a bit – I don’t see any line in that config sets the size. e.g. surely there are different configurations for --shm-volume 30GB vs --shm-volume 10GB ? Or is this automatically set as a percent of the memory allocated to a container?

stebo85 · January 24, 2024, 1:27am

Dear @cboettig ,

Here is the link to the documentation where this is described: Customizing User Resources — Zero to JupyterHub with Kubernetes documentation

The volume shm-volume will be created when the user’s pod is created, and destroyed after the pod is destroyed. SHM usage by the pod will count towards its memory limit. When the memory limit is exceeded, the pod will be evicted.

cboettig · January 24, 2024, 5:25am

Thanks @stebo85 , yes, sorry I wasn’t more clear with my question, but I’d read those docs before asking here. They state very clearly that the default shm size is a mere 64Mb,

The following configuration will increase the SHM allocation by mounting a tmpfs (ramdisk) at /dev/shm , replacing the default 64MB allocation.

Replaced with what size? I understand that shm use will count against the pod’s total RAM use, but I don’t see any size argument here. Is this saying that the user can now write up to the full RAM allocated to the pod to /dev/shm? Sorry if I’m being dense.

stebo85 · January 24, 2024, 5:44am

Dear @cboettig ,

Agree - that’s not 100% clear in the documentation. On my installation, after following this instruction, /dev/shm in the pod shows as the size of the RAM of the underlying compute node, so in my case 250GB (I think it’s subtracting some RAM of the 256GB for Kubernetes overhead?). My understanding is (and please anyone correct me on this), that this shm size can be filled by the pod up to the RAM limit of the pod in your config. So, let’s say you have

singleuser:
memory:
limit: 7G
guarantee: 4G

then the pod will be killed when the RAM usage + the usage of shm is more than 7GB combined.

manics · January 24, 2024, 10:01am

According to this K8s comment from 2022

github.com/kubernetes/kubernetes

Comment by webmutation - Add support to size memory backed volumes

kubernetes:master ← derekwaynecarr:bound-tmpfs

What version of Kubernetes is this PR merged to? Setting the size as follows… on kubernetes 1.21 but the shm still reports 50% of available ram. instead of the value declared in the yaml file. Also, we are getting evicted pods when the limit is exceeded instead of getting a firendly message like we have with the 64Mb default limit... dd: writing to '/dev/shm/test': No space left on device if we reach the 64Mb limit, the pod does not get evicted if we try to do the same for the pod we get a pod evicted event with and whitout the limit. So it feels like we are missing something to have a more reliable behavior. ``` spec: containers: - args: - infinity command: - sleep image: centos:7 name: centos resources: limits: memory: 256Mi volumeMounts: - mountPath: /dev/shm name: cache-volume volumes: - emptyDir: medium: Memory sizeLimit: 128Mi name: cache-volume ```

you can add sizeLimit under emptyDir

Topic		Replies	Views
Possible to increase mem allocation in JHub via helm? JupyterHub	1	401	March 19, 2020
[Request for Implementation] JupyterHub aware kubernetes cluster autoscaler Zero to JupyterHub on Kubernetes	13	1222	February 11, 2021
0.5 G for memory limit? Zero to JupyterHub on Kubernetes	4	1591	August 20, 2019
With very large notebook docker images, I get DiskPressure, then later out of space Zero to JupyterHub on Kubernetes	2	1067	October 10, 2019
Maximum size of a data set that can be loaded into the memory of a user's running Jupyterlab server instance JupyterLab	2	1436	August 23, 2021

Increase the size of /dev/shm on Z2JH kubernetes?

Related topics