seb835
June 20, 2024, 5:28pm
1
Hi Community,
i am looking for advice,
i use jupyterhub and dockerspawner to run jupyterlab into redhat podman containers. It works very well without any trouble, thanks for the work guys !!
But now, i want to add GPU Supports, when using podman you have to add "–device nvidia.com/gpu=all " to the podman command to allow container to use the GPU.
How can we add this in the dockerspawner parameters ? because i can’t get it works … the spawner complains about “device” as un unkwon parameters when added as .extra_create_kwargs or extra_host_config, that amke sens it is podman option not docker one.
Anyone here, get this kind of config running ?
Thanks a Lot.
Best
manics
June 20, 2024, 10:41pm
2
If you use the equivalent Docker config for GPUs does it work?
seb835
June 21, 2024, 10:46am
3
Can you elaborate on equivalent Docker config?
From my search and code analysis, when using docker as cri we usually used “device request host config docker api field” to sent gpu information with dockerspawner.
But it looks like, when podman is configure though the docker api, it does not manage that field
manics
June 22, 2024, 7:51am
4
seb835
June 22, 2024, 1:51pm
5
You are right, but podman does not use it
opened 06:51PM - 08 May 24 UTC
kind/bug
### Issue Description
"Device requests" are how GPUs are invoked from the Docke… r API. However, device requests are not being respected by Podman when creating a container over the Podman socket.
### Steps to reproduce the issue
Here is a Python script which tests the Docker and Podman socket APIs.
Setup: install Python version 3.10+ and run `pip install docker==7.0.0`
Run these tests:
```python
import subprocess as sp
import docker
import docker.types
# Setup: create unix socket clients
# --------------------------------------------------------------------------------
podman_socket = sp.check_output(['podman', 'info', '--format', '{{ .Host.RemoteSocket.Path }}'], text=True).strip()
podman_client = docker.DockerClient(base_url=f'unix://{podman_socket}')
docker_client = docker.DockerClient(base_url=f'unix:///var/run/docker.sock')
# Sanity checks: assert podman is working
# --------------------------------------------------------------------------------
assert b'!... Hello Podman World ...!' in podman_client.containers.run('quay.io/podman/hello', auto_remove=True)
# Sanity checks: assert podman and docker both work with nvidia-container-toolkit
# --------------------------------------------------------------------------------
def test_nvidia_smi_works_using_command(command: str):
assert sp.check_output([command, 'run', '--rm', '--gpus=all', 'registry.access.redhat.com/ubi9:9.4-947.1714667021', 'nvidia-smi', '-L']).startswith(b'GPU 0')
test_nvidia_smi_works_using_command('docker')
test_nvidia_smi_works_using_command('podman')
# Bug reproduction cases
# --------------------------------------------------------------------------------
GPU_REQUEST = {
'device_requests': [ docker.types.DeviceRequest(count=1, capabilities=[['gpu']]) ]
}
def test_nvidia_smi_works_using_client(client: docker.DockerClient):
assert client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], **GPU_REQUEST).startswith(b'GPU 0')
test_nvidia_smi_works_using_client(docker_client) # pass
test_nvidia_smi_works_using_client(podman_client) # fail
def test_device_request_goes_through(client: docker.DockerClient):
container = client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], detach=True, **GPU_REQUEST)
assert len(container.attrs['HostConfig']['DeviceRequests']) > 0
assert any(request.get('Capabilities', None) == ['gpu'] for request in container.attrs['HostConfig']['DeviceRequests'])
test_device_request_goes_through(docker_client) # pass
test_device_request_goes_through(podman_client) # fail
```
### Describe the results you received
- It should be possible to create containers with GPUs over the Podman socket API
- The created container should have a non-empty value for `.HostConfig.DeviceRequests`
### Describe the results you expected
- Device request is not honored when creating container via Podman socket
### podman info output
```yaml
host:
arch: amd64
buildahVersion: 1.35.3
cgroupControllers:
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: /usr/bin/conmon is owned by conmon 1:2.1.11-1
path: /usr/bin/conmon
version: 'conmon version 2.1.10, commit: e21e7c85b7637e622f21c57675bf1154fc8b1866'
cpuUtilization:
idlePercent: 94.1
systemPercent: 1.54
userPercent: 4.36
cpus: 20
databaseBackend: boltdb
distribution:
distribution: arch
version: unknown
eventLogger: journald
freeLocks: 2012
hostname: geo
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.8.9-arch1-1
linkmode: dynamic
logDriver: journald
memFree: 97578004480
memTotal: 134802944000
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: /usr/lib/podman/aardvark-dns is owned by aardvark-dns 1.10.0-2
path: /usr/lib/podman/aardvark-dns
version: aardvark-dns 1.10.0
package: /usr/lib/podman/netavark is owned by netavark 1.10.3-1
path: /usr/lib/podman/netavark
version: netavark 1.10.3
ociRuntime:
name: crun
package: /usr/bin/crun is owned by crun 1.15-1
path: /usr/bin/crun
version: |-
crun version 1.15
commit: e6eacaf4034e84185fd8780ac9262bbf57082278
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: /usr/bin/pasta is owned by passt 2024_04_26.d03c4e2-1
version: |
pasta 2024_04_26.d03c4e2
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: /usr/bin/slirp4netns is owned by slirp4netns 1.3.0-1
version: |-
slirp4netns version 1.3.0
commit: 8a4d4391842f00b9c940bb8f067964427eb0c964
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.5
swapFree: 0
swapTotal: 0
uptime: 1h 31m 25.00s (Approximately 0.04 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries: {}
store:
configFile: /home/jenni/.config/containers/storage.conf
containerStore:
number: 20
paused: 0
running: 14
stopped: 6
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/jenni/.local/share/containers/storage
graphRootAllocated: 1578640605184
graphRootUsed: 1019202039808
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 56
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/jenni/.local/share/containers/storage/volumes
version:
APIVersion: 5.0.2
Built: 1713438799
BuiltTime: Thu Apr 18 07:13:19 2024
GitCommit: 3304dd95b8978a8346b96b7d43134990609b3b29-dirty
GoVersion: go1.22.2
Os: linux
OsArch: linux/amd64
Version: 5.0.2
```
### Podman in a container
No
### Privileged Or Rootless
Rootless
### Upstream Latest Release
Yes
### Additional environment details
```
$ nvidia-container-cli info
NVRM version: 550.78
CUDA version: 12.4
Device Index: 0
Device Minor: 0
Model: NVIDIA GeForce RTX 3080 Ti
Brand: GeForce
GPU UUID: GPU-c61acb21-8716-6540-271c-39beab917d03
Bus Location: 00000000:01:00.0
Architecture: 8.6
```
### Additional information
_No response_
I have add a look to podmanclispawner and podmanspawner, but the source code is old.
manics
June 25, 2024, 4:00pm
6
podmanclispawner should let you pass additional command line arguments:
Does that work?