Self replying - and leaving it there for anyone in the same boat:
this was a wrong configuration in the nvidia container runtime.
/etc/nvidia-container-runtime/config.toml should have the following lines:
accept-nvidia-visible-devices-as-volume-mounts = true
accept-nvidia-visible-devices-envvar-when-unprivileged = false
And then the nvidia plugin should be deployed with
compatWithCPUManager: true
deviceListStrategy: volume-mounts
This (apparently) prevent containers to get all GPUs if no allocation specified, and correctly only 1 GPU when assigned with limits.
This is “clearly” documented here [External] Read list of GPU devices from volume mounts instead of NVIDIA_VISIBLE_DEVICES - Google Docs