Better security isolation mechanisms for untrusted users


To follow up with this issue:

I’m opening this thread to centralize information about how JupyterHub on k8s could be made more secure in scenarios where users are considered fully untrusted in the threat model.

Currently we have multiple potential solutions that yet have to mature:

Right now, none of these solutions are either production-ready or usable with JupyterHub. Some solutions are only partial, for example, using cri-o would not negate the need for better isolation, as that’s provided by the kernel.

KubeVirt is probably the most production-ready solution but then it might not be useful to use Kubernetes at all for the deployment.

Kata Containers does not support network namespace sharing, because a single network namespace can’t be shared across VMs, as it’s a kernel abstraction. Note that network namespaces are different than a private network. A network namespace contains interfaces, sharing network namespaces means sharing interfaces.

Performance measurements would also be interesting for production-readiness ;

  • Kata Containers has an experimental virtio block based backend that should offer much better performance.
  • gVisor has the same disk performance issue

Research on this topic is far from complete, so please comment with any new information or emerging solutions!



Thanks for your nice summary! It would be interesting to try gvisor / kata / kubevirt - they are all at the kubernetes layer, and theoretically should not require any change on the JupyterHub side. Since many installations use NFS for home directory storage anyway, would the perf loss from that not swallow any perf loss from gvisor / kata? Or it could make everything worse, of course…

kubevirt seems different - as VMs, they should have less problems with NFS performance issues. I’d be worried about startup latency for the most part.

I started working on a blog post listing a bunch of things you can do at the JupyterHub level in Needs a lot more work, but might be useful here.

1 Like

KubeVirt is not transparent to use, it’s a different set of APIs, it’s a Custom Resource Definition with broadly different attributes that apply to VMs and not Containers. So it would need modification of JupyterHub. That’s not the case for gVisor or Kata Containers, but it is the case for KubeVirt.

I wonder if it’s already possible to get this working with Z2JH?

Let us know if you try it!

@manics Well I don’t think anyone wants to use such a mechanism for production, but for example here we don’t run on the cloud. We have our homegrown Kubernetes clusters on top of baremetal, so autoscaling wont work.

Also, a pod isnt more isolated because it’s dedicated to a node, Kubernetes services run on that same node and if the pod obtains root, it can most probably own the entire cluster then.

The difference with VMs is that there’s a layer of isolation between the pod and the Kubernetes services.

Thanks, it’s useful to know what threats you’re considering, especially as others will assess risks differently. For example some admins may judge it’s enough to harden the container platform.

VMs are still more secure for now though they’re still not perfect. Even ignoring hypervisor bugs SPECTRE has shown hardware bugs can break the isolation between VMs.

Thanks for starting a really interesting discussion :grinning:

@manics Currently the consensus in the Kubernetes community is that it’s not designed for untrusted workloads, and hardening the container platform is not enough.

" Containers are not a sandbox . While containers have revolutionized how we develop, package, and deploy applications, running untrusted or potentially malicious code without additional isolation is not a good idea. The efficiency and performance gains from using a single, shared kernel also mean that container escape is possible with a single vulnerability."

Also Gitlab, trying to solve security concerns :

“We generally push customers to use the kubernetes executor since it provides autoscaling out of the box and has one of the best workload schedulers. When customers find issues with Docker Machine we always suggest to use Kubernetes since it’s better, and they get a lot more benefits. Currently, we can’t run shared Runners on Kubernetes for the fact that we run untrusted code from users, that can be used to escape from containers and cause harm to the infrastructure. So we need some kind of isolation that a full-blown Virtual Machine brings or something similar, like what we have right now.”

And there’s plenty more.

The idea here is that VMs emulate “hard multi-tenancy”, the closest to actual physical isolation of machines. Recent Speculative CPU flaws are information leaks, they’re not as far reaching as arbitrary code execution, though still critical, and in some situations where a secret is stored in memory, could be made for executing arbitrary code, but not obvious.

Hypervisors need patching as well, of course.

Most people practice defense-in-depth, which means that you put a number of walls until you hit your critical security domain, and there’s much much less of a chance that someone can break all the walls at the same time with vulnerabilities, zero-days or not, that your systems arent patched against. So this combination requirement gives you a little bit of headroom for updating your machines at your own pace and yet still be safe if you have some delay and in the case of zero-days, make it much more expensive to attack you.

With containers, there’s only one wall, the Linux kernel, or the Kubernetes APIs. A flaw in either of these and your whole cluster is owned, not just the machine that runs the container. What GKE does for example, is that they provision you VMs (VMs because they don’t trust their customers), and people install Kubernetes on it. VMs are considered a quite robust wall against attacks, so some people run that alone (like Google, AWS, Microsoft Azure clouds).

1 Like