I need to preprocess code executed by the user in the Jupyter Notebook to make sure that it doesn’t contain any malicious code. How do I approach that?
I’m wondering if there is any “code execution hook” available for extensions, so I can interject code execution on Jupyter Lab Server before sending it to Kernel. Or maybe there is some other approach how to achieve that.
Others may disagree with me, but fundamentally: with remote-code-execution-as-a-service, you can’t substantively change what folk can do with it beyond changing what the underlying server and kernel process can do. There are just too many ways to circumvent any countermeasures.
Some things you can do:
run the server/kernel process in a more isolated fashion
e.g. chroots, docker, vms, another computer, or all of the above
these can still be escaped by a dedicated attacker, but if it must be…
run everything against a read-only file system
this makes many useful things fail
run everything in the client’s browser e.g. jupyterlite
you can expose some features behind, e.g. a REST API
A product that ships such countermeasures will be more frustrating for non-malicious users, and not actually offer better protection from the other kind of users. With only the static analysis approaches one can code up in a feasible amount of time, even a lazy attacker will still find a way to circumvent in a couple minutes with automated tools.
I’m suggesting if a product allows folks to use kernels in production, it needs a defense-in-depth strategy that isolates untrusted arbitrary code execution.