Using Jupyter notebook offline

Dear All,

I am new at using Jupyter Notebook/Python, my manager has queried security concerns about using these tools. I have downloaded and installed jupyter on my local machine to QC company important data will this have any security risks to the company data?

A truly offline installation of Jupyter (a single computer with no network connection) is as secure as the person using it to execute code.

The functionality of 99.9% of Jupyter tools will work without making any external requests… after installation on a computer, of course: the packages have to come from somewhere.

A typical local notebook server installation requires local access to around 130 packages, not all of which were created by the Jupyter community, and that code, to the best of our knowledge, does not contain any known serious vulnerabilities, aside from enabling the Big One: once loaded, the purpose of most Jupyter tools is interactive computing, or, in scary security terminology, remote code execution, even if that “remote” distance is very small. Under a default installation, a single browser talking to a single server process talking to a single kernel process, all on the same computer, data touched by Jupyter tools is relatively safe: the server will only accept connections from the same computer, so the threat boundary hasn’t changed much versus what the user of that computer could do from the command line.

Once a server is running, and a kernel loads arbitrary code and actually executes, each and every notebook opened should be considered a potential threat, based on its source: if you write and execute all your own notebooks, it has the same threat profile that the user that started it, would have: if you could delete all the files on your computer or network drive, or if you could uploaded files to an adversary, so a notebook could do the same thing, if you tell it to. This is among the reasons we don’t suggest people use the root or Administrator user to install, but certainly never run, their Jupyter tools. This will make use of at least the operating system’s built-in self-preservation data isolation.

Once some compute has happened, notebooks mix input and output. Anything printed to a notebook will stay in the notebook, unless deleted. So uploading/emailing/putting-on-dropbox an .ipynb that touches sensitive data somewhere can be equivalent to sending the sensitive data over the same means.

Downloading random notebooks and code snippets from the internet, however, is basically letting random people from the internet do the same, and we, as a community, pretty much can’t protect people from that particular learning experience.

4 Likes