Jupyter Security Best Practices Workshop

This is a continuation of an initial discussion on the Jupyter Security mailing list to evaluate the interest in a workshop focused on security in for Jupyter. Based on the response in the mailing list, it’s clear that there’s interest and the question now is how myself and one or more motivated individuals should organize one. This is also the time to broaden the participants in the discussion. (Thanks Yuvi and Brian.)

(For those of you on the security list, the content below is very similar to my original email to the list.)

Idea & Questions

An outcome of the recent Jupyter Community Workshop hosted by NERSC and BIDS is the idea to hold another workshop focused on Jupyter security, particularly best practices for deploying and managing JupyterHub. As part of my work supporting data services for ALCF projects, I can lead the organization of the workshop. To me, this would be an initial event to bring more focus on security within the Jupyter community.

To motivate us further, there is funding available for hosting and travel from the Argonne Leadership Computing Facility (ALCF) and we could look for additional funding from the Jupyter Community Workshops effort by Bloomberg. A caveat to the ALCF support is that the event would need to occur by October 1, 2019.

Another topic for this workshop is diversity. I understand this can be a challenge within IT generally, and security especially, but we will get farther putting some effort towards improving representation. So whatever we can do within the bounds of our funding guidelines, let’s try.

Here are some questions for this group to gauge the level of interest
and support for a “Jupyter Security Best Practices Community
Workshop”:

  • Does this sound meaningful and worthwhile?
  • What are reasonable outcomes for this workshop?
  • Who should attend?
  • How long should it be?
  • When are the critical attendees available?

Below are some of my thoughts based on the discussions last week.

Potential outcomes for the workshop:

  • Updated Jupyter documentation on security.
  • A white paper on “Jupyter Security Best Practices”, scoped to various levels like DOE supercomputing centers, campus research computing centers, and workshops.
  • Summarizing Jupyter development practices that target security (e.g., code reviews, auth modules, CVE tracking, etc.).
  • Submit engagement proposal to Trusted CI to improve security in the Jupyter ecosystem, including development, operation, and usage of Jupyter.

Potential dates:

  • August 27-29, 2019
  • September 24-26, 2019

I appreciate any feedback and insight you all have.

Sincerely,

Rick Wagner

Globus
University of Chicago
Argonne National Laboratory
rick@globus.org

4 Likes

Slightly off topic: do you know where I can subscribe to this list? A quick google didn’t find anything :frowning:

I think that’s intentional (since a lot of security exploits get posted there it is treated as an invite-only list).

@rpwagner I bet a lot of the folks at the HPC workshop, as well as many people in companies and government facilities would be interested in this topic.

1 Like

I am able to attend on either of those dates.

I think this is a great idea! I (or one of my colleagues) would certainly be interested in attending and contributing. I would probably prefer a two-day format over a one-day format so that we can produce more meaningful artifacts.

This is great! Is the agenda envisioned to be specific to JuptyerHub deployments? I’d be interested in attending if the agenda was more general like multi-user/multi-tenant Jupyter deployments since I’d be able to contribute more.

I’d love to attend this workshop as well. @jaipreet-s, @yuvipanda and I have been working on the Jupyter Telemetry System. It would be great to discuss how this new system could improve security around Jupyter applications.

Both dates should work for me.

Hi All,

Thanks for the feedback. Here’s an update from some discussion that occurred as part of PEARC.

First, Trusted CI is very interested in helping promote security in the Jupyter community given how much it’s been adopted. It was already on their radar as a project they would like to contribute to. That’s good for us because it means that the likelihood of an engagement with them is higher.

Also, September as a target date hasn’t worked out (too many conflicting events), which frees us up to consider more optimal choices. (Unfortunately, it also means we need to find some funding, but I’m optimistic about that.) Mark Krenz from Trusted CI has suggested a half day workshop tied to the NSF Cybersecurity Summit for Large Facilities and Cyberinfrastructure in October. That’s in San Diego which makes it good for those of us on the West Coast. That could be a planning session for a larger workshop and some other next steps in the security space.

I think a community effort centering on improving security best practices would be fantastic. I think you’ve listed some great (and ambitious) outcomes of a workshop. I think an impactful outcome of such a workshop would be producing a working structure for a long-term effort in improving documentation, visibility, and commitment to security best practices by users and developers (for example, your engagement proposal to Trusted CI sounds interesting).

I think 2-3 days is a good length for a workshop like this.

Related to this, a few of us are giving a tutorial on setting up JupyterHub at the Trusted CI NSF Cybersecurity Summit next week. As part of the basic instructions we’ll focus on:

  • User auth & user management
  • TLS at all levels
  • Deployment practices

Some the security practices will be basic and common to any web service, but we want to provide as much Jupyter-specific advice as possible. I’ve started a collection of links in a Google Doc to draw upon.

If anyone has a minute to share their favorite reference to Jupyter security, it will be very appreciated. The contribution does not have to be a link or in the categories currently in the doc, please help us build a corpus to draw upon for other uses.

1 Like

A couple of items on this thread.

First, the training at the Cybersecurity Summit went really well. It was a great experience hearing the perspectives of the security and administrators who were the primary attendees. They’re clearly hearing requests from the researchers about running Jupyter on various resources, and the training helped them understand what Jupyter means in their context. The slides are available and are a good starting point for other training (perhaps PEARC 2020). Many thanks to @carreau for participating and representing the Jupyter development community.

Second, I’m working on a submission for the latest Jupyter Community Workshop solicitation. The potential outcomes for the workshop originally stated are a good starting point for the agenda:

  • Updated Jupyter documentation on security.
  • A white paper on “Jupyter Security Best Practices”, scoped to various levels like DOE supercomputing centers, campus research computing centers, and workshops.
  • Summarizing Jupyter development practices that target security (e.g., code reviews, auth modules, CVE tracking, etc.).
  • Submit a proposal to Trusted CI or another group to improve security in the Jupyter ecosystem, including development, operation, and usage of Jupyter.

If anyone else is interested to join steer this, please let me know. If you know someone that would be critical to this conversation (e.g., @yuvipanda and @minrk), I’d like to target dates they could attend if interested.

3 Likes

Hi Rick. Down here in Aotearoa we at NeSI are currently working through / debating our first cut of a GA JupyterHub architecture that allows NeSI users to interactively access HPC services and perform data analytics in place on our high-performance filesystems. The architecture as it relates to security is our biggest sticking point at present, so we’d be very interested to hear more about this workshop you’re wrangling…

Heya, Blair!

We’re hoping to host the workshop sometime in May and to release more details soon. Right now, the focus would mostly be on defining security practices, but we could look expanding it to cover a few cases so that the attendees are representative of the Jupyter community. Along those lines, we’re also looking for security people to attend, not just systems people or developers. They can pitch in with their knowledge even if they’re not familiar with Jupyter.

–Rick

1 Like