Julia community is creating a new notebook format

I took some notes! There were some explicitly noted off the record comments that I left out of the notes. I tried to reflect what folks were saying, but if there is any of this that you would like me to remove from this post, please reach out!

I will let @fperez explain in more detail, but we ended with universal agreement that we would organize a mini set of lightning talks around different notebook tools, ala VS Code teams, jupytext, etc, to serve as a venue to collect some of the different opinions and people in the space.

Here are my notes:


Attendees

  • David Athoff: Senior fellow at BIDS

    • Julia and Jupyter
  • Jason Grout:

    • Works at Bloomberg
    • Ipywidgets + JupyterLab + various other Jupyter projects (like nbformat) and other Jupyter community things
  • Matthias:

    • Jupyter developer since 2012
    • At Quansight, before at BIDS
  • Saul:

    • Work at Quansight, on Jupyter JupyterLab, and pydata ecosystem
  • Carol Willing:

    • Working for a long time with Jupyter
    • Working for a startup (noteable.io) based off of nteract and Papermil
    • Passionate about education
    • Involved in Python as well
  • Tony Fast

    • Data science at Quansight
    • Made a lot of notebooks
  • Matthew Brett

    • Was at berkelely till 2017, at University of Birmingham in UK
    • Teaches notebooks
    • Try to get students from notebooks to editor
  • Travis Oliphant

    • At Quansight and Open Teams
    • Cheering fan in Jupyter Community
    • Background around NumPy and Anaconda and SciPy
    • Perspective from industry users on consulting
  • Fernando Pérez:

    • IPython/Jupyter since 2001/2014
    • At UC Berkeley Statistics Dept since 2017.
  • Chris Holdgraf

    • Work at BIDS with Fernando
    • Mostly on Binder and JupyterHub
    • Right now working to set up nonprofit around infrastructure for computing around education in Jupyter space
  • Tim George

    • Work on JupyterLab team as UX designer
    • Work through calpoly and consulting through Quansight
  • Nick Bollweg

    • “taken Travis’s money on a number of occasions”
    • Work at Georgia Tech
    • Help people do things on JupyterLab, Conda Forge, JupyterHub
    • Been for a year and a half working on JupyterLab LSP
    • Worked with other orgs to ship and test interactive tools for computing
  • Gonzalo Peña-Castellanos

    • At Quansight, Inc
    • Working with Spyder since 2015.
    • Started working with JLab development this year.
    • Working on JLab on the localization effort.
    • Following the Jupyter community over the years :-p
  • Chris just realized that this hackmd was here! He has been taking notes over here: notes - meeting w/ David Anthoff from MS - Google Docs

Agenda?

Five minute rundown from David Anthoff on what they are doing, on Julia side of VS Code extension.

Then Chris Holdraf, can take over?

Can be listening exercise to exchange what’s going on and know this?


Story on Julia side is keen supporter of Jupyter philosophy. Julia kernel one of earliest kernels. ijulia package always gotten great support.

Also a desire to have good editor integration for Julia, so at that point JupyterLab wasn’t on the horizon, just notebooks.

We had a first serious attempt using LightTable. Most efforts shifted at that point to Atom. Up until two months ago, the main editor integration effort, billed on Julia homepage, was passed on Atom project. Called Juno, Julia Computing, had a pro version of this that installed Atom + this + Jupyter.

The Juno project based on Atom editor was main vehicle. Atom has a reputation for being slow. Some people not so happy with it. Also spirit of having integration with all editors. So at some point, I started extension for VS Code. with no strategy besides wanting support there. At some point Zack Nugent showed up and wrote LSP support for Julia.

Over the years, it emerged that VS Code experience was second most popular story for Julia. Then, two or three months ago, Juno folks decided to give up that effort and join VS Code efforts.

The reason for that, is that Atom basically stopped development. Not clear if from Microsoft aqueistion. We don’t have resources to continue maintaing atom, so we just join VS Code.

On one had, I was thrilled, that we had more people here, on the other here, I felt bad that they spent all this time and now it has been abandoned. There was never a competition between the two teams. It does mean, that the VS Code extension is the go to thing for Julia environments. The Vim and Emacs people have their own thing and reuse LSP support. But those are niche projects really in terms of user accounts.

We are up to something like 130k-140k downloads of Julia extension. We have opt in telemetry, we have about 15-20k active users per month. Lower bound, because it’s opt in. Nowhere near Julia numbers, but not five folks in a garage which use product.

In terms of features, it’s really exploded. Originally we just had syntax highlighting. Now we have full debugger, LSP, workspace view to see all global variables. Data table viewer, plot plane, integrated REPL, lots of functionality to send code from file into REPL.

So, in terms of feature scope, we are aiming at RStudio or JupyterLab.

But we were originally focusing on editor, editing text files of code. Now morphed into bigger thing. When things got weird, from my point of view, was when MS added complete Notebook UI and functionality into base project. You don’t have to do any UI work. As an extension, you implement a few things, like how I load and save file, and how to execute cell, and you are up and running.

I added a prototype in this PR, not focused on file format, but just playing around. Incredibly easy to build extension that gives fulls support to Julia. Full access to Language server, didin’t have to add anything. Almost no work, to get debugger.

We will have things like workspace view for notebook and so on.

Oh, so I just started, this API is here I will use it. But I stepped back and said, this is the same UI as JupyterLab. On LHS you have list of files and notebooks. But it will be richer experience, because of debugger, LSP, etc.

I don’t know what this means for JupyterLab, but it’s weird, and unexpected from our point of view. To invest so little time and be in this space.

We haven’t though much about governance questions. I just wanted a good integration, and users are coming to us.

I can talk a little about relationship with microsoft.

I think it’s important to note there are different groups in MS that we end up interacting with.

Our interactions with VS code team have been fantastic. They are responsive, support, they reach out to developers, they have calls with extension authors and listen to us.

At the same time, it’s clear who is calling the shots. You can can make suggestions and argue your case, but decision is made by that team.

That has not bitten us in any way so far, the decisions have been good. We always like what they have decided. But that’s been relationship with core editor team.

Then, MS is shipping a bunch of not OS extensions for VS code, which make a lot of the appeal, a different team, all the remote extensions, to connect via SSH to another machine, Julia is running on other machines, half of extensions automatically run, it’s literally magic. Those parts are all closed source. Those features are important. We have a lot of Julia users which work in cluster environments, they all use this.

Certainly, it’s closed source.

There is also an extension called live share, that provides real time collaboration, like google docs. Has audio, and all of that. Also closed sourced. Works well. Will be full integration with notebook story. Completely closed source.

Also, integration with code spaces? Starting in september, every repo will have a button that throws you into a web editor, all on azure, all closed source. a little like binder. But it will be a button on github. A fantastic experience for users. Zero community input.

Fernando: Seems like they basically just were inspired by Binder

David: there is no governance story, besdies MS calling the shots. The issue is experience is really smooth. It’s tough competition.

Then there is Python extension, developed by MS. Seperate team from VS code extension. Interaction with that team has been most frustrating (General discussion about support for other languages than just Python · Issue #1536 · microsoft/vscode-jupyter · GitHub ). The python extension added Jupyter Notebook support, but did it by themselves. They just added a webview, didn’t add through standard API.

Notebook interface in Python extension, is a year or two, notebook API hasn’t shipped. So they went with their own thing before it was available. They are of course, changing it, migrating extension to use notebook UI, so custom UI will go away.

But what they are doing, the Python extension, is communicating with Jupyter, not by kernelspec protocol, but instead require you to have Jupyter, and communicate via https,

Jason: they use JS jupyterlab libraries, to communicate with jupyter notebook server. We have had conversations with them about them.

David: For other languages, this is a problem. The Python extension that is grabbing the way to open ipynb files. This only works if you have Jupyter and Python installed.

for other languages, this is weird, if Julia users want to use a notebook, I don’t want them to have to install a Python extension. For a simple use case, I don’t want them to have to install this other stack.

At least to me, it’s not in spirit of kernel communication.

Fernando: I full agreed with how you phrased it in the thread. Sure you can jump on some existing tool, but point is that you can implement as cleanly as possible

David: I feel interactions there was quite frustrating. You small languages go away, we will do what we want. I don’t feel heard in that thread.

Jason: in our conversations with them, I suggested they were introducing too many pieces, and should look at nteract code to communicate with server. I never heard back if they looked at nteract implementations

David: A bunch of us mentioned nteract as the way to go.

It’s fine if they do this for Python, but they affect other experiences as well.

This was one place I felt annoyed, by MS, and the governance issues come up.

I will stop now.


Nick: On the spec front, ther ehave been a number of germame conversations. Both LSP and Jupyter, are driven entirely by implementations. Same on the LSP side, a markdown is not a spec. I have no problem with debug adaptor protocol, but LSP, if you are not using nodejs implementation, you are a second class citizen, and all the non conforming implementation has been smoothed over.

David: That hasn’t been our experience, the LSP has been in Julia, and it works perfectly well. That has worked well for us.

David: It works in other clients,

NB: some issues with LanguageServer.jl

Nick: We have same issue, where our specs are RST docs, we should probably get back to this validation. We still need to formalize Jupyter Markdown, who knows? It’s whatever works in classic.

It’s all about publishing as well.

Shippging interactive computing that work offline:

Fernando: We can start to have this format discussion in Jupyter.

Jason: David: what is your timeline for coming up with a notebook format?

David: 3-6 months. I primarily care about good experience for Julia users. Then interop, etc. So what I want, is that people can open a folder, they open a file, and it tells them to install julia and jupyter extnesion (optional) and it just works.

I don’t see how that would play out, with python extension grabbing format.

Best case, is that user clicks on that, VS code lets them know about all extensions, and its a mess.

So that’s why I would say, we just say, let’s have our own extension. That would insulate us from all the chaos that MS is creating or not.

I like rmarkdown way of doing things.

That’s another consideration.

There exists a Julia markdown flavor, all the docs exist in this. There exists a thing called Weave that we use to render this. If we go with this r nokebook story, we already go with markdown flavor that we use.

There are three options:

  1. Jupyter Notebook
  2. Just Weave format that exists, they already exist
  3. Or we could go with new file extension, with subset of Weave markdown

We don’t have a process on how to make this. We don’t have a timeline. In terms of governance, we have four people.

jason: Is it the bigger community or just you?

David: It’s just the four of us who merge or don’t merge. If we do make our own things, I would circle back to core Julia devs. While the four of us make the decision, I would kick off a process where we listen to more people.

Jason: At juliacon, I was impressed the number of people using JupyterLab over VS Code. 20% for Jlab, 60% for atom and vscode. Your own extension in just your own tool, isolates in your own community.

David: Yeah, we would certainly have a new format. I love jupyterlab, I use it every day. But I think once we ship this, I will do my editing in VS Code, because it will be a better experience. I wouldn’t be surprised if this cuts into JupyterLab.

Jaosn: although I work on jlab, I more care about formats and protocols. At the same time we are working hard to make jlab better, adding debugger, and variable explorer.

David: Here is the conundum. The ppl that work debugger, is two peeople. So the reality of it is, we don’t have anyone who would provide integration with JupyterLab. It’s just a manpower thing.

Jason: I would love to have other conversations about how to make jlab better.

David: We have never had a decision that has been this disruptive as this notebook question. That’s why I keep stressng, I am going slowly.

Chris: Jupyter’s primary goal is not to build tech, but around standardization. There are probably people high up in VS code world, who are aligned. One of key points of VS code has been langauage and community agnostic. If you said “Hey we are thinking about forking off of a standard, b/c python team is doing xyz” then you might get buy in from them. So.

Fernando: One of these things in bandwidth. It is critical that there is time pressure. It makes sense to convene this discussion, from the perspective of Jupyter. What tools have the right to exist? JupyteRLab, spyder, vscode? How can these tools serve them, in best possible way with best possible interop.

And second is related, is the question of the file format? What could be the happy marriage of RST and markdown

We do have moral highground of multi user space.

The only fear that I have is the bandwidth to steer this discussion. My reality is that I am screwed, now. These 1200 students will be my highest priority.

If I wasn’t teaching right now, I would say, let’s organize this to have a timeline with decisions made.

David: One other thing that would be helpful for me. I haven’t followed your discussion around alternative file formats, so I don’t know what’s going on. What’s in the file, could be a Jupyter story. I could see that working out. The file extension is different, but content has interop. I think this requires me undrstanding where you all are going with these conversations.

Fernando: Yeah talking about organizing that.

David: The bandwidth problem is difficult. It’s hard to sync up limited time with standards process.

Fernando: It’s tricky, these discussions have cost in time. They are an upfront, and they are harder, when there are multiple big stakeholders, so they are harder.

What are some potential interop measure temporarily. So you aren’t blocked and we can figure out solution in six months.

David: Can I ask if there are drafts for a markdown document?

Chris: I think right now, we are in super satuated space. Jupytext is becoming pandoc of text based notebook formats. Thus far, jupyter hasn’t made a push on one of these. Jesse Perla, involved in julia, he has been working on team with Quantecon that has been working on one of these formats. Clearly a lot of people would benefit. There are in some cases, funding for this kind of thing. We have a grant to fund JupyterBook dev, we could use that for helping lead this discussion.

Nick: From the eduction perspsective, being able to discover what is available and getting more directly to publishing. We don’t need to win developers, we can focus on next billion people who need to learn. Prosemirror is amazing. If we would push on markdown, we should push on a GUI editor like this. My prefernce would be to move to HTML or something like that.

We can’t win markdown, if commonmark hasn’t figured it out, then this isn’t a good language for long term archival of tools. I don’t wanna be checking in diffs of HTML, than maybe that’s what it takes.

Fernando: I know we are playing similar movie of being interested in seeing this work, and being scared of bandwidth. We give 10 slots for 10 minute talks. Where people give sequence of talks.

So that we gather that material in one place.

Well before JupyterCon, just a little meeting for two hours, we have zoom link, and record it, we have link to youtube, and github repo.

I think that much we can do.

David: I think that would really help us.

Fernando: We can allow live or recorded. 10 minutes slots, 12 of them. to happen in 2-4 weeks max

Chris: Can we build relationship with MS?

David: We asked python team about this and they said they were busy. Maybe opening some thing with vs code core could be useful. I could open up a conversation on the vs code tracker. I don’t wanna be middle man.

Chris: Can you lead a working group for next three years on Jupyter and VS Code? /s

Fernando: My concrete proposal, along these: GitHub - jupyter/jupyterhub-2016-workshop: Materials for an online mini-workshop around JupyterHub use cases, held July 22nd, 2016

  • Make github repo for this. They can open an issue.
  • We invite the VS code team, this is how we start. Tell us what you are doing.
  • Notebook interoperatbility online workshop
  • Anyone who is involved, we contact and give a talk.

David: Give python team 10 minutes as well as core team.

Fernando: Jupytext, jupyterlab, nteract, etc. Give a prompt, where you show off features, and you should end with questions about format that this brings up.

Useful as it comes from Jupyter team.

David: Who organizes it?

Fernando: I can take this on and ask for a little bit of help. It’s a zoom call, a google doc, and a github repo.

We make explicit that it isn’t a long term thing.

I think this fire is gonna burn in a way, that if we don’t do anything now it will burn harder in the future.

2 Likes