Notebook Metadata UX

The existing markup provision in JupyterLab, like most notebook tools, is plain-old-commonmark. This is fine, but unsuitable for authoring complex documents such as those with references, or figures.

MyST Markdown is an ambitious superset of CommonMark that includes document-level features such as references. It is incredibly useful for authoring rich documents, and the current primitive implementation in JupyterLab (jupyterlab-myst) has already proven invaluable to me in writing such content.

@stevejpurves and @rowanc1 are re-writing jupyterlab-myst as a stand-alone extension to better address the needs of a MyST renderer, and also to leverage the current JS direction of MyST, which is based upon the unifiedjs pipeline. There’s an ongoing discussion here: Alignment with mystjs · Issue #57 · executablebooks/jupyterlab-myst · GitHub

One of the main uses for MyST is in Jupter Book, for which there is support for Jupyter Notebooks as MyST documents; content (e.g. figures) can be generated from cell outputs, and cell-metadata can be used to guide the subsequent transformations (e.g adding a caption to a figure generated by a code-cell).

@choldgraf noted that we would like to improve the cell-metadata editing experience, which I’ve also found to be highly frustrating (escaping captions for JSON, losing content if the :heavy_check_mark: is not pressed before losing focus, etc.[1]).

A very simple improvement would be a stateful UI that converts the JSON to YAML. This is fairly niche; Jupyter Book already uses YAML extensively, so it would naturally map to our use case. But, a longer-term solution would directly address the problem. In executable-books/jupyterlab-myst#57 I wrote:

Notebook metadata editing is notoriously bad. JupyterLab has added a JSONSchema-derived UX for settings management, I wonder whether the next logical extension is for JupyterLab extensions to be able to declare a metadata schema (and JupyterLab provides the UX for free?

In this thread, I wanted to see what’s already out there, and get a sense of any immediate obstacles to replicating schema-based settings UX in the metadata view. I’m fairly sure that this must have come up before, but I couldn’t find anything. @fcollonval / @bollwyvl you are two people I suspect might have an idea about this?

I note that this topic overlaps with the planned Jupyter Notebook format workshop!


  1. see Don't dispose the notebook metadata editor on active cell change by fcollonval · Pull Request #13259 · jupyterlab/jupyterlab · GitHub ↩︎

Have a look at this merged PR, which builds on rjsf.

I don’t think YAML is a solution to many interchange problems, which the canonical nbformat fairly firmly needs to stay tied to. TOML at least has a standard reader in python 3.11, and only one way to do multi-line strings, but that’s still probably not enough to move forward given all the languages involved that already have JSON read/write capabilities.

TOML/YAML are great as a UI for editing and reading JSON, however, especially with a schema-constrained language server backing it up… hopefully will have some opportunity to revisit a browser-side YAML language server, initially in JupyterLite but presumably in jupyterlab proper.

Longer con:

Floated out in the mists of time was the nbformat becoming substantially more self-describing, such that as actually defining $schema URI, and indeed, some of the more recent JSON Schema drafts make this much more feasible with an optional list of $vocabulary URIs. Again, URI: there’s really no expectation that a schema should be “hot loaded,” so knowledge of these would need to exist in software.

Whatever comes out of nbformat >=5, it should definitely bump up to at least a draft that supports $vocabulary, with enough time for downstream implementations to update.

Have a look at this merged PR, which builds on rjsf .

:rofl: :rofl: :rofl: :rofl:

I have been following JLab 4, and missed this PR entirely. Thanks for linking it in.

I don’t think YAML is a solution to many interchange problems, which the canonical nbformat fairly firmly needs to stay tied to. TOML at least has a standard reader in python 3.11, and only one way to do multi-line strings, but that’s still probably not enough to move forward given all the languages involved that already have JSON read/write capabilities.

I agree. To clarify, I’m suggesting that Jupyter Book (which uses YAML everywhere) would benefit from an extension (in the absence of the above PR) that exposes the JSON metadata as slightly-more-friendly YAML. I.e., what you wrote!

TOML/YAML are great as a UI for editing and reading JSON,

The serialised form remains JSON.

Floated out in the mists of time was the nbformat becoming substantially more self-describing, such that as actually defining $schema URI, and indeed, some of the more recent JSON Schema drafts make this much more feasible with an optional list of $vocabulary URIs.

Right, I’m in favour of this; having notebooks become a higher-level abstraction / standard.