Objective
In my view, at the most “general” of representations, the Jupyter Notebook is a
rich, structured document
Although frontends like JupyterLab impose even more constraints on this definition, such as “columnar document”, other frontends (e.g. voila
+ voila-reveal
/jupyter-flex
) do not.
Whilst Notebooks are currently comprised of three cell types: Markdown, Code, and Raw cells; these are “implementation details”. I believe that we could generalise the Notebook even further, with the following goals:
- Support multiple interoperable Markdown renderers
- More-easily facilitate polyglot kernels (e.g. SoS)
- Extend kinds of rich output supported at the document level
Motivations
For some time I’ve felt that the existing 3-cell type notebook schema is both a blessing and a curse.
Whilst we support a huge range of code-cell output MIME types in the various Jupyter Notebook frontends (e.g. JupyterLab), the notebook itself can only represent a small subset of these at the document level in the form of cells. Besides Markdown, there are other markup languages that may be useful in the notebook context:
- Diagrams e.g. MermaidJS, DrawIO
- GeoJSON
- Vega (etc)
We support these in cell outputs, why not support them directly as cells?
Furthermore, the Markdown cell is currently defined as a GFM syntax. Whilst this has served us very well over the years, there are an increasing number of projects that want to extend this in various ways:
- GitHub - agoose77/jupyterlab-imarkdown: Embed rich output in Markdown cells.
- GitHub - agoose77/jupyterlab-markup: JupyterLab extension to enable markdown-it rendering, with support for markdown-it plugins
- GitHub - agoose77/jupyterlab-myst: Integrating MyST functionality in Jupyter Lab
To support these kinds of Markdown flavours at present, we have to re-purpose the existing Markdown cell with different renderers, and there is no standardised way to communicate this to the frontend-in-question; users need to know which extensions / packages to install.
One of the huge strengths of JupyterLab has been the rendermime
interfaces that drive the rich-representation paradigm. I believe we should extend this to the Notebook itself; cells should be able to describe their contents sufficiently that the frontend can provide the appropriate view(s).
Details
Note: the following attempt at a “solution” isn’t actually a good fit for what would need to be done, but fun to at least consider
Notebook Schema
In another thread, @fcollonval touched upon the idea that we might generalise the notebook to support more cell types, with a stronger model-view concept. I quite like this idea, and I wonder if we ought to go as far as to remove the cell “type” from the schema altogether, in favour of a single “MIME” cell. In this design, the three cell types are just views:
[
// Raw
{
"mimetype": "text/plain",
"data": "I am a raw cell"
},
// Markdown (GFM)
{
"mimetype": "text/markdown;flavor=GFM",
"data": "This is `some inline code` inside Markdown"
},
// Code (Python)
{
"mimetype": "text/x-python",
"data": "import numpy as np\nx = np.arange(10)"
},
]
Both Markdown cells and Code cells currently have the ability to carry extra data:
- Markdown cells have attachments that contain MIME-bundles
- Code cells have outputs that the frontend displays (usually below) the cell editor
These could be views of the same in-document data.
The point here is that the existing notebook schema enshrines these behaviours in the schema itself. By lifting this out of the notebook schema and into the frontend, we can extend things more easily, and (ironically) keep the notebook future proof (e.g. with the flavor=GFM
parameter)
Given that IRender
s are allowed to modify their models, we can have multiple views for the same cell (just as we have rendered/source mode for Markdown cells), e.g.
vs
Cell Execution
By simplifying the notebook schema, the frontends now have to do a bit more work. How do we implement kernel execution of code cells (in JupyterLab)? A “kernel” extension could be made aware of the MIME type for the current kernel (e.g. via the metadata.mimetype
field of the notebook`). For those cells in the current document with the correct mimetype, this extension is responsible for taking the code, executing it in the kernel, and storing the results in the notebook. This partially relates to @jasongrout’s comment here