Sustainability of the ipynb/nbformat Document Format

bollwyvl · January 13, 2020, 8:47pm

Thanks for starting this!

A related discussion is on the JEP for DAP. In general, we might want to separate the specification from the reference implementation, and embed the human-readable documentation inside the formal specification. DAP itself is a good example of this, with its toolchain. In the DAP case, the ability for a Jupyter spec to reference another spec, in both a machine- and human-resolvable way would likely be preferable to re-implementing or re-documenting it.

Once (more) formalized, including a concrete reference to these specs in so-constrained objects would go a long way towards self-description, while actually including the schema might be a bit too Goedel-Escher-Bach. Including $schema seems like the most straightforward approach. What this does not provide, however, is an easy means for a document to be, for example, both an nbformat.v4 document as well as a particular, more-constrained format. I am not sure if schema could be crafted in such a way as to make this self-describing.

As this would generally necessitate a major (breaking) change on both ends of the pipe, I would also advocate for (optional) inclusion of a list of JSON-LD context, which would permit much deeper, unambiguous integration with high-value metadata formats like W3C Web Annotation and PROV.

Finally, setting a goal for a computationally-lossless, yet publication-ready format would make sense. I submit that PDFA/2 is a format really worth considering for this role, as it is already the de facto (or indeed, de jure) format in a number of domains. In addition to the familiar features of PDF, it includes a virtual file system, such that a “Jupyter PDF” meant:

a PDFA/2
at least one .ipynb

Topic		Replies	Views
Jupyter and GitHub - alternative file format Notebook community , idea	101	10224	May 31, 2021
Is the ipynb extension off putting to non-python users? General	2	895	May 8, 2019
Proposed-JEP: Investigate alternate, optional file formats Notebook	14	1193	July 13, 2020
How to Version Control Jupyter Notebooks Notebook blog-post	22	25675	March 8, 2023
Strip_invalid_metadata future? Notebook	1	242	January 3, 2024

Sustainability of the ipynb/nbformat Document Format

Related topics