ipynb combined with nbdime allows today great diff reviews as shown e.g. in the jupyterlab-git extension (look at demo.ipynb on https://raw.githubusercontent.com/jupyterlab/jupyterlab-git/master/docs/figs/demo-0-10-0.gif)
I agree alternative representation like yaml or markdown have a value, and would love to have a markdown with preamble like [*] where one could define the server size, the type of kernel, the datasets to mount and the initialisation scripts to run. The markdown notebook render would be free to honor or not the preamble definitions.
This drives me to the root of the question which is IMHO the notebook spec where I see 2 issues:
-
The format is defined by a json-schema (https://github.com/jupyter/nbformat/blob/a06f4c84738b338fee5ad6316b21918a8709b636/nbformat/v4/nbformat.v4.4.schema.json) which makes it easy to implement as a json but hard or even impossible to apply to markdown (there is no standard way to my knowledge that defines where/how to put all those definitions in the markdown?). So should we define how to apply the json schema to the concrete formats (md, yaml…) or move away from json schema and try to be more generic (not sure what it would look)?
-
The format is difficult to evolve, not on a technical standpoint, but more on a community agreement aspect. As many implementations are usin ipynb and many users are also looking at that, any change seems to be very difficult to be adopt (see e.g. Parameterized Kernel Launch on https://github.com/jupyter/enhancement-proposals/pull/46#).
[*]
name: datalayer/paper:features
version: latest
description: Datalayer Features
picto:
- variable: picto
server:
image: datalayer/server:base:latest
size: S
prune: 1h
kernel:
image: datalayer/kernel:base:latest
size: S
prune: 1h
datasets: - input:
- variable : iris
image: datalayer/dataset:iris:latest
- variable : iris
- output:
- name: iris_predict
variable: iris_predict
type: pandas
format: csv
separator: ;
init:
- name: iris_predict
- load.ipynb
snippets: - |
import …