Inline variable insertion in markdown

I’d expect the Markdown cell to act just like any other cell and evaluate expressions at execution time. I would also expect it not to affect cell numbering.

I’d be curious to hear if others have the same intuition.

2 Likes

A few quick thoughts:

My intuition matches @stefanv - in my mind, I think of the “execution markdown” areas as no different from a code cell. Whatever is inside gets executed, at render time, in the order that it appears in the markdown cell. Hitting “shift+enter” to render the markdown cell would trigger this operation. I could understand if there were some differences in behavior (e.g., an error simply results in a red block rather than a full traceback or something, or only “expressions and variables are allowed”) but behavior of a snippet of executable markdown shouldn’t deviate too much from a “regular” code cell.

Regarding variables vs. expressions vs. arbitrary code, I suspect that “expressions and variables” will make up the large majority of use-cases people want from this. Most people will want to use this to insert a variable into their text, or potentially call a function that returns something cool. I doubt that there will be a strong demand for arbitrary variable assignment etc - IMO that is best left to a code cell.

To that point - I also think that the model of statefulness should follow the (admittedly controversial) model that Jupyter Notebooks follow already. The default behavior should be that it doesn’t do anything too clever about re-rendering variables and such. If you want your markdown to update, re-render it. If you want to ensure that the display matches the state, then re-execute from top to bottom (or execute via something like nbclient). Maybe there’s a future conversation about addressing this behavior more generally, but I think most users’ will just map their mental model of code cells directly into the executable markdown areas, so this is what users will expect.

(also, more code snippets with references to 80s/90s pop songs plz :upside_down_face: )

2 Likes

Ah one other thought re: formatting and such, could we also piggy-back on jinja filter syntax for this? (I’m not necessarily saying use Jinja directly, but just the general pattern of {{ somevar | somefunction }}…perhaps that could be used for some basic formatting options - I definitely agree people will want to be able to format floats to strings at the least.

If it serves as helpful inspiration, I think {{ mustache }} is an interesting project w/ similar syntax: https://mustache.github.io/

The Binder above is ready if people want to test things out. Outputs aren’t saved in the notebook (due to existing markdown attachment handling), but re-executing cells works out of the box!

One of the tricky parts will, I think, be handling the formatting of rendered outputs. Currently, we have a simple set of CSS rules that handle known cases like the result of text/plain (which is just a pre tag), and widget h/vbox.

Binder link

Very nice, this works well!

It would be helpful to be able to see when the expression did not evaluate correctly. Currently it is quite confusing that once a mustache rendered successfully it will keep that value, even if the mustache is subsequently modified and fails to execute the next time.

1 Like

Here’s a Binder link that points to the example notebook:

Also note that you’ll need to %pip install numpy to get the example to run!

Regarding @stefanv’s comment, I think that this would reproduce the error he describes (I’ll write it in myst markdown since I can only work with text here)


This variable is undefined {{ a }}. It will be empty when the markdown cell is rendered.

```{code-cell}
a = 2
```
Now we'll re-execute the markdown: {{ a }} and it will display as `2`

We'll then delete that variable

```{code-cell}
del a
```

Now we'll render the variable again, and it will **still** be 2! {{ a }}.

@choldgraf @stefanv good catch; that issue is actually being tracked on GitHub, which is where the development discussion is / will be happening. My thought here is that we clear the old outputs before they are replaced, so any failed expressions are left empty.

Heya, I’ve created a (very) draft PR in nbclient: PoP for Inline variable insertion in markdown by chrisjsewell · Pull Request #160 · jupyter/nbclient · GitHub

Here you simply provide it with a callable, to extract substitution expressions from the Markdown content, then execute the extracted expressions as user_expressions, and finally save them as attachments.

So very much like the @agoose77 prototype, except you are essentially extracting the variable in a pre-process stage (rather than having to go all the way through to rendering the Markdown), then it would be easy to “later” inject these attachments back into the Markdown parse+render in myst-nb or nbconvert etc (this should also be able to still work with jupyter-cache)

It also means nbclient doesn’t directly need to know anything about the Markdown syntax

A few additional thoughts:

  • I would definitely give these cells an execution number; the “imarkdown” maybe should have a slightly different JSON schema to “markdown”, that is a hybrid between “markdown” and “code”
  • For integration with nbconvert, it would be cool if it adopted GitHub - executablebooks/markdown-it-py: Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python! as the default parser :wink: then it would be easy to synergise between the Javascript and Python implementations
  • I guess the Jinja {{ }} syntax is a slightly double-edged sword; it is fairly intuitive for anyone who has used jinja/mustache, but it may clash with other uses of these engines, and people may expect it to work exactly the same. (It would certainly clash with https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#substitutions-with-jinja2)
  • For injecting things like floats, it would be cool if you could do something like f-strings: {{ variable:.3f }}, although with python kernels you can already do {{ f'{a:.3f}' }}

OK, this has now been handled, and for now, errors are shown inline for anything that the kernel reports.

@agoose77 apparently my full post is awaiting moderation, but you maybe want to check out jupyter/nbclient/pull/160 :wink:

1 Like

I don’t think this is a good idea for the following reasons:

  • If supported filter should be agnostic of the kernel language…
  • But we have no clue on the type returned by the variable; if the filter is applied on the mimetype bundle for example, we don’t have control of what types will be available
  • The user can directly do {{ "{:.3f}".format(my_float) }} or anything else allow by the kernel language. So let’s keep things simple

(as will be said in my post awaiting moderation) I feel there should be a slightly different JSON schema for these “executable” markdown cells, rather than piggybacking on attachments, and it should definitely have an execution number. Something like:

  {
   "cell_type": "emarkdown",
   "execution_count": 2,
   "metadata": {},
   "expressions": [
      {
        "input": "a",
        "output_type": "display",
        "data": {
          "text/plain": "1"
        }
      },
      {
        "input": "b",
        "output_type": "error",
        "traceback": ["..."]
      },
   ],
   "source": [
    "{{ a }} {{ b }}\n"
   ]
  }

I’ve had some thoughts about this! I feel that this entire discussion is the tip of a much larger question about the Notebook (which I’ve tried unsuccessfully to reason about here). Hopefully, you’ll forgive me for talking to that as we go along? I don’t think we can solve this problem without at least discussing the other problems (as I see them). Also, apologies if I’ve touched on this already; I think this is a slow descent into madness.

TL;DR — (imo) there are some big problems that need to be solved before we can look at making this an official part of the Jupyter Notebook spec, so let’s not make any drastic changes like new cell types just yet.

Rendering expressions is a specific case of injecting content

The prototype for Jupyterlab separates the kernel query (that generates the attachments) from the rendering stage (that retrieves the cell attachments). It does this by generating a mapping from a templated name `${ATTACHMENT_PREFIX}-${index}` to the DOM node that marks the location of the rendered attachment.

I could see a future extension exposing attachments in a JupyterLab UI, and allowing them to be directly embedded via emarkdown, e.g.

Let's embed a video here: {{#video}}

In an ideal world, the XMarkdown class that we currently have would only be responsible for transforming rendered Markdown (HTML) to inject the rendered attachments; the kernel processing would be managed by another extension that understands the kernel and populates the attachments. This is almost what we have now, except that currently we assume that the markup embeds an expression, and generate the attachment name in TS, rather than embedding the attachment name in the HTML. I might open a discussion on this idea, actually, as I think it’s important to get right.

So, because of this, I think cell.attachments is the right place to store the MIME bundles, rather than a new cell property. Maybe need a new cell type at the same time, though (see next point)!

We have introduced a new Markdown flavour

The Notebook markdown is “standardised” on marked.js, but even this standard is loose. Whatever we use to implement inline-expressions, the resulting Markdown will not behave properly when rendered in a default marked.js viewer; we’re creating our own flavour.

I like your use of emarkdown. Given the idea to generalise this to “embed attachments”, maybe EMarkdown is a good name (embedded-markdown rather than expression-markdown) for our flavour.

For this reason, I think a new cell type might help enforce this distinction. However, see my final point as to why we might also not do this.

But, elsewhere there is more than one Markdown flavour

If we only consider the context of emarkdown, then we can probably create a new cell type, and be done with it. However, notebooks containing ill-defined Markdown flavours is already a problem elsewhere:

  • jupterlab-markup re-uses the Markdown cell with a different (custom) flavour (in JLab)
  • Jupyter Book re-uses the Markdown cell with MyST syntax (rendered + JLab)

This is the Big Issue™ that I’m worried about. Things are chaotic already; introducing a new flavour of Markdown is just going to make it worse if we can’t concretely define it for frontends.

I don’t think that these use cases can be solved by choosing one particular flavour; users should be able to add “extra” features through extensions, just like we do for other areas of JupyterLab (e.g. rendermime). Equally, at present we have “no” guarantees about what the frontend can render, or an indication about what the notebook needs. A user would have to have existing knowledge of various MyST markup to realise that they needed to install extensions.

In my opinion, any solution to this problem would need to accomplish the following:

  1. Establish/choose a system of identifying Markdown syntaxes
  2. Store this metadata in the notebook (at notebook or cell level)

The scope of this change may vary

I’d love it if the various stakeholders could get together to discuss this. @bollwyvl has spoken on this before, but I don’t know who else needs to be involved.

Conclusion

Maybe the “short-term” solution to this is to boot the problem down the road; there is nothing in the notebook schema that stops us from using different flavours — the original Markdown cell doesn’t have a hard-defined flavour, and unless we can introduce one here, we should avoid making a new cell type that isn’t any more well-defined than the original one!

Let’s just keep the existing Markdown cell + attachments, and wait to implement this in core (e.g. nbclient) until we’re ready to solve these problems.

1 Like

I’ll comment more when I have some time lol.
But one thing I want to emphasise here though; IMO a text cell (whatever flavour of Markdown it is, or not even Markdown), is modally/conceptually different from an executable text cell.
I feel it absolutely has to have an execution number, because the output can only be understood in relation to the execution order (the same as for any other executable/code cell).
Plus, piggybacking on attachments seems a little “hacky” to me, to be introduced in core; user_expressions are just not the same as attachments.
For example, attachments are a dictionary, whereas user_expressions are, and should be, a list because their calls can have side effects, and so their order is important. So then you are introducing the hack of including the expression index in the key.
Then user_expressions are also not just mimebundles, they can also be error tracebacks. So here you are adding the hack that you give these special mimetype names to distinguish.

So do we add extra things/concepts into the markdown cell type, or do we introduce a new cell type?

Basically, even if we still use the markdown cell, I feel we should then be introducing additional (optional) keys to it

A couple quick thoughts:

  • I agree that getting this addressed in “core jupyter” (whatever that means) is a long-term challenge with lots of moving parts and stakeholders. It probably needs experimentation and some PoCs, and ultimately a JEP. Agree that it would be useful if the various folks that care about this had a conversation about it (and a quick pointer to @bollwyvl 's thoughts above: Inline variable insertion in markdown - #11 by bollwyvl)
  • Also agree that a long-term solution here would be to extend the markdown cell spec. IMO this would be the action that makes it “officially part of core Jupyter”. But IMO if we make this the “first step” then it’ll be hard to make iterative progress and prototype.
  • I think the most important question to learn about is what the end-user experience should look like, so we should be iterating and getting feedback with that kind of learning as the goal (and thanks @agoose77 for enabling this!). IMO it’s OK to have sub-optimal implementations right now, as long as we know there’s probably a future step of having those debates as well.
  • Agree w/ @chrisjsewell’s point that we may not want to use moustaches directly because of its ubiquity and different behavior in so many other applications. And agree w/ @fcollonval that we wouldn’t want to use Jinja directly, I’m just saying that there should be some functionality for formatting, since the most common thing people will do is “output a string from a variable/expression”, and string formatting is useful :slight_smile:

and a quick pointer to @bollwyvl 's thoughts above: Inline variable insertion in markdown - #11 by bollwyvl

The danger of this: whatever clever thing is done, at whatever level, if the renderer isn’t portable/formally defined (e.g. an ANTLR/lark grammar)

To my knowledge, creating a grammar for Markdown is not theoretically possible, see e.g. Why isn't there a formal grammar for Markdown? – roopc.net Although, I stand to be corrected?

My understanding is, currently, the spec is all defined by guaranteeing that Markdown x → HTML y

So you have the CommonMark spec https://spec.commonmark.org/ and test set https://spec.commonmark.org/0.30/spec.json, the GFM has built on this with GitHub Flavored Markdown Spec to create an extended test set (see e.g. https://github.com/hukkin/mdformat-gfm/blob/82ff92312ecb246ace0ec4a08d4e6f59240ffea4/tests/data/gfm_spec.commit-85d895289c5ab67f988ca659493a64abb5fec7b4.json),
and then you can build on this, which is basically what MyST does with e.g. https://github.com/executablebooks/mdit-py-plugins/tree/master/tests/fixtures

Is this a fundamentally different question?

Is it true that
Which mimerenderer extension is registered for each type MUST (?) be persisted along with the notebook in order to have a portable, reproducible notebook environment? Where in nbformat or REES is such local application state even persisted now?

I’ll just cc regarding the existing functionality from https://jupyterbook.org/content/content-blocks.html#content-substitutions :

Substitutions and variables in markdown MyST markdown in *this* markdown parser without special metadata to hint processing

(content:substitutions)=

Substitutions and variables in markdown

Substitutions allow you to define variables in the front-matter of your page, and then insert those variables into your content throughout.

To use a substitution, first add front-matter content to the top of a page like so:

---
substitutions:
  key1: "I'm a **substitution**"
  key2: |
    ```{note}
    {{ key1 }}
    ```
  fishy: |
    ```{image} img/fun-fish.png
    :alt: fishy
    :width: 200px
    ```
---

You can use these substitutions inline or as blocks, and you can even nest substitutions in other substitutions (but circular references are prohibited):

:::{tabbed} Markdown Input

Inline: {{ key1 }}

Block level:

{{ key2 }}

:::

:::{tabbed} Rendered Output
Inline: {{ key1 }}

Block level:

{{ key2 }}
:::

You can also insert substitutions inside of other markdown structures like tables:

:::{tabbed} Markdown Input

| col1     | col2      |
| -------- | --------- |
| {{key2}} | {{fishy}} |

:::

:::{tabbed} Rendered Output

col1 col2
{{key2}} {{fishy}}

:::

:::{seealso}
For more information about Substitutions, see .
:::

Define substitutions for your whole book

You can also define book-level substitution variables with the following configuration:

parse:
  myst_substitutions:
    key: value

These substitutions will be available throughout your book. For example, the global substitution key my-global-substitution is defined in this book’s _config.yml file, and it produces: {{ sub3 }}.

Formatting substitutions

MyST substitutions use {{ jinja }} in order to substite in key / values. This means that you can apply any standard Jinja formatting to your substitutions. For example, you can replace text in your substitutions like so:

:::{tabbed} Markdown Input

The original key1: {{ key1 }}

{{ key1 | replace("a substitution", "the best substitution")}}

:::

:::{tabbed} Rendered Output
The original key1: {{ key1 }}

{{ key1 | replace(“a substitution”, “the best substitution”)}}
:::

Using substitutions in links

If you’d like to use substitutions to insert and modify links in your book, here are two options to explore:

  1. Define the entire markdown link as a variable. For example:

    :::{tabbed} Markdown Input

    substitutions:
      repo_url: [my repo url](https://github.com/executablebooks/jupyter-book)
    
    Here's my link: {{ repo_url }}
    

    :::

    :::{tabbed} Rendered Output
    Here’s my link: {{ repo_url }}
    :::

  2. Use Jinja features to insert the variable.
    Because substitutions use {{ jinja }}, you also have access to Python formatting operations in your substitution.
    For example:

    :::{tabbed} Markdown Input

    substitutions:
      repo_name: jupyter-book
    
    Here's my link: {{ '[my repo: `{repo}`](https://github.com/executablebooks/{repo})'.format(repo=repo_name) }}
    

    :::

    :::{tabbed} Rendered Output
    Here’s my link: {{ 'my repo: {repo}'.format(repo=repo_name) }}
    :::


… If you put MyST Markdown in a {notebook, {discourse thread, github issue/pr, github README,}} that’s not built with e.g. myst-nb just like jupyter-book, nothing will ‘execute’ the substitutions. https://myst-nb.readthedocs.io/en/latest/

Notebook-level metadata in the .ipynb/.myst.md itself. From https://myst-nb.readthedocs.io/en/latest/use/markdown.html#myst-notebooks-in-sphinx :

In order to signal MyST-NB that it should treat your markdown file as a notebook, add the following Jupytext configuration to your notebook-level metadata (by adding it to the YAML front-matter at the beginning of the file).

---
jupytext:
 text_representation:
   format_name: myst
kernelspec:
 display_name: Python 3
 name: python3
---

(edit) Do e.g. {nbdev, papermill, other template systems} indicate notebook-level metadata under a standard key to indicate how to handle the (1) markdown; and (2) mimebundle mimerenderers within?