Inline variable insertion in markdown

For the RMarkdown option, won’t we run into trouble with notebooks that already accidentally have markup like `j = 0` or similar?

1 Like

Rendering inline expressions is also what I’d intuitively expect.

My mental model of markdown cells is that they’re code cells whose output is the rendered markdown (which is why ctrl-enter/shift-enter are required to render them), with some magic that means the cell output hides the original markdown code cell. That means they can have all the good points (e.g. rendered output can be saved and reloaded without re-executing) and bad points (e.g. can be executed out of order) as any other code cell.

1 Like

Damnit you’re right

That makes sense to me! I guess under the hood they are a cell like anything else, so no reason they couldn’t have information needed at the rendering step.

I think that’s probably true. Though I also don’t think it would be a dealbreaker if this were a future feature and not something that would have to be there out of the box. So IMO it’s important not to make decisions that make “insta-rendering” impossible, but not required that it is there from day 1.

I think the question we should all be asking is “what is the MVP?”. I worry that the more scope this feature has, the less likely anybody will implement it (and we’ve already seen github issues where this has been discussed extensively and no action resulted). I’m trying to think of an easy first step that would require minimal complexity and would let us iterate a bit

An easy-ish way to implement this would be to have magics like `%%markdown`` send a specific metadata value in the output that says “the input of this cell is supposed to be hidden when you display the output”, and then simply reuse code cells.

That would requires minimal changes to frontend, and to backend and leave any magic implementer to go with whatever syntax they like.

Trying to bold this on top of markdown cell is going to be hard as md cells have been usually explicitly design in the sens that 1) they are context free (this is why we used md instead of RST) and 2) have no side effects.

With this approach you could even in some case (in particular live notebook), use widgets to have values that are live, or simple replacement for values that are not.

You also would not need to “render” markdown to html on the kernel, you could for example just substitute “variable”, and send with the markdown mimetype.

1 Like

@carreau wouldn’t the %%markdown approach mean that you have to implement the logic of that magic in every kernel that wanted to use it?

It’s not more logic than if you need to add specific handling of “markdown cell execution”.

Plus I’m strongly expecting the logic to anyway have to depend on the kernel language and the type of expression you will/can allow depends on your language. Like in R, $ are valid, not in Python; curly brackets will Likely be a problem as delimiters in in C/C++ or rust kernel.

As soon as you will want “some” makdown/kernel integration you will have to have custom code in your kernel anyway.

And the problem with trying to standardise it in Jupyter and say ‘this is the jupyter markdown specification’ is that we tried it a couple years ago and gave up because even the common-mark folks gave up.

Also the flip side of “all the kernels needs to implement it” is “all the frontends needs to implement it”, which is not easier.

IMHO the “use magics” approach is the least backward incompatible (it will work on old frontend, except the code will still show), and fastest to implement – basically just a hide/show in frontend, no changes in nbconvert and co, and let different approaches to include this in markdown compete.

1 Like

Maybe I am not understanding, but I’m not sure why there would need to be much language-specific logic. In the same way that a code cell just grabs whatever is inside, sends it to the kernel, and then attempts to display the result below, I’d imagine that having “executable inline markdown content” would simply:

  1. Use a regex for {{. XXX }} (or whatever pattern)
  2. Send the stuff inside to the kernel to execute
  3. Get the result and store it in the markdown cell’s metadata
  4. Let the renderer decide what it wants to do to render

Is there a reason that you’d need to have kernel-specific information to handle that, in a different way from a code cell?

In my opinion this isn’t necessary immediately. Even if this were just an nbconvert extension I think it could be useful as a starting point to prototype infrastructure. Renderers could begin by simply displaying those characters as “regular” markdown (or could display just the characters in some special way without executing the code), and over time decide how they wanted the execution logic + rendering to work. This is basically how knitr and RMarkdown work - I think that proves that you don’t need interactive rendering in order for this to be useful for people.

In my opinion that is too much for the user to remember. The beauty of RMarkdown is the simplicity of simply typing `r someexpression()`. I think that it’d feel too clunky to have to use a special magic + from IPython.display import Markdown + the call to Markdown(. Following the reasoning of @matthew.brett and @stefanv I think the raw text (non-rendered) version of the notebook should look minimally different from the fancy rendered version.

(and anecdotally, I have noticed that many people, particularly teachers, discourage the use of %% cell magics in general, because then the notebook’s code is not executable outside of the context of a Jupyter server…so I think that’s a downside that is important to consider)

2 Likes

I would really like it to “just work”. Typing %%markdown on top of a cell is cumbersome, and can lead to loss of syntax highlighting (unless you use jupyterlab-lsp); also, many kernels do not have such a magics mechanism, I would like to be able to use it in kernels which consistently reject the ideas to introduce cell magics.

Your post however, lead my to an idea: what if it was a new cell type, say “template markdown”? Pros:

  • easy for users to chose it from the dropdown in existing interfaces (there could be a setting for which markdown type to use by default)
  • does not break existing notebooks
1 Like

Shouldn’t it be a global switch then, stored in the notebook metadata? Switching on a per-cell basis is also cumbersome.

That could be solved by the setting allowing users to use that new type by default, right?

You mean exactly like I tried to do about 8 years ago ?

It’s a bit more complicated as now your markdown may have to deal with side effects, like exception, print statement and so on. Technically you want to use the user_expressions – not the typical code execution flow – field of the Jupyter protocol as well, and either make sure you supress all those (which I would discourage as otherwise you’ll get plenty of hard to debug error messages), special case it, or heavily modify the frontend so the markdown cells get many of the same features code cells have:

  • You’ll end up capturing all outputs mimetype (hey what is in a cell output)
  • You’ll end up having out of sync values with what is in the kernel, and might want to start having a prompt number to debug.
  • You’ll get questions on how to display graphs and multiline contents, like df.head().

(and let’s not dive into ‘now markdown cells are security concern’ side of things as well, and impact the “trust” status of a notebook, but it’s quite an important one.)

It may look simple to “just” capture what is in between {{}} but it is not, and trust me you will get into issues with escaping and the type of expression that can be in between will vary between languages.

  • {'a':{'b':1}} in python as a double closing bracket, do you want people to write "this is a dictionary: {{ {'a':{'b':1\}\}}}"
  • does `{{expression}}` interpolate or not ?
    • what if a language uses ` like bash or R in its expression ?

You can try to do it and end up with a slightly different markdown spec and supported variables and expression in your curly braces in every frontend. IMHO It would be much easier to delegate the markdown parsing and formatting to the kernel, so your kernels need to have markdown endpoint.

So yeah I think sending only the variable and expressions is be a possible direction, and an interesting one, it happen to be way more complicated than it looks; and my historical perspective is that it was tried but was too much complication to be accepted.

I also don’t believe the challenges to this proposal are that much technicals, they will be about coming to a consensus and actually pulling of the implementation in all the required repositories. And my guts tell me the less place you have to change the most chances it has to succeed.

1 Like

Thanks for that historical perspective. It is helpful to see that this has been thought through before. Reading through that PR, I’m reassured that there is actually a decent amount of agreement on the patterns that are there. Interesting that this brainstorm also converged on a similar syntax ({{. some_expression }}). I agree with your assessment that the biggest challenge is just agreeing on something and going with it (this is the same problem that the CommonMark community has).

I think the important two questions to ask are

  • should this feature exist?
  • what should it look like from a user’s perspective?

To me, the answer to “should it exist?” is obviously yes. This is a massively popular part of the RMarkdown ecosystem, it has a lot of people requesting it in the ipython issue, and I think there’d be a huge benefit to Jupyter users to allow for similar behavior.

For the second answer, I think that here we have a relatively clear target to shoot for: It should be as easy or easier than using RMarkdown. This is the comparison that people will make when they think about a feature like this, and this pattern clearly already works well enough.

So in my opinion, anything that adds much extra user-facing complexity on top of

The value of myvariable plus 2 is `r myvariable + 2`!

is too much complexity. This is why I don’t think we can use a magic command, or require people to change cell types, etc. All of that will feel cumbersome to users, and the benefit of quickly writing something will be lost.

I think that this second question is still where we are right now. What should the syntax be? What is the expected behavior in an interactive setting, and in a non-interactive setting?

Is it realistic that we can answer some of these questions without knowing all of the details of what an implementation would look like? Obviously implementation is important, and it is clear there could be a lot of complexity there. But the important thing is that people agree on the “what” and the “why”, so that we know that this is worth doing in the first place, and so that we have some constraints about what it must look like from a user’s perspective.

And secondarily - I still wonder if there’s some way that we can prototype something in the form of an extension, rather than require it to be part of core from day 1. Perhaps this would make it easier to experiment and iterate. Maybe @krassowski’s idea of a different “type” of markdown cell could be a way to experiment?

I’d like to highlight the IPython issue 2958 that Chris mentions (sorry, I would link, but new users on this forum can only include 2 links in a post), and that follows up on Matthias’s original implementation. I see developers giving the idea the thumbs up and users enthusiastically supporting it. Even Matthias’s original PR seemed like it could potentially have made it through with some more work.

We should not over-complicate things. Users may not care about the 1% of use cases that we cannot support (very complex expressions that need special escaping, e.g.), or expressions that raise exceptions (the result can simply be rendered as some red text). The feature is so useful that it is worth finding a way to do it, even if it isn’t universally perfect.

I don’t know why the concept of static Markdown cells have become important along the way. Is there a reason we care about our text cells having predictable output? If this is a rendering speed issue, we can fill in the display async as @minrk suggested.

With the “send-string-to-kernel” approach, there is no need for any kernel to be modified; we can re-use the existing cell execution pathways. I think Brian also supports this route.

[Edit] Matthias added the link.

1 Like

I think a lot of reasons are historical ones, keep in mind that 8 years ago, the main way of sharing notebooks was still emailing back and forth. So the balance of why or why not might have changed.

I think we can go ~90% there with an extension, there will be som UI/UX in live notebook to think a lot about (how to re-execute a markdown cell without showing its source for example).

I’m happy to pitch ideas/review PRs or prototypes, but not sure I’ll have any bandwidth to do any actual implementation.

If this is something critical enough we could make that part of a grant.

3 Likes

Do you know of anyone who would be willing to do this work? Perhaps this can be funded with one or two rounds of a NumFOCUS SDG, for example.

1 Like

I am definitely interested in making this happen, though it’s still a bit unclear to me what it would take to make it happen. I think there are a few moving pieces here:

  1. Deciding on a path forward
  2. Understanding who is needed to agree to that path forward
  3. Getting buy-in from that group of people
  4. Doing the work

I think that we can realistically get funding for maybe 1 and 4, though it is unclear to me who 2 is, which makes it difficult to know how to do 3.

If we can scope this as an extension of some kind then maybe 2/3 is just a matter of “whoever is interested in working on the extension” and then this can be much simpler. The potential downside there is that there’s no guarantee that it would be broadly adopted, but I think that’ll be more likely if there is a working prototype that people like. That’s how I’ve seen many other things ultimately make their way into core.

Either way I am happy to write proposals and provide collaboration at the design level. I don’t have time to do the actual development, though potentially we could try and find somebody in the Executable Books community, or in 2i2c to work on it

1 Like

Not sure it needs more +1s, but to add to the “should it exist” my most enthusiastic YES. We’ve talked about this for ages and it’s something we discussed well before even RMarkdown existed. For various reasons in our ecosystem we never fully implemented it (sometimes I wonder if the existence of reST with some of its worst design and implementation decisions has hurt us more than helped in the long run, but I digress). But I think it’s unquestionably a needed idea, and now between MyST (that gives us some design freedom away from pure reST) and JupyterLab (that gives us better extensibility to explore UI ideas), we should go ahead.

I’ll dig deeper into the rest of the thread later, but I wanted to chime in about how happy it made me to stumble on this tonight :slight_smile:

3 Likes

Hey,

I wanted to bring a different perspective on this.

Looking at document oriented softwares like good old Office or Confluence, they have the concept of embedded object (OLE technology for Microsoft, Macro component in Confluence). One of the reason for those external component is to provide a tool better suited for writing a piece of documentation. Notebook could allow such a behavior by supporting more type of cells (called viewer later); like a plotly chart editor cell, drawio cell, … . The source of the cell would be the required model (like a json for plotly). But to have those type of cells really powerful (on the opposite of the afford mentioned softwares), it should be possible for some viewers to link variable views from the kernel into their model. So the user is offer to process data efficiently through code cell. But some more visual tasks like annotating a graph is more easily done with plotly chart editor for example.
And so markdown cell would be a variant of those non-code cells.

The request variable view type will be tight to the viewer need through mimetype (like plotly would ask application/json view). And we could image that the markdown viewer is constrained to text/plain; other views required either to use a code cell or another viewer - injecting a graph or a table view in markdown is easier solved by adding a intermediate code cell in the short run anyway.

2 Likes

Wanted to leave a quick note here saying I am also very enthusiastic about this concept/direction! I am working on curvenote.com which is a downstream user of jupyter outputs, myst - and helps with creating and collaborating on scientific documents and weaving in Jupyter results. Our editor embeds variables/outputs (some examples here), although up to now we have been mostly focused on JS not Jupyter. We have been working with @choldgraf and others on some of the MyST in javascript implementation, which may be able to help on some of the frontend user interfaces down the road (certainly through an extension at least).

I think there are some amazing opportunities for educational content and scientific writing for embedded interactivity through some combination of thebe/jupyterlite/myst and this new syntax. And to some of @choldgraf’s points, even very small steps towards this style of rendering (e.g. static, post processing in JupyterBook) can have large impact (and be done mostly through extensions to start with?).

Keeping an eye on this, and hopefully may be able to contribute time to some aspects through working with @choldgraf/Executable Books!

3 Likes

I was wondering how hard would hard it be to implement this feature alongside storing notebooks in plain markdown/MyST and have them converted in JSON in the background with Jupytext (perhaps in .jupyter_cache/). In this way, quick modifications could be done on the fly with any editor, without needing to turn on a server each time. (I know this seems pretty unrelated or might ignore that the community already discussed it and does not like this)