Possible to send markdown text to a markdown cell in a new notebook via Papermill? Other options?

#1

I am using papermill to parameterize making a bunch of notebooks. These notebooks are meant to be stubs so that not everything is 100% complete in them. One of the things I have for input into Papermill is raw markdown formatted text that I got from a conversion via pandoc. Ultimately I want this raw markdown text to show up as a typical markdown cell in the new notebook. I could get it to display the rendered markdown as output using this but then it cannot be edited further. And of course, I could just use print() to get the raw markdown to show up as text I could that I can then copy and paste in a new markdown cell in the generated notebook and render as markdown. And that would work, but it sounds tedious for a lot of notebooks and it feels like I am missing a step where I could go direct from markdown text stored as a Python variable to the text as a markdown cell in a new notebook.

Discussing it with JĂĽrgen Hermann (@jhermann), we came up with some options. Options discussed:

My present solution is to insert using notedown to fist make an unrun notebook that puts the markdown as markdown cells and then I can run the notebooks via Papermill to have the required resulting output be added. But I am still curious if there is a more direct way to inject programmatically generated markdown text into a new notebook and have it be in a markdown cell? Suggestions?

#2

It seems like this could be pretty straightforward to do just with a lightweight bit of Python code to manually insert markdown into a generated notebook. Have you given that a shot?

#3

That is definitely an option. I am doing a lot of changes in multiple parts of the notebook. And because I knew I would need papermill to run it, I guess I was relying on that avoiding touching the json myself.

#4

you wouldn’t necessarily need to touch the raw JSON, you could use nbformat to grab the output notebooks and insert markdown as needed. E.g.

import nbformat as nbf
ntbk = nbf.read(myntbk.ipynb, nbf.NO_CONVERT)
for cell in ntbk.cells:
    if cell.source == "<insert_placeholder>":
        cell.source = "# My markdown\n\nGoes here
1 Like
#5

That looks useful to know how to do. I think that and some of the other abilities of nbconvert were things I was missing that had me thinking this had to be more easily automated and also explained why Papermill didn’t have that ability. Thanks.

I ended up using python to build up a string for each notebook I wanted to make from each starting markdown. I saved that as markdown with the specific code blocks to be ultimately code cells tagged specially with {.python .input}. The other code blocks among the markdown text will ultimately be rendered as literal markdown code blocks by notedown, which is what I needed to happen. With the markdown built up then all I needed was the following to for each markdown file to make a new notebook via notedown and execute it via nbconvert to get the code output to show in the resulting notebook:

!notedown --match=strict input.md > {notbeook_name}
!jupyter nbconvert --to notebook --execute {notbeook_name} --output={notbeook_name}

notedown made it easy to mix injecting code and markdown to the resulting notebook via markdown which is where I was starting from.

1 Like
#6

Another option, if you wanted to use papermill but add some injected code is to inherit the nbconvert engine, register it as “markdown-nbconvert”, and do the cell source injection as @choldgraf suggested before performing the rest of the execution as normal. There’s instructions for how to register these extensions here. Then from papermill you could call papermill input.ipynb output.ipynb --engine markdown-convert -p foo bar -p etc etc

1 Like
#7

Thanks, MSeal. That would allow me return to just primarily using papermill. One thing I am understanding though is the order of injecting the markdown? Wouldn’t I want to do it after the new notebook executed so I don’t mess up my template? I see under here, it specifically says, “could apply post-processing to the executed notebook.”

#8

You could do either before or after. The execute_managed_notebook function is wrapping the notebook execution so manipulating the nb_man.nb (this is the underlying nbformat object) can happen as a pre or post processing step.

Also note that with papermill you typically want to keep the input and output paths separate. That way your original template/input isn’t edited no matter what papermill does. Injecting the markdown before it executes only affects the in-memory representation. It won’t edit the source file in any way unless you point the output back to that source.

2 Likes