Modify the ASCII cell output to a HTML table

Nithanaroy · September 21, 2020, 5:46am

I’m using a spark kernel inside the notebook. Unlike Python, the resultant output table is not beautified. When I run myDataFrame.show(10), it prints an ASCII table which is very hard to read given the possible nested nature of the result. I was thinking a tabular output would be a good first step or an optional JSON array to handle the nested ness. How can I intercept this output and reformat with a HTML interactive table?

fomightez · September 21, 2020, 3:48pm

Have you tried something like this:

myDataFrame.limit(10).toPandas()

See here and here. According to here you’d need Pandas installed and available. However, I’m unsure though what kernel they are referring to since the first SO post just says Jupyter or if pyspark and the spark kernel are different?

Nithanaroy · September 21, 2020, 4:36pm

Thanks for the idea. I forgot to add that I’m working with Scala spark and not pyspark which doesn’t have toPandas() method.
In general how to format the output of a cell before painting it on screen?

fomightez · September 21, 2020, 5:55pm

Can you try putting the code credited to Aivean shown here in a cell and running that cell and then try:

myDataFrame.showHTML(10, 300)

Problem with that idea is that I cannot tell if it is particular to the almond kernel?

Nithanaroy · September 21, 2020, 6:49pm

Neat idea! But as you guessed the sparkmagic kernel I’m using does not provide (afaik) an output handler like publish used by Aivean. I raised a feature request for the same on sparkmagic github at https://github.com/jupyter-incubator/sparkmagic/issues/670#issue-705831198

But I was thinking of implementing a magic command (if it doesn’t exist) which is kernel agnostic, that intercepts the output of myScalaDataFrame.show() and transforms it to a HTML table. Any thoughts on that?

fomightez · September 21, 2020, 7:09pm

Sorry, I haven’t made any magic commands.

If you want something less fancy and working now… does write work in your kernel? If so, you could send the output of .show() to a file with the following,like in step 3 here:

text_file = open("filename.html", "a")
# write the ASCI
text_file.write(myDataFrame.show(10))

Or try:

 %store myDataFrame.show(10) >filename.html

Then you can adjust the ASCII to HTML using this or pandoc, perhaps? And then display that HTML in your notebook if the kernel allows that?

Nithanaroy · October 6, 2020, 3:21am

This idea is cool too. Thanks for sharing these. However, I think they expect myDataFrame to be a local Python data structure. In my case, all data frames are in spark context managed by livy, which cannot be accessed by local kernel. So, I think hooking into output handler of the Jupyter notebook ecosystem, would be the only way. I’m assuming this is possible using magic commands.

Topic		Replies	Views
Object reference output instead of HTML General	8	4929	November 11, 2019
Problem displaying dataframe in Jupyterlab notebook using python JupyterLab help-wanted	3	3821	January 6, 2022
Best practice for using %%sql magics on python Spark notebooks General	1	751	April 16, 2020
Having an issue to schedule a Jupyter scala notebook nbconvert	5	1061	June 11, 2021
Grabbing snapshots of cell output - identifying a cell output dom element General	4	1055	April 27, 2019

Modify the ASCII cell output to a HTML table

Related topics