File size limitation in jupyter notebook

MerryGoLucky · January 18, 2023, 10:47am

Hi All,

What is the maximum file size that jupyter notebook can import and convert into csv file? I have orc files - sample.orc (file size 2GB) and sample2.orc (file size 63GB) to import into jupyter notebook but cannot even load sample.orc to read and covert into csv file.

Appreciate your kind help, suggestions.

bollwyvl · January 18, 2023, 12:47pm

Does it work from the terminal version of your process?

Is there a shell command you can use to offload this to another process? Something along the lines of:

!orc2csv in.orc out.csv

With file sizes that large, keeping the whole thing in RAM (twice) can have fairly negative impacts when working interactively, whether in a notebook, or just in a kernel.

MerryGoLucky · January 19, 2023, 7:55am

Hi bollwyvl,

Thanks for reply. I am using jupyter notebook 6.4.12,

Python 3.9.13
Selected Jupyter core packages…
IPython : 7.31.1
ipykernel : 6.15.2
ipywidgets : 7.6.5
jupyter_client : 7.3.4
jupyter_core : 4.11.1
jupyter_server : 1.18.1
jupyterlab : 3.4.4
nbclient : 0.5.13
nbconvert : 6.4.4
nbformat : 5.5.0
notebook : 6.4.12
qtconsole : 5.2.2
traitlets : 5.1.1

how shall i apply the code that you have suggested? Below code is trying to read 18.5MB orc file in the jupyter notebook but I got parse error

import csv
example = open("./19297__currentWLAN.orc", "rb")    #reading 19297__currentWLAN.orc file which has 18.5MB     
reader = pyorc.Reader(example)
rows = reader.read()
with open('Sample123.csv', 'w') as out:
    csv_out = csv.writer(out)
    csv_out.writerow(reader.schema.fields.keys()) #write the header, columns parameters
    csv_out.writerows(rows)

bollwyvl · January 19, 2023, 1:36pm

Sorry, I don’t really know much about that file format.

Can you put that script into a python script, and run it that way? I’m just not sure if this is actually related to jupyter, or even ipython/ipykernel. Is there any chance the file is malformed? Do you have any other tools that can open this file?

Topic		Replies	Views
Extracting Large ORC file into CSV using jupyter notebook ipynb how-to	0	960	December 28, 2022
Uploading a csv General help-wanted	1	1005	March 2, 2023
Is there a maximum size of Jupyter notebook for which nbviewer can render? nbconvert nbviewer	0	130	May 27, 2024
Not able to load the the .ipynb file in Jupyter notebook Notebook help-wanted	0	844	August 25, 2023
Issue with Jupyter Notebook JupyterHub	1	796	July 24, 2021

File size limitation in jupyter notebook

Related topics