I am new to Jupyter Notebook.
How do I extract large ORC file into CSV file using python jupyter notebook? and is there any limitation of ORC files in jupyter notebook that can handle?
I am extracting large ORC file (58G) into csv files in python jupyter, but I could not generate csv file after run this python code in jupyter notebook.
import pyorc import csv example = open("./ORC-A.orc", "rb") reader = pyorc.Reader(example) rows = reader.read() with open('orc.csv', 'w') as out: csv_out = csv.writer(out) csv_out.writerow(reader.schema.fields.keys()) csv_out.writerows(rows)
When I tried 19297__currentWLAN.orc, which is smaller size of ORC file (18.5MB), I got Parse Error could not convert this into csv file
> #Checking version > !python -V > !jupyter --version > !jupyter notebook --version > > import pyorc > import csv > example = open("./19297__currentWLAN.orc", "rb") #19297__currentWLAN.orc -file size is 18.5 MB > reader = pyorc.Reader(example) > rows = reader.read() > with open('19297.csv', 'w') as out: > csv_out = csv.writer(out) > csv_out.writerow(reader.schema.fields.keys()) > csv_out.writerows(rows)