Hi everyone,
I’m Thibaut and I’m thrilled to introduce Amphi for Jupyterlab, a new Micro ETL extension designed for ingesting, cleansing, and processing data.
Coming from a data engineering background, I really enjoy using notebook for data exploration and analysis. However, I also really like to use a graphical ETL (such as Talend or Knime) for repetitive data ingestion and cleaning tasks. I developed Amphi to take care of mundane data integration tasks that take away a lot of time from actually analyzing the data, providing interesting insights and all the fun stuff
Discover Amphi
In short, use Amphi within the Jupyterlab environment to design your data pipelines with a graphical user-interface and generate native Python code you can deploy anywhere.
For instance, you can create pipelines to rapidly transform a CSV file into a JSON or Parquet format. Additionally, you can apply transformations like filtering, sorting, and deduplication, among others. Then, you have the option to execute the pipeline directly or export the Python code to run it wherever you prefer.
If you are interested by Amphi, please don’t hesitate to star the repository .
Install
You can install the extension to your jupyterlab (>4) instance via pip or the extension manager.
pip install --upgrade jupyterlab-amphi
Documentation
Documentation is very much a WIP, any feedback is welcome
Feedback
As Amphi is in beta version, any feedback and suggestion is very much appreciated. Please submit any comment and issue on Github. Also, I’ve opened a Slack community if anyone is interested to discuss further.
Thank you
Special thanks to the community members that have helped and responded to many of my questions on the forum and in Gitter over the last few months. In particular: krassowski, jtp, bollwyvl and mahendrapaipuri.