How to build and automate your Python ETL pipeline with Airflow | Data pipeline | Python

86,590
0
Published 2022-03-07
In this video, we will cover how to automate your Python ETL (Extract, Transform, Load) with Apache Airflow. In this session, we will use the TaskFlow API introduced in Airflow 2.0. TaskFlow API makes it much easier to author clean ETL code without extra boilerplate, by using the @task decorator. Airflow organizes your workflows as Directed Acyclic Graphs (DAGs) composed of tasks.


In this tutorial, we will see how to design a ETL Pipeline with Python. We will use SQL Server’s AdventureWorks database as a source and load data in PostgreSQL with Python. We will focus on Product's hierarchy and enhance our initial data pipeline to give you a complete overview of the extract, load and transform process.

Link to medium article on this topic: medium.com/@hnawaz100/how-to-automate-etl-pipeline…

Link to previous video:    • How to build an ETL pipeline with Pyt...  

Link to Pandas video:    • Python Pandas Data Science Tutorial (...  

Link to GitHub repo: github.com/hnawaz007/pythondataanalysis/blob/main/…

Link to Cron Expressions: docs.oracle.com/cd/E12058_01/doc/doc.1014/e12030/c…

Subscribe to our channel:
youtube.com/c/HaqNawaz

---------------------------------------------
Follow me on social media!

GitHub: github.com/hnawaz007
Instagram: www.instagram.com/bi_insights_inc
LinkedIn: www.linkedin.com/in/haq-nawaz/

---------------------------------------------

#ETL #Python #Airflow

Topics covered in this video:
0:00 - Introduction to Airflow
2:49 - The Setup
3:40 - Script ETL pipeline: Extract
5:52 - Transform
7:39 - Load
8:00 - Define Direct Acyclic Graph (DAG)
9:36 - Airflow UI: Dag enable & run
10:09 - DAG Overview
10:29 - Test ETL Pipeline

All Comments (21)