Uploaded by Aayush Kaushal

Data Engineering Roadmap: Skills, Tools, and Projects

advertisement
DATA ENGINEERING ROADMAP
Prerequisites
DBMS and SQL
Python
Hands On
Hands On
Linux
Everything Data
Data warehouse
Course | Book
Also, do some research about more data concepts like
Data Lake, Data Mart, Data Fabric, Data Mesh, Data Catalog, etc
Distributed Systems
Spark with Python
Course | Hands On
Also, research basics about Hadoop, Hive, Pig, MPP systems
Cloud
GCP or AWS or Azure
Consider certifications like AWS solutions architect, AWS big data speciality, GCP
Professional Data Engineer, Azure Cloud Fundamentals, etc
Must learn tools
Orchestration: Airflow
Compute: Databricks and
Snowflake
Streaming: Kafka
CICD: Jenkins and
Sonarqube
Containers: Docker
Databricks substitutes can be AWS EMR or GCP Dataproc
Snowflake substitutes can be AWS Redshift or GCP Big Query
Projects
Batch processing
Real time processing
Suggestion is to create a free tier account in any cloud platform that you prefer.
AWS, GCP or Azure and implement end to end projects
Additional Optional Topics
Building, Training and
Deploying ML models
Data visualisation with
Tableau or Power BI or
Looker
ETL/ELT: Matillion or
Talend
Containers: Kubernetes
And be an SME in one of the topics above. pick any tool or topic and deep dive
further!
Download