Roadmap for Data Engineering
I thought I could never crack an Data Engineering interview Until..
I found this Roadmap.
Here is HOW can you do in 5 steps.
𝗦𝘁𝗲𝗽 𝟭: 𝗦𝗤𝗟
- Basic SQL Syntax
- DDL, DML, DCL
- Joins & Subqueires
- Views & Indexes
- CTEs & Window Functions
𝗦𝘁𝗲𝗽 𝟮: 𝗣𝘆𝘁𝗵𝗼𝗻
- Fundamentals
- Numpy
- Pandas
𝗦𝘁𝗲𝗽 𝟯: 𝗣𝘆𝘀𝗽𝗮𝗿𝗸
- RDD
- Dataframe
- Datasets
- Spark Streaming
- Optimization techniques
𝗦𝘁𝗲𝗽 𝟰: 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘀𝘂𝗶𝗻𝗴/𝗗𝗮𝘁𝗮 𝗠𝗼𝗱𝗲𝗹𝗶𝗻𝗴
- OLAP vs OLTP
- Star & Snowflake Schema
- Fact & Dimension Tables
- Slowly Changing Dimensions (SCD)
𝗦𝘁𝗲𝗽 𝟱: 𝗖𝗹𝗼𝘂𝗱 𝗦𝗲𝗿𝘃𝗶𝗰𝗲𝘀
- Nosql DB
- Relational DB
- Datawarehousing
- Scheduling & Orchestration
- Messaging
- ETL Services
- Storage Services
- Data Processing Services
𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝘀𝗼𝗺𝗲 𝘃𝗮𝗹𝘂𝗮𝗯𝗹𝗲 𝗿𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀 𝘁𝗼 𝗵𝗲𝗹𝗽 𝘆𝗼𝘂 𝗴𝗲𝘁 𝘀𝘁𝗮𝗿𝘁𝗲𝗱:
- SQL - https://lnkd.in/gJXw4XtK
- Python - https://lnkd.in/dt_-2-Uj
- Pyspark - https://lnkd.in/gtCdub-V
- Airflow - https://lnkd.in/guebuHJ7
- Kafka - https://lnkd.in/gVZUT52s
- Data Modeling - https://lnkd.in/gg6V98Ru
- Azure Cloud - https://lnkd.in/gwc3By9h
- Google Cloud - https://lnkd.in/gNwpp2RT
- AWS - https://lnkd.in/gJeUGfjS
- Projects - https://lnkd.in/gZf6J6fm