\Duke-DataEngineering# Python, Bash, and SQL for Data Engineering
This repository contains my projects, lab work, and technical notes from the Duke University Specialization. I use it to track my learning and show my data engineering skills.
I organized this repository based on the official four-course curriculum:
- Course Topics: Setting up Python environments, using VS Code and Vim, and data manipulation with Pandas (read/write data structures and files).
- Skills I Gained: Python Programming, Pandas, Git, Data Manipulation, Software Development Tools, Data Structures, Development Environment, NumPy, Virtual Environment, Version Control, Data Analysis Software.
- Folder:
/01-Python-Pandas/
- Course Topics: Using Linux tools to build solutions, and developing Bash syntax to control Linux.
- Skills I Gained: Shell Scripting, Bash (Scripting Language), Linux Commands, File Management, Command-Line Interface, Scripting, Data Manipulation, Development Environment, Remote Access Systems, Unix, Data Processing, File Systems, Data Management, Unix Commands, Scripting Languages, Unix Shell, Linux Administration.
- Folder:
/02-Linux-Bash/
- Course Topics: Connecting to and querying SQL databases with Python, extracting data from different sources, and using web scraping techniques.
- Skills I Gained: Data Structures, SQL, Data Import/Export, Python Programming, Web Scraping, Scripting, Data Persistence, JSON, MySQL, Data Manipulation, Database Management, Spatial Analysis, Databases, Hypertext Markup Language (HTML), Data Capture.
- Folder:
/03-Python-SQL/
- Course Topics: Constructing Python microservices with FastAPI, building Command-Line Tools (CLI) with Click, and using Jupyter notebooks.
- Skills I Gained: AWS SageMaker, Jupyter, Command-Line Interface, Microservices, Package and Software Management, Containerization, CI/CD, Test Automation, Data Pipelines, Algorithms, Python Programming, Cloud Engineering, Applied Machine Learning.
- Folder:
/04-Capstone-CLI-Web/
I follow these technical rules to work like a professional:
- Organized Environments: I use a new
.venv(virtual environment) for every project. This keeps my tools and libraries organized. - Version Control (Git): I save all my progress using Git commands and share my code on GitHub.
- Native Linux Experience: I work using WSL2 (Ubuntu) to learn in a real-world data engineering environment.# Duke-Data-Eng-Python