
I have been working as a data analyst and data engineer for the past two years, and I decided to pursue new challenges to further develop my career.
As part of this goal, I am building a complete end-to-end data engineering pipeline from scratch, running entirely on local infrastructure in my home lab!
The goal is to build, brick by brick a complete project that covers the entire data pipeline from data sources to final consumers using the main tools of the standard data engineering stack.
I’m a physicist so naturally I enjoy working with simulations, one of my ideas is to simulate an army of CPU temperature sensors lets say from a data center where the temperature values will respond to a simulated demand on the data center infrastructure.
In the next post I will attempt to simulate the sensors using some python, trying to emulate a bunch of microcontrollers such as ESP32s streaming their sensor values over the network. This will be the foundation of the pipeline and the results will influence the next steps.

Wish me luck!
