Real-Time European Energy Analytics

A high-throughput data engineering pipeline monitoring electricity generation, pricing, and (soon) cross-border flows across 30+ European countries.

About the Pipeline

The European Energy Grid Monitor is a data pipeline with an ETL architecture that processes data from the European Union's ENTSO-E API. It ingests raw XML data from the ENTSO-E Transparency Platform every hour, normalizes it via Python producers, buffers messages through Apache Kafka, and persists millions of records in a CockroachDB cluster.

System Architecture

🐍
Python Ingestion
Custom ETL scripts fetching XML data from the ENTSO-E API, normalizing schemas, and handling API rate limits.
🌊
Apache Kafka
Hosted on Confluent Cloud. Decouples ingestion from storage, ensuring reliable message buffering and replayability.
🗄️
CockroachDB
Serverless SQL database (PostgreSQL compatible). Optimized for high-write loads and distributed availability.
🐳
Docker & CI/CD
Full pipeline containerized with Docker Compose. Automated ingestion workflows managed via GitHub Actions.