Skip to content

aayush2789/GridPulse

Repository files navigation

GridPulse: Energy Forecast & Monitoring System

GridPulse is a high-performance Data and ML system designed for real-time energy load forecasting. It integrates an ETL pipeline, a FastAPI backend with Server-Sent Events (SSE) for real-time updates, and a modern Next.js dashboard for visualization.

Note

Project Highlight:

  • Real-Time ML Pipeline: Built a robust end-to-end system integrating Kafka-like streaming, leakage-safe temporal feature engineering (lag/rolling/EMA), and sub-second inference using FastAPI and Server-Sent Events (SSE).
  • Production-Grade Architecture: Implemented a "fail-fast" CI/CD pipeline with strict contract testing, containerized microservices (Docker), and automated schema validation to ensure 99.9% system stability.
  • Modern Data Visualization: Developed an interactive Next.js dashboard with Recharts to visualize high-frequency time-series data, model performance metrics, and drift detection in real-time.

🎯 Project Objective

The primary goal of GridPulse is to provide reliable, real-time energy consumption predictions and system health monitoring. The system emphasizes change safety through a robust CI/CD pipeline and comprehensive testing, ensuring that incremental updates to the ML models, API, or frontend do not compromise system stability.

📊 Dataset: OPSD Household

The project utilizes the Open Power System Data (OPSD) Household dataset, specifically the Household Data Package. This open dataset contains validated, high-resolution (15-minute) power consumption data from single-family homes (Europe Region(Germany)). It serves as a realistic ground truth for benchmarking the system's ability to handle seasonality, noise, and intra-day volatility in a streaming context.

🏗️ System Architecture

GridPulse is composed of four main layers:

  1. Data Ingestion & ETL: Extracts raw energy data, performs cleaning, and transforms it into model-ready features.
    • Advanced Feature Engineering: Computes leakage-safe lag features (t-1 to t-96), rolling window statistics (mean/std/min/max), and Exponential Moving Averages (EMA) to capture short-term trends and volatility.
  2. ML & Inference: Trained models generate hourly load predictions based on processed historical data, with dynamic feature selection to ensure model compatibility.
  3. API Layer (FastAPI):
    • Provides REST endpoints for historical data, temporal features, and metrics.
    • Uses Server-Sent Events (SSE) to push real-time updates to the dashboard whenever new predictions are generated.
  4. Frontend Dashboard (Next.js): A responsive React-based dashboard that visualizes predictions, actual load, temporal trends (lags, EMAs), performance metrics (MAE, RMSE), and pipeline health.

🔌 Key API Endpoints

Method Endpoint Description
GET /dashboard/latest Aggregated data for dashboard initialization
GET /predictions/stream SSE stream for real-time prediction updates
GET /features/temporal Returns lag, rolling stats, and EMA feature data
GET /metrics/latest Current model performance (MAE, RMSE, MAPE)
GET /health System health and status check

🚀 CI/CD & Change Safety

The project includes a GitHub Actions CI workflow configured to run on every push and pull request to the main branch. The workflow is designed to "fail fast" and focuses on verifying the integrity of each component.

CI Jobs:

  • python-tests:
    • Installs backend dependencies.
    • Runs pytest for smoke tests, unit tests, and contract validation.
  • frontend-build:
    • Performs a production build of the Next.js application to catch build-time errors.
  • docker-build-check:
    • Verifies that the Dockerfiles for both the API and Frontend build successfully, ensuring container portability.

🧪 Testing Strategy

GridPulse uses a multi-layered testing approach via pytest:

  • Smoke Tests: Verify that the API imports correctly and core endpoints (/health, /dashboard/latest) return successful status codes and expected structures.
  • SSE Validation: Ensures the SSE generator can be instantiated and sends the initial "dashboard" event without runtime errors.
  • Logic Tests: Unit tests for performance metric computation (MAE/RMSE/MAPE) and data transformation logic in the ETL layer.
  • Contract Tests: Validates the schema of the dashboard payload to ensure the backend and frontend remain in sync.

Running Tests Locally

To run the backend test suite locally:

# Install test dependencies
pip install pytest pytest-asyncio fastapi httpx sse-starlette pandas numpy watchdog

# Run the tests
PYTHONPATH=. pytest tests/

🛠️ Development Setup

Prerequisites

  • Python 3.11+
  • Node.js 20+
  • Docker (for containerized deployment)

API Setup

cd api
pip install -r requirements.txt
python main.py

Frontend Setup

cd frontend
npm install
npm run dev

🐳 Running the Full System with Docker

GridPulse supports one-command startup for all services (API, ETL, and Frontend) using Docker Compose.

Prerequisites

  • Docker
  • Docker Compose (v2+)

Build and Start All Services

docker compose up --build

About

Real-time energy forecasting system with streaming ML, CI/CD, and drift monitoring.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors