Skip to content

eduardoruiz1990/madrid-metro-pulse

Repository files navigation

Madrid Metro Pulse

🗺️ Project Overview

Madrid Metro Pulse is an advanced data engineering and machine learning project designed to analyze, visualize, and forecast urban mobility patterns in Madrid. By integrating real-time data from the Empresa Municipal de Transportes de Madrid (EMT) with historical pedestrian traffic data, this project provides a dynamic and insightful view into the city's pulse.

At its core, the project leverages a suite of hyper-local time-series models, trained using Meta's Prophet library, to deliver 48-hour pedestrian demand forecasts for hundreds of specific locations. Fused with live bus transit information and rendered on an interactive Mapbox interface, the dashboard offers a unique tool for urban planning, operational logistics, and public transit optimization. This project serves as a comprehensive, end-to-end demonstration of a production-ready data science application, from automated data ingestion and model training to a sophisticated, interactive user dashboard.

✨ Key Features

  • Automated Data Ingestion: Includes a standalone script to perform a one-time, robust fetch of all static bus and line data from the official EMT MobilityLabs API, creating a reliable local database.
  • Hyper-Local Time-Series Forecasting: A dedicated script trains hundreds of individual Prophet models, one for each pedestrian sensor, providing granular 48-hour demand forecasts for specific city locations.
  • Hybrid Analysis Dashboard: The Streamlit application presents a dual view for any selected location: a historical analysis of typical 24-hour demand patterns and a live 48-hour predictive forecast.
  • Geospatial Visualization: Integrates with the Mapbox API to render an interactive map of bus routes, dynamically highlighting selected stops and visualizing the predicted demand intensity with a color-coded halo.
  • Dockerized for Portability: The entire application is containerized with Docker, ensuring a consistent and reproducible environment for setup and deployment.

🛠️ Prerequisites

Before you begin, you will need to create accounts with two services to obtain the necessary API credentials (free):

  1. EMT Madrid: An account with access to the MobilityLabs API. You can register at https://mobilitylabs.emtmadrid.es/. This will provide you with a Client ID (your email) and a Passkey.
  2. Mapbox: A Mapbox account to generate a public access token for rendering maps. You can sign up and find your token in your account console at https://www.mapbox.com/.

🚀 Installation & Setup

Option 1: Running with Docker (Recommended)

This is the simplest way to get the application running in a consistent environment.

  1. Clone the Repository:
    git clone [https://github.com/eduardoruiz1990/madrid-metro-pulse.git\](https://github.com/eduardoruiz1990/madrid-metro-pulse.git)
    cd madrid-metro-pulse

  2. Create Your Secrets File:
    Create a file named .secrets in the project's root directory and add your credentials (template file provided in repository, remember to rename):
    # .streamlit/secrets.toml

    # EMT Madrid MobilityLabs Credentials
    EMT_CLIENT_ID = "your_emt_email_here"
    EMT_PASSKEY = "your_emt_passkey_here"

    # Mapbox Public Access Token
    MAPBOX_API_KEY = "your_mapbox_api_key_here"

  3. Build and Run the Docker Container:
    Make sure you have Docker Desktop running on your machine.
    docker-compose up --build

    The application will be available at http://localhost:8501.

Option 2: Running Locally with a Virtual Environment

  1. Clone the Repository and set up the environment:
    git clone [https://github.com/eduardoruiz1990/madrid-metro-pulse.git\](https://github.com/eduardoruiz1990/madrid-metro-pulse.git)
    cd madrid-metro-pulse
    python3 -m venv venv
    source venv/bin/activate # On Windows, use `venv\Scripts\activate`

  2. Install Dependencies:
    pip install -r requirements.txt

  3. Create Your Secrets File:
    Follow the instructions in step 2 of the Docker setup to create your .secrets file.

🏃 Usage

The project follows a three-step workflow:

  1. Fetch Static Data (One-Time Setup):
    Before running the app for the first time, you must populate your local database. Run the fetcher script from your terminal:
    python fetch_api_data.py

    This script will perform the heavy API calls and save the results to the /data folder.

  2. Train Predictive Models (One-Time Setup):
    Next, train the hyper-local forecast models using your historical pedestrian data:
    python train_local_models.py

    This will create hundreds of model files in the /models folder.

  3. Run the Streamlit Dashboard:
    You can now launch the interactive application:
    streamlit run app.py

📁 File Structure

madrid-metro-pulse/
├── .streamlit/
│ └── secrets.toml # Your private credentials file (ignored by Git)
├── data/ # Raw pedestrian data and fetched API data (ignored by Git)
├── models/ # Trained Prophet models (ignored by Git)
├── .gitignore # Specifies files to ignore
├── README.md # This project overview
├── requirements.txt # Project dependencies
├── fetch_api_data.py # One-time script to fetch static API data
├── train_local_models.py # One-time script to train all forecast models
└── app.py # The main Streamlit dashboard application

📜 Acknowledgments and Licensing

This project utilizes data and services provided by the following organizations. Use of these services is subject to their respective terms and conditions.

Please review their terms of use before deploying this application for any public or commercial purpose.

About

An end-to-end Data Engineering and Machine Learning project that analyzes and forecasts urban mobility in Madrid. This project fuses real-time transit data (EMT API) with historical pedestrian traffic to train a suite of Prophet time-series models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors