Skip to content

Jakee4488/MLOps_Databricks_Project

Repository files navigation

End-to-end MLOps with Databricks

Set up your environment

Check out the documentation on how to install it: https://docs.astral.sh/uv/getting-started/installation/

To create a new environment and create a lockfile, run:

uv sync --extra dev

Data

Using the Marvel Characters Dataset from Kaggle.

This dataset contains detailed information about Marvel characters (e.g., name, powers, physical attributes, alignment, etc.). It is used to build classification and feature engineering models for various MLOps tasks, such as predicting character attributes or status.

Scripts

  • 01.process_data.py: Loads and preprocesses the Marvel dataset, splits into train/test, and saves to the catalog.
  • 02.train_register_fe_model.py: Performs feature engineering and trains the Marvel character model.
  • 03.deploy_model.py: Deploys the trained Marvel model to a Databricks model serving endpoint.
  • 04.post_commit_status.py: Posts status updates for Marvel integration tests to GitHub.
  • 05.refresh_monitor.py: Refreshes monitoring tables and dashboards for Marvel model serving.

About

This repository provides a simple, end-to-end MLOps architecture implemented within the Databricks platform. It includes a minimal, working machine learning project that demonstrates the key stages of an MLOps lifecycle, including data preparation, model training and tracking (using MLflow), model registration, and batch inference or serving.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages