Databricks framework to validate Data Quality of pySpark DataFrames and Tables
-
Updated
Apr 14, 2026 - Python
Databricks framework to validate Data Quality of pySpark DataFrames and Tables
End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.
Automated migrations to Unity Catalog
A cloud-native data pipeline and visualization project analyzing Formula 1 racing data using Azure, Databricks, Delta Lake, Tableau, and Python for insightful EDA and interactive dashboards.
Open, Multi-modal Catalog for Data & AI, written in Rust
A production-ready PySpark project template with medallion architecture, Python packaging, unit tests, integration tests, CI/CD automation, Databricks Asset Bundles, and DQX data quality framework.
Notebooks, terraform, tools to enable setting up Unity Catalog
Production-grade Databricks infrastructure templates for Azure. Deploy in 20 minutes with VNet injection, Unity Catalog, managed identity. Perfect for learning and prototyping. Free and open source.
Unity Catalog Explorer is a TypeScript + Next.js based Web UI for the Unity Catalog OSS.
Visual intelligence tool for Databricks workspaces. Builds a live ontology of Unity Catalog assets, compute, jobs, dashboards, and apps as entities with semantic relationships, enriched with cost attribution, lineage, and governance insights.
How to Configure Azure Databricks Unity Catalog using Terraform
SchemaX - Automatic Schema Management
Production-ready support ticket classification using Unity Catalog AI Functions, Vector Search, and RAG. Features 6-phase workflow, knowledge base integration, and Streamlit dashboard.
End-to-end Azure Data Engineering project using ADF for incremental ingestion, Databricks (DLT) for Medallion Architecture, and Delta Lake for CDC (SCD Type 1). Managed via Databricks Asset Bundles (DABs) for professional CI/CD. Focuses on real-time streaming, scalability, and Star Schema modeling.
Real Estate ELT pipeline using Databricks Asset Bundles on GCP. Ingests, transforms, and analyzes property data via Delta Live Tables. Follows medallion architecture (Bronze/Silver/Gold), modular Python design, CI/CD automation with GitHub Actions, and full Unit and Integration tests coverage.
End-to-end Azure Databricks retail data engineering project using Medallion Architecture (Bronze, Silver, Gold). Implements Auto Loader, Unity Catalog, Delta Lake, SCD Type 1 & 2 dimensions, and Fact Orders for analytics-ready star schema modeling.
Pulls new data from FRED API, maintained by St. Louis Federal Reserve Bank ,on the 1st Tuesday of every month, performs feature engineering, stores as per medallion architecture on Databricks, and predicts Recession
End-to-end retail data pipeline on Databricks using PySpark and Delta Lake, built with Bronze–Silver–Gold architecture and connected to Power BI for analytics.
Portfolio Azure (ADF + Databricks SQL + Power BI)
Add a description, image, and links to the unity-catalog topic page so that developers can more easily learn about it.
To associate your repository with the unity-catalog topic, visit your repo's landing page and select "manage topics."