diff --git a/2026-06-llmops-quickstart/LICENSE.md b/2026-06-llmops-quickstart/LICENSE.md new file mode 100644 index 0000000..a65ffc1 --- /dev/null +++ b/2026-06-llmops-quickstart/LICENSE.md @@ -0,0 +1,21 @@ +Copyright (2022) Databricks, Inc. + +This library (the "Software") may not be used except in connection with the Licensee's use of the Databricks Platform Services pursuant to an Agreement (defined below) between Licensee (defined below) and Databricks, Inc. ("Databricks"). The Object Code version of the Software shall be deemed part of the Downloadable Services under the Agreement, or if the Agreement does not define Downloadable Services, Subscription Services, or if neither are defined then the term in such Agreement that refers to the applicable Databricks Platform Services (as defined below) shall be substituted herein for “Downloadable Services.” Licensee's use of the Software must comply at all times with any restrictions applicable to the Downlodable Services and Subscription Services, generally, and must be used in accordance with any applicable documentation. For the avoidance of doubt, the Software constitutes Databricks Confidential Information under the Agreement. + +Additionally, and notwithstanding anything in the Agreement to the contrary: + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +you may view, make limited copies of, and may compile the Source Code version of the Software into an Object Code version of the Software. For the avoidance of doubt, you may not make derivative works of Software (or make any any changes to the Source Code version of the unless you have agreed to separate terms with Databricks permitting such modifications (e.g., a contribution license agreement)). +If you have not agreed to an Agreement or otherwise do not agree to these terms, you may not use the Software or view, copy or compile the Source Code of the Software. + +This license terminates automatically upon the termination of the Agreement or Licensee's breach of these terms. Additionally, Databricks may terminate this license at any time on notice. Upon termination, you must permanently delete the Software and all copies thereof (including the Source Code). + +Agreement: the agreement between Databricks and Licensee governing the use of the Databricks Platform Services, which shall be, with respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with respect to Databricks Community Edition, the Community Edition Terms of Service located at www.databricks.com/ce-termsofuse, in each case unless Licensee has entered into a separate written agreement with Databricks governing the use of the applicable Databricks Platform Services. + +Databricks Platform Services: the Databricks services or the Databricks Community Edition services, according to where the Software is used. + +Licensee: the user of the Software, or, if the Software is being used on behalf of a company, the company. + +Object Code: is version of the Software produced when an interpreter or a compiler translates the Source Code into recognizable and executable machine code. + +Source Code: the human readable portion of the Software. diff --git a/2026-06-llmops-quickstart/README.md b/2026-06-llmops-quickstart/README.md new file mode 100644 index 0000000..1e35f79 --- /dev/null +++ b/2026-06-llmops-quickstart/README.md @@ -0,0 +1,152 @@ +# LLMOps Quickstart for Databricks + +A minimal but complete end-to-end **LLMOps** example on Databricks, demonstrating the full lifecycle of an LLM-powered application: + +**Data Ingestion → Agent Build → Evaluation → Deployment → Inference** + +Use case: a **customer support ticket classifier** that uses a Databricks Foundation Model to categorize free-text tickets into `billing`, `technical_issue`, `feature_request`, `account_management`, or `other`. + +> 📝 **Blog post:** _link to be added once published on the [Databricks Community](https://community.databricks.com/) platform._ + +> ⚠️ **Disclaimer:** This is **not production-ready code**. It is provided **as-is**, for educational purposes, and support is available on a **best-effort basis**. If you run into problems, please [open an issue](https://github.com/databricks-solutions/databricks-blogposts/issues). + +This example accompanies the blog post and is intended for **educational purposes**. It is unofficial and unsupported (see [Licensing](#licensing)). + +--- + +## What it demonstrates + +The repository shows how to wrap a hosted LLM in the standard Databricks MLOps building blocks so that an LLM application becomes reproducible, governed, and repeatably deployable: + +- An agent logged to **MLflow** as a `ChatAgent`, calling a **Foundation Model API** endpoint (default `databricks-claude-sonnet-4-6`) through the OpenAI-compatible client. +- An **evaluation gate** that promotes a model version to the **Champion** alias in **Unity Catalog** only when accuracy clears a threshold (default 80%). +- A **Databricks Asset Bundle** that deploys the Unity Catalog schema, the MLflow experiment, and every job with one command, with separate `dev` and `prod` targets. +- Both **batch** and **real-time** inference against the deployed **Mosaic AI Model Serving** endpoint. + +--- + +## Prerequisites + +- [Databricks CLI](https://docs.databricks.com/dev-tools/cli/install.html) v0.200+ +- A Databricks workspace with: + - Unity Catalog enabled + - Foundation Model APIs enabled (for the default `databricks-claude-sonnet-4-6` endpoint) + - Permissions to create schemas, registered models, jobs, and Model Serving endpoints + +--- + +## Quickstart + +### 1. Authenticate + +```bash +databricks auth login --host https://.cloud.databricks.com +``` + +### 2. Deploy the bundle + +From this folder: + +```bash +databricks bundle deploy +``` + +This creates the Unity Catalog schema, MLflow experiment, and all jobs in your workspace under your user directory. + +### 3. Run the pipeline + +```bash +# Step 1 — ingest sample support tickets into a Delta table +databricks bundle run data_preprocessing_job + +# Step 2 — build and evaluate the classifier; promote to Champion if accuracy >= 80% +databricks bundle run model_build_evaluation_job + +# Step 3 — deploy the Champion model to a Model Serving endpoint +databricks bundle run model_deployment_job + +# Step 4 — run batch inference over all tickets +databricks bundle run batch_inference_job +``` + +--- + +## Configuration + +All configuration is exposed as bundle variables with sensible defaults — no edits to source files are needed for most workspaces. + +| Variable | Default | Description | +|---|---|---| +| `catalog_name` | `main` | Unity Catalog catalog (must already exist) | +| `schema_name` | `llmops_quickstart` | UC schema (created by the bundle) | +| `model_name` | `support_ticket_classifier` | Registered model name | +| `llm_endpoint` | `databricks-claude-sonnet-4-6` | Foundation Model API endpoint used by the agent | + +Override at deploy time, e.g.: + +```bash +databricks bundle deploy \ + -v catalog_name=my_catalog \ + -v llm_endpoint=databricks-meta-llama-3-3-70b-instruct +``` + +The `prod` target uses `llmops_quickstart_prod` as the schema name: + +```bash +databricks bundle deploy --target prod +``` + +--- + +## Project structure + +``` +notebooks/ + 1_data_preprocessing/ + data_ingestion.py # Creates support_tickets Delta table (30 labelled rows) + 2_model_build_and_deploy/ + quickstart_agent.py # MLflow ChatAgent definition + model_config.yml # Default agent config (llm_endpoint) + model_build.py # Logs agent to MLflow + model_evaluation.py # Evaluates agent; promotes to Champion if accuracy >= threshold + model_deployment.py # Deploys Champion to Mosaic AI Model Serving + 3_inference/ + batch_inference.py # Batch predictions written to inference_results table + realtime_inference.py # Live queries via the OpenAI-compatible API +resources/ # Bundle job + schema/experiment definitions +databricks.yml # Bundle entry point — targets, variables +docs/img/ # Architecture diagrams used in the blog post +``` + +--- + +## How it works + +1. **Data Ingestion** — 30 hand-written, synthetic support tickets (6 per category) are written to a Delta table in Unity Catalog. No real or PII data is used. +2. **Model Build** — `quickstart_agent.py` is logged as an MLflow `ChatAgent`. The configured LLM endpoint is baked into the model artifact via `mlflow.models.ModelConfig`. +3. **Evaluation** — the logged agent runs predictions on all tickets. If accuracy meets the threshold (default 80%), the model is registered in Unity Catalog and aliased as **Champion**. +4. **Deployment** — the Champion version is deployed to a Mosaic AI Model Serving endpoint via `databricks.agents.deploy()`. +5. **Inference** — batch inference loads the Champion model directly; real-time inference queries the serving endpoint via the OpenAI-compatible API. + +--- + +## Data + +The only dataset is **30 small, synthetic support tickets** generated by hand in `notebooks/1_data_preprocessing/data_ingestion.py`. There is no external dataset, no customer data, and no PII. + +--- + +## Licensing + +- This folder is provided under the repository's Databricks license — see [`LICENSE.md`](./LICENSE.md). The Databricks license is **not modified**. +- Dependencies used by this example and their licenses: + - [MLflow](https://github.com/mlflow/mlflow) — Apache License 2.0 + - [Databricks SDK for Python](https://github.com/databricks/databricks-sdk-py) — Apache License 2.0 + - [`databricks-agents`](https://docs.databricks.com/en/generative-ai/agent-framework/build-genai-apps.html) — Databricks + - [OpenAI Python client](https://github.com/openai/openai-python) — Apache License 2.0 + +--- + +## Acknowledgements + +The structure of this example follows the established [databricks-solutions/mlops-quickstart](https://github.com/databricks-solutions/mlops-quickstart) repository, adapted for an LLM/agent workload. Portions of the code and the accompanying blog draft were developed with the assistance of AI tooling and reviewed by the author. diff --git a/2026-06-llmops-quickstart/databricks.yml b/2026-06-llmops-quickstart/databricks.yml new file mode 100644 index 0000000..293f8cc --- /dev/null +++ b/2026-06-llmops-quickstart/databricks.yml @@ -0,0 +1,38 @@ +bundle: + name: llmops-quickstart + +include: + - resources/*.yml + +workspace: + root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/${bundle.target} + +variables: + catalog_name: + description: "Unity Catalog catalog name (must already exist)" + default: main + schema_name: + description: "Unity Catalog schema name (created by the bundle if absent)" + default: llmops_quickstart + model_name: + description: "Registered model name in Unity Catalog" + default: support_ticket_classifier + llm_endpoint: + description: "Databricks Foundation Model API endpoint used by the agent" + default: databricks-claude-sonnet-4-6 + environment: + description: "Deployment environment label" + default: dev + +targets: + dev: + mode: development + default: true + variables: + environment: dev + + prod: + mode: production + variables: + schema_name: llmops_quickstart_prod + environment: prod diff --git a/2026-06-llmops-quickstart/docs/img/llmops-architecture.png b/2026-06-llmops-quickstart/docs/img/llmops-architecture.png new file mode 100644 index 0000000..7123aa2 Binary files /dev/null and b/2026-06-llmops-quickstart/docs/img/llmops-architecture.png differ diff --git a/2026-06-llmops-quickstart/docs/img/llmops-lifecycle.png b/2026-06-llmops-quickstart/docs/img/llmops-lifecycle.png new file mode 100644 index 0000000..4d7c995 Binary files /dev/null and b/2026-06-llmops-quickstart/docs/img/llmops-lifecycle.png differ diff --git a/2026-06-llmops-quickstart/notebooks/1_data_preprocessing/data_ingestion.py b/2026-06-llmops-quickstart/notebooks/1_data_preprocessing/data_ingestion.py new file mode 100644 index 0000000..d8920a0 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/1_data_preprocessing/data_ingestion.py @@ -0,0 +1,88 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# No %pip install needed — all required packages are pre-installed. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Data Ingestion +# MAGIC +# MAGIC Creates sample customer support tickets in a Unity Catalog Delta table. +# MAGIC These records serve as both the evaluation dataset and batch inference input. + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Create schema + +# COMMAND ---------- + +spark.sql(f"CREATE SCHEMA IF NOT EXISTS {catalog_name}.{schema_name}") + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Create support tickets table + +# COMMAND ---------- + +tickets = [ + # billing + (1, "I was charged twice for my subscription last month. Please refund the extra charge.", "billing"), + (2, "My invoice shows a different amount than what I was quoted. Can you explain the discrepancy?", "billing"), + (3, "I cancelled my plan three weeks ago but still got charged. I need an immediate refund.", "billing"), + (4, "How do I update my payment method? My credit card expired.", "billing"), + (5, "I'd like to downgrade my plan to save money. What are the pricing options?", "billing"), + (6, "Can I get an itemized invoice for my last three months of service?", "billing"), + # technical_issue + (7, "The mobile app crashes every time I try to upload a photo. Running iOS 17.", "technical_issue"), + (8, "I keep getting a 500 error when trying to export my reports to PDF.", "technical_issue"), + (9, "The dashboard stopped loading after your last update. I just see a blank screen.", "technical_issue"), + (10, "Two-factor authentication is not sending me the SMS code.", "technical_issue"), + (11, "My data sync between devices stopped working two days ago.", "technical_issue"), + (12, "Search results are returning empty even though I know the data exists.", "technical_issue"), + # feature_request + (13, "It would be great if you could add dark mode to the dashboard.", "feature_request"), + (14, "Can you add bulk export functionality to the reporting module?", "feature_request"), + (15, "Please add keyboard shortcuts to the editor — it would speed up my workflow a lot.", "feature_request"), + (16, "I'd love to see a Slack integration so I get notifications in my team channel.", "feature_request"), + (17, "Would it be possible to add an undo button to the data editor?", "feature_request"), + (18, "Please add a public API so we can integrate with our internal tools.", "feature_request"), + # account_management + (19, "I need to transfer ownership of my account to a colleague.", "account_management"), + (20, "How do I add team members to my organization account?", "account_management"), + (21, "My password reset email never arrived. I've been locked out for 2 days.", "account_management"), + (22, "I want to delete my account and all associated data.", "account_management"), + (23, "Can I merge two accounts under the same email address?", "account_management"), + (24, "How do I enable SSO login for my company?", "account_management"), + # other + (25, "What are your business hours for live chat support?", "other"), + (26, "Do you offer any discounts for non-profit organizations?", "other"), + (27, "I'd like to leave a positive review — your support team was amazing.", "other"), + (28, "Is your platform SOC 2 Type II certified?", "other"), + (29, "What data centers do you use and where are they located?", "other"), + (30, "Do you have a referral program? I'd like to recommend you to clients.", "other"), +] + +from pyspark.sql.types import StructType, StructField, IntegerType, StringType + +schema = StructType([ + StructField("id", IntegerType(), False), + StructField("ticket", StringType(), False), + StructField("category", StringType(), False), +]) + +df = spark.createDataFrame(tickets, schema=schema) + +df.write.mode("overwrite").saveAsTable(f"{catalog_name}.{schema_name}.support_tickets") + +display(spark.read.table(f"{catalog_name}.{schema_name}.support_tickets")) + +# COMMAND ---------- + +print(f"Created table: {catalog_name}.{schema_name}.support_tickets") +print(f"Row count: {spark.read.table(f'{catalog_name}.{schema_name}.support_tickets').count()}") diff --git a/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_build.py b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_build.py new file mode 100644 index 0000000..8bfe326 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_build.py @@ -0,0 +1,70 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# databricks-openai is added via the job's environment spec; no %pip install needed here. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Model Build +# MAGIC +# MAGIC Logs the `TicketClassifierAgent` to an MLflow experiment run and stores the +# MAGIC run ID for use by the evaluation task. + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +dbutils.widgets.text("model_name", "support_ticket_classifier") +dbutils.widgets.text("experiment_name", f"/Users/{dbutils.notebook.entry_point.getDbutils().notebook().getContext().userName().get()}/llmops_quickstart") +dbutils.widgets.text("llm_endpoint", "databricks-claude-sonnet-4-6") + +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") +model_name = dbutils.widgets.get("model_name") +experiment_name = dbutils.widgets.get("experiment_name") +llm_endpoint = dbutils.widgets.get("llm_endpoint") + +registered_model_name = f"{catalog_name}.{schema_name}.{model_name}" + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Log agent to MLflow + +# COMMAND ---------- + +import mlflow +import datetime +from mlflow.models.resources import DatabricksServingEndpoint + +mlflow.set_registry_uri("databricks-uc") +mlflow.set_experiment(experiment_name) + +resources = [DatabricksServingEndpoint(endpoint_name=llm_endpoint)] + +timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") + +# quickstart_agent.py lives in the same directory as this notebook. +# model_config is written into the artifact so the agent reads llm_endpoint at serving time. +with mlflow.start_run(run_name=f"build_{timestamp}") as run: + logged_model_info = mlflow.pyfunc.log_model( + artifact_path="agent", + python_model="quickstart_agent.py", + model_config={"llm_endpoint": llm_endpoint}, + resources=resources, + pip_requirements=[ + "mlflow", + "databricks-openai", + "databricks-agents", + "databricks-sdk", + "typing_extensions", + ], + ) + print(f"Logged model: {logged_model_info.model_uri}") + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Pass run ID to downstream evaluation task + +# COMMAND ---------- + +dbutils.jobs.taskValues.set(key="logged_run_id", value=logged_model_info.run_id) +print(f"run_id: {logged_model_info.run_id}") diff --git a/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_config.yml b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_config.yml new file mode 100644 index 0000000..68fb21e --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_config.yml @@ -0,0 +1 @@ +llm_endpoint: "databricks-claude-sonnet-4-6" diff --git a/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_deployment.py b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_deployment.py new file mode 100644 index 0000000..bb7d463 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_deployment.py @@ -0,0 +1,57 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# databricks-openai is added via the job's environment spec; no %pip install needed here. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Model Deployment +# MAGIC +# MAGIC Deploys the **Champion** model version to a Mosaic AI Model Serving endpoint +# MAGIC using `databricks.agents.deploy`. If an endpoint already exists, it is updated +# MAGIC in-place. + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +dbutils.widgets.text("model_name", "support_ticket_classifier") +dbutils.widgets.text("experiment_name", f"/Users/{dbutils.notebook.entry_point.getDbutils().notebook().getContext().userName().get()}/llmops_quickstart") + +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") +model_name = dbutils.widgets.get("model_name") +experiment_name = dbutils.widgets.get("experiment_name") + +registered_model_name = f"{catalog_name}.{schema_name}.{model_name}" + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Deploy Champion to Model Serving + +# COMMAND ---------- + +import mlflow +from mlflow import MlflowClient +from databricks import agents +from databricks.agents import get_deployments, delete_deployment + +mlflow.set_registry_uri("databricks-uc") +mlflow.set_experiment(experiment_name) + +client = MlflowClient() +champion = client.get_model_version_by_alias(registered_model_name, "Champion") +print(f"Deploying {registered_model_name} v{champion.version} (Champion)") + +# Remove any existing deployments for this model so we don't accumulate stale endpoints +existing = get_deployments(model_name=registered_model_name) +for d in existing: + print(f"Removing existing deployment: {d.endpoint_name}") + delete_deployment(model_name=registered_model_name, model_version=d.model_version) + +deployment = agents.deploy( + model_name=registered_model_name, + model_version=int(champion.version), +) + +print(f"Endpoint: {deployment.endpoint_name}") +print(f"URL: {deployment.endpoint_url}") diff --git a/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_evaluation.py b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_evaluation.py new file mode 100644 index 0000000..aceb5d1 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/model_evaluation.py @@ -0,0 +1,102 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# databricks-openai is added via the job's environment spec; no %pip install needed here. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Model Evaluation +# MAGIC +# MAGIC Evaluates the logged agent against the labelled support tickets. +# MAGIC Metrics are logged to the same MLflow run. If accuracy meets the threshold, +# MAGIC the model is registered to Unity Catalog and aliased as **Champion**. + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +dbutils.widgets.text("model_name", "support_ticket_classifier") +dbutils.widgets.text("logged_run_id", "") +dbutils.widgets.text("experiment_name", f"/Users/{dbutils.notebook.entry_point.getDbutils().notebook().getContext().userName().get()}/llmops_quickstart") +dbutils.widgets.text("accuracy_threshold", "0.8") + +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") +model_name = dbutils.widgets.get("model_name") +logged_run_id = dbutils.widgets.get("logged_run_id") +experiment_name = dbutils.widgets.get("experiment_name") +accuracy_threshold = float(dbutils.widgets.get("accuracy_threshold")) + +registered_model_name = f"{catalog_name}.{schema_name}.{model_name}" +model_uri = f"runs:/{logged_run_id}/agent" + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Load agent and run predictions against labelled tickets + +# COMMAND ---------- + +import mlflow + +mlflow.set_registry_uri("databricks-uc") +mlflow.set_experiment(experiment_name) + +agent = mlflow.pyfunc.load_model(model_uri) + +df = spark.read.table(f"{catalog_name}.{schema_name}.support_tickets").toPandas() + +results = [] +for _, row in df.iterrows(): + prediction = agent.predict({"messages": [{"role": "user", "content": row["ticket"]}]}) + messages = prediction.get("messages", []) + predicted = messages[-1].get("content", "").strip().lower() if messages else "" + results.append({ + "id": row["id"], + "ticket": row["ticket"], + "expected": row["category"], + "predicted": predicted, + "correct": predicted == row["category"], + }) + +import pandas as pd +results_df = pd.DataFrame(results) +display(results_df) + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Log evaluation metrics + +# COMMAND ---------- + +accuracy = results_df["correct"].mean() +n_total = len(results_df) +n_correct = results_df["correct"].sum() + +print(f"Accuracy: {n_correct}/{n_total} = {accuracy:.1%}") + +with mlflow.start_run(run_id=logged_run_id): + mlflow.log_metrics({ + "eval/accuracy": accuracy, + "eval/n_total": n_total, + "eval/n_correct": int(n_correct), + }) + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Register model as Champion if threshold is met + +# COMMAND ---------- + +from mlflow import MlflowClient + +if accuracy >= accuracy_threshold: + print(f"Accuracy {accuracy:.1%} >= threshold {accuracy_threshold:.0%} — registering model.") + client = MlflowClient() + registered = mlflow.register_model(model_uri, name=registered_model_name) + client.set_registered_model_alias(registered_model_name, "Champion", registered.version) + print(f"Registered {registered_model_name} v{registered.version} as Champion.") + dbutils.jobs.taskValues.set(key="model_version", value=registered.version) +else: + raise Exception( + f"Accuracy {accuracy:.1%} is below threshold {accuracy_threshold:.0%}. " + "Model not registered. Improve the agent and re-run." + ) diff --git a/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/quickstart_agent.py b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/quickstart_agent.py new file mode 100644 index 0000000..bf32eed --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/2_model_build_and_deploy/quickstart_agent.py @@ -0,0 +1,51 @@ +import uuid +from databricks.sdk import WorkspaceClient +from typing import Any, Optional + +import mlflow +from mlflow.pyfunc import ChatAgent +from mlflow.types.agent import ChatAgentMessage, ChatAgentResponse, ChatContext + +config = mlflow.models.ModelConfig(development_config="model_config.yml") +LLM_ENDPOINT_NAME = config.get("llm_endpoint") + +openai_client = WorkspaceClient().serving_endpoints.get_open_ai_client() + +mlflow.openai.autolog() + +SYSTEM_PROMPT = ( + "You are a customer support ticket classifier. " + "Classify the given support ticket into exactly one of these categories: " + "billing, technical_issue, feature_request, account_management, other. " + "Respond with only the category name, lowercase, no punctuation or extra text." +) + + +@mlflow.trace +def classify_ticket(content: str) -> list[dict]: + response = openai_client.chat.completions.create( + model=LLM_ENDPOINT_NAME, + messages=[ + {"role": "system", "content": SYSTEM_PROMPT}, + {"role": "user", "content": content}, + ], + ) + return [response.choices[0].message.to_dict()] + + +class TicketClassifierAgent(ChatAgent): + def predict( + self, + messages: list[ChatAgentMessage], + context: Optional[ChatContext] = None, + custom_inputs: Optional[dict[str, Any]] = None, + ) -> ChatAgentResponse: + content = messages[-1].content + raw_msgs = classify_ticket(content) + return ChatAgentResponse( + messages=[ChatAgentMessage(id=uuid.uuid4().hex, **m) for m in raw_msgs] + ) + + +AGENT = TicketClassifierAgent() +mlflow.models.set_model(AGENT) diff --git a/2026-06-llmops-quickstart/notebooks/3_inference/batch_inference.py b/2026-06-llmops-quickstart/notebooks/3_inference/batch_inference.py new file mode 100644 index 0000000..6d62926 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/3_inference/batch_inference.py @@ -0,0 +1,63 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# No %pip install needed — all required packages are pre-installed. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Batch Inference +# MAGIC +# MAGIC Runs the Champion model over all tickets in the support tickets table and +# MAGIC writes predictions back to a results table. + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +dbutils.widgets.text("model_name", "support_ticket_classifier") + +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") +model_name = dbutils.widgets.get("model_name") + +registered_model_name = f"{catalog_name}.{schema_name}.{model_name}" + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Load Champion model and run predictions + +# COMMAND ---------- + +import mlflow + +mlflow.set_registry_uri("databricks-uc") + +model_uri = f"models:/{registered_model_name}@Champion" +agent = mlflow.pyfunc.load_model(model_uri) + +df = spark.read.table(f"{catalog_name}.{schema_name}.support_tickets").toPandas() + + +def predict_category(ticket: str) -> str: + result = agent.predict({"messages": [{"role": "user", "content": ticket}]}) + # Response is a dict with 'messages' list; grab the last message content + messages = result.get("messages", []) + return messages[-1].get("content", "").strip() if messages else "" + + +df["predicted_category"] = df["ticket"].apply(predict_category) + +display(df[["id", "ticket", "category", "predicted_category"]]) + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Write predictions to Delta table + +# COMMAND ---------- + +result_df = spark.createDataFrame(df[["id", "ticket", "category", "predicted_category"]]) +result_df.write.mode("overwrite").saveAsTable(f"{catalog_name}.{schema_name}.inference_results") + +print(f"Results written to {catalog_name}.{schema_name}.inference_results") + +correct = (df["category"] == df["predicted_category"]).sum() +print(f"Accuracy: {correct}/{len(df)} = {correct/len(df):.1%}") diff --git a/2026-06-llmops-quickstart/notebooks/3_inference/realtime_inference.py b/2026-06-llmops-quickstart/notebooks/3_inference/realtime_inference.py new file mode 100644 index 0000000..25d4a30 --- /dev/null +++ b/2026-06-llmops-quickstart/notebooks/3_inference/realtime_inference.py @@ -0,0 +1,68 @@ +# Databricks notebook source +# Uses Databricks Serverless Environment v5 (configured in job resource YAML). +# No %pip install needed — all required packages are pre-installed. + +# COMMAND ---------- +# MAGIC %md +# MAGIC # Realtime Inference +# MAGIC +# MAGIC Demonstrates querying the deployed Model Serving endpoint directly via the +# MAGIC OpenAI-compatible REST API using the Databricks SDK. + +# COMMAND ---------- + +# COMMAND ---------- + +dbutils.widgets.text("catalog_name", "main") +dbutils.widgets.text("schema_name", "llmops_quickstart") +dbutils.widgets.text("model_name", "support_ticket_classifier") + +catalog_name = dbutils.widgets.get("catalog_name") +schema_name = dbutils.widgets.get("schema_name") +model_name = dbutils.widgets.get("model_name") + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Discover endpoint name from the deployed Champion + +# COMMAND ---------- + +from databricks.agents import get_deployments +import mlflow +from mlflow import MlflowClient + +mlflow.set_registry_uri("databricks-uc") +registered_model_name = f"{catalog_name}.{schema_name}.{model_name}" + +deployments = get_deployments(model_name=registered_model_name) +assert deployments, f"No deployments found for {registered_model_name}. Run model_deployment first." + +endpoint_name = deployments[0].endpoint_name +print(f"Using endpoint: {endpoint_name}") + +# COMMAND ---------- +# MAGIC %md +# MAGIC ## Send a ticket to the endpoint + +# COMMAND ---------- + +from databricks.sdk import WorkspaceClient + +client = WorkspaceClient() +openai_client = client.serving_endpoints.get_open_ai_client() + +sample_tickets = [ + "My API key stopped working after I reset my password.", + "I want to add my manager to my account as an admin.", + "Please add webhook support so we can trigger workflows automatically.", + "I was billed for an annual plan but I selected monthly.", + "What time does your support team finish for the day?", +] + +for ticket in sample_tickets: + response = openai_client.chat.completions.create( + model=endpoint_name, + messages=[{"role": "user", "content": ticket}], + ) + category = response.choices[0].message.content.strip() + print(f"[{category:25s}] {ticket}") diff --git a/2026-06-llmops-quickstart/resources/1_data_preprocessing_job.yml b/2026-06-llmops-quickstart/resources/1_data_preprocessing_job.yml new file mode 100644 index 0000000..bcaadc0 --- /dev/null +++ b/2026-06-llmops-quickstart/resources/1_data_preprocessing_job.yml @@ -0,0 +1,31 @@ +resources: + jobs: + data_preprocessing_job: + name: "LLMOps Quickstart - 1. Data Preprocessing [${var.environment}]" + description: "Ingests sample support tickets into Unity Catalog." + + environments: + - environment_key: default + spec: + environment_version: "5" + + parameters: + - name: catalog_name + default: "${var.catalog_name}" + - name: schema_name + default: "${var.schema_name}" + + tasks: + - task_key: data_ingestion + environment_key: default + notebook_task: + notebook_path: "../notebooks/1_data_preprocessing/data_ingestion.py" + base_parameters: + catalog_name: "{{job.parameters.catalog_name}}" + schema_name: "{{job.parameters.schema_name}}" + + timeout_seconds: 1800 + max_concurrent_runs: 1 + tags: + Project: llmops-quickstart + Environment: "${var.environment}" diff --git a/2026-06-llmops-quickstart/resources/2_1_model_build_evaluation_job.yml b/2026-06-llmops-quickstart/resources/2_1_model_build_evaluation_job.yml new file mode 100644 index 0000000..ae6c9ee --- /dev/null +++ b/2026-06-llmops-quickstart/resources/2_1_model_build_evaluation_job.yml @@ -0,0 +1,60 @@ +resources: + jobs: + model_build_evaluation_job: + name: "LLMOps Quickstart - 2.1 Model Build & Evaluation [${var.environment}]" + description: "Logs the agent to MLflow, evaluates it, and promotes to Champion if the accuracy threshold is met." + + # databricks-agents, mlflow, databricks-sdk, and pydantic are all included in Env v5. + # Only databricks-openai needs to be added. + environments: + - environment_key: default + spec: + environment_version: "5" + dependencies: + - "databricks-openai" + + parameters: + - name: catalog_name + default: "${var.catalog_name}" + - name: schema_name + default: "${var.schema_name}" + - name: model_name + default: "${var.model_name}" + - name: llm_endpoint + default: "${var.llm_endpoint}" + - name: experiment_name + default: "${resources.experiments.llmops_experiment.name}" + - name: accuracy_threshold + default: "0.8" + + tasks: + - task_key: model_build + environment_key: default + notebook_task: + notebook_path: "../notebooks/2_model_build_and_deploy/model_build.py" + base_parameters: + catalog_name: "{{job.parameters.catalog_name}}" + schema_name: "{{job.parameters.schema_name}}" + model_name: "{{job.parameters.model_name}}" + llm_endpoint: "{{job.parameters.llm_endpoint}}" + experiment_name: "{{job.parameters.experiment_name}}" + + - task_key: model_evaluation + depends_on: + - task_key: model_build + environment_key: default + notebook_task: + notebook_path: "../notebooks/2_model_build_and_deploy/model_evaluation.py" + base_parameters: + catalog_name: "{{job.parameters.catalog_name}}" + schema_name: "{{job.parameters.schema_name}}" + model_name: "{{job.parameters.model_name}}" + experiment_name: "{{job.parameters.experiment_name}}" + logged_run_id: "{{tasks.model_build.values.logged_run_id}}" + accuracy_threshold: "{{job.parameters.accuracy_threshold}}" + + timeout_seconds: 3600 + max_concurrent_runs: 1 + tags: + Project: llmops-quickstart + Environment: "${var.environment}" diff --git a/2026-06-llmops-quickstart/resources/2_2_model_deployment_job.yml b/2026-06-llmops-quickstart/resources/2_2_model_deployment_job.yml new file mode 100644 index 0000000..04b77eb --- /dev/null +++ b/2026-06-llmops-quickstart/resources/2_2_model_deployment_job.yml @@ -0,0 +1,39 @@ +resources: + jobs: + model_deployment_job: + name: "LLMOps Quickstart - 2.2 Model Deployment [${var.environment}]" + description: "Deploys the Champion model version to a Mosaic AI Model Serving endpoint." + + environments: + - environment_key: default + spec: + environment_version: "5" + dependencies: + - "databricks-openai" + + parameters: + - name: catalog_name + default: "${var.catalog_name}" + - name: schema_name + default: "${var.schema_name}" + - name: model_name + default: "${var.model_name}" + - name: experiment_name + default: "${resources.experiments.llmops_experiment.name}" + + tasks: + - task_key: model_deployment + environment_key: default + notebook_task: + notebook_path: "../notebooks/2_model_build_and_deploy/model_deployment.py" + base_parameters: + catalog_name: "{{job.parameters.catalog_name}}" + schema_name: "{{job.parameters.schema_name}}" + model_name: "{{job.parameters.model_name}}" + experiment_name: "{{job.parameters.experiment_name}}" + + timeout_seconds: 3600 + max_concurrent_runs: 1 + tags: + Project: llmops-quickstart + Environment: "${var.environment}" diff --git a/2026-06-llmops-quickstart/resources/3_batch_inference_job.yml b/2026-06-llmops-quickstart/resources/3_batch_inference_job.yml new file mode 100644 index 0000000..7e36aad --- /dev/null +++ b/2026-06-llmops-quickstart/resources/3_batch_inference_job.yml @@ -0,0 +1,34 @@ +resources: + jobs: + batch_inference_job: + name: "LLMOps Quickstart - 3. Batch Inference [${var.environment}]" + description: "Runs the Champion model over all support tickets and writes predictions to a results table." + + environments: + - environment_key: default + spec: + environment_version: "5" + + parameters: + - name: catalog_name + default: "${var.catalog_name}" + - name: schema_name + default: "${var.schema_name}" + - name: model_name + default: "${var.model_name}" + + tasks: + - task_key: batch_inference + environment_key: default + notebook_task: + notebook_path: "../notebooks/3_inference/batch_inference.py" + base_parameters: + catalog_name: "{{job.parameters.catalog_name}}" + schema_name: "{{job.parameters.schema_name}}" + model_name: "{{job.parameters.model_name}}" + + timeout_seconds: 3600 + max_concurrent_runs: 1 + tags: + Project: llmops-quickstart + Environment: "${var.environment}" diff --git a/2026-06-llmops-quickstart/resources/model_artifacts.yml b/2026-06-llmops-quickstart/resources/model_artifacts.yml new file mode 100644 index 0000000..6da5652 --- /dev/null +++ b/2026-06-llmops-quickstart/resources/model_artifacts.yml @@ -0,0 +1,10 @@ +resources: + schemas: + llmops_schema: + catalog_name: "${var.catalog_name}" + name: "${var.schema_name}" + comment: "LLMOps Quickstart schema — support ticket classification" + + experiments: + llmops_experiment: + name: "/Users/${workspace.current_user.userName}/${bundle.target}_${var.model_name}" diff --git a/CODEOWNERS b/CODEOWNERS index 3f80da0..8f79f78 100644 --- a/CODEOWNERS +++ b/CODEOWNERS @@ -38,3 +38,4 @@ /2026-05-external-access-to-unity-catalog-managed-delta-tables/* @dipankarkush-db /2026-05-ai-functions-data-warehouse-use-cases/* @ismailmakhlouf-dbx @srikantdas11 /2026-05-coding-agent-sandboxes/* @jlieow +/2026-06-llmops-quickstart/* @CEDipEngineering