feat: add schema-compare command to test harness

markphelps · markphelps · commit 2c85d079a17b · 2026-03-30T13:32:07.000-04:00
Add a new `schema-compare` command that builds each model twice — once
with COG_STATIC_SCHEMA=1 (Go tree-sitter) and once without (Python
runtime) — then compares the resulting OpenAPI schemas for exact JSON
equality. Differences are reported with a structured diff showing the
exact JSON paths that diverge.

Also add 7 local fixture models covering the full input type matrix:
scalar types, optional types (PEP 604 + typing.Optional), list types,
optional list types, constraints/choices, File/Path types, and
structured BaseModel output.
diff --git a/tools/test-harness/README.md b/tools/test-harness/README.md
@@ -5,36 +5,43 @@ Designed to test any cog model from any repo.
 
 ## Quick Start
 
+All commands use `uv run` which automatically installs dependencies from
+`pyproject.toml` — no manual venv setup needed.
+
 ```bash
 cd tools/test-harness
 
-# Create a venv and install dependencies
-python3 -m venv .venv
-source .venv/bin/activate
-pip install pyyaml
-
 # List all models in the manifest
-python -m harness list
+uv run cog-test list
 
 # Run all non-GPU models
-python -m harness run --no-gpu
+uv run cog-test run --no-gpu
 
 # Run a specific model
-python -m harness run --model hello-world
+uv run cog-test run --model hello-world
 
 # Run GPU models only (requires NVIDIA GPU + nvidia-docker)
-python -m harness run --gpu-only
+uv run cog-test run --gpu-only
 
 # Output JSON report
-python -m harness run --no-gpu --output json --output-file results/report.json
+uv run cog-test run --no-gpu --output json --output-file results/report.json
 
 # Build images only (no predictions)
-python -m harness build --no-gpu
+uv run cog-test build --no-gpu
+
+# Compare static (Go) vs runtime (Python) schema generation
+uv run cog-test schema-compare --no-gpu
+
+# Compare schemas for a specific fixture model
+uv run cog-test schema-compare --model fixture-scalar-types
+
+# Use a locally-built cog binary
+uv run cog-test schema-compare --no-gpu --cog-binary /path/to/cog
 ```
 
 ## Prerequisites
 
-- Python 3.10+
+- [uv](https://docs.astral.sh/uv/) (or Python 3.10+ with `pip install pyyaml`)
 - Docker
 - For GPU models: NVIDIA GPU + nvidia-docker runtime
 
@@ -47,19 +54,19 @@ skipping any alpha/beta/rc tags. You can override either via the CLI or in
 
 ```bash
 # Use the latest stable CLI + SDK (default)
-python -m harness run --no-gpu
+uv run cog-test run --no-gpu
 
 # Pin a specific CLI version
-python -m harness run --cog-version v0.16.12 --no-gpu
+uv run cog-test run --cog-version v0.16.12 --no-gpu
 
 # Pin a specific SDK version
-python -m harness run --sdk-version 0.16.12 --no-gpu
+uv run cog-test run --sdk-version 0.16.12 --no-gpu
 
 # Use a pre-release CLI
-python -m harness run --cog-version v0.17.0-rc.2 --no-gpu
+uv run cog-test run --cog-version v0.17.0-rc.2 --no-gpu
 
 # Use a locally-built binary (overrides --cog-version)
-python -m harness run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
+uv run cog-test run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
 ```
 
 You can also pin versions in `manifest.yaml` under `defaults`:
@@ -156,12 +163,13 @@ No code changes required.
 ## CLI Reference
 
 ```
-usage: cog-test {run,build,list} [options]
+usage: cog-test {run,build,list,schema-compare} [options]
 
 Commands:
-  run     Build and test models (full pipeline)
-  build   Build Docker images only (no predictions)
-  list    List models defined in the manifest
+  run              Build and test models (full pipeline)
+  build            Build Docker images only (no predictions)
+  list             List models defined in the manifest
+  schema-compare   Compare static (Go) vs runtime (Python) schema generation
 
 Common options:
   --manifest PATH       Path to manifest.yaml
@@ -173,21 +181,58 @@ Common options:
   --cog-binary PATH     Path to local cog binary (overrides --cog-version)
   --keep-images         Don't clean up Docker images after run
 
-Run-specific options:
+Run/schema-compare options:
   --output {console,json}   Output format (default: console)
   --output-file PATH        Write report to file
 ```
 
+### Schema Comparison
+
+The `schema-compare` command builds each model **twice** — once with
+`COG_STATIC_SCHEMA=1` (Go tree-sitter parser) and once without (Python
+runtime schema generation) — then compares the resulting OpenAPI schemas
+for exact JSON equality. Any difference is reported as a failure with a
+structured diff showing the exact paths that diverge.
+
+This is useful for catching regressions when changing either the Go static
+schema generator (`pkg/schema/`) or the Python SDK schema generation
+(`python/cog/_adt.py`, `python/cog/_inspector.py`, `python/cog/_schemas.py`).
+
+### Local Fixture Models
+
+Models with `repo: local` are loaded from `fixtures/models/<path>/` instead
+of being cloned from GitHub. These are small predictors designed to cover
+the full input type matrix for schema comparison testing:
+
+| Fixture | What it covers |
+|---------|----------------|
+| `scalar-types` | str, int, float, bool, Secret |
+| `optional-types` | PEP 604 `X \| None` and `Optional[X]` for all types |
+| `list-types` | `list[X]` and `List[X]` for str, int, Path, File |
+| `optional-list-types` | `list[X] \| None` and `Optional[List[X]]` |
+| `constraints-and-choices` | ge/le constraints, string/int choices |
+| `file-path-types` | Path, File, optional Path/File |
+| `complex-output` | BaseModel structured output |
+
 ## Architecture
 
 ```
 tools/test-harness/
 ├── manifest.yaml           # Declarative test definitions
-├── fixtures/               # Test input files (images, etc.)
+├── fixtures/
+│   ├── *.png               # Test input files (images, etc.)
+│   └── models/             # Local fixture models for schema comparison
+│       ├── scalar-types/
+│       ├── optional-types/
+│       ├── list-types/
+│       ├── optional-list-types/
+│       ├── constraints-and-choices/
+│       ├── file-path-types/
+│       └── complex-output/
 ├── harness/
 │   ├── cli.py              # CLI entry point
 │   ├── cog_resolver.py     # Resolves + downloads cog CLI and SDK versions
-│   ├── runner.py           # Clone -> patch -> build -> predict -> validate
+│   ├── runner.py           # Clone -> patch -> build -> predict -> validate + schema compare
 │   ├── patcher.py          # Patches cog.yaml with sdk_version + overrides
 │   ├── validators.py       # Output validation strategies
 │   └── report.py           # Console + JSON report generation
diff --git a/tools/test-harness/fixtures/models/complex-output/cog.yaml b/tools/test-harness/fixtures/models/complex-output/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/complex-output/predict.py b/tools/test-harness/fixtures/models/complex-output/predict.py
@@ -0,0 +1,15 @@
+from cog import BaseModel, BasePredictor, Input
+
+
+class Output(BaseModel):
+    text: str
+    score: float
+    tags: list[str]
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        prompt: str = Input(description="Input prompt"),
+    ) -> Output:
+        return Output(text=f"generated: {prompt}", score=0.95, tags=["a", "b"])
diff --git a/tools/test-harness/fixtures/models/constraints-and-choices/cog.yaml b/tools/test-harness/fixtures/models/constraints-and-choices/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/constraints-and-choices/predict.py b/tools/test-harness/fixtures/models/constraints-and-choices/predict.py
@@ -0,0 +1,23 @@
+from cog import BasePredictor, Input
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        prompt: str = Input(description="The prompt", default="hello"),
+        temperature: float = Input(
+            description="Sampling temperature", ge=0.0, le=2.0, default=0.7
+        ),
+        top_k: int = Input(description="Top-K", ge=1, le=100, default=50),
+        mode: str = Input(
+            description="Quality mode",
+            choices=["fast", "balanced", "quality"],
+            default="balanced",
+        ),
+        style: int = Input(
+            description="Style preset",
+            choices=[1, 2, 3],
+            default=1,
+        ),
+    ) -> str:
+        return f"{prompt}-{temperature}-{top_k}-{mode}-{style}"
diff --git a/tools/test-harness/fixtures/models/file-path-types/cog.yaml b/tools/test-harness/fixtures/models/file-path-types/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/file-path-types/predict.py b/tools/test-harness/fixtures/models/file-path-types/predict.py
@@ -0,0 +1,13 @@
+from cog import BasePredictor, File, Input, Path
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        image: Path = Input(description="An image path"),
+        document: File = Input(description="A file upload"),
+        # Optional variants
+        mask: Path | None = Input(description="Optional mask path", default=None),
+        attachment: File | None = Input(description="Optional file", default=None),
+    ) -> str:
+        return f"image={image} mask={mask}"
diff --git a/tools/test-harness/fixtures/models/list-types/cog.yaml b/tools/test-harness/fixtures/models/list-types/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/list-types/predict.py b/tools/test-harness/fixtures/models/list-types/predict.py
@@ -0,0 +1,14 @@
+from typing import List
+
+from cog import BasePredictor, File, Input, Path
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        tags: list[str] = Input(description="List of strings"),
+        numbers: List[int] = Input(description="List of ints"),
+        paths: list[Path] = Input(description="List of paths"),
+        files: list[File] = Input(description="List of files"),
+    ) -> str:
+        return f"tags={len(tags)} numbers={len(numbers)}"
diff --git a/tools/test-harness/fixtures/models/optional-list-types/cog.yaml b/tools/test-harness/fixtures/models/optional-list-types/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/optional-list-types/predict.py b/tools/test-harness/fixtures/models/optional-list-types/predict.py
@@ -0,0 +1,25 @@
+from typing import List, Optional
+
+from cog import BasePredictor, File, Input, Path
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        text: str = Input(description="Required anchor field"),
+        # PEP 604 optional lists
+        opt_tags: list[str] | None = Input(
+            description="Optional list of strings", default=None
+        ),
+        opt_paths: list[Path] | None = Input(
+            description="Optional list of paths", default=None
+        ),
+        opt_files: list[File] | None = Input(
+            description="Optional list of files", default=None
+        ),
+        # typing.Optional style
+        opt_ints: Optional[List[int]] = Input(
+            description="Optional list of ints", default=None
+        ),
+    ) -> str:
+        return f"{text}"
diff --git a/tools/test-harness/fixtures/models/optional-types/cog.yaml b/tools/test-harness/fixtures/models/optional-types/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/optional-types/predict.py b/tools/test-harness/fixtures/models/optional-types/predict.py
@@ -0,0 +1,19 @@
+from typing import Optional
+
+from cog import BasePredictor, File, Input, Path
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        text: str = Input(description="Required string"),
+        # PEP 604 style optionals
+        opt_str: str | None = Input(description="Optional string", default=None),
+        opt_int: int | None = Input(description="Optional int", default=None),
+        opt_float: float | None = Input(description="Optional float", default=None),
+        opt_bool: bool | None = Input(description="Optional bool", default=None),
+        # typing.Optional style
+        opt_path: Optional[Path] = Input(description="Optional path", default=None),
+        opt_file: Optional[File] = Input(description="Optional file", default=None),
+    ) -> str:
+        return f"{text}"
diff --git a/tools/test-harness/fixtures/models/scalar-types/cog.yaml b/tools/test-harness/fixtures/models/scalar-types/cog.yaml
@@ -0,0 +1,3 @@
+build:
+  python_version: "3.12"
+predict: "predict.py:Predictor"
diff --git a/tools/test-harness/fixtures/models/scalar-types/predict.py b/tools/test-harness/fixtures/models/scalar-types/predict.py
@@ -0,0 +1,13 @@
+from cog import BasePredictor, Input, Secret
+
+
+class Predictor(BasePredictor):
+    def predict(
+        self,
+        text: str = Input(description="A string input"),
+        count: int = Input(description="An integer", default=5),
+        temperature: float = Input(description="A float", default=0.7),
+        flag: bool = Input(description="A boolean", default=True),
+        api_key: Secret = Input(description="A secret key"),
+    ) -> str:
+        return f"{text}-{count}-{temperature}-{flag}"
diff --git a/tools/test-harness/harness/cli.py b/tools/test-harness/harness/cli.py
diff --git a/tools/test-harness/harness/report.py b/tools/test-harness/harness/report.py
diff --git a/tools/test-harness/harness/runner.py b/tools/test-harness/harness/runner.py
diff --git a/tools/test-harness/manifest.yaml b/tools/test-harness/manifest.yaml
diff --git a/tools/test-harness/pyproject.toml b/tools/test-harness/pyproject.toml

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+build:`
	`2`	`+ python_version: "3.12"`
	`3`	`+predict: "predict.py:Predictor"`