Skip to content

Commit 2c85d07

Browse files
committed
feat: add schema-compare command to test harness
Add a new `schema-compare` command that builds each model twice — once with COG_STATIC_SCHEMA=1 (Go tree-sitter) and once without (Python runtime) — then compares the resulting OpenAPI schemas for exact JSON equality. Differences are reported with a structured diff showing the exact JSON paths that diverge. Also add 7 local fixture models covering the full input type matrix: scalar types, optional types (PEP 604 + typing.Optional), list types, optional list types, constraints/choices, File/Path types, and structured BaseModel output.
1 parent e5535bc commit 2c85d07

20 files changed

Lines changed: 706 additions & 27 deletions

File tree

tools/test-harness/README.md

Lines changed: 69 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -5,36 +5,43 @@ Designed to test any cog model from any repo.
55

66
## Quick Start
77

8+
All commands use `uv run` which automatically installs dependencies from
9+
`pyproject.toml` — no manual venv setup needed.
10+
811
```bash
912
cd tools/test-harness
1013

11-
# Create a venv and install dependencies
12-
python3 -m venv .venv
13-
source .venv/bin/activate
14-
pip install pyyaml
15-
1614
# List all models in the manifest
17-
python -m harness list
15+
uv run cog-test list
1816

1917
# Run all non-GPU models
20-
python -m harness run --no-gpu
18+
uv run cog-test run --no-gpu
2119

2220
# Run a specific model
23-
python -m harness run --model hello-world
21+
uv run cog-test run --model hello-world
2422

2523
# Run GPU models only (requires NVIDIA GPU + nvidia-docker)
26-
python -m harness run --gpu-only
24+
uv run cog-test run --gpu-only
2725

2826
# Output JSON report
29-
python -m harness run --no-gpu --output json --output-file results/report.json
27+
uv run cog-test run --no-gpu --output json --output-file results/report.json
3028

3129
# Build images only (no predictions)
32-
python -m harness build --no-gpu
30+
uv run cog-test build --no-gpu
31+
32+
# Compare static (Go) vs runtime (Python) schema generation
33+
uv run cog-test schema-compare --no-gpu
34+
35+
# Compare schemas for a specific fixture model
36+
uv run cog-test schema-compare --model fixture-scalar-types
37+
38+
# Use a locally-built cog binary
39+
uv run cog-test schema-compare --no-gpu --cog-binary /path/to/cog
3340
```
3441

3542
## Prerequisites
3643

37-
- Python 3.10+
44+
- [uv](https://docs.astral.sh/uv/) (or Python 3.10+ with `pip install pyyaml`)
3845
- Docker
3946
- For GPU models: NVIDIA GPU + nvidia-docker runtime
4047

@@ -47,19 +54,19 @@ skipping any alpha/beta/rc tags. You can override either via the CLI or in
4754

4855
```bash
4956
# Use the latest stable CLI + SDK (default)
50-
python -m harness run --no-gpu
57+
uv run cog-test run --no-gpu
5158

5259
# Pin a specific CLI version
53-
python -m harness run --cog-version v0.16.12 --no-gpu
60+
uv run cog-test run --cog-version v0.16.12 --no-gpu
5461

5562
# Pin a specific SDK version
56-
python -m harness run --sdk-version 0.16.12 --no-gpu
63+
uv run cog-test run --sdk-version 0.16.12 --no-gpu
5764

5865
# Use a pre-release CLI
59-
python -m harness run --cog-version v0.17.0-rc.2 --no-gpu
66+
uv run cog-test run --cog-version v0.17.0-rc.2 --no-gpu
6067

6168
# Use a locally-built binary (overrides --cog-version)
62-
python -m harness run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
69+
uv run cog-test run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
6370
```
6471

6572
You can also pin versions in `manifest.yaml` under `defaults`:
@@ -156,12 +163,13 @@ No code changes required.
156163
## CLI Reference
157164

158165
```
159-
usage: cog-test {run,build,list} [options]
166+
usage: cog-test {run,build,list,schema-compare} [options]
160167

161168
Commands:
162-
run Build and test models (full pipeline)
163-
build Build Docker images only (no predictions)
164-
list List models defined in the manifest
169+
run Build and test models (full pipeline)
170+
build Build Docker images only (no predictions)
171+
list List models defined in the manifest
172+
schema-compare Compare static (Go) vs runtime (Python) schema generation
165173

166174
Common options:
167175
--manifest PATH Path to manifest.yaml
@@ -173,21 +181,58 @@ Common options:
173181
--cog-binary PATH Path to local cog binary (overrides --cog-version)
174182
--keep-images Don't clean up Docker images after run
175183

176-
Run-specific options:
184+
Run/schema-compare options:
177185
--output {console,json} Output format (default: console)
178186
--output-file PATH Write report to file
179187
```
180188
189+
### Schema Comparison
190+
191+
The `schema-compare` command builds each model **twice** — once with
192+
`COG_STATIC_SCHEMA=1` (Go tree-sitter parser) and once without (Python
193+
runtime schema generation) — then compares the resulting OpenAPI schemas
194+
for exact JSON equality. Any difference is reported as a failure with a
195+
structured diff showing the exact paths that diverge.
196+
197+
This is useful for catching regressions when changing either the Go static
198+
schema generator (`pkg/schema/`) or the Python SDK schema generation
199+
(`python/cog/_adt.py`, `python/cog/_inspector.py`, `python/cog/_schemas.py`).
200+
201+
### Local Fixture Models
202+
203+
Models with `repo: local` are loaded from `fixtures/models/<path>/` instead
204+
of being cloned from GitHub. These are small predictors designed to cover
205+
the full input type matrix for schema comparison testing:
206+
207+
| Fixture | What it covers |
208+
|---------|----------------|
209+
| `scalar-types` | str, int, float, bool, Secret |
210+
| `optional-types` | PEP 604 `X \| None` and `Optional[X]` for all types |
211+
| `list-types` | `list[X]` and `List[X]` for str, int, Path, File |
212+
| `optional-list-types` | `list[X] \| None` and `Optional[List[X]]` |
213+
| `constraints-and-choices` | ge/le constraints, string/int choices |
214+
| `file-path-types` | Path, File, optional Path/File |
215+
| `complex-output` | BaseModel structured output |
216+
181217
## Architecture
182218
183219
```
184220
tools/test-harness/
185221
├── manifest.yaml # Declarative test definitions
186-
├── fixtures/ # Test input files (images, etc.)
222+
├── fixtures/
223+
│ ├── *.png # Test input files (images, etc.)
224+
│ └── models/ # Local fixture models for schema comparison
225+
│ ├── scalar-types/
226+
│ ├── optional-types/
227+
│ ├── list-types/
228+
│ ├── optional-list-types/
229+
│ ├── constraints-and-choices/
230+
│ ├── file-path-types/
231+
│ └── complex-output/
187232
├── harness/
188233
│ ├── cli.py # CLI entry point
189234
│ ├── cog_resolver.py # Resolves + downloads cog CLI and SDK versions
190-
│ ├── runner.py # Clone -> patch -> build -> predict -> validate
235+
│ ├── runner.py # Clone -> patch -> build -> predict -> validate + schema compare
191236
│ ├── patcher.py # Patches cog.yaml with sdk_version + overrides
192237
│ ├── validators.py # Output validation strategies
193238
│ └── report.py # Console + JSON report generation
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build:
2+
python_version: "3.12"
3+
predict: "predict.py:Predictor"
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
from cog import BaseModel, BasePredictor, Input
2+
3+
4+
class Output(BaseModel):
5+
text: str
6+
score: float
7+
tags: list[str]
8+
9+
10+
class Predictor(BasePredictor):
11+
def predict(
12+
self,
13+
prompt: str = Input(description="Input prompt"),
14+
) -> Output:
15+
return Output(text=f"generated: {prompt}", score=0.95, tags=["a", "b"])
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build:
2+
python_version: "3.12"
3+
predict: "predict.py:Predictor"
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
from cog import BasePredictor, Input
2+
3+
4+
class Predictor(BasePredictor):
5+
def predict(
6+
self,
7+
prompt: str = Input(description="The prompt", default="hello"),
8+
temperature: float = Input(
9+
description="Sampling temperature", ge=0.0, le=2.0, default=0.7
10+
),
11+
top_k: int = Input(description="Top-K", ge=1, le=100, default=50),
12+
mode: str = Input(
13+
description="Quality mode",
14+
choices=["fast", "balanced", "quality"],
15+
default="balanced",
16+
),
17+
style: int = Input(
18+
description="Style preset",
19+
choices=[1, 2, 3],
20+
default=1,
21+
),
22+
) -> str:
23+
return f"{prompt}-{temperature}-{top_k}-{mode}-{style}"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build:
2+
python_version: "3.12"
3+
predict: "predict.py:Predictor"
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
from cog import BasePredictor, File, Input, Path
2+
3+
4+
class Predictor(BasePredictor):
5+
def predict(
6+
self,
7+
image: Path = Input(description="An image path"),
8+
document: File = Input(description="A file upload"),
9+
# Optional variants
10+
mask: Path | None = Input(description="Optional mask path", default=None),
11+
attachment: File | None = Input(description="Optional file", default=None),
12+
) -> str:
13+
return f"image={image} mask={mask}"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build:
2+
python_version: "3.12"
3+
predict: "predict.py:Predictor"
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
from typing import List
2+
3+
from cog import BasePredictor, File, Input, Path
4+
5+
6+
class Predictor(BasePredictor):
7+
def predict(
8+
self,
9+
tags: list[str] = Input(description="List of strings"),
10+
numbers: List[int] = Input(description="List of ints"),
11+
paths: list[Path] = Input(description="List of paths"),
12+
files: list[File] = Input(description="List of files"),
13+
) -> str:
14+
return f"tags={len(tags)} numbers={len(numbers)}"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build:
2+
python_version: "3.12"
3+
predict: "predict.py:Predictor"

0 commit comments

Comments
 (0)