Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions .github/workflows/monthly-vision-eval.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,18 @@
#
# Why monthly not weekly: vision scoring costs real money
# (Anthropic vision API, billed per image input token). At Haiku
# defaults ~$0.001/image; 600 text images + 40 image-gen images
# ~= $0.60-$1.00/run. Monthly cadence keeps the annual bill
# defaults ~$0.001/image; 300 text images + 40 image-gen images
# ~= $0.30-$0.40/run. Monthly cadence keeps the annual bill
# bounded and still catches drift well ahead of any six-month
# signal loss.
#
# Budget breakdown:
# - Text gen (600 samples): free (CF OSS)
# - Text vision (600 images): ~$0.60
# Budget breakdown (5 models, n=30, raw + compiled = 300 samples):
# - Text gen (300 samples): free (CF OSS)
# - Text vision (300 images): ~$0.30
# - Image gen (40 images): free (CF free tier, <10k
# neurons/day)
# - Image vision (40 images): ~$0.04
# Total: ~$0.60-$1 / month
# Total: ~$0.30-$0.40 / month
#
# Secrets required on the repo:
# - CF_API_TOKEN + CF_ACCOUNT_ID: text generation via CF Workers AI
Expand Down Expand Up @@ -105,6 +105,7 @@ jobs:
--brief briefs/landing.yml \
--models "$MODELS" \
--n "$N" \
--sample-concurrency 6 \
--out evals \
--report "docs/evals/monthly/${REPORT_DATE}-source.md"
echo "REPORT_DATE=$REPORT_DATE" >> "$GITHUB_ENV"
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/weekly-eval.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
CF_API_TOKEN: ${{ secrets.CF_API_TOKEN }}
CF_ACCOUNT_ID: ${{ secrets.CF_ACCOUNT_ID }}
run: |
# Five OSS cells at n=30 = 300 raw + 300 compiled = 600 calls.
# Five OSS models at n=30 = 150 raw + 150 compiled = 300 calls.
# Well within CF Workers AI free tier (10k neurons/day). If the
# model list needs updating, bump the comma list below; the
# workflow is intentionally explicit rather than auto-discovered
Expand All @@ -81,6 +81,7 @@ jobs:
--brief briefs/landing.yml \
--models cf:@cf/google/gemma-4-26b-a4b-it,cf:@cf/meta/llama-4-scout-17b-16e-instruct,cf:@cf/mistralai/mistral-small-3.1-24b-instruct,cf:@cf/openai/gpt-oss-120b,cf:@cf/qwen/qwen3-30b-a3b-fp8 \
--n 30 \
--sample-concurrency 6 \
--out "$OUT_DIR" \
--report "$REPORT_PATH"
echo "REPORT_PATH=$REPORT_PATH" >> "$GITHUB_ENV"
Expand Down
377 changes: 377 additions & 0 deletions docs/artwork/stickers/02-rule-final.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/artwork/stickers/02-rule-print.pdf
Binary file not shown.
Binary file added docs/artwork/stickers/02-rule-print.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,389 changes: 2,389 additions & 0 deletions docs/artwork/stickers/02-rule-print.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
379 changes: 379 additions & 0 deletions docs/artwork/stickers/03-manifest-final.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/artwork/stickers/03-manifest-print.pdf
Binary file not shown.
Binary file added docs/artwork/stickers/03-manifest-print.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,443 changes: 2,443 additions & 0 deletions docs/artwork/stickers/03-manifest-print.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions docs/artwork/stickers/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Stickers

Conference-distribution stickers for AHD. Two directions, both 3" × 3" die-cut matte vinyl, pure black on natural cream.

## Files

| File | Purpose |
| --- | --- |
| `02-rule-final.svg` | Editable source for *The Rule*. Live `<text>` elements, font references intact. |
| `02-rule-print.svg` | Production. Text outlined to paths. No font dependency. |
| `02-rule-print.pdf` | Production. Upload to printer. Vector, 3" page. |
| `02-rule-print.png` | Preview only. 900 × 900 px (= 3" at 300 DPI). |
| `03-manifest-final.svg` | Editable source for *The Manifest*. |
| `03-manifest-print.svg` | Production. Text outlined. |
| `03-manifest-print.pdf` | Production. Upload to printer. |
| `03-manifest-print.png` | Preview only. |

QR codes encode `https://ahd.adastra.computer` and are scanned-verified.

## Regenerating from sources

If you edit either `-final.svg`, regenerate the `-print` outputs with Inkscape under nix-shell so fonts resolve correctly:

```sh
nix-shell -E 'with import <nixpkgs> {}; let
fontsConf = makeFontsConf { fontDirectories = [ inter jetbrains-mono ]; };
in mkShell {
buildInputs = [ inkscape fontconfig ];
shellHook = "export FONTCONFIG_FILE=" + fontsConf;
}' --run '
for stem in 02-rule-final 03-manifest-final; do
out="${stem%-final}-print"
inkscape "${stem}.svg" --export-text-to-path --export-plain-svg --export-type=svg --export-filename="${out}.svg"
inkscape "${out}.svg" --export-type=pdf --export-filename="${out}.pdf"
inkscape "${out}.svg" --export-type=png --export-dpi=300 --export-filename="${out}.png"
done
'
```

## Font note

The editable sources reference Inter (SIL OFL) and JetBrains Mono (Apache 2.0). Both are free for commercial print. The live AHD site uses Neue Haas Grotesk (Linotype, paid); if you have a license, swap `font-family` in the `-final.svg` files before regenerating.
48 changes: 34 additions & 14 deletions src/eval/runners/openai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,20 @@ export function openaiRunner(options: {
* fields to the underlying model as extra generation params.
*/
extraBody?: Record<string, unknown>;
/**
* Per-request wall-clock cap in milliseconds. A request that hasn't
* resolved by this point is aborted so it surfaces as a caught error
* (the caller writes a `.error.txt` and moves on) rather than hanging
* the whole run. Without it a single stalled upstream connection
* blocks a serial eval forever — the failure mode that silently ate
* the full 60-minute CI ceiling. Defaults to 120s, comfortably above
* the slowest legitimate generation (~25-30s) observed on CF.
*/
timeoutMs?: number;
}): ModelRunner {
const model = options.model ?? "gpt-5";
const baseURL = options.baseURL ?? "https://api.openai.com/v1";
const timeoutMs = options.timeoutMs ?? 120_000;
return {
id: model,
provider: "openai",
Expand All @@ -31,20 +42,29 @@ export function openaiRunner(options: {
if (input.systemPrompt)
messages.push({ role: "system", content: input.systemPrompt });
messages.push({ role: "user", content: input.userPrompt });
const res = await fetch(`${baseURL}/chat/completions`, {
method: "POST",
headers: {
Authorization: `Bearer ${options.apiKey}`,
"content-type": "application/json",
},
body: JSON.stringify({
model,
messages,
max_completion_tokens: input.maxTokens ?? 4096,
seed: input.seed,
...(options.extraBody ?? {}),
}),
});
let res: Response;
try {
res = await fetch(`${baseURL}/chat/completions`, {
method: "POST",
headers: {
Authorization: `Bearer ${options.apiKey}`,
"content-type": "application/json",
},
body: JSON.stringify({
model,
messages,
max_completion_tokens: input.maxTokens ?? 4096,
seed: input.seed,
...(options.extraBody ?? {}),
}),
signal: AbortSignal.timeout(timeoutMs),
});
} catch (err) {
if (err instanceof Error && err.name === "TimeoutError") {
throw new Error(`openai ${model}: request timed out after ${timeoutMs}ms`);
}
throw err;
}
if (!res.ok) {
throw new Error(`openai ${model}: ${res.status} ${await res.text()}`);
}
Expand Down
2 changes: 1 addition & 1 deletion src/eval/runners/workers-ai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ export const WORKERS_AI_DEFAULTS = [
"@cf/qwen/qwq-32b",
"@cf/qwen/qwen2.5-coder-32b-instruct",
"@cf/mistralai/mistral-small-3.1-24b-instruct",
"@cf/google/gemma-3-12b-it",
"@cf/google/gemma-4-26b-a4b-it",
] as const;

// Model-family-specific generation knobs. Cloudflare Workers AI's
Expand Down