Discussion: Scope of contracts/ and plan.md for detailed engineering design #2630

bob798 · 2026-05-19T06:29:02Z

bob798
May 19, 2026

Context

We've been using spec-kit on a real subsystem — an LLM agent tool dispatcher with authz, rate-limit, approval flow, idempotency, and audit requirements. The /specify → /clarify → /plan → /tasks flow worked well for user-story-driven features. During code review, we found about 14 categories of non-functional concerns that didn't have a clear home in spec-kit's artifacts. They ended up in a separate single-file engineering design document.

Before considering PRs, we'd like to understand maintainers' position on scope so we know whether to propose upstream additions or build externally.

What spec-kit covers well (verified by reading the templates)

User stories with Given-When-Then acceptance — spec.md
[NEEDS CLARIFICATION] flagging — /clarify flow
Tech-stack selection with rationale — plan.md Technical Context
Constitution Check gate — plan.md
Library / dependency research — Phase 0 research.md
Domain model — Phase 1 data-model.md
Interface signatures — Phase 1 contracts/
Phased tasks with [P] parallelism — tasks.md

This is genuinely a lot, and we relied on all of it.

What we couldn't place

Below are concrete content types from our real design document that did not fit any spec-kit template. Each has a one-line description of the failure mode if it's missing.

Error-code contract — a table of code × trigger × counts-toward-rate-limit × caller-visible message. Without it, each handler invents its own error structure and clients can't uniformly handle failures.
State machine — typically a stateDiagram-v2 with legal transitions and an explicit illegal-transition policy. Approval flows with TTL-based expiry are the common case.
Cross-cutting execution order — the precise order of authz, schema validation, rate-limit, approval gate, planning, and timeout-wrapped execution. Order is a contract; getting it wrong leaks "this resource exists" via schema errors before authz denies, or charges rate-limit budget against denied users.
Authorization invariants (pseudocode + invariants) — tenants AND roles, no implicit super-admin bypass, enumerated deny_reason for forensics. Plain-English "the feature must check authz" routinely results in tenants OR roles in practice.
Failure-mode scheduling rules — timeout, approval persistence + TTL, idempotency-key derivation, per-run hard cap (calls / tokens / duration), cancel propagation across transport layers.
Trust labels and prompt-injection mitigation — trusted / partially_trusted / untrusted labeling for content flowing into LLM context, plus the sanitization pipeline (truncation → control-char escape → boundary tagging → system-prompt hardening).
Network egress policy — application-layer allowlist for outbound HTTP from tools / handlers.
Observability schema — OTel span hierarchy with required attributes, metric names with label sets, alert thresholds. We treat span names and metric names as part of the interface contract.
Audit event schema — JSON event with required fields (traceId, runId, callId, principal, result.code, result.durationMs) and retention policy.
Framework adapter pattern — isolating third-party framework types behind an adapter so major-version upgrades don't propagate through business code.
Alternatives considered — options rejected, with reasons. Code review repeatedly asks "why not X?"; writing it down once saves cycles.
Rollout / canary / rollback — milestones, feature flags, rollback paths with RTO targets.
Risk register — known risks with mitigations, owners, and trigger conditions.
Anti-pattern citations — pointers to specific lines in reference codebases that motivated each principle. (e.g., "rejected because reference framework Foo Bar.java:NN-NN scattered cross-cutting led to a tenant-leak incident".)

Where these landed in practice

We used contracts/ for (1) and (8). Everything else went into a single sibling file we called engineering-design.md that sits between plan.md and tasks.md. Tasks in tasks.md reference section anchors in that file. This works, but it's outside spec-kit's standard layout.

Question

Are any of these in scope for upstream spec-kit? We see three possible answers:

In scope — happy to split into small, focused PRs:
- Easiest first PRs (low controversy): new templates for alternatives.md, risks.md, an "Alternatives Considered" section in plan.md.
- Medium: state-machine template, error-codes template (could land under contracts/).
- Heavier: observability schema, rollout, anti-pattern citation discipline.
Partially in scope — we'd appreciate guidance on which subset to PR and which to leave external.
Out of scope — we'll publish complementary templates externally; happy to coordinate so users have a clear "use spec-kit for X, see kit-name for Y" story rather than fragmented advice.

What we've prepared

We've published a complementary repo with templates and a worked example for the categories above, anonymized to the LLM tool dispatcher domain:

Repo: https://github.com/bob798/engdesign-kit
Single-file template: templates/engineering-design.md
Worked example: examples/llm-tool-dispatcher/design.md
Integration playbook: playbooks/spec-kit-integration.md (positions our document between plan.md and tasks.md)
Why-this-exists doc: docs/why.md (the 14 categories enumerated above with failure modes)

The repo is positioned as a complement (Apache 2.0, same license as spec-kit, "we complement, we do not fork"). If maintainers signal that some of the categories are in-scope upstream, we'll migrate those over to PRs and link from our repo so users find the canonical location.

Why we're asking now

We'd rather contribute small focused PRs that you'd accept than maintain a parallel project forever. But before opening 6 PRs, we want to know which ones land in your "yes," "no," or "maybe" buckets.

Thanks for spec-kit — the underlying flow is exactly what we needed for everything that fit, and we're trying to extend the same discipline to what didn't fit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Scope of contracts/ and plan.md for detailed engineering design #2630

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Discussion: Scope of contracts/ and plan.md for detailed engineering design #2630

Uh oh!

bob798 May 19, 2026

Context

What spec-kit covers well (verified by reading the templates)

What we couldn't place

Where these landed in practice

Question

What we've prepared

Why we're asking now

Replies: 0 comments

bob798
May 19, 2026