fix(data): make data the sole writer of repos.yaml to end promotion conflicts#3399
fix(data): make data the sole writer of repos.yaml to end promotion conflicts#3399marcusrbrown wants to merge 8 commits into
Conversation
… private slug resolution Replace the GraphQL private-slug resolution (which failed closed on deleted/lost-access repos and blocked every promotion) with a delta-based public-allowlist model: flag a wiki page only when it is new in the promotion and maps to neither a known-public entry (private === false, via computeRepoSlug) nor a page already on main. Removes all GraphQL machinery. merge-data no longer copies main's repos.yaml over data's; the gate reads data's own copy and grandfathers pages already public on main.
A data->main promotion PR that opens already conflicted previously reported success, letting divergence accumulate silently. Detect a dirty mergeable_state after the PR is found-or-created, open a deduplicated alert issue, and exit non-zero so the workflow surfaces it.
…data promotion metadata/repos.yaml is data-authoritative; a change to it on a PR whose head is not data is a both-sides mutation that conflicts the weekly promotion. Block it even for fro-bot identities, while leaving other metadata files and the data promotion path unaffected.
…y boundary State that repos.yaml is written only on data and never edited on main outside the promotion PR, that redacted private entries belong on main and must not be deleted as hygiene, and the residual-risk boundary for a bare node_id on a public branch.
…ord-presence brainstorm Capture the sole-writer requirements + implementation plan for the repos.yaml conflict-class fix (all units complete), plus the carried-over discord-presence requirements doc.
…ic-slug matches Grandfather wiki pages by content hash rather than filename: an unchanged page already on main still passes (preserving legacy lost-access entries), but a modified page whose entry is not explicitly public is blocked. Build a slug->entry map so a stem shared by two public entries fails closed, and require a page matching a public slug to carry that repo's URL before it is admitted. Cover the public-allowlist predicate and the fail-closed missing-dir branch directly.
Separate a retryable mergeable-state fetch failure (signal unavailable, exit non-zero) from a fetched 'unknown' (GitHub still computing, no false alarm). Keep the born-conflicted signal when conflict-alert creation fails on a retryable error so the next run retries instead of crashing, and rethrow non-retryable alert failures. Cover both paths.
Fetch changed files via the paginated pulls API so a repos.yaml change beyond the first page can no longer slip past the guard.
fro-bot
left a comment
There was a problem hiding this comment.
The structural move here is the right one: collapsing repos.yaml to a single writer eliminates the both-sides mutation at its root rather than patching the conflict-resolution recipe forever. You're not teaching the promotion to survive collisions — you're removing the collision's reason to exist. That's the correct altitude for this fix.
The privacy-gate rewrite is the load-bearing change and it lands cleanly. Inverting from "resolve private slugs via GraphQL" to "promote only what an explicit public allowlist or content-identity grandfather admits" trades a brittle fail-on-everything network dependency for a pure function. detectPrivateWikiLeaks is deterministic, fails closed on absent private, and the content-hash grandfathering is exactly how you keep marcusrbrown--copiloting alive without re-admitting modified private pages. Test coverage is dense across all four decision branches — ambiguous slug, content-spoof miss, hash-match pass, hash-drift block.
The born-conflicted detection is also well-shaped: distinguishing a successfully-fetched mergeable_state: 'unknown' (GitHub still computing — not an alarm) from an 'unavailable' fetch failure (can't determine — exit non-zero) is the kind of distinction that prevents both false backlogs and silent drift. The dedup-by-prefix alert and retryable-error tolerance on alert creation are correct.
Verdict: PASS
Blocking issues
None.
Non-blocking concerns
-
scripts/check-wiki-authority.ts:59—headRefis attacker-influenced for fork PRs. The newinput.headRef !== 'data'bypass only triggers inside theFROBOT_AUTHORSbranch, and the workflow checks outgithub.head_ref, so a fork could in principle name its branchdata. In practice a fork PR'sauthoris neverfro-bot/fro-bot[bot], so it falls toGUARDED_PATTERNSand is blocked anyway. The guard holds, but it holds because of an identity invariant that isn't asserted at this line. Impact: low (defense-in-depth gap, not an open door). Remediation: optional — a one-line comment noting the bypass is safe only because fro-bot identities never originate from forks would document the load-bearing assumption. -
scripts/merge-data-pr.ts:384-389— conflict-alert dedup uses unpaginatedlistForRepo({per_page: 30}). If more than 30 open issues exist, an older conflict alert could fall off the first page and the run would create a duplicate. Same pattern as the existing stale-divergence alert (line 489), so this is consistent, not a regression. Impact: low (low issue volume in this repo). Remediation: if you ever cross ~30 open issues, switch both dedup queries to atitle/labelsearch viagh api --paginateorissues.listForRepowith pagination. -
scripts/check-wiki-private-presence.ts:93— attribution is a substringincludes. A page whose stem collides with a public repo's slug AND happens to embed that public repo'sgithub.com/owner/nameURL (e.g. as a "related repos" reference) would pass even if the page is really about a different repo. The stem-collision precondition makes this narrow, and the fail-closed defaults elsewhere absorb the residual risk. Impact: low. Remediation: none required; flagging for the record since attribution is the gate's trust anchor.
Missing tests
None. Happy path, both error classes (unknown vs unavailable), conflict dedup, retryable alert-creation failure, and every privacy-gate branch are covered. The headRef matrix in check-wiki-authority.test.ts exercises data/non-data × fro-bot/fro-bot[bot] × repos.yaml/other-metadata, which is the right cross-product.
Risk assessment
LOW. The change is additive to the guard surface (one new head-ref rule), replaces a network-dependent gate with a pure deterministic one, and converts silent promotion failures into loud non-zero exits with deduplicated alerts. The author simulated the new gate against live main/data state (0 pages flagged, 25 public pages attribute cleanly, lost-access legacy page grandfathers). The real cross-layer proof is the first post-merge Merge Data Branch dispatch — until that promotion resolves on its own, treat the conflict-class as fixed-in-theory. Pinned action SHAs and minimal contents: read permissions on merge-data.yaml are intact.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/.github |
| Run ID | 26751633013 |
| Cache | hit |
| Session | ses_17d17a6daffeD0aQkf5yH1SiSy |
Why
The weekly
data → mainpromotion PR keeps being born with conflicts inmetadata/repos.yaml. The root cause is that one file carries two different kinds of change written from two different branches: the autonomous survey state (advanced ondataevery run) and privacy-policy edits (orphan-entry deletions made onmain). When both sides touch the same file between merge-bases, the promotion conflicts — and the only way out has been a manual conflict-resolution recipe.What this does
Makes
datathe sole writer ofmetadata/repos.yamlso the promotion is always a clean fast-forward of that file, and removes the reasonsmainever had to edit it.main. Pages are attributed to a single public repo by the repo URL they already carry, and a slug claimed by two public repos fails closed.mainstill promotes (so legacy lost-access pages likemarcusrbrown--copilotingkeep working), but a modified page for a repo that is no longer public is blocked.main's metadata. The gate readsdata's ownrepos.yaml, so there's no cross-branch read to drift.repos.yamlis guarded onmain. A change tometadata/repos.yamltargetingmainis only allowed when it comes from thedatapromotion, even for Fro Bot identities — closing the path that let a feature-branch edit land there directly.Verification
pnpm lint,pnpm check-types, full test suite green (682 passed, 3 todo).main/datastate: 0 pages flagged — all 25 public pages attribute cleanly, and the one lost-access legacy page passes via content-identity grandfathering.github.com/owner/nameURL, so attribution introduces no over-block.After merge, a manual
Merge Data Branchdispatch is the real cross-layer check that the promotion now resolves on its own.