Skip to content

Fix folded private publish anchoring#1371

Open
lupuszr wants to merge 11 commits into
mainfrom
codex/issue-1369-folded-private-anchor
Open

Fix folded private publish anchoring#1371
lupuszr wants to merge 11 commits into
mainfrom
codex/issue-1369-folded-private-anchor

Conversation

@lupuszr

@lupuszr lupuszr commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes folded public+private V10 publishes so they can anchor on-chain instead of falling back to local tentative results.

The issue was that DKGPublisher skipped V10 ACK collection whenever private roots existed. This change threads private Merkle root commitments through the PublishIntent/ACK path so core nodes can recompute the folded KC root from public quads plus private commitments, without seeing private plaintext.

Changes

  • Add privateMerkleRoots to PublishIntent.
  • Pass private root commitments through publisher, agent, CLI ACK provider, and ACK collector.
  • Update StorageACKHandler to verify folded public+private roots.
  • Remove the private-data local-only ACK skip branch.
  • Add regression tests for proto round-trip, ACK collection, ACK verification, and confirmed folded public+private publish.

Validation

  • pnpm --filter @origintrail-official/dkg-core exec vitest run test/v10-proto.test.ts
  • pnpm --filter @origintrail-official/dkg-publisher exec vitest run --config vitest.ack-temp.config.ts
  • pnpm --filter @origintrail-official/dkg-publisher exec vitest run test/publish-lifecycle.test.ts --silent
  • Builds passed for core, publisher, agent, CLI, and required dependencies.
  • git diff --check

fix for #1369

Comment thread packages/publisher/src/dkg-publisher.ts Outdated
catalogCommitment
? { catalogRoot: catalogCommitment.root, catalogLeafCount: catalogCommitment.leafCount }
: undefined,
privateRoots,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Private-root publisher plumbing is not verified by the new tests

What's wrong
The changed behavior depends on DKGPublisher carrying computed private Merkle commitments all the way to the ACK provider, but the new confirmation test stubs away that contract. This gives green coverage for private on-chain publishes even if the production ACK request would omit the private roots and real cores would reject the publish.

Example
A regression like changing the call at line 2609 to pass undefined would still let confirms folded public+private publishes on-chain when ACKs are available pass, because the in-memory ACK provider signs the folded merkleRoot without checking the private-root wire field. In production, the ACKCollector would send no private roots and a real StorageACKHandler would recompute the public-only root and reject the request.

Suggested direction
Replace or supplement the lifecycle stub with a provider that observes and validates the new privateMerkleRoots argument before returning ACKs, preferably exercising the same inline PublishIntent a core node receives.

For Agents
Add a regression test around DKGPublisher.publish with privateQuads whose v10ACKProvider asserts the final privateMerkleRoots argument contains the computed 32-byte root. Stronger: have that provider feed the provided stagingQuads plus private roots through ACKCollector/StorageACKHandler so direct inline folded-private ACK verification is covered too.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Curated private publishes now fail before ACK collection

What's wrong
The publisher now forwards private Merkle roots unconditionally, but the ACK collector treats private roots as mutually exclusive with curated/encrypted catalog ACKs. That makes an existing valid curated-private publish shape impossible to ACK.

Example
A curated/private context graph publish with a public catalog entry and private quads sets useCuratedCatalog === true, has a catalogCommitment, and has at least one privateRoot. This call now reaches ACKCollector.collect(..., isEncryptedPayload: true, catalogCommitment, privateMerkleRoots: privateRoots) and throws privateMerkleRoots are only valid for folded-private public-CG ACKs before any ACK request is sent. Expected behavior is that curated catalog ACKs continue on the catalog/encrypted path without folded-private roots.

Suggested direction
Gate the new privateRoots argument so curated/encrypted catalog publishes pass undefined or an empty list, while folded-private public publishes still send the roots over V2.

For Agents
In DKGPublisher.publish, only pass privateRoots to v10ACKProvider for folded-private public-CG ACK mode, not when useCuratedCatalog/encrypted catalog mode is active. Preserve catalog ACK behavior and add a regression that publishes a curated CG with both catalog commitment and private quads and reaches ACK collection without the collector mixed-mode rejection.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Curated private publishes now pass a rejected ACK mode combination

What's wrong
The publisher unconditionally forwards private Merkle roots into the V10 ACK provider. For curated/encrypted publishes, that creates the exact mixed mode the collector now rejects, so a supported curated private publish fails locally during ACK collection instead of reaching the catalog ACK flow.

Example
Publish to a curated context graph with private quads and a catalog entry. privateRoots.length > 0, useCuratedCatalog === true, and catalogCommitment is present, so v10ACKProvider(...) receives all three. ACKCollector.collect() throws privateMerkleRoots are only valid for folded-private public-CG ACKs before any ACKs are requested. Expected behavior is the existing curated-catalog ACK path: cores verify the public catalog commitment and do not receive folded-private roots.

Suggested direction
Separate the ACK modes before calling v10ACKProvider: folded-private public publishes should pass privateRoots, while curated/encrypted catalog publishes should pass only the catalog commitment.

For Agents
In DKGPublisher.publish, only pass the new privateMerkleRoots argument for the folded-private public-CG mode. Preserve curated/encrypted catalog behavior by passing undefined or [] when useCuratedCatalog/useEncryptedInline is true, and add a publisher-level regression that curated private publishes still collect catalog ACKs and confirm.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: No negative test proves folded-private publishes fail instead of falling back to local finalization

What's wrong
This PR removes the old private-no-acks local-only branch and makes folded-private publishes depend on V2 ACK collection. The tests prove that a folded-private publish can confirm when ACKs are available, but they do not prove the equally important contract that a chain-registered folded-private publish without collectable ACKs fails rather than being reported as an honest local/tentative result.

Example
A regression test could publish public+private quads to a numeric context graph with a v10ACKProvider that rejects with no connected core peers or storage_ack_insufficient. Expected behavior should be that publish() rejects and does not return a tentative result with localChainSkipReason: 'no-chain' or any local-only success shape.

Suggested direction
Cover the no-V2-quorum/no-ACK path at the publisher boundary, not only the successful folded-private ACK path.

For Agents
Add a publisher lifecycle regression near the new folded-private confirmation test. Use a chain-registered CG, privateQuads, and a failing/empty ACK provider; assert the publish fails loudly and no tentative local finalization is reported. Preserve the confirmed happy-path behavior when ACKs are available.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Curated private publishes now fail ACK collection

What's wrong
The publisher now forwards privateRoots to the ACK provider for every private publish, including curated/encrypted catalog publishes. The collector explicitly treats privateMerkleRoots as mutually exclusive with both isEncryptedPayload and catalogCommitment, so the curated path becomes unreachable for private publishes with catalog metadata.

Example
A chain-ready curated CG publish with privateQuads and a public catalog entry produces privateRoots.length > 0 and useCuratedCatalog === true. The call at line 2609 sends those roots to ACKCollector, which throws "privateMerkleRoots are only valid for folded-private public-CG ACKs" or "privateMerkleRoots cannot be combined with catalogCommitment" before any ACKs are collected. Expected behavior: curated/catalog ACKs should continue using the catalog commitment without folded-private roots.

Suggested direction
Gate the new privateMerkleRoots argument by ACK mode: send it for folded-private public-CG publishes, but omit it for curated/encrypted catalog publishes because that mode verifies the catalog commitment instead.

For Agents
In packages/publisher/src/dkg-publisher.ts, only pass privateRoots to v10ACKProvider for the folded-private public-CG mode. Preserve curated catalog behavior by passing undefined/[] when useCuratedCatalog or catalogCommitment is set. Add a regression test for a curated encrypted publish with private quads and a catalog commitment proving ACK collection reaches the V1 catalog path and confirms or at least does not fail on privateMerkleRoots validation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Curated private publishes now send mutually exclusive ACK fields

What's wrong
The publisher unconditionally forwards private Merkle roots to the V10 ACK provider. In curated catalog mode, those roots are not part of the ACK wire contract; the collector explicitly treats privateMerkleRoots plus isEncryptedPayload or catalogCommitment as invalid. This makes valid curated private publishes fail before ACK collection.

Example
A curated/private context graph publish with a public catalog floor and one private quad computes privateRoots.length === 1, useCuratedCatalog === true, and a non-empty catalogCommitment. The new call becomes v10ACKProvider(..., true, catalogCommitment, privateRoots), and ACKCollector.collect() throws privateMerkleRoots are only valid for folded-private public-CG ACKs instead of collecting catalog ACKs and submitting on-chain.

Suggested direction
Pass privateRoots only for the folded-private public-CG mode, and omit them for curated/catalog ACKs. Consider also gating the catalogCommitment argument on useCuratedCatalog so the provider cannot receive a mixed mode.

For Agents
In packages/publisher/src/dkg-publisher.ts, split the ACK modes at the provider call: curated catalog ACKs should pass the catalog commitment but no privateMerkleRoots; folded-private public-CG ACKs should pass private roots only when not using curated/encrypted catalog mode. Add a regression covering a curated CG publish with private quads and a catalog commitment that confirms or at least reaches catalog ACK collection.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Folded-private legacy-provider rejection is not covered

What's wrong
This diff intentionally allows folded-private publishes to collect ACKs, but only through the new object-form ACK provider. The guard that prevents legacy positional providers from being used is a high-risk compatibility boundary and is currently untested. A regression here could give green tests while allowing private publishes to use a provider contract that cannot carry the private root commitments.

Example
Add a publisher-level regression test with privateQuads and a LegacyV10ACKProvider: expect publish to reject with /object-form v10ACKProvider/ and expect the legacy provider not to be called.

Suggested direction
Add a regression test for the legacy-provider rejection branch so future changes cannot silently route folded-private publishes through a callback shape that drops privateMerkleRoots.

For Agents
In packages/publisher/test/publish-lifecycle.test.ts, add a private publish using a positional LegacyV10ACKProvider. Preserve the public legacy-provider compatibility test, but prove folded-private publishes fail loudly before ACK collection because private roots cannot be passed through the old callback shape.

Comment thread packages/publisher/src/publisher.ts Outdated
* signers. Cores use these 32-byte roots to recompute the claimed KC root
* together with the public quads they store, without seeing private plaintext.
*/
privateMerkleRoots?: Uint8Array[],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Stop extending the ACK provider as a positional argument list

What's wrong
This PR adds privateMerkleRoots as another trailing optional argument on a callback that already mixes ACK identity, payload bytes, pricing, catalog mode, and protocol flags. The new field is real protocol data, but the API models it as pass-through tail plumbing. That makes the provider boundary harder to scan, easier to drift across wrappers, and needlessly different from the collector object API that already exists.

Example
The publisher call now has to keep useCuratedCatalog, optional catalogCommitment, and privateRoots aligned by position. Any new provider implementation must copy the same unlabeled argument order before it can forward to ACKCollector.collect, which already accepts a named object.

Suggested direction
Introduce a V10ACKProviderParams object and build it once in DKGPublisher. That would remove the fragile tail-argument API, make the private-root addition self-documenting, and avoid forcing every wrapper to copy the same positional signature.

For Agents
Refactor V10ACKProvider in packages/publisher/src/publisher.ts to take a single params object, ideally close to ACKCollector.collect's shape. Update the call in dkg-publisher.ts and the agent/CLI provider factories to forward named fields. Preserve the current wire fields and ACK digest inputs; existing ACK collector/publish lifecycle tests should continue proving behavior.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Do not extend the already overgrown positional ACK provider contract

What's wrong
The folded-private feature is a cross-layer protocol concept, but this exposes it as another optional tail argument every wrapper must mirror exactly. That increases coupling and makes the relationship between privateMerkleRoots, isEncryptedPayload, catalogCommitment, and SWM fields harder to reason about.

Example
A provider implementation has to remember that catalogCommitment is followed by privateMerkleRoots; the next protocol field will repeat the same tail-argument expansion pattern.

Suggested direction
Replace the callback signature with a typed V10ACKRequest object and pass that through DKGPublisher, the agent wrapper, and the CLI wrapper.

For Agents
Refactor V10ACKProvider to accept one named request object, ideally shared with or derived from ACKCollector.collect params. Preserve current publish behavior and cover private roots, catalog commitment, encrypted payload, and SWM remap fields by name.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Model ACK mode explicitly instead of adding another positional optional

What's wrong
This change extends an already overloaded positional API instead of introducing a real model for ACK modes. The result is harder to maintain because the protocol choice, privacy semantics, catalog commitment, encrypted payload flag, and private roots are spread across optional arguments and comments rather than being represented once as a typed state.

Example
A caller can still type-check with a mixed mode such as isEncryptedPayload=true, catalogCommitment=undefined, and privateMerkleRoots=[root]; the collector rejects it later, but every provider implementation and test has to remember the positional convention.

Suggested direction
Replace the long positional provider signature with an object parameter and encode the mutually exclusive ACK modes in the type boundary. That would delete the need for several defensive mixed-mode checks and make future protocol extensions local to the mode model.

For Agents
Refactor V10ACKProvider and ACKCollector.collect around a single request object. Keep the existing behavior, but model the ACK mode as a discriminated union such as public, folded-private, and curated-catalog; prove with the existing public/catalog/folded-private ACK tests that public/catalog stay on V1 and folded-private uses V2.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Model ACK mode explicitly instead of adding another optional positional field

What's wrong
The PR adds folded-private ACKs by extending an already long positional callback with another optional field. The documented invariant is now “exactly one of several loose shapes,” but the type does not encode it; callers and the collector rely on comments and runtime rejection branches. That is brittle as ACK modes keep growing.

Example
The call site now ends with useCuratedCatalog, catalogCommitment ? ... : undefined, privateRoots. A reader has to remember that this means either public, curated-catalog, or folded-private depending on the combination of three optional values, and TypeScript cannot prevent mixed modes.

Suggested direction
Collapse the optional booleans/objects into a discriminated ackMode and pass the provider a named parameter object. Then the collector can switch on one typed mode to choose protocol and wire fields, rather than validating ad-hoc combinations after the fact.

For Agents
Refactor V10ACKProvider and ACKCollector.collect around a parameter object with a discriminated ACK mode, for example { kind: 'public' } | { kind: 'folded-private'; privateMerkleRoots } | { kind: 'curated-catalog'; catalogCommitment }. Preserve the exact wire fields and protocol selection, then update the provider call sites in dkg-publisher.ts, dkg-agent.ts, and publisher-runner.ts. Tests should cover the three modes and mixed-mode rejection becoming unrepresentable or centralized.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Model ACK mode explicitly instead of appending another optional positional argument

What's wrong
Folded-private support is added by extending an already long positional callback with another optional argument. The real invariant is a three-way ACK mode, but the code represents it as a loose combination of nullable fields and runtime guards. That makes the API harder to read and easier to misuse as more ACK modes are added.

Example
The call v10ACKProvider(..., useCuratedCatalog, catalogCommitment ? {...} : undefined, privateRoots) only makes sense if the reader remembers the positional contract and the mutually exclusive mode matrix from the comment. TypeScript still permits invalid combinations, leaving ACKCollector and StorageACKHandler to rediscover the mode and throw later.

Suggested direction
Use a discriminated mode model, for example ackMode: { kind: 'public' } | { kind: 'curatedCatalog', catalogCommitment } | { kind: 'foldedPrivate', privateMerkleRoots }, ideally inside a single provider parameter object. That would make the invalid states impossible to construct in ordinary callers and remove scattered mode inference.

For Agents
Replace the trailing optional mode parameters with a single typed params object or discriminated ackMode union. Update DKGPublisher, DKGAgent.createV10ACKProvider, publisher-runner, and ACKCollector.collect to consume that shape while preserving public, curated-catalog, and folded-private behavior. Keep existing mode tests but assert illegal combinations are unrepresentable or rejected at the boundary.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Model ACK mode explicitly instead of adding another positional optional

What's wrong
This PR adds folded-private support by appending privateMerkleRoots to an already sprawling positional callback and encoding the real state as a combination of nullable parameters and booleans. That makes the boundary harder to reason about and spreads mode validation/construction across the publisher, agent provider, CLI provider, and collector. The comments now describe a clean mode model, but the types still expose an ad-hoc bag of optional fields.

Example
A caller has to keep isEncryptedPayload, catalogCommitment, and privateMerkleRoots aligned by position: v10ACKProvider(..., useCuratedCatalog, catalogCommitment, privateRoots). The type does not express that folded-private and curated-catalog are mutually exclusive modes; that invariant is rediscovered later by runtime checks.

Suggested direction
Introduce a V10ACKRequest object and a typed ackMode/discriminated union such as public, curatedCatalog, and foldedPrivate. Build it once in the publisher, pass it through the provider boundary, and let the collector derive isPrivate, protocol id, catalog fields, and private roots from that single model.

For Agents
Look at V10ACKProvider, DKGPublisher.publish, createV10ACKProvider, createV10ACKProviderForPublisher, and ACKCollector.collect. Preserve the existing public, curated-catalog, and folded-private behavior, but replace the long positional provider contract with a request object whose ACK mode is a discriminated union. Add/adjust focused tests that prove each mode maps to the same collector fields and protocol as today.

isEncryptedPayload: params.isEncryptedPayload === true ? true : undefined,
catalogRoot: params.catalogCommitment?.catalogRoot,
catalogLeafCount: params.catalogCommitment?.catalogLeafCount,
privateMerkleRoots: privateMerkleRoots.length > 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Make ACK verification mode explicit instead of combining loose optional fields

What's wrong
The diff adds private commitments as an independent optional field next to isEncryptedPayload and catalogCommitment, so the protocol state is now represented by several loosely coupled flags. That makes it harder to tell which verifier owns which fields and leaves accepted combinations that the code does not actually use. This is the kind of optional-field mode drift that will make future ACK changes more brittle.

Example
The current model can encode an intent with isEncryptedPayload=true, catalogCommitment present, and privateMerkleRoots present. The handler accepts that shape but the curated branch verifies only the catalog commitment and digest inputs, while the private roots are just normalized and then ignored.

Suggested direction
Model the ACK payload as a discriminated mode, for example plain/SWM verification with privateMerkleRoots versus curated-catalog verification with catalogCommitment. If private roots are not used for curated ACKs, do not put them on that wire path; if they are required, fold them in the curated branch through a named helper. Either direction would make invalid or irrelevant field combinations disappear.

Confidence note
This is a structural concern from the diff shape. If private roots are intentionally meaningful for curated encrypted ACKs too, the invariant should be documented and enforced in the curated handler path rather than left as an optional field that is accepted independently.

For Agents
Look at dkg-publisher.ts where useCuratedCatalog and privateRoots are both passed into the ACK provider, ack-collector.ts where the PublishIntent is built, and the two handler branches in storage-ack-handler.ts. Preserve public folded-private ACK behavior, but make the verification mode explicit and prove the intended combinations in tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Collapse ACK privacy state into one discriminated mode

What's wrong
The PR introduces a better ackMode abstraction, but leaves the old boolean state beside it. That keeps the contract wider than the real invariant and makes the collector depend on multiple ad-hoc derivations instead of one canonical mode model.

Example
A caller can now construct an internally inconsistent collector request such as isPrivate: false with ackMode: { kind: 'folded-private', privateMerkleRoots: [...] }. Current production callers may not do this, but the boundary allows it and forces every reader to reconcile which field is authoritative.

Suggested direction
Remove the standalone isPrivate boolean from ACKCollectorParams and derive isPrivate, protocol id, catalog digest fields, and intent-only fields from one normalized ackMode. A small resolveACKMode(params.ackMode) helper would also remove the cast-heavy impossible-state guard and the privateMerkleRoots.length dispatch.

For Agents
Look at ACKCollectorParams, both V10 ACK provider adapters, and ACKCollector.collectInner. Preserve the public, folded-private, and curated-catalog wire outputs, but make the mode contract single-source-of-truth and add a focused test or type-level check proving inconsistent privacy/mode input cannot be represented at the collector boundary.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Preserve the ACK mode union at the collector boundary

What's wrong
The PR introduces a cleaner ACK mode model, but the collector immediately weakens it by accepting a flattened object where mode-specific fields are optional and isPrivate is separate from ackMode. That makes impossible states representable, requires cast-based checks later, and leaves future ACK modes harder to reason about because each caller must keep several fields aligned manually.

Example
collector.collect({ ..., isPrivate: false, stagingQuads: undefined, ackMode: { kind: 'curated-catalog', catalogCommitment } }) type-checks at the collector boundary even though curated catalog mode requires staging quads. Likewise, folded-private mode can be paired with isPrivate: false, so protocol selection and wire intent fields come from different sources of truth.

Suggested direction
Let ackMode own the mode invariant end-to-end. Avoid reintroducing a bag of optional fields in the collector API, and keep any defensive runtime validation in one normalization function for untyped callers.

For Agents
In packages/publisher/src/ack-collector.ts, make ACKCollectorParams preserve the same discriminated mode shape as V10ACKProviderParams, plus collector-only fields like parsed bigint CG id, chain id, and KAV address. Remove isPrivate from the public collector params and derive it from ackMode. Keep existing public, folded-private, and curated-catalog behavior covered by the current ACK collector tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Make ackMode the single source of truth for collector state

What's wrong
The new discriminated union is a good abstraction, but the collector boundary still carries an older boolean alongside it. That leaves future maintainers to reason about two partially overlapping models and makes invalid combinations expressible.

Example
The public collector API can represent contradictory states such as isPrivate: false with ackMode: { kind: 'folded-private', privateMerkleRoots: [...] }, or isPrivate: true with ackMode: { kind: 'public' }. Even if current callers do not do that, the boundary now permits states the model says should not exist.

Suggested direction
Collapse the duplicate state so the collector derives isPrivate, protocol id, catalog fields, and private roots from one discriminated model. That would also remove the need for cast-heavy guard code around impossible union states.

For Agents
In packages/publisher/src/ack-collector.ts, remove isPrivate from ACKCollectorParams or make it a derived local from ackMode. Update the two provider factories in packages/agent/src/dkg-agent.ts and packages/cli/src/publisher-runner.ts to pass only the canonical ACK mode plus chain fields. Prove with a focused collector test that each ACK mode produces the intended PublishIntent fields.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Let ackMode own the ACK invariants instead of duplicating state

What's wrong
The PR introduces a good mode abstraction, but the collector boundary still carries the older boolean model beside it. That leaves readers and callers responsible for keeping two sources of truth aligned, and it forces defensive casts and special-case checks that the type model should make unnecessary.

Example
collect({ isPrivate: false, ackMode: { kind: 'folded-private', privateMerkleRoots: [root] }, ... }) is representable. So is isPrivate: true with omitted ackMode, which defaults the collector to public mode. The new discriminated mode no longer owns the invariant.

Suggested direction
Make ackMode required and derive all wire flags from it. A discriminated union should delete invalid combinations, not coexist with the old booleans and optional fallback.

For Agents
In ack-collector.ts, turn ACKCollectorParams into the same discriminated mode model used by V10ACKProviderParams, remove top-level isPrivate, and derive PublishIntent.isPrivate/isEncryptedPayload from ackMode in one place. If legacy callers must remain supported, normalize them before entering ACKCollector.collect. Add a small type-level or runtime test that impossible mode combinations cannot reach the collector.

Comment thread packages/agent/src/dkg-agent.ts Outdated
publisherPeerId: this.peerId,
publicByteSize,
isPrivate: isEncryptedPayload === true,
isPrivate: isEncryptedPayload === true || (privateMerkleRoots?.length ?? 0) > 0,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Keep the folded-private privacy invariant in one canonical boundary

What's wrong
The new rule that private Merkle roots imply a private publish is scattered across entry-point wrappers instead of living at the protocol boundary that builds the PublishIntent. That leaves isPrivate and privateMerkleRoots as independent inputs that can drift.

Example
A later provider can pass private roots while leaving isPrivate false because the collector does not own or validate that invariant.

Suggested direction
Let the collector or a dedicated intent builder derive/assert isPrivate from named inputs, then remove the duplicated rule from agent and CLI wrappers.

For Agents
Move privacy-mode derivation into ACKCollector.collect or a shared publish-intent builder. Preserve current wire output: encrypted and folded-private publishes set PublishIntent.isPrivate=true; public publishes without private roots remain false.

}

const inMemoryRoot = computeFlatKCRoot(parsed, []);
const inMemoryRoot = computeFlatKCRoot(parsed, privateMerkleRoots);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Folded-private inline ACK verification is untested

What's wrong
The PR verifies folded public+private ACKs through the SWM fallback path, but not through the inline stagingQuads path used by direct publishes. That leaves a user-facing publish mode without regression coverage for the new private-root folding behavior.

Example
A regression changing the inline branch back to computeFlatKCRoot(parsed, []) would still leave the new SWM folded-private test green, but a direct publish whose stagingQuads contain the public quads and whose privateMerkleRoots complete the claimed folded root would fail with Merkle root mismatch (inline quads).

Suggested direction
Cover the direct/inline ACK path separately from the SWM fallback path, because both branches were changed and they fail differently.

For Agents
Add a StorageACKHandler or v10-protocol-operations test that encodes a PublishIntent with non-empty stagingQuads, privateMerkleRoots: [privateRoot], merkleRoot = computeFlatKCRoot(publicQuads, [privateRoot]), and the folded leaf count, then asserts a valid StorageACK. A negative case with a missing or wrong private root would further prove the inline check is binding the commitments.

Comment thread packages/cli/src/daemon/lifecycle.ts Outdated
.getPeers()
.map((p) => p.toString())
.filter((id) => id !== agent.peerId);
if (protocol === PROTOCOL_STORAGE_ACK_V2) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: CLI daemon references the V2 ACK protocol without importing it

What's wrong
The new protocol-specific peer filter uses a symbol that is not in scope in the CLI daemon lifecycle module, so the CLI build/runtime path is broken.

Example
Building the CLI package after this change will hit TS2304: Cannot find name 'PROTOCOL_STORAGE_ACK_V2' at the new V2 peer-filter branch. If emitted despite type errors, the daemon would throw a ReferenceError when a folded-private ACK path asks for V2 peers.

Suggested direction
Import PROTOCOL_STORAGE_ACK_V2 from @origintrail-official/dkg-core in this file before using it in getConnectedCorePeers.

For Agents
Add PROTOCOL_STORAGE_ACK_V2 to the dkg-core import in packages/cli/src/daemon/lifecycle.ts and run the CLI TypeScript build/typecheck to prove the daemon lifecycle file compiles.

Daemon V2 ACK candidate filtering has no direct test

What's wrong
This PR adds a daemon-only candidate selection branch for folded-private ACKs. Because it is separate from the agent helper that is tested, the current suite can still pass if daemon publishes either dial V1-only cores for field-20 intents or find no V2 cores at all.

Example
A daemon regression that returns all connected peers for PROTOCOL_STORAGE_ACK_V2, or always returns [] because knownCorePeerIdsV2 is missing/misnamed, would not be caught by the current collector tests because they stub getConnectedCorePeers directly. A test with peers [self, v2-core, v1-core] and knownCorePeerIdsV2 = {v2-core} should prove folded-private ACK collection only dials v2-core.

Suggested direction
Add a daemon-level regression test, or extract this transport construction into a testable helper and cover both V1 fallback and V2 strict filtering.

Confidence note
I found CLI publisher-runner tests, but none that exercise this runDaemonInner transport closure with PROTOCOL_STORAGE_ACK_V2; the closure is embedded in daemon startup, so a human may know of an indirect fixture I did not see in the diff search.

For Agents
Add CLI/daemon coverage near the publisher runtime or daemon lifecycle tests. Stub an agent with connected peers plus knownCorePeerIdsV2, drive the V10 ACK provider or extracted transport with PROTOCOL_STORAGE_ACK_V2, and assert V2 filtering excludes self and V1-only peers while V1 behavior remains unchanged.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: CLI daemon V2 ACK peer selection wiring is not directly verified

What's wrong
Folded-private ACKs now rely on getConnectedCorePeers(PROTOCOL_STORAGE_ACK_V2) to prefer peers that advertise the V2 storage-ack protocol. The agent path has new coverage, but the CLI daemon has its own transport callback and this diff changes it separately. Without a test on that path, a future regression in the daemon wiring could route folded-private ACK collection through V1-biased or unfiltered peers while the existing agent tests still pass.

Example
A regression where the daemon callback ignores the protocol argument or fails to pass knownCorePeerIdsV2 would still leave ack-candidate-pool.test.ts green, because those tests call DKGAgent.getACKCandidatePeers, not the daemon's createV10ACKProvider transport. A focused test could instantiate the daemon ACK transport with connected peers plus V1/V2 known-core sets and assert getConnectedCorePeers(PROTOCOL_STORAGE_ACK_V2) orders V2 peers first and falls back only when below quorum.

Suggested direction
Add a focused test for the daemon transport path so folded-private publishes from the CLI use the same V2-aware candidate selection that the agent path now verifies.

Confidence note
The diff shows daemon ACK peer selection was changed but no daemon-level test or helper-level unit test is added for the CLI transport path.

For Agents
Look at packages/cli/src/daemon/lifecycle.ts around the createV10ACKProvider transport. Add a daemon/CLI-level test, or extract and directly test the transport factory if needed, proving the optional protocol argument is forwarded to selectACKCandidatePeers with both known-core sets and the runtime required ACK threshold.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: CLI daemon V2 ACK peer-selection bridge lacks coverage

What's wrong
The changed daemon path is user-facing for CLI/async publishing, and it is where protocol-aware ACK collection is connected to the agent's V2 capability cache. If this closure stops forwarding the requested protocol or the V2 set, the lower-level tests could still pass while daemon-originated folded-private publishes target the wrong peer pool and fail quorum in mixed clusters.

Example
A daemon-level test or extracted helper test could seed connected peers plus knownCorePeerIdsV2, call the daemon ACK peer selector with PROTOCOL_STORAGE_ACK_V2, and assert V2-advertised peers are preferred while V1-only peers are retained only as negotiation fallback when below quorum.

Suggested direction
Add focused coverage for the daemon ACK transport's getConnectedCorePeers(protocol) wiring, not just the shared selector logic, because folded-private CLI publishes depend on this bridge using the V2 capability cache.

Confidence note
This is based on the diff and repository search: the shared peer-selection behavior is tested through the agent, but I did not find a CLI daemon test for this new bridge closure specifically.

For Agents
Look at packages/cli/src/daemon/lifecycle.ts around the ACK transport factory. Add coverage for the daemon peer-selection bridge, or extract the closure into a small tested helper. The test should prove the requested ACK protocol is forwarded and that knownCorePeerIdsV2 affects folded-private peer ordering in the CLI path.

Comment thread packages/cli/src/daemon/lifecycle.ts Outdated
.map((p) => p.toString())
.filter((id) => id !== agent.peerId);
if (protocol === PROTOCOL_STORAGE_ACK_V2) {
const knownCorePeerIdsV2 = (agent as any).knownCorePeerIdsV2 as

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize ACK peer capability selection instead of reaching into agent internals

What's wrong
The folded-private capability gate leaks through layers as ad-hoc state. The daemon reaches into protected agent internals with any, while other files maintain a second parallel peer set and duplicate the add/delete rules. That makes the architecture more coupled and makes protocol capability changes expensive and easy to apply inconsistently.

Example
The V2 branch in the daemon is a bespoke copy of the agent-side selection policy. A future /storage-ack V3 or update-ACK capability would require another set, another identify-update branch, another daemon cast, and another special case in the selector.

Suggested direction
Promote the peer-selection logic to a canonical typed boundary instead of duplicating it in the daemon with any casts and parallel knownCorePeerIds* sets. A protocol-indexed capability map or small peer-selector service would make this extension natural and avoid scattering future protocol gates across lifecycle, sync, agent, and daemon code.

For Agents
Move peer ACK capability ownership behind a typed API or helper, for example a protocol-indexed peer capability registry plus getACKCandidatePeers(protocol). Use it from both DKGAgent and the CLI daemon; preserve the current V1 quorum-aware fallback and the V2 no-fallback behavior. A focused test should cover V1 fallback, V2 strict filtering, and stale capability eviction from populated identify protocol lists.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Keep ACK peer capability selection behind one typed boundary

What's wrong
This PR extends the daemon’s copied peer-selection logic by reading a protected agent set through any. That makes ACK capability policy non-canonical and leaks agent internals into CLI orchestration. The V2 gate, stale identify handling, and fallback rules are now scattered instead of owned by one model.

Example
The agent-owned selector has the V2 rule, while the daemon has a separate inline copy that reaches through any. Any future change to ACK peer selection now has to update both paths, plus the sync-time capability bookkeeping, or the daemon and direct agent publish paths can drift structurally.

Suggested direction
Expose a canonical getACKCandidatePeers(protocol)/ACK transport factory from the agent, or extract a small PeerCapabilityIndex used by both agent and daemon. Avoid (agent as any) and avoid duplicating protocol interpretation in the CLI layer.

For Agents
Look at packages/agent/src/dkg-agent.ts, packages/agent/src/dkg-agent-lifecycle.ts, packages/agent/src/sync/on-connect/sync-on-connect.ts, and this daemon transport factory. Preserve V1 fallback behavior and V2 fail-closed behavior, but move capability-based peer selection behind a typed agent method or shared peer-capability selector. Tests should prove daemon async publish and agent direct publish ask the same peers for V1 and V2.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize ACK peer selection instead of duplicating it through any

What's wrong
This adds a second implementation of a compatibility- and quorum-sensitive policy in the CLI layer, then accesses the agent's protected internals through any. That makes the boundary blurrier and creates a drift point for every future protocol version or peer-discovery rule.

Example
A future change to how stale V2 protocol metadata is handled would have to be applied in both DKGAgent.getACKCandidatePeers and this daemon callback. If one side changes and the other does not, foreground publishes and daemon async publishes can select different peer pools for the same protocol.

Suggested direction
Move the peer-selection policy to a canonical helper such as selectACKCandidatePeers({ allPeers, selfPeerId, knownCorePeers, knownV2Peers, quorum, protocol }), or expose a narrow method on DKGAgent. The CLI should not know about protected sets or replicate quorum/protocol ordering rules.

For Agents
Extract the ACK peer selection into one typed helper or expose an agent-owned public method that the daemon can call. Preserve the current V1/V2 ordering and quorum behavior, then cover both foreground agent and daemon transport wiring with the same selector tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize ACK peer capability selection instead of duplicating it through any access

What's wrong
The new V2 compatibility policy is now copied into the CLI daemon and implemented by casting the agent to any to read internal sets and quorum state. That is a structural regression: the most fragile part of the change, peer capability gating during mixed-version rollout, is no longer owned by one abstraction and can drift silently between publish paths.

Example
If the V2 quorum fallback changes again, maintainers must update the private DKGAgent method, the daemon closure that reaches into (agent as any), and the protocol-cache mutation paths together. Missing one leaves two publish entry points with different peer ordering for the same ACK protocol.

Suggested direction
Move the peer selection policy into a shared selectACKCandidatePeers/capability tracker helper, or expose a narrow typed method on DKGAgent that the daemon can call. The protocol-cache add/delete logic should also live behind one helper so V1 and V2 membership cannot drift between connection and peer-update paths.

For Agents
Extract the ACK peer capability state/update and candidate ordering into one canonical helper or service owned by the agent layer. Have both the normal agent provider and daemon ACK transport call that helper instead of duplicating selection logic or reading protected fields via any. Preserve the current ordering: V2-advertised peers first, confirmed cores next, then remaining connected peers until V2-advertised peers satisfy quorum.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Avoid extending the CLI’s any access to agent internals

What's wrong
This change deepens an existing boundary leak: the CLI now depends on another protected agent implementation detail through as any. That undermines the new helper abstraction and makes future cleanup of peer capability state riskier than it needs to be.

Example
If the agent replaces the two sets with a capability index or renames knownCorePeerIdsV2, this CLI path can keep compiling because of any while silently losing the V2 preference behavior at runtime.

Suggested direction
Make peer candidate selection an explicit agent API, or inject a typed candidate provider into the daemon. The CLI should not know about knownCorePeerIdsV2, lastKnownRequiredACKs, or how the agent stores peer protocol capabilities.

For Agents
Look at the daemon ACK transport factory and DKGAgent.getACKCandidatePeers. Preserve daemon ACK collection behavior, but expose a typed public agent method or transport/candidate provider instead of reading protected fields with any. Add a small compile-time/behavioral test around the daemon transport factory if feasible.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Do not extend the daemon’s ACK path through any access to agent internals

What's wrong
This change centralizes peer ordering in a helper, but the daemon still builds the helper input by poking through protected agent fields with any. The new V2 capability set and runtime quorum value make that boundary leak larger: CLI lifecycle now depends on internal agent storage details that TypeScript cannot protect.

Example
If the agent renames or replaces knownCorePeerIdsV2 with a capability map, this daemon code still compiles because of any, silently falling back to weaker V2 peer ordering. A future storage-ack V3 would require another protected field and another daemon edit instead of extending one typed agent API.

Suggested direction
Move this behind a typed boundary owned by the agent, or provide a small public capability/peer-selection adapter. The CLI should not know the names of the agent’s internal peer sets.

For Agents
In packages/cli/src/daemon/lifecycle.ts, avoid reaching into DKGAgent internals for ACK peer selection. Expose a typed agent method/dependency such as getACKCandidatePeers(protocol) or reuse the agent’s V10 ACK provider construction so the daemon does not duplicate protected state access. Preserve the current V1/V2 ordering and quorum-aware fallback behavior.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Do not leak ACK peer-selection internals into the CLI daemon

What's wrong
The new V2 capability set deepens an existing package-boundary leak. Peer capability state now matters for folded-private ACK correctness, but the daemon accesses and combines that state outside the agent abstraction with no type protection.

Example
If the agent later replaces knownCorePeerIdsV2 with a capability map or changes how quorum metadata is cached, TypeScript will not flag the daemon path because the CLI reads those fields through any. The daemon can then silently diverge from the agent’s own candidate ordering.

Suggested direction
Keep selectACKCandidatePeers pure if useful, but make the agent own the mutable capability sets and quorum cache. The CLI should consume a typed public capability, not protected fields via any.

For Agents
Move the daemon ACK candidate selection behind a typed agent-owned method or factory. For example, expose agent.getACKCandidatePeers(protocol) or an ACK transport factory from the agent package, then have both agent publish and daemon publish use that boundary. Preserve current V1 and V2 candidate ordering in focused tests.

// `DKGAgent.getACKCandidatePeers` builds for the publisher.
knownCorePeerIds.delete(remotePeer);
}
if (protocols.includes(PROTOCOL_STORAGE_ACK_V2)) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: V2 peer discovery cache is not covered on the on-connect path

What's wrong
Folded-private publishes now depend on knownCorePeerIdsV2 to find eligible ACK peers. The tests verify filtering once the set is already populated, but they do not verify the production on-connect path that populates and evicts that set from identify protocols. That leaves a real rollout failure mode unverified: V2-capable peers can be connected but never become ACK candidates.

Example
A regression that forgets to pass knownCorePeerIdsV2 into runSyncOnConnect, or deletes V2 membership on [], would still leave ack-candidate-pool.test.ts green because it bypasses this on-connect path. A focused test could call runSyncOnConnect with [PROTOCOL_SYNC, PROTOCOL_STORAGE_ACK_V2], then [PROTOCOL_SYNC], then [], and assert the V2 cache add/delete/preserve behavior.

Suggested direction
Add a regression test for runSyncOnConnect itself, not only DKGAgent.getACKCandidatePeers or handlePeerUpdateForSyncRetry.

For Agents
Add coverage in the sync-on-connect tests around packages/agent/src/sync/on-connect/sync-on-connect.ts: pass an explicit knownCorePeerIdsV2 set, vary getPeerProtocols, and prove V2 cache state changes match the V1 populated-list race behavior. Preserve existing skipped-no-sync and sync accounting behavior.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Initial sync-on-connect V2 peer classification is untested

What's wrong
Folded-private ACK candidate selection depends on knownCorePeerIdsV2, and this change updates that set during the initial sync-on-connect protocol scan. The added tests only exercise the later peer:update helper and direct candidate filtering, so a regression in the initial connection path could leave V2-capable peers undiscovered until another update event happens.

Example
A runSyncOnConnect test with getPeerProtocols returning [PROTOCOL_STORAGE_ACK, PROTOCOL_STORAGE_ACK_V2, PROTOCOL_SYNC] should assert that the supplied knownCorePeerIdsV2 contains remotePeer; a second case with a populated protocol list lacking V2 should assert stale V2 membership is removed, while an empty list preserves it.

Suggested direction
Add focused sync-on-connect tests for the new knownCorePeerIdsV2 add/delete behavior, separate from the already-tested peer:update helper.

For Agents
Extend packages/agent/test/sync-on-connect-retry.test.ts or the closest sync-on-connect test file. Pass an explicit knownCorePeerIdsV2 set into runSyncOnConnect, drive V2-present, V2-absent populated, and empty protocol-list cases, and assert the set is updated without disturbing existing sync outcomes.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize storage-ACK capability tracking instead of duplicating the V2 rules

What's wrong
The PR adds a second ACK capability set and threads it through two independent protocol-observation paths. Both paths now need to remember the same V2 rule, including the non-destructive empty-list race handling. That is exactly the kind of duplicated state-transition logic that tends to become permanent drift as more protocol versions are added.

Example
The empty identify-list rule is subtle: empty lists must not evict V2, populated lists without V2 must. That rule now exists in two places. Any future tweak to stale capability eviction, or any new ACK protocol capability, has to be applied twice and can drift.

Suggested direction
Create a small recordStorageAckCapabilities(peerId, protocols, tracker, options) helper or a peer-capability tracker object so the add/evict semantics live in one canonical place.

For Agents
Extract a small protocol capability updater, likely in the agent sync/peer capability layer, and use it from both runSyncOnConnect and handlePeerUpdateForSyncRetry. Preserve current behavior: V1 sync-on-connect may declassify on populated non-ACK lists, peer:update keeps V1 add-only behavior, and V2 evicts only on populated lists without V2. Existing sync-on-connect-retry and ack-candidate-pool tests should prove the shared helper.

lupuszr added 6 commits June 30, 2026 10:40
Allow folded public+private publishes to collect V10 ACKs by threading private Merkle root commitments through the PublishIntent path. Core nodes can recompute the folded KC root from public quads plus private commitments without seeing private plaintext.
a.handlePeerUpdateForSyncRetry(CORE[0], []);
expect(a.getACKCandidatePeers(PROTOCOL_STORAGE_ACK_V2)).toEqual(CORE);

a.handlePeerUpdateForSyncRetry(CORE[0], [PROTOCOL_STORAGE_ACK]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: The stale-V2 eviction test would pass without the eviction

What's wrong
This test is meant to verify that a populated protocol list without PROTOCOL_STORAGE_ACK_V2 removes stale V2 capability, but the assertion observes a candidate list that is identical before and after the deletion. That gives false confidence for the compatibility gate that protects folded-private ACKs from stale peer metadata.

Example
If the knownCorePeerIdsV2.delete(peerId) branch in handlePeerUpdateForSyncRetry were removed, this test would still pass: the stale V2 peer would be ordered first, then the remaining confirmed cores, producing the same CORE array.

Suggested direction
Make the test distinguish stale V2 membership from eviction, for example by checking knownCorePeerIdsV2.has(CORE[0]) after each update or by choosing quorum/candidates where stale membership changes the returned list.

For Agents
Update packages/agent/test/ack-candidate-pool.test.ts to assert the V2 cache mutation directly, or use a setup where stale V2 membership changes the candidate result, such as quorum 1 before/after the populated V1-only update. Preserve the empty-list non-destructive behavior and prove populated V1-only updates evict V2 capability; consider adding the same regression assertion for runSyncOnConnect because it updates the same knownCorePeerIdsV2 cache.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: The V2 peer-update regression test would pass without the eviction

What's wrong
The new test claims to verify stale V2 ACK capability is evicted only on populated protocol lists, but its expected candidate list is identical before and after eviction. That gives false confidence around the compatibility gate used for folded-private ACKs.

Example
If handlePeerUpdateForSyncRetry stopped deleting stale V2 membership on a populated V1-only protocol list, this test would still pass: stale v2Advertised=[CORE[0]], remainingConfirmedCore=[CORE[1], CORE[2], CORE[3]], result=CORE.

Suggested direction
Make the test assert the V2 cache state directly, or choose inputs where stale V2 membership changes the candidate list.

For Agents
Fix packages/agent/test/ack-candidate-pool.test.ts so the stale-V2 case has an observable difference. Either assert knownCorePeerIdsV2.has(CORE[0]) after [] and not after [PROTOCOL_STORAGE_ACK], or use quorum/peer ordering that changes getACKCandidatePeers(PROTOCOL_STORAGE_ACK_V2) when stale V2 membership remains. Preserve the intended empty-list race behavior and prove populated protocol lists evict V2.

syncingPeers: Set<string>;
getPeerProtocols: (peerId: string) => Promise<string[]>;
knownCorePeerIds: Set<string>;
knownCorePeerIdsV2?: Set<string>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Avoid hiding required V2 capability state behind an optional throwaway set

What's wrong
The new V2 capability cache is semantically required for the peer-selection policy, but the context type makes it optional and silently discards updates when omitted. That optionality papers over a real wiring invariant and makes future callers look correct while losing capability information.

Example
A test helper or future caller can construct SyncOnConnectContext with only knownCorePeerIds. The V2 branch still executes, but it records capabilities into a throwaway set, so the caller gets no signal that folded-private ACK capability tracking was never wired.

Suggested direction
Treat V2 protocol discovery as part of the same required peer-capability boundary as V1, not as an optional field with a silent default. A small registry abstraction would also remove the parallel-set update logic from runSyncOnConnect and handlePeerUpdateForSyncRetry.

Confidence note
Production wiring currently passes the V2 set, so this is a boundary/design concern rather than an observed missing production call site.

For Agents
Make the capability sink explicit. Either require knownCorePeerIdsV2 everywhere and update the test contexts, or replace both sets with a small PeerCapabilityRegistry/recordPeerProtocols(peerId, protocols) interface that owns V1/V2 updates in one place.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize ACK protocol capability tracking

What's wrong
V2 capability tracking is bolted into two unrelated sync paths as duplicated set mutation logic. That makes the compatibility policy harder to audit and makes every future storage-ack protocol version grow the same pattern again.

Example
The rule “empty protocol lists are non-destructive, populated lists without V2 evict V2 membership” now has to stay identical in both sync-on-connect and peer:update. A future /storage-ack V3 would likely add a third set and another copy of the same branch.

Suggested direction
Replace the parallel knownCorePeerIds / knownCorePeerIdsV2 mutation branches with a single capability update helper, or a map keyed by peer id and protocol. Then have both sync-on-connect and peer:update call that helper before candidate selection.

For Agents
Inspect runSyncOnConnect, handlePeerUpdateForSyncRetry, and selectACKCandidatePeers. Preserve the current empty-list behavior and V1/V2 candidate ordering, but move protocol-cache mutation behind one helper or PeerCapabilityIndex with tests that exercise both call sites through the same API.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize ACK capability cache updates instead of duplicating protocol branches

What's wrong
The PR centralizes candidate ordering with selectACKCandidatePeers, but the state that feeds that helper is still maintained by duplicated ad-hoc conditionals in separate lifecycle flows. That makes the V2 compatibility policy harder to audit than it needs to be.

Example
The empty-list race policy, stale populated-list eviction, and version-specific protocol constants now have to be kept aligned in at least two places. A future V3 ACK capability or a tweak to the identify race policy would require editing both paths correctly.

Suggested direction
Move the protocol-list to capability-cache mutation into one helper so peer identify, sync-on-connect, and future ACK versions share the same policy.

Confidence note
The duplicated branches currently match for V2 behavior, but this is exactly the kind of cache policy that tends to drift when the next protocol capability is added.

For Agents
Extract a small peer capability cache helper, probably near the sync/on-connect code or agent lifecycle utilities, that updates knownCorePeerIds and knownCorePeerIdsV2 from a protocol list with the chosen empty-list semantics. Use it from both runSyncOnConnect and handlePeerUpdateForSyncRetry; keep the existing add/delete behavior unchanged and pin with the existing tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Centralize storage-ACK capability tracking instead of copying V2 mutations

What's wrong
The PR adds another parallel set and repeats its mutation rules in separate flows. This is exactly the kind of small duplicated conditional that grows into protocol-version spaghetti as more ACK capabilities are added.

Example
Both call sites must remember the same policy: add on PROTOCOL_STORAGE_ACK_V2, delete only when the protocol list is populated and lacks V2. A future V3 ACK protocol or a change to empty-list handling now has to be patched in multiple places.

Suggested direction
Move the V1/V2 ACK capability mutation rules into one helper or small peer-capability model, then have both event paths call it. That keeps the tricky identify-race policy in one place and makes adding another protocol version less error-prone.

Confidence note
The duplicated blocks are small today, but the logic is subtle because empty identify lists must be non-destructive while populated lists may evict stale V2 membership.

For Agents
Extract a tiny capability updater near the sync/agent boundary, e.g. recordStorageAckCapabilities(peerId, protocols, state), and call it from both handlePeerUpdateForSyncRetry and runSyncOnConnect. Preserve the empty-list non-eviction behavior and populated-list stale V2 eviction in tests.

@lupuszr lupuszr force-pushed the codex/issue-1369-folded-private-anchor branch from a7414f4 to 9ff2c20 Compare June 30, 2026 09:12
`See GH #1013 (honesty) and GH #1121 (encrypted private async publish).`,
);
}
if (publishResult.status === 'tentative') {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Legacy private-no-acks results can be finalized as local successes

What's wrong
The mapper no longer distinguishes an honest no-chain local publish from the previous private-no-acks failure shape. Any existing result with that reason is now treated as a successful local finalization, which reintroduces the provenance-honesty problem the old guard prevented.

Example
Calling mapPublishResultToLiftJobSuccess() with an old-shaped result like { status: 'tentative', localChainSkipReason: 'private-no-acks', onChainResult: undefined, ual: 'did:dkg:.../t1', ... } now returns { status: 'finalized', finalization: { mode: 'local' } }. That publish targeted a chain-registered CG and did not reach chain, so it should not be reported as finalized local success.

Suggested direction
Keep the old honesty guard as a runtime compatibility check even if new code no longer produces the enum value, and retain or migrate the corresponding terminal failure semantics for existing async lift records.

Confidence note
This depends on whether deployments can still have in-flight or serialized PublishResult/lift-job data produced by versions that emitted localChainSkipReason: 'private-no-acks'. The exported mapper still accepts runtime objects, so a migration or guard is needed if upgrade compatibility matters.

For Agents
In packages/publisher/src/async-lift-publish-result.ts, keep a runtime compatibility guard for (publishResult as any).localChainSkipReason === 'private-no-acks' before the tentative local-finalization branch, or add an explicit migration path. Preserve no-chain local success and add a regression with a legacy private-no-acks result object.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Legacy private-no-acks async jobs become retryable again

What's wrong
The mapper correctly refuses to finalize legacy private-no-acks results as local success, but the PR also removes the special terminal failure classification. The thrown error is now classified as a generic RPC outage, so the async lift queue can reset and retry a job that this code says should be kept terminal-failed or explicitly rerun.

Example
A recovered async job records a legacy tentative publish result with localChainSkipReason: 'private-no-acks'. mapPublishResultToLiftJobSuccess throws Async lift publish result cannot finalize legacy private-no-acks...; recordExecutionFailure maps that broadcast failure to rpc_unavailable, whose policy is retryable/reset, instead of a terminal failure.

Suggested direction
Keep a terminal classification for the new private-no-acks legacy error message, or retain an equivalent non-retryable failure code/policy for this deterministic condition.

For Agents
Check mapPublishResultToLiftJobSuccess, mapPublishExceptionToLiftJobFailure, and LIFT_JOB_FAILURE_POLICIES. Preserve the new folded-private behavior, but keep legacy private-no-acks failures terminal. Add a regression that routes the thrown legacy error through mapPublishExceptionToLiftJobFailure and asserts non-retryable terminal metadata.

// `useCuratedCatalog` gates the inline-catalog ACK path (cores rebuild the
// catalog root from the inline plaintext and DECLINE on mismatch).
const useCuratedCatalog = useEncryptedInline && catalogCommitment !== undefined;
if (useEncryptedInline && privateRoots.length > 0) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Local-only encrypted private publishes now fail before the local fallback

What's wrong
The new unconditional guard treats every encrypted publish with privateQuads as an ACK-model error. For local-only publishes there is no ACK collection or on-chain submission, so the ACK limitation does not apply; throwing here blocks a previously valid local persistence path.

Example
Call DKGPublisher.publish for an unregistered/local context graph with quads, privateQuads, and a real encryptInlinePayload. This used to reach the intentional local publish path and return a tentative local result; now it throws Encrypted inline publishes with privateQuads are not supported... before local persistence.

Suggested direction
Gate this error on the ACK/on-chain path, not merely on the presence of an encryption callback, so local-only publishes can still use the existing intentional-local flow.

For Agents
In packages/publisher/src/dkg-publisher.ts, narrow the mixed curated/folded-private rejection so it only blocks V10 ACK/on-chain attempts, or move it into the ACK collection branch. Preserve the on-chain curated+private failure, but prove a local-only encrypted private publish still finalizes locally and persists its private slice.

Comment thread packages/publisher/src/publisher.ts Outdated
* Callback that collects V10 StorageACKs from core nodes.
* Called AFTER merkle root computation, BEFORE on-chain tx.
*/
export type V10ACKProvider = (params: V10ACKProviderParams) => Promise<V10CoreNodeACK[]>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Changing V10ACKProvider to object-only breaks existing publisher integrations

What's wrong
The PR changes a public callback contract from positional arguments to a single params object. Existing callers that supply their own ACK provider are not adapted by the library, so normal publishes can fail before any folded-private behavior is involved.

Example
An existing integration with v10ACKProvider: async (merkleRoot, contextGraphId, kaCount, rootEntities, publicByteSize, ...) => ... will now be called as v10ACKProvider({ merkleRoot, contextGraphId, ... }); contextGraphId is undefined, causing digest construction or ACK collection to fail at runtime even for ordinary public publishes.

Suggested direction
Support both the old positional callback and the new object parameter during the compatibility window, or make this an explicit major-version API break with adapter coverage.

Confidence note
This is a public exported type, but release/versioning context is not in the diff; if this PR intentionally targets a major breaking release, this may be acceptable.

For Agents
In packages/publisher/src/publisher.ts and the call sites in dkg-publisher.ts, preserve backward compatibility with the positional provider shape or provide an explicit adapter/overload and migration boundary. Add a compatibility test using the old positional callback for a public publish, plus a new object-form folded-private test.

function isLegacyV10ACKProvider(
provider: NonNullable<PublishOptions['v10ACKProvider']>,
): provider is LegacyV10ACKProvider {
return provider.length > 1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Legacy ACK providers using rest/default parameters are misclassified as object providers

What's wrong
The PR advertises backward compatibility for the original positional v10ACKProvider, but the runtime discriminator does not reliably identify valid positional callbacks. Existing integrations that used rest parameters to forward or capture the positional arguments will be called with a single object and can compute wrong ACK digests or fail before publishing.

Example
A previously valid provider such as const provider = async (...args) => forwardToCollector(...args) now has provider.length === 0, so line 218 calls it as provider(params) instead of passing the 13 positional arguments. Public publishes then forward an object where merkleRoot was expected.

Suggested direction
Use an explicit provider shape marker/separate option or a wrapper helper instead of inferring the callback contract from Function.length; preserve positional invocation for existing legacy providers unless callers opt into the object form.

For Agents
Look at invokeV10ACKProvider in packages/publisher/src/dkg-publisher.ts. Avoid using Function.length as the compatibility discriminator, or make the provider shape explicit at the API boundary. Add a regression with a legacy rest-parameter provider for a public publish and prove it still receives positional args.

Avoid using function arity as the V10 ACK provider contract

What's wrong
This adds a brittle, magical boundary right where the PR is trying to make ACK modes explicit. It also forces casts in the implementation, which means the type system cannot actually protect the new object-form contract.

Example
A wrapper written as async (...args) => legacyProvider(...args) has length === 0, so this adapter treats it as the object-form provider even though it is forwarding the legacy positional shape. More generally, default parameters, rest parameters, and bound functions make Function.length a weak type discriminator.

Suggested direction
Make the provider shape explicit at the API boundary instead of inferring it from Function.length. The canonical object-form provider is a good direction; keep legacy support behind an explicit adapter rather than a magic runtime guess.

For Agents
Look at packages/publisher/src/dkg-publisher.ts around the provider adapter and packages/publisher/src/publisher.ts for the public type. Preserve support for public/catalog legacy providers, but replace arity inference with an explicit adapter/normalization boundary, such as a named legacyV10ACKProvider(...) wrapper, a provider object with a kind, or a compatibility helper callers opt into. Add a small compatibility test for rest-arg/default-param wrappers so the boundary is deliberate.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Legacy ACK providers using rest-argument wrappers are misdetected as object-form

What's wrong
The compatibility shim is meant to keep old positional v10ACKProvider callbacks working, but Function.length does not identify all old callbacks. Common wrappers that collect ...args, decorate metrics, or forward to another provider report length 0 and are now called with the new object contract. That breaks existing public/catalog ACK integrations even though the type advertises legacy support.

Example
An existing integration like const wrapped = async (...args) => realLegacyProvider(...args); used to receive (merkleRoot, contextGraphId, ..., merkleLeafCount). After this change, wrapped receives only { merkleRoot, contextGraphId, ..., ackMode }, then forwards that object as the first positional argument and leaves the rest undefined. Public/catalog publishes can then fail or sign the wrong digest depending on the wrapper.

Suggested direction
Avoid relying on Function.length as the only discriminator. Prefer an explicit object-provider marker/wrapper, or otherwise make the compatibility path default to positional for ambiguous functions so existing providers keep working.

For Agents
Look at invokeV10ACKProvider in packages/publisher/src/dkg-publisher.ts. Preserve existing positional callbacks, including rest-argument wrappers, while adding a non-ambiguous way to opt into object-form providers. Add a regression test with a legacy (...args) => realProvider(...args) wrapper around a public publish and prove it still receives positional arguments.

Avoid function arity as the ACK provider contract switch

What's wrong
This is a brittle, magical boundary. TypeScript cannot enforce the union at runtime, and JavaScript function arity changes with rest parameters, defaults, binding, and wrappers. The result is an API compatibility path that is hard to reason about and easy to break accidentally.

Example
A compatibility wrapper such as const provider: LegacyV10ACKProvider = async (...args) => legacyProvider(...args) has length === 0, so it is treated as the object-form provider and receives a single params object. Conversely, an object provider with an accidental second declared parameter would be treated as legacy.

Suggested direction
Make the canonical internal provider object-form only, and adapt legacy positional callbacks explicitly at the API/configuration boundary. This deletes the magic runtime inference and makes folded-private support an explicit capability rather than a side effect of how a function is declared.

For Agents
Replace arity detection in dkg-publisher.ts with an explicit boundary: either expose a wrapLegacyV10ACKProvider() adapter, add a separate legacy option, or use a tagged provider object. Preserve public/catalog legacy behavior and prove it with a rest-arg legacy wrapper test plus an object-form folded-private test.

Legacy ACK provider compatibility is only tested for one function shape

What's wrong
The change introduces runtime detection between object-form and legacy positional ACK providers, but the tests only verify the easiest legacy shape. That leaves a real API-compatibility path unverified and could let a common wrapper style break while the legacy-support test stays green.

Example
A legacy integration written as const provider = async (...args) => realLegacyProvider(...args) is still a positional callback, but provider.length is 0, so the current test would not prove this compatibility path works. A regression test should publish with that wrapper and assert the legacy arguments are received, or document that shape as unsupported.

Suggested direction
Broaden the compatibility tests around legacy positional v10ACKProvider callbacks so they cover the detection rule, not just a high-arity function declaration.

Confidence note
I did not run the suite; this is based on the diff and repository test search.

For Agents
Look at packages/publisher/src/dkg-publisher.ts legacy provider detection and packages/publisher/test/publish-lifecycle.test.ts legacy-provider coverage. Add a public publish test using a rest-parameter positional provider, and keep the existing declared-argument case. If rest wrappers are intentionally unsupported, add a test that proves the intended failure mode is explicit.

subGraphName: options.subGraphName,
merkleLeafCount: kcMerkleLeafCount,
};
if (useCuratedCatalog) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: Push ACK-mode construction out of the 6k-line publish orchestration

What's wrong
This adds more mode branching to an already very large publish method instead of letting the new ACK-mode model simplify the flow. The behavior may be correct, but structurally it makes the core publish path harder to scan and easier to extend incorrectly.

Example
The reader has to track useEncryptedInline, useCuratedCatalog, catalogCommitment, privateRoots.length, stagingQuads, and effectiveByteSize across a very large method before understanding which ACK request is valid. The new V10ACKMode could make that one explicit value instead.

Suggested direction
Use the new V10ACKMode abstraction more aggressively: compute the ACK request once in a focused helper, then let the orchestration call the provider once. This is a good opportunity to delete branching from the god method rather than adding another local decision tree.

For Agents
In packages/publisher/src/dkg-publisher.ts, extract a pure helper such as buildV10ACKProviderParams(...) or resolveV10ACKMode(...) that returns the canonical mode plus required staging/catalog validation. Preserve the three current behaviors and phase/logging, but leave the publish orchestration with a single invokeV10ACKProvider(v10ACKProvider, params) call. Add unit coverage for the helper for public, folded-private, and curated-catalog modes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants