Skip to content

V2#29

Open
awcjack wants to merge 361 commits into
masterfrom
v2
Open

V2#29
awcjack wants to merge 361 commits into
masterfrom
v2

Conversation

@awcjack

@awcjack awcjack commented Jun 5, 2026

Copy link
Copy Markdown

  • CHANGELOG updated or not needed
  • Documentation updated or not needed
  • Haddocks updated or not needed
  • No new TODOs introduced or explained herafter

v0d1ch and others added 30 commits April 1, 2026 10:18
  The old dumpTrace used say which in IO context dumps the entire IOSim
  trace (potentially thousands of JSON lines) to stdout on test failure.
  With slow-network tests generating much larger traces, this was very
  noisy. Now the trace is written to a temp file and only the path is
  printed.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  When CommitFinalized (or DecommitFinalized) arrives while a snapshot is
  in RequestedSnapshot state, the in-flight ReqSn carries the old version
  and will be rejected with ReqSvNumberInvalid once the version bumps.
  Since nothing re-triggers a fresh request, the head gets permanently
  stuck with pending localTxs.

  The guard added in ef0762b used snapshotInFlight which returns True
  for both RequestedSnapshot and SeenSnapshot. This was correct for
  SeenSnapshot (AckSns in-flight, snapshot will complete naturally) but
  too broad for RequestedSnapshot (stale echo will be rejected).

  Replace the guard with isCollectingAcks which only blocks SeenSnapshot,
  allowing CommitFinalized/DecommitFinalized to immediately re-request
  with the new version when in RequestedSnapshot state.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
…shot

  Rename "extracts correct snapshot number" to reflect that the test now
  verifies SeenSnapshot is preserved (not reset to LastSeenSnapshot) so
  AckSns can still be collected after DecommitFinalized arrives.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
…ck handlers

  ReqTx, OnDecrementTx, and the rollback repost path were passing unfiltered
  pendingDeposits, allowing deposits from other heads to be picked up when
  selecting the next deposit for ReqSn. Apply depositsForHead consistently,
  matching the existing filtering already done in ReqSn, AckSn, and onOpenChainTick.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  Fetch and patch peer-snapshot.json for Preproduction (in addition to
  Mainnet and Preview) so the node can bootstrap peers and start syncing
  within the test's 10-second window.

  Also update Preview and Preproduction config paths from environments-pre/
  to environments/ to match the current layout on book.world.dev.cardano.org,
  and increase the CommitRecovered wait timeout in canSeePendingDeposits to
  20 block times to tolerate slower networks.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
…tnet

  On multi-era chains, Byron slots are 20s each while Shelley+ slots are 1s.
  Using fixedEpochInfo for L2 Globals produces wrong POSIXTime values in the
  Plutus ScriptContext, which can cause time-sensitive scripts (Close, Contest,
  Fanout) to fail. For online (Cardano) mode, query the chain's EraHistory and
  use it to build an era-aware EpochInfo via newGlobalsWithEraHistory. Offline
  mode keeps fixedEpochInfo since it runs a single-era devnet.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  The POST /commit endpoint no longer exists, and the test referenced
  the removed CommittedTooMuchADAForMainnet error constructor.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  When two heads share the same network, nodes observe chain events for
  all heads. Three aggregate cases were applying head-specific state
  without checking that the event belonged to the currently open head:

  - DepositActivated: could set currentDepositTxId in HEAD 1 to a
    deposit txid from HEAD 0, causing onOpenChainTick to never fire
    ReqSn (isNothing currentDepositTxId = False)
  - DepositRecovered: could clear currentDepositTxId in the wrong head
  - CommitFinalized: could corrupt version, localUTxO and reset
    currentDepositTxId in the wrong head

  Add `deposit.headId == ourHeadId` / `headId == ourHeadId` guards in
  each case. The pendingDeposits cleanup in DepositRecovered remains
  unconditional since that map is node-level and tracks all heads.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  Add unit tests verifying that DepositActivated, DepositRecovered, and
  CommitFinalized StateChanged events carrying a foreign headId do not
  corrupt the CoordinatedHeadState of the currently open head.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
👷‍♂️ - Fix head getting permanently stuck when
CommitFinalized or DecommitFinalized bumps the snapshot version while a
ReqSn echo is still in-flight — only blocks re-request when AckSns are
actively collecting (isCollectingAcks), not during RequestedSnapshot.
👷‍♂️ - Fix deposit activated while a snapshot is
in-flight being silently dropped — the next chained snapshot picks it up
via selectNextDeposit, and DepositActivated now sets currentDepositTxId
if unset.
👷‍♂️ - Fix deposits from other heads being selected
for ReqSn in ReqTx, OnDecrementTx, and rollback repost handlers —
depositsForHead is now applied consistently in all head-level handlers.
👷‍♂️ - Guard deposit aggregate cases by headId to
prevent one head's deposits from corrupting another head's state when
multiple heads share the same network.
👷‍♂️ - Fix Plutus script evaluation on
mainnet/testnet: L2 ledger Globals now uses era-aware EpochInfo (queried
from chain) instead of fixedEpochInfo, ensuring correct POSIXTime values
in Plutus ScriptContext for time-sensitive scripts.
👷‍♂️ - Fix Preproduction node not syncing due to
missing peer-snapshot.json bootstrap and stale config paths.
👷‍♂️ - Remove the hard-coded 100 ADA commit limit
on mainnet.
👷‍♂️ - Remove the GET /head-initialization
endpoint.


---

<!-- Consider each and tick it off one way or the other -->
* [x] CHANGELOG updated or not needed
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained herafter
…g#2550)

Fixes cardano-scaling#2388

Skips on CI job `tx-cost-diff` for PR's from external contributors; uploads an artifact always, so anyone can view what the difference is (only without the convenience of it being a GitHub comment.)

The CI job `tx-cost-diff` tries to add comments on PR and it fails for
external contributors PRs
([example](https://github.com/cardano-scaling/hydra/actions/runs/23140532889/job/67572643885?pr=2547))
with:

```text
403 Resource not accessible by integration
```

This change follows the same pattern as in `ci-nix.yaml` job
`publish-benchmark-results`:


https://github.com/cardano-scaling/hydra/blob/b7ad7a8b26991fab350a7403745cc07503ba9a9c/.github/workflows/ci-nix.yaml#L128-L128

If CI passes for this PR, it means the fix works.


---

* [x] CHANGELOG updated or not needed (CI-only change)
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained hereafter

---------

Co-authored-by: Noon <noon.vandersilk@iohk.io>
<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [x] CHANGELOG updated or not needed
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained herafter

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Bump the limit for waiting on hydra-node connection

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
v0d1ch and others added 30 commits June 1, 2026 21:42
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Also reduce a diff in the test code

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
… merge

  - HeadLogic: FanoutInProgress case must convert Set (TxOutType tx) to
    UTxOType tx via filterUTxOByOutputs/computeFullFanoutUTxO before passing
    to FinalPartialFanoutTx.utxoToDistribute

  - State/Handlers: remove non-existent splitUTxOAt and sizeUTxO (were
    introduced in the branch but never added to Hydra.Tx); inline the UTxO
    split using take/drop on UTxO.toList, and replace sizeUTxO with UTxO.size

  - HeadLogicSpec: remainingUTxO field renamed to remainingOutputs (master's
    name from the partial-fanout PR); replace removed PartialFanoutTx
    PostChainTx constructor with FinalPartialFanoutTx

  - HandlersSpec/StateSpec: add missing imports (finalPartialFanout,
    unsafePartialFanout, UTxO); rewrite splitUTxOAt tests with inline splits

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  Extend TinyWallet with isTxWithinSizeLimits to check serialised tx
  byte size against ppMaxTxSizeL from current protocol parameters.

  Replace the linear scan in findFittingFanoutTx with a binary search
  (findLargestFitting) that finds the largest chunk that fits within
  both size and script execution limits. A short-circuiting fitsTx
  check runs the cheap size check before the expensive UPLC evaluation.
  Structural failures from partialFanout abort the search immediately
  since they are independent of chunk size.

  Both functions are extracted as testable top-level exports and covered
  by property tests: fitsTx tests verify short-circuit behaviour and
  correct result combination using real Cardano protocol parameters and
  evaluateTx; findLargestFitting tests verify the monotone-predicate
  property and the O(log n) evaluation bound using the built-in
  Monad instance for (,) (Sum Int).

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  Remove the two duplicate binarySearch helpers from
  computePartialFanOutNominalCost and computePartialFanOutMixedCost and
  replace them with the shared findLargestFitting from
  Hydra.Chain.Direct.Handlers. Also use fanoutChunkSize from fixture in
  the membership proof bench group name instead of a hardcoded literal.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  - fitsTx: verify short-circuit (size check before UPLC eval), failure
    on budget/script errors, success path, and a real-world test against
    actual Cardano protocol parameters and evaluateTx
  - findLargestFitting: verify O(log n) call count via pure (Sum Int, a)
    monad, correct propagation of mkTx/fitsCheck exceptions, and
    monotone correctness via round-trip
  - UTxO splitting: replace three fixed-value unit tests with one property
    test covering normal split, n-exceeds-size, and empty-UTxO cases,
    with cover thresholds to enforce all branches are exercised

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
…aybe

  Eliminates the `either (const Nothing) Just` conversion at each call site
  by changing the preferred-tx parameter from `Maybe Tx` to `Either e Tx`
  and pattern matching on `Right`/`Left` in `findBest`.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
  When partialFanout returns Left during the binary search fallback the
  error was silently discarded. Now emits PartialFanoutFailed with the
  reason before throwing StalePartialFanoutTx, making unrecoverable
  failures visible in the node logs.

Signed-off-by: Sasha Bogicevic <sasha.bogicevic@iohk.io>
- Remove all hardcoded fanout chunk size and threshold constants; fanout
sizing is now fully dynamic
- findFittingFanoutTx uses a binary search (findLargestFitting) to find
the largest UTxO chunk that fits within the protocol size limit and
script budget, checking both tx byte size and UPLC execution units
- findLargestFitting is a general-purpose monadic binary search exported
from Handlers.hs and reused in the tx-cost benchmark (replacing a local
copy)
- Property tests added for fitsTx (short-circuit, failure modes, real
Cardano protocol parameters), findLargestFitting (O(log n) call count,
exception propagation, monotone correctness), and UTxO splitting
(normal/overflow/empty cases with cover thresholds)


---

<!-- Consider each and tick it off one way or the other -->
* [x] CHANGELOG updated or not needed
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained herafter
…no-scaling#2627)

The 90% pumba test was failing because of missed etcd writes.

This fixes the flakiness on the pumba network tests by being more
resilient at how it writes keys to ecd. i think it's reasonably
uncontroversial; seems like a genuninely useful change, and hopefully
means we can trust our tests a little more.
  The SQLite event store accumulates events from all head lifecycles without
  rotation. On restart, `aggregate` replays every persisted StateChanged event
  but unlike the live `handleChainInput` path, never checked that an event's
  headId matched the current state — so a HeadClosed or HeadFannedOut event
  from a previous head could silently drive an unrelated Open/Closed head into
  the wrong state.

  Fix: add `eventHeadId` and `headIdOf` helpers and a single pre-check at the
  top of `aggregate`. Any event whose headId does not match the current state's
  headId is silently dropped. Events that carry no headId are always applied,
  preserving existing behaviour for TransactionReceived, ChainRolledBack, etc.

  Regression tests cover the two critical cross-phase transitions:
  Open→Closed via HeadClosed and Closed→Idle via HeadFannedOut from a
  mismatched head.
  Move the cross-head contamination guard from the HeadState-only
  `aggregate` function into `aggregateNodeState`, so that the entire
  NodeState (including pendingDeposits) is protected on headId mismatch.

  This removes the redundant inline `| headId == ourHeadId` guards from
  the DepositRecovered and CommitFinalized cases, which previously still
  modified pendingDeposits in their fallthrough branch even when the
  headId didn't match.

  Also makes `eventHeadId` fully explicit — no wildcard catch-all — so
  every StateChanged constructor is deliberately accounted for. Events
  with a head-specific headId now return Just headId; IgnoredHeadInitializing
  returns Nothing because its headId is the other head's id, not ours.
cardano-scaling#2618)

 fix cardano-scaling#2605 
 
The SQLite event store accumulates events from all head lifecycles
without
rotation. On restart, `aggregate` replays every persisted StateChanged
event
but unlike the live `handleChainInput` path, never checked that an
event's
headId matched the current state — so a HeadClosed or HeadFannedOut
event
from a previous head could silently drive an unrelated Open/Closed head
into
  the wrong state.

  

<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [x] CHANGELOG updated or not needed
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained herafter
The root cause was in confirmedUTxO inside onOpenNetworkReqSn: it always
  included utxoToCommit unconditionally. When a new ReqSn with no deposit
  arrived after recovery, the recovered deposit was still part of the base
  UTxO used to apply transactions, and with no withoutUTxO to strip it
  back out, it baked into the new snapshot's utxo field.
…ing#2630)

Resolves cardano-scaling#2629

The root cause was in confirmedUTxO inside onOpenNetworkReqSn: it always
included `utxoToCommit` unconditionally. When a new `ReqSn` with no
deposit arrived after recovery, the recovered deposit was still part of
the base UTxO used to apply transactions, and with no `withoutUTxO` to
strip it back out, it baked into the new snapshot's utxo field.

<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [ ] CHANGELOG updated or not needed
* [ ] Documentation updated or not needed
* [ ] Haddocks updated or not needed
* [ ] No new TODOs introduced or explained herafter
Co-authored-by: Noon <noon.vandersilk@iohk.io>
…g#2632)

<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [x] CHANGELOG updated or not needed
* [x] Documentation updated or not needed
* [x] Haddocks updated or not needed
* [x] No new TODOs introduced or explained herafter
…cardano-scaling#2633)

<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [ ] CHANGELOG updated or not needed
* [ ] Documentation updated or not needed
* [ ] Haddocks updated or not needed
* [ ] No new TODOs introduced or explained herafter

---------

Co-authored-by: Noon van der Silk <noon.vandersilk@iohk.io>
…ardano-scaling#2634)

I noticed the Publish Docs workflow has been failing on master lately.

The error is:

```bash
 cp: cannot stat 'result/build/*': No such file or directory
 ```
 The workflow does `nix build .#docs-unstable` and then `cp result/build/* $out -r`. The nix build succeeds but `result/build/` doesn't exist.
 
The workflow gets its output by running `yarn pack` to create tarball, then extracting it to `$out`.
Yarn respects `.gitignore` and `docs/.gitignore` lists exactly our `/build` folder.

This was not an issue before because the derivation was previously served from Cachix. The cache was invalidated by the `docs.nix` refactor in cardano-scaling#2609 , which forced a fresh rebuild that exposed the issue.

The fix adds a `postInstall` hook that copies `build/` from the nix build sandbox directly into `$out/build/`.

<!-- Describe your change here -->

---

<!-- Consider each and tick it off one way or the other -->
* [ ] CHANGELOG updated or not needed
* [ ] Documentation updated or not needed
* [ ] Haddocks updated or not needed
* [ ] No new TODOs introduced or explained herafter
- Add pull_request trigger for PRs targeting master branch
- Tag PR builds as pr-<number> for easy identification
- Use PR head SHA as version for traceability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants