fix: audit findings round final 3#846
Conversation
…iting off-chain (#808)
…ated + backfill unsigned receiver states (#803)
…TxID (#807) ## Finding (MF3-L01) Each transition stores a `txID` pointing to its transaction. The ID `NewTransactionFromTransition` generated did **not** match that `txID` for `HomeDeposit`, `TransferSend`, and `TransferReceive` (and, missed by the finding, `EscrowDeposit`). ## Root cause Two ID conventions collided: - **Transition `TxID`** (set by `Apply*`, validated in `state_advancer.go` via `Equal`) = `getTransactionID(transition.AccountID, stateID)`. - **Transaction `ID`** (recomputed in `NewTransactionFromTransition`) = `getTransactionID(toAccount|fromAccount, stateID)`. They matched only when `AccountID` equalled `toAccount`/`fromAccount`. They diverged for: | Transition | transition.TxID | old recomputed ID | |---|---|---| | HomeDeposit | (HomeChannelID, sID) | (UserWallet, sID) | | EscrowDeposit | (EscrowChannelID, sID) | (UserWallet, sID) | | TransferSend | (recipient, sID) | (UserWallet, receiverID) | | TransferReceive | shares Send's TxID | — | TransferSend/Receive share one TxID (`channel_v1/handler.go:108`), so a transfer must produce a single transaction referenced by both. The old code produced an ID neither transition referenced → dangling `tx_id`. ## Fix `NewTransactionFromTransition` now uses `transition.TxID` directly as the transaction ID (rejecting empty). The `TxID` is canonicalised and validated in the state advancer, so it is the single source of truth and cannot be forged by a client. Fixes all four cases by construction and drops the fragile parallel recompute. - `pkg/core/types.go` — use `transition.TxID`. - `nitronode/store/database/transaction.go` — doc comment updated to the new convention. - `pkg/core/types_test.go` — assert `tx.ID == transition.TxID` for HomeDeposit / EscrowDeposit / TransferSend, plus empty-TxID rejection. ## Notes - Pre-fix rows carry the old (wrong) IDs. No backfill included — history intentionally out of scope. ## Testing - `go test ./pkg/core/... ./nitronode/...` — pass - `go vet` — clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Summary `SubmitDepositState()` synthesized `core.NewVoidState()` when `GetLastUserState()` returned `nil` and continued validation. But deposit is **not** a channel initialization path: after `CheckActiveChannel()` confirms an active `Void`/`Open` home channel, the `channel_states` row for `(user_wallet, asset)` is guaranteed to exist (home channels are created atomically with their first state in `request_creation.go`). A `nil` result there means the local-state materialization is inconsistent. Continuing from a synthetic void state evaluates a `Commit` against state history that was never persisted, yielding a misleading downstream validation error instead of a clean missing-state rejection. ## Changes - Replace the `NewVoidState()` fallback with an explicit `last user state not found` rejection. - `EnsureNoOngoingStateTransitions` now runs unconditionally — `currentState` is guaranteed non-nil past the guard. - Add `TestSubmitDepositState_MissingLastUserState_Rejected`: active channel passes but `GetLastUserState` returns nil → clean rejection, no advancement past the guard. ## Impact Hardening / fail-fast. The void path was unreachable under correct operation (no exploit, no incorrect accept) — worst prior consequence was a confusing error on DB inconsistency. ## Testing - `go test ./nitronode/api/app_session_v1/` — pass - `go vet ./nitronode/api/app_session_v1/` — clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Finding (MF3-L11) `WebsocketDialer.Call` registered the response sink **before** marshalling the request. If marshalling failed (malformed raw JSON in the payload), the function returned `ErrMarshalingRequest` without removing the sink. Repeated direct calls to the low-level `WebsocketDialer.Call` API with malformed payloads and unique request IDs could grow `responseSinks` unboundedly, degrading or crashing the calling process. SDK methods are not affected — they build payloads via `NewPayload` before reaching `Call`. ## Fix Move `json.Marshal` ahead of the connection check and sink registration in `Call`. A marshal failure now returns before any sink is registered, so nothing can leak. ## Test `pkg/rpc/dialer_internal_test.go` — white-box test (no server needed, since marshalling now precedes the connection check): - 100 calls with malformed `json.RawMessage` payloads and unique request IDs - asserts each returns `ErrMarshalingRequest` - asserts `responseSinks` is empty afterward The test locks in the reordering: with the old order a disconnected dialer returned `ErrNotConnected` before reaching the marshal, so it would fail. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remediates the MF3-I* batch of audit findings (informational severity). Each finding is its own commit so they can be reviewed independently. ## Findings **MF3-I01 — `MAX_SESSION_KEYS_PER_USER=0` silently disables the per-user cap** The cap is skipped whenever `maxSessionKeysPerUser <= 0`, but that wasn't documented and `0` could be read as "invalid" or "zero keys allowed". Documented `<= 0` as the unlimited mode on the config field and in the app-session submit README. Kept it as a documented behavior rather than a startup-reject, since that matches the existing `WsBytesPerSec` "<0 to disable" / `MaxSignedUpdates` `0` conventions and avoids breaking anyone already running `0`. **MF3-I02 — duplicate asset symbols accepted case-insensitively** `verifyAssetsConfig()` accepted both `usdt` and `USDT`, which is ambiguous because the in-memory registry matches symbols case-sensitively while the persistence layer lowercases them. Now tracks enabled symbols by their lowercased form and fails startup on a case-insensitive (or exact) duplicate. Disabled assets are skipped; request-time exact-symbol semantics are unchanged. **MF3-I03 — stale `err` in missing-user-signature error** The missing-`user_sig` guard formatted an `err` that is always `nil` at that point, polluting the message with `<nil>`. Dropped the format arg. **MF3-I04 — nil-deref in `NewTransactionFromTransition` release branch** The `release` case dereferenced `receiverState.UserWallet` before the nil check, so a nil receiver state panicked instead of returning an error. Moved the nil check ahead of any field access. No RPC path passes nil today, but the helper is now safe regardless of caller. Added a regression test. **MF3-I05 — pagination offset can wrap negative on 32-bit** `GetOffsetAndLimit` returned a caller-controlled `uint32` offset that every store query converts via `int(offset)` before GORM's `Offset()`; on a 32-bit target a large offset wraps negative and GORM treats it as "no offset", silently returning the first page. Clamped offset to `MaxInt32` in `GetOffsetAndLimit`, then routed the handlers that still computed offset/limit by hand through the same helper so the clamp is enforced everywhere (`channels.v1.get_channels`, `channels.v1.get_last_key_states`, `app_sessions.v1.get_last_key_states`). An explicit `limit: 0` still falls back to the endpoint default so page-count math never divides by zero. **MF3-I07 — session-key revocation mislogged** Both `SubmitSessionKeyState` handlers logged a revocation whenever the submitted `expires_at` was not in the future, ignoring whether the previous state was active — so inactive-to-inactive updates (and already-expired new versions) were logged as revocations. Now gated on the real transition: `revoked := prevActive && !submittedActive`. **MF3-I08 — unvalidated channel / app-session IDs** `GetAppSessions` accepted a non-nil but empty `app_session_id`, which the store treated as "no filter", silently returning an unfiltered paginated list instead of honoring the documented either-id-or-participant contract. Added `core.HashRegex` + `core.IsValidHash` (a `0x`-prefixed 32-byte hash — the canonical form of channel and app-session IDs), mirroring `app.IsValidApplicationID`, and reject malformed IDs at the RPC boundary in `GetAppSessions` and `GetEscrowChannel`. `HomeChannelID` is left as-is — it has no bare-id read endpoint and is already verified by exact derivation match or existence lookup. **MF3-I09 — `UpdateChannel` rewrites immutable config** `UpdateChannel` rewrote immutable config (`blockchain_id`, `token`, `nonce`) on every lifecycle update, widening the blast radius if a future caller passes a partial/stale `core.Channel`. Restricted it to mutable lifecycle fields (`status`, `state_version`, `challenge_expires_at`, `updated_at`); immutable config is set once in `CreateChannel`. ## Test plan - `go build ./...` - `go test ./pkg/core/... ./nitronode/store/database/... ./nitronode/api/channel_v1/... ./nitronode/api/app_session_v1/... ./nitronode/event_handlers/...` - `go vet ./...` on touched packages 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Session-key revocation logging now only fires when a key actually transitions from active to inactive * Error message for missing state signature corrected * Prevented a potential crash when handling a specific transaction case * **New Features** * Validation added for session and channel identifiers to reject malformed IDs * **Improvements** * Centralized, more consistent pagination and safer offset handling across endpoints * Asset symbol validation now treats duplicates as case-insensitive * Channel updates no longer modify immutable configuration fields * **Documentation** * Clarified per-user active session-key cap semantics and disabling behavior <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…te (#811) ## What Remediation for **MF3-L04** — `SubmitDepositState()` only partially validated `AppStateUpdate.Allocations`, so an accepted deposit update could carry a non-canonical or incomplete allocation set relative to the node's actual ledger. Two gaps: 1) the participant check sat *inside* the amount-increase branch, so a non-participant could be included with a zero amount and bypass it; 2) the final completeness check skipped `asset == userState.Asset`, so existing nonzero allocations for the deposited asset could be omitted from the signed update. Neither moves or erases funds (omitted balances aren't modified, zero allocations produce no ledger entries) — but the signed deposit state could diverge from the ledger, which is misleading for any client/app/verifier treating it as a full allocation snapshot. ## Changes - Hoist the participant check to the top of the allocation loop so it runs regardless of amount — non-participant allocations are now rejected even at zero amount. - Drop the deposited-asset skip in the completeness check. Every existing nonzero allocation must now be present in the update: - **deposited asset** — may only grow (the deposit itself; the increase is still validated and recorded in the forward loop), never shrink; - **all other assets** — must match the current state exactly (unchanged). Valid deposit increases keep working; the signed allocation set becomes a complete, canonical snapshot. ## Tests - `NonParticipantZeroAllocation_Rejected` — zero-amount non-participant allocation is rejected, no ledger entry recorded for it. - `MissingDepositedAssetAllocation_Rejected` — omitting another participant's existing balance in the deposited asset is rejected, session is not updated. All `app_session_v1` tests pass; `go vet` clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved participant validation during deposit state submission to reject invalid allocations earlier in processing * Strengthened balance verification to ensure all existing asset allocations are accounted for in updates * Enforced stricter requirements for deposited asset allocations to maintain non-decreasing amounts <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Maharshi Mishra <ihsraham27@gmail.com>
…806) ## Summary Removes the app-session **rebalancing** feature end to end. The endpoint was already disabled in every deployed environment (`NITRONODE_MAX_SIGNED_UPDATES=0`, route gated on `>= 2`), so this is effectively dead-code cleanup at runtime plus full removal of the API surface, SDK methods, types, and docs. ## Changes by layer **nitronode** - Delete `rebalance_app_sessions.go` handler + its 1358-line test - Drop `MaxSignedUpdates` / `MaxRebalanceSignedUpdates` config and the route gate; remove `NITRONODE_MAX_SIGNED_UPDATES` from the 3 helm configs - Fix `NewHandler` signature + 35 call sites **pkg/** - `rpc`: remove `rebalance_app_sessions` method const, request/response types, client wrapper - `app`: remove `AppStateUpdateIntentRebalance`, `AppSessionVersionV1`, `GenerateRebalanceBatchIDV1`, `GenerateRebalanceTransactionIDV1` - `core`: remove `TransactionTypeRebalance` (42) **SDKs** - `sdk/go`, `sdk/ts`: drop client methods, types, packing helpers, enum members, examples, READMEs - Regenerate `rpc-drift` and `public-api-drift` snapshots **Docs / misc** - `docs/api.yaml`, `docs/data_models.mmd`, `llms-full.txt`, MCP categorizer regex, playground history view ## Notes - The intent (`4`) and transaction-type (`42`) enum values were **terminal**, so removal shifts no other wire values. - `TransactionTypeRebalance=42` is removed entirely — pre-existing ledger rows tagged `42` (if any) now render as `unknown`. Chosen deliberately over keeping a display-only mapping. ## Verification - `go vet ./...` clean · `go test ./...` all pass - TS SDK: 184 tests pass, snapshots regenerated · `tsc` + ts-compat typecheck OK - MCP build OK · playground typecheck OK 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Breaking Changes** * Removed `rebalance_app_sessions` RPC endpoint for atomic multi-session rebalancing * Removed rebalance methods from Go and TypeScript SDKs * Removed "rebalance" as a supported transaction type and app state update intent * Removed related configuration parameters * **Documentation** * Updated API specifications, SDK documentation, and deployment guides to reflect removal of rebalancing functionality <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: nksazonov <nsazonov@openware.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ction allowances, and user staking (#810) ## Summary Removes three interlinked, unused subsystems that were wired together through the action gateway. None were reachable by default (both `NITRONODE_APP_REGISTRY_ENABLED` and `NITRONODE_ACTION_LIMITS_ENABLED` defaulted to off), and none are planned. **Action allowances (rate limiting)** - `nitronode/action_gateway/` package, `action_log` store + model - `core.GatedAction` / `core.ActionAllowance` + the `TransitionType` / `AppStateUpdateIntent` `.GatedAction()` mappers - all `AllowAction` call sites, and the `user.v1.get_action_allowances` RPC method **App registry** - `nitronode/api/apps_v1/`, the `app` store + model, `pkg/app` registry types (`AppV1`/`AppInfoV1`/`PackAppV1`) - `apps.v1.*` RPC methods, and the gated `GetApp` lookup in `create_app_session` **User staking** - on-chain locking reactor/client/ABI (Go + TS), `user_staked` store + model - `UpdateUserStaked` / `HandleUserLockedBalanceUpdated`, `LockingContractEventHandler*`, `UserLockedBalanceUpdatedEvent` - `LockingContractAddress` across `core` → `rpc` → `node.v1.get_config` → both SDKs **Cross-cutting cleanup** - dependent surfaces in `sdk/go`, `sdk/ts`, `sdk/ts-compat`, and `cerebro` - `docs/api.yaml` schemas + methods, drift guards + snapshots - helm chart + config schema, the two feature-flag env vars - now-dead helpers: `AppRegistryClient`, `MaxAppMetadataLen`, `AppIDV1Regex` - new goose migration `20260603000000_drop_app_registry_staking_action_log.sql` (drops `apps_v1`, `action_log_v1`, `user_staked_v1`; `Down` recreates them) **Scope:** 119 files, +181 / −9507. ## Notes - App sessions are unaffected — `ApplicationID` stays a plain string field (the registry lookup was gated and is now gone). - `blockchain_config` now requires `channel_hub_address` (a locking-only blockchain is no longer valid). Prod/sandbox `blockchains.yaml` have no locking entries, so no deploy config breaks. - `sdk/mcp/content/` is gitignored (regenerated on build); not committed. ##⚠️ Breaking changes - Removes the `apps.v1.*` and `user.v1.get_action_allowances` RPC methods. - Removes `locking_contract_address` from the `node.v1.get_config` response. - Removes the app-registry / action-allowance / on-chain locking surfaces from the Go and TypeScript SDKs → **major version bump** required. ## Test plan - [x] `go build ./...` / `go vet ./...` / `go test ./...` - [x] `sdk/ts` build — 182 tests incl. RPC/ABI/public-API drift guards - [x] `sdk/ts-compat` build + 41 tests - [x] `sdk/mcp` typecheck + content regen - [x] `forge build` 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Removed app registry functionality for application registration and management. * Removed action allowance system for user action tracking and limits. * Removed security token locking and escrow operations. * Removed corresponding CLI commands, API endpoints, and database tables supporting these features. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## MF3-M05: wallet-only session-key revocation ### Problem `SubmitSessionKeyState()` (both app-session and channel paths) treated a submitted `expires_at` not after `now` as a revocation, but still **required and validated `session_key_sig`** before storing. A user could not unilaterally revoke a delegated key: if the session key was lost, unavailable, compromised, or held by a delegate who refused to sign the revocation payload, the user had to wait until the existing `expires_at` passed while the key stayed usable. Requiring `session_key_sig` is correct when activating or extending authority, but on revocation it handed the delegate a veto. ### Fix Split the signature requirement by intent: - **`expires_at > now`** (activation / extension / rotation) — still requires **both** `user_sig` and `session_key_sig`, validated as before. - **`expires_at <= now`** (revocation) — requires **only `user_sig`**. Any `session_key_sig` present is ignored. The packed payload binds `user_address`, `session_key`, `version`, and `expires_at`, so a wallet-only revocation is authorized for exactly that key/version and cannot be replayed. Revocation still flows through `LockSessionKeyState` (ownership binding) and the monotonic version check. ### Changes - **`pkg/app`, `pkg/core`** — new `ValidateAppSessionKeyStateUserSigV1` / `ValidateChannelSessionKeyStateUserSigV1`; the both-signature validators now reuse them. - **nitronode handlers** — branch on `expires_at` to pick the validator. `now` is captured once and reused for the in-transaction cap/version logic so the active/inactive decision is consistent across validation and storage. - **Migration `20260602000000`** — drops the `*_session_key_sig_present_chk` CHECK constraints added in `20260508000000` (they rejected an empty `session_key_sig`). The ownership guarantee is preserved in application code: active rows are still co-signed, and owner/auth lookups filter `expires_at > now`, so revocation rows with an empty signature are never trusted as a key's owner. Down re-adds the constraints as `NOT VALID`. - **SDK** — wallet-only helpers: `RevokeChannelSessionKey` / `RevokeAppSessionKey` (Go), `revokeChannelSessionKey` / `revokeSessionKey` (TS). Updated the docstrings that previously stated wallet-only revocation was unsupported. - **`docs/api.yaml`** — `session_key_sig` documented as optional on the revocation path for both methods. ### Tests - pkg-level unit tests for the new user-sig-only validators (valid passes, wrong wallet rejected, tampered version rejected). - Handler tests per kind: wallet-only revoke succeeds; a mismatched `session_key_sig` is ignored on revoke; an invalid `user_sig` is still rejected on revoke. Existing both-signature, cap, and reactivation tests unchanged and passing. ### Verification - `go build ./...`, `go vet` (affected packages), Go tests for pkg/app, pkg/core, both handlers, sdk/go — pass. - TS typecheck + lint clean; 184 tests pass (public-API drift snapshot updated for the two new methods — intentional). ### Note for reviewers Migration is timestamped `20260602000000`. If it should slot before another unreleased migration, say so and I'll rename. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added wallet-only session-key revocation support. New methods `RevokeChannelSessionKey`, `RevokeAppSessionKey` (Go SDK) and `revokeChannelSessionKey`, `revokeSessionKey` (TypeScript SDK) enable simplified revocation without requiring session-key holder signatures. * **Documentation** * Clarified signature requirements for session-key operations: activation/extension requires both wallet and session-key signatures; revocation requires wallet signature only. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Audit finding (MF3-L14) The per-connection request-rate limiter (~10 rps) ran as RPC **middleware** — after a frame was parsed and matched to a registered method. Malformed frames and unknown-method frames return error responses *before* the middleware chain executes, so they never debited the request-rate bucket. A flood of tiny invalid / unknown-method frames was bounded only by the per-connection byte limiter and queue limits — well above the intended request rate, and multiplied across connections. ## Fix Move request-rate limiting to the connection (**frame**) layer, beside the existing byte budget, so every inbound frame is counted *before* parsing and routing — malformed and unknown-method frames included. - Add `RequestTokenBucket` (one token per frame, size-independent) and `CompositeFrameRateLimiter` (admits only if every member admits) to `pkg/rpc`. - Build the per-connection `FrameRateLimiter` as a composite of the byte + request buckets in `runtime.go`; fail fast at startup if `RATE_LIMIT_BURST < 1` while enabled (mirrors the byte-burst check). - Remove `RateLimitMiddleware` and its router/config plumbing. Net **−73 lines** including new tests. ## Behavior changes - **Overrun now closes the connection** (consistent with the byte limiter) instead of returning a `"rate limit exceeded"` RPC error with the connection kept open. Appropriate for an anti-flood control; `burst=20` gives legitimate clients headroom. - Coverage is now **pre-parse** — even the JSON unmarshal cost of invalid frames is avoided once the budget is exhausted. - Config knobs `NITRONODE_RATE_LIMIT_PER_SEC` / `NITRONODE_RATE_LIMIT_BURST` (default 10 / 20) are unchanged — enforced one layer lower. ## Tests - `pkg/rpc/rate_limiter_test.go`: burst / refill / cap / sub-unit-burst for `RequestTokenBucket`; composite reject + nil/empty members. - `go build ./...`, `go vet`, and tests for `pkg/rpc`, `nitronode/api`, `nitronode` all green. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added validation for app session and escrow channel identifiers * Introduced request-based rate limiting alongside byte-rate limiting * Enhanced pagination handling with improved limit parameter support * Added case-insensitive duplicate asset symbol detection * **Bug Fixes** * Fixed quorum weight validation to prevent integer overflow * Improved deposit state validation and error handling * Enhanced session key revocation tracking for active-to-inactive transitions * Fixed pagination metadata computation to reject invalid limits * **Improvements** * Migrated rate limiting from RPC layer to WebSocket connection level * Added user state locking during home channel operations * Enhanced off-chain signature backfill on channel opening * Strengthened channel status validation for state transitions <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
) ## Finding Home-channel event handlers read the channel via `GetChannelByID` **before** acquiring `LockUserState`, then mutated/persisted that pre-lock snapshot. `LockUserState` locks the user *balance* row, not the channel row — so a concurrent `submit_state` Finalize could flip the channel `Open→Closing` in the read→lock window. `HandleHomeChannelCheckpointed`'s non-challenged path persists the channel snapshot verbatim (it only reassigns `Status` on the `Challenged` branch). A stale `Open` snapshot therefore overwrote the committed `Closing`, **reopening a finalized channel** — bypassing the C-01 remediation and enabling the post-finalize epoch-rebind / fake-`HomeDeposit` spend chain. ## Fix New `DBStore.LockUserStateForHomeChannel(channelID)`: - Derives the `(wallet, asset)` lock key from the channel **inside SQL** — no pre-lock Go snapshot exists. - Postgres: ensures the balance row, then `SELECT c.* FROM channels c JOIN user_balances b ... WHERE c.channel_id = ? FOR UPDATE OF b` — locks the balance row (same row/strength as `LockUserState`) and reads the channel in one statement. The stale-snapshot window is structurally impossible, not just guarded. - sqlite (tests): resolve → `LockUserState` → read. All four home-channel handlers (`Created`, `Checkpointed`, `Challenged`, `Closed`) now lock-then-read via this method. Escrow handlers are unchanged (different status surface, out of scope). Wired into `core.ChannelHubEventHandlerStore`, `evm.ChannelHubReactorStore`, `database.DatabaseStore` + both mocks. ## Tests - Rewired 25 home-handler tests to the new mock; escrow tests untouched. - `TestHandleHomeChannelCheckpointed_DoesNotReopenFinalizedChannel` — post-lock `Closing` snapshot must persist `Closing`, never reopen to `Open`. - `TestDBStore_LockUserStateForHomeChannel` — returns channel + ensures balance row; nil for missing channel. `go build ./...`, `go vet`, and `go test` (event_handlers, evm, store/database) pass. > Note: the Postgres branch (`FOR UPDATE OF` on the join) is covered by the gated `store/database/test` integration suite, not the default run — same convention as the existing `LockUserState` Postgres branch. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Audit finding MF3-L13 The state validator lets off-chain credit with no `HomeChannelID` bind to any chain/token the user proposes at first channel creation, enforcing only that the **asset symbol** matches (`pkg/core/state_advancer.go:34-55`). Consequently credit can be redeemed from any configured token inventory sharing that symbol, independent of which inventory originally backed it. ## Assessment This is **intended behaviour** of the unified asset model — a unified asset is identified solely by its symbol, and all chain-specific tokens configured under that symbol are fungible 1:1 representations. Leaving the chain/token open until the user binds it is exactly what enables cross-chain redemption. Binding credit to its origin token would remove that feature. The harmful scenario requires the **node operator** to map economically non-equivalent tokens (e.g. a test token and production USDC) under a single symbol. In that case the operator drains their own inventory through their own misconfiguration — no external trust boundary is crossed. Token equivalence cannot be verified programmatically. Severity: informational / config hardening. No validation or provenance change. ## Change Make the operator obligation explicit: - `docs/protocol/security-and-limitations.md` — new **Asset-symbol equivalence** trust assumption. - `nitronode/store/memory/asset_config.go` — equivalence warning on the `AssetConfig` doc comment, cross-referencing the protocol doc. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## MF3-L16 ### Problem `NewErrorPayload()` built the error field by wrapping `errMsg` with `fmt.Sprintf(`"%s"`, errMsg)` and storing it as `json.RawMessage`. Since `json.RawMessage` is treated as already-encoded JSON, embedded quotes or other JSON-special characters in `errMsg` were not escaped during the later websocket response marshal. A client-facing `rpc.Error` containing quotes — e.g. a duplicate-key error mentioning `"app_sessions_v1_pkey"` — produced invalid raw JSON. The websocket node then failed to marshal `ctx.Response` and returned the generic node error instead of the intended application-level error response. ### Fix Encode `errMsg` with `json.Marshal()` in `NewErrorPayload()` and store the returned bytes. The standard encoder escapes quotes, backslashes, and control characters before the value is embedded in the final response. A defensive fallback to `""` is kept (marshaling a Go string cannot fail in practice). ### Tests Added `pkg/rpc/error_test.go` covering embedded quotes, backslashes, newlines, control chars, leading quotes, and empty input. Each case asserts the stored value round-trips to the original string and that the whole payload marshals without error (regression guard for the websocket-marshal failure). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remediates three audit findings in nitronode. All informational/defense-in-depth — no exploit in the current healthy flow; each hardens an implicit invariant to fail closed. ## MF3-I12 — channel session-key asset validation `SubmitSessionKeyState` for channel session keys only length-checked `coreState.Assets`. Now, per asset, `validateSessionKeyAssets` rejects: - **non-canonical** (`asset != strings.ToLower(asset)`) — assets are part of the user-signed metadata hash, so the node rejects rather than rewrites, keeping the signed payload identical to what is stored; - **duplicates** (after normalization); - **unsupported** (`GetAssetDecimals` errors). An empty assets list is accepted only as a revocation/deactivation state: it is rejected when `expires_at` is in the future (it would leave an active current version authorizing no usable asset) and allowed only when `expires_at <= now`. All checks run before signature validation/storage. ## MF3-I13 — reject submit_state on a stateless active channel `SubmitState`'s default branch silently bootstrapped a synthetic `Void` state when no stored state existed for an active home channel. An active channel always has its initial state stored atomically by `request_creation`, so a nil state is an invariant violation, not a first-submit. Now fails closed with `active channel has no stored state`. The `NewVoidState` fallback stays in the channel-creation path. ## MF3-I14 — assert user_balances update affects one row `StoreUserState` checked only the GORM `Error` from the `user_balances` `Updates()` and ignored `RowsAffected`. A caller that stored a state without first calling `LockUserState` would silently match zero rows, leaving `channel_states` and the cached balance divergent without failing the transaction. Now requires exactly one affected row. Tests that used `StoreUserState` without locking are updated to lock-first via a `storeLocked` helper, plus a negative test for the new error. All production call sites already lock-before-store / store-initial-state, so no behavior change in the healthy flow. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
) ## What `app_sessions.v1.submit_session_key_state` only validated array length and lowercasing for `application_ids` and `app_session_ids`. It did **not** validate each ID's format. As a result malformed-but-lowercase IDs passed request validation, reached signature verification, and could persist as inert, unusable scopes (never matching a real application/app-session). This aligns the handler with the validation already applied in `create_app_session` and the RPC `application_id` query-param path. ## Changes - `application_ids`: replaced the lowercase-only loop with `app.IsValidApplicationID()` (regex `^[a-z0-9_-]{1,66}$`, which already enforces lowercase). - `app_session_ids`: added `core.IsValidHash()` (`^0x[0-9a-fA-F]{64}$`) before the lowercase canonical check. `IsValidHash` is case-insensitive, so the lowercase check is kept. - Both checks run **before** signature verification — fail fast at the request boundary. Array-length caps unchanged. - Tests: refactored the two stale lowercase-only tests into a shared `submitStateExpectingError` helper; added format-rejection cases (uppercase / illegal char / over-length application IDs; non-hash / short / no-`0x` / uppercase-hex app-session IDs). ## Severity Low — hardening/consistency, not an exploitable vuln. The request is self-authenticated (both `user_sig` and `session_key_sig` gate persistence, so no injection into another user's state), and DB column constraints (`VARCHAR(66)` / `CHAR(66)`) already reject oversized values. Worst real case is a user signing their own malformed scope that silently never authorizes anything; this rejects it up front instead. ## Test ``` go build ./nitronode/api/app_session_v1/ go vet ./nitronode/api/app_session_v1/ go test ./nitronode/api/app_session_v1/ -run TestSubmitSessionKeyState ``` All green. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nstead of fatal (#827) ## Finding (MF3-H03) `ChannelHubReactor.HandleEvent()` treated any handler error as fatal: it returned before `StoreContractEvent()`, so `Listener.processEvents()` unsubscribed and `main()` exited via `logger.Fatal()`. `depositToNode()` accepts any ERC20 and emits `NodeBalanceUpdated`. `handleNodeBalanceUpdated()` required the token to be configured in the node asset store, so depositing an unconfigured ERC20 was a permissionless, low-cost way to stop nitronode event processing. Since the failing event was never recorded, it replayed on restart — recovery needed manual intervention or a config/code change. ## Fix - New `core.ErrTokenNotSupported` sentinel, returned (wrapped) by the memory asset store's `GetTokenAsset`/`GetTokenDecimals` when a token is not configured. - `handleNodeBalanceUpdated()` checks `errors.Is(err, core.ErrTokenNotSupported)`, logs a warning, skips the balance update, and returns `nil`. `HandleEvent()` then records the event via `StoreContractEvent()`, so it is not replayed, the listener stays subscribed, and the process does not die. - Genuine store errors still propagate — only the deterministic "not configured" case is skipped (typed error, not a catch-all). Out of scope (per finding's optional suggestion): broader listener-level error classification so handler failures don't all collapse to `logger.Fatal`. Can follow up separately. ## Tests - New subtest `unsupported token is skipped and recorded`: asserts `HandleEvent` returns nil, `StoreContractEvent` is called, the balance handler is NOT called, and the processed callback reports success. - `go test ./pkg/blockchain/evm/... ./nitronode/store/memory/...` and `go vet` pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…otion, signature canonicalization note (#828) Addresses three audit findings. ## MF3-L17 — single-participant app sessions (docs) `create_app_session` docs claimed "at least 2 participants required", but `CreateAppSession()` only enforces the *max* participant count and quorum — a single-participant session with `quorum = 1` is valid. We decided to keep single-participant app sessions, so the doc was the divergence. Updated `app_session_v1/README.md` to "at least 1 participant required" to match implementation. ## MF3-I15 — Void channel left un-promoted on checkpoint (fix) `ChannelHub.createChannel()` can emit `ChannelCheckpointed` before `ChannelCreated` for an initial non-deposit/non-withdraw state. If the checkpoint was processed while the local row was still the `Void` seed, `HandleHomeChannelCheckpointed()` bumped `state_version` but left the channel `Void`, violating the protocol invariant until the later `ChannelCreated` event replayed and repaired it. Fix: treat a checkpoint on a `Void` home channel as evidence it is materialized on-chain and promote it to `Open`, backfilling any unsigned concurrent receiver head (mirrors `HandleHomeChannelCreated`). The later `ChannelCreated` then no-ops via the `Status >= Open` guard. Added a regression test. ## MF3-I16 — stored signatures not canonicalized (docs) `channel_states.user_sig` / `node_sig` are unbounded `text` and accepted signatures are persisted verbatim rather than re-encoded after verification. They may retain non-canonical / unused trailing bytes (notably around session-key payloads, and for signatures learned from on-chain events that bypass the WebSocket message-size limit). No asset-safety impact — verification still gates acceptance. Documented under Known Limitations so consumers don't assume stored signature bytes are minimal/canonical. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#830) ## What Fixes the MF3-L18 audit finding: balance-lock acquisition for `transfer_send` transitions happened in request-dependent order, so two opposite-direction transfers of the same asset could deadlock at the database layer. `SubmitState` and `RequestCreation` locked the sender's `user_balances` row first, then `issueTransferReceiverState` locked the receiver's row later in the same tx. Concurrent A→B and B→A transfers could lock A-then-wait-B and B-then-wait-A — a cycle that Postgres breaks with `SQLSTATE 40P01`, aborting a valid transfer instead of resolving it deterministically. No balance corruption (the tx rolls back), but it's a liveness/UX problem under load. ## Change - New `lockTransferBalances` helper: normalizes the receiver address, rejects self-transfers, and locks both `(wallet, asset)` rows in ascending lowercase-wallet order **up front** for any `TransferSend` transition. Same sequence regardless of direction → no lock cycle. - Both handlers branch: `TransferSend` → `lockTransferBalances`; every other transition keeps the existing sender-only lock. - `issueTransferReceiverState` no longer takes the receiver lock — it now documents that the caller must pre-lock the receiver row. Mirrors the deterministic-ordering pattern already used in `app_session_v1` (sorted participants before locking). ### Side effect (improvement) Self-transfer and invalid-receiver are now rejected *before* the sender lock and state store, rather than after. Less wasted work; behavior otherwise unchanged. ## Scope Cross-subsystem ordering (channel transfers vs `app_session_v1` release locks) is a separate concern and intentionally out of scope here — different assets/rows. ## Tests - New `lock_transfer_balances_test.go`: asserts both directions lock low→high, and that self-transfer / invalid-receiver take zero locks. - Updated the self-transfer case in `submit_state_test.go` to reflect the earlier rejection point. - `go build ./nitronode/...`, `go vet`, full `channel_v1` suite green. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Important Review skippedToo many files! This PR contains 231 files, which is 81 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (3)
📒 Files selected for processing (231)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
## Release v1.4.0 ### Highlights - **Audit remediation — MF3 final round (#846):** security hardening across contracts, nitronode, SDK, and RPC. - Contract: reentrancy guard + event-handler monotonicity, reorg double-spend confirmation gate. - **Breaking:** removed app registry, action allowances, user staking, and rebalance feature; removed SDK locking client + app-registry ABI; RPC surface (`docs/api.yaml`) trimmed. - DoS/overflow fixes: pagination limit=0, uint8 quorum overflow, frame-layer rate limit, deposit-state enforcement. - feat(nitronode): add USDC asset across 7 chains (#845) - fix(go-sdk): preserve channel challenge expiry (#816) - Fix SDK MCP Hono audit failure (#821) - Dependency bumps (gomod, npm, ws). ### Versioning note Minor bump (1.3.1 → 1.4.0). Note: removal of app-registry/staking/rebalance + RPC methods is a breaking change for downstream consumers — flagged here rather than a major bump per 1.x cadence. ### MCP `mcp-v1.4.0` tag intentionally NOT pushed this release. `@yellow-org/sdk-mcp` published manually (npm + MCP registry), alongside manual `@yellow-org/sdk` + `@yellow-org/sdk-compat` publishes.
## Release v1.4.0 ### Highlights - **Audit remediation — MF3 final round (#846):** security hardening across contracts, nitronode, SDK, and RPC. - Contract: reentrancy guard + event-handler monotonicity, reorg double-spend confirmation gate. - **Breaking:** removed app registry, action allowances, user staking, and rebalance feature; removed SDK locking client + app-registry ABI; RPC surface (`docs/api.yaml`) trimmed. - DoS/overflow fixes: pagination limit=0, uint8 quorum overflow, frame-layer rate limit, deposit-state enforcement. - feat(nitronode): add USDC asset across 7 chains (#845) - fix(go-sdk): preserve channel challenge expiry (#816) - Fix SDK MCP Hono audit failure (#821) - Dependency bumps (gomod, npm, ws). ### Versioning note Minor bump (1.3.1 → 1.4.0). Note: removal of app-registry/staking/rebalance + RPC methods is a breaking change for downstream consumers — flagged here rather than a major bump per 1.x cadence. ### MCP `mcp-v1.4.0` tag intentionally NOT pushed this release. `@yellow-org/sdk-mcp` published manually (npm + MCP registry), alongside manual `@yellow-org/sdk` + `@yellow-org/sdk-compat` publishes. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Version updated to 1.4.0 across multiple packages and Helm charts for faucet-app, nitronode, playground, and SDK components (mcp, ts-compat, ts). <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
This PR consolidates remediations for all findings from the MF3 security audit. Severity breakdown: 1 Critical, 3 High, 5 Medium, 19 Low, 14 Informational.
Critical
C1 — Epoch/Channel Version=0 Sentinel Collision Allows Crediting Phantom Funds Off-Chain (#808)
After a channel
Finalize,NextState()reopened the new epoch atversion=0; newly created channels are also persisted withstate_version=0. TheEnsureNoOngoingStateTransitionsguard treated aHomeDepositas settled whenstate.version == channel.state_version, so the two zeros collided and the guard passed unconditionally — allowing a node-signedHomeDepositto be credited off-chain before any corresponding on-chain deposit event had landed. An attacker could exploit this to spend phantom funds with no real on-chain backing: a direct theft-of-funds risk.Fix:
version=0is now reserved as the "no on-chain state materialised yet" sentinel.NextState()post-Finalize andNewChallengeRescueState()emitVersion: 1. TheHomeDeposit/HomeWithdrawaltransition gate was strengthened to additionally requirechannel.status == Open.RefreshUserEnforcedBalancewas updated via a newChannel.IsEnforced()helper that encodes the convention explicitly.High
H1 — Unauthenticated
limit=0Triggers Integer Division-by-Zero, Crashing the Nitronode (#801)Passing
pagination.limit = 0to any list endpoint causedcalculatePaginationMetadata()to divide by zero unconditionally, panicking and killing the entire process. No authentication was required, making this a trivial, repeatable, zero-cost denial-of-service. Because the crash was immediate and reproducible, the node could be kept offline indefinitely.Fix:
GetOffsetAndLimitnow coerceslimit=0to the server default.calculatePaginationMetadatareturns an explicit error instead of dividing unconditionally. Regression tests coveringZeroLimitandNilLimitwere added for all affected endpoints.H2 — Stale Pre-Lock Channel Snapshot in Event Handlers Can Reopen a Finalized Channel (#820)
Home-channel event handlers read the channel row before acquiring
LockUserState. A concurrentsubmit_stateFinalize could flip the channel toClosingin the window between the read and the lock; the handler then persisted the staleOpensnapshot, effectively overwriting the committedClosingstate. This directly bypassed the C1 remediation and re-enabled the phantom-fund exploit chain.Fix: A new
DBStore.LockUserStateForHomeChannel(channelID)atomically locks and reads the channel row in a single SQL statement (SELECT … FOR UPDATE OF). All four home-channel event handlers (Created,Checkpointed,Challenged,Closed) were updated to lock-then-read via this helper, closing the window structurally.H3 — Unsupported ERC-20 Token Deposit Permissionlessly Halts All Event Processing (#827)
ChannelHubReactor.HandleEvent()treated any handler error as fatal: on failure it returned before callingStoreContractEvent(), causing the listener to unsubscribe andmain()to exit. BecausedepositToNode()accepts any ERC-20, depositing any token not configured in the node asset store was a permissionless, low-cost operation that permanently stopped event processing. The failing event was never recorded, so it replayed on every restart — requiring manual operator recovery.Fix: A new
core.ErrTokenNotSupportedsentinel is returned by the asset store for unknown tokens.handleNodeBalanceUpdated()checks for this sentinel, logs a warning, skips the balance update, and returnsnilsoHandleEvent()proceeds to record the event and the listener stays subscribed. Genuine store errors continue to propagate normally.Medium
M1 — Rebalance: Off-Chain Conservation Check Not Enforceable On-Chain (#806)
The
rebalance_app_sessionsfund-conservation invariant (all per-asset deltas sum to zero) was enforced only in broker-side code, with no corresponding on-chain state transitions. A compromised or malicious broker could credit funds to one session without deducting from another, with no ability to challenge or audit the imbalance on-chain.Fix: The entire rebalance feature was removed end-to-end: handler, routes, SDKs, API spec, and documentation.
M2 — Rebalance: Per-Session Quorum Verification Does Not Cover the Full Batch (#806)
Signatures covered individual session state updates, not the combined batch. A participant could be induced to sign their session's update without visibility into what was happening to the other sessions in the same atomic operation.
Fix: Resolved by removing the rebalance feature entirely (see above).
M3 — Rebalance: Dead but Routable Code Path Active Across All Deployments (#806)
The rebalance route was gated on
MaxSignedUpdates >= 2, which was0in every deployment, making it unreachable at runtime. However, the route remained registered, its 355-line handler compiled, and the feature was fully documented in the public API spec — attack surface re-activatable by a single config change.Fix: All rebalance code, types, config, and documentation removed (see above).
M4 —
uint8Overflow in Quorum Weight Accumulators (#802)CreateAppSessionandverifyQuorumaccumulated participant weights inuint8. When combined weights exceeded 255, the total wrapped modulo 256, causing legitimate sessions to be false-rejected or — at the boundary value 256 — quorum to appear as zero, allowing any single signature to satisfy it. With up to 32 participants each with max weight 255, real totals up to 8160 were reachable.Fix: Both accumulators widened to
uint16.SignatureWeightandQuorumremainuint8at the wire/storage layer. Boundary regression tests added.M5 — Session Key Revocation Requires the Session Key's Own Signature (#809)
Revoking a delegated session key required
session_key_sig, giving the delegate a veto over the wallet owner's own revocation. A lost or deliberately uncooperative key could not be revoked until its originalexpires_atelapsed.Fix: The path is split by intent: activation/extension (
expires_at > now) still requires both signatures; revocation (expires_at <= now) requires onlyuser_sig, with the payload bindinguser_address + session_key + version + expires_atto prevent replay. The database CHECK constraint blocking emptysession_key_sigwas dropped.Low
L1 — Transaction ID Mismatch With Referencing Transition TxID (#807)
NewTransactionFromTransitionused a different ID convention than the referencing transition'sTxIDfor four transition types, producing danglingtx_idreferences. Fixed by usingtransition.TxIDdirectly.L2 — Race Condition in
HandleHomeChannelCreated: Status Flip Before Lock (#803)Channel status was flipped to
OpenbeforeLockUserStatewas acquired, allowing unsigned receiver states to land mid-handler. Fixed by locking first and backfilling unsigned head signatures withbackfillOffChainHeadNodeSig.L3 — App Session Cooperative Close Atomicity Blocking on In-Flight Escrow Undocumented (#831)
The protocol behavior that blocks cooperative close while escrow is in-flight had no documentation, leaving integrators unaware of the blocking condition and recovery paths. Fixed by adding documentation to
contracts/SECURITY.md,protocol-description.md, anddocs/protocol/security-and-limitations.md; no code changes.L4 — Canonical Allocation Validation Gaps in
SubmitDepositState(#811)The participant check ran only inside the amount-increase branch, and the completeness check skipped the deposited asset. Both gaps were closed: participant check now runs for every allocation regardless of amount, and all nonzero allocations across all assets must be present in the signed update.
L5–L8 — ARAL: Dead Subsystems Removed: Action Allowances, App Registry, User Staking, and Dead Code (#810)
Three feature-flag-gated subsystems (action allowances, app registry, user staking) defaulted off and had no planned activation path, leaving unused attack surface. All three were fully removed along with cross-cutting dead code across SDKs, docs, helm chart, and config (119 files, −9,507 lines).
L9 — Rebalance: Exported API Surface Left After Feature Disable (#806)
AppSessionVersionV1,GenerateRebalanceBatchIDV1, andGenerateRebalanceTransactionIDV1remained as exportedpkg/appsurface after the feature was disabled. Removed as part of the full rebalance cleanup.L10 — Void State Fallback in
SubmitDepositStateMasks Inconsistency (#812)SubmitDepositStatesynthesized a void state whenGetLastUserStatereturnednil, yielding a misleading downstream error. Fixed:nilnow returns an explicitlast user state not foundrejection.L11 — Response Sink Leak on Marshal Failure in
WebsocketDialer.Call(#813)A sink was registered before marshalling; marshal failure leaked it permanently, allowing unbounded sink growth. Fixed by moving
json.Marshalbefore sink registration.L12 — Session-Key Scope ID Format Not Validated at Request Boundary (#826)
app_session_idsandapplication_idswere only length/case-checked, allowing malformed IDs to reach signature verification. Fixed by addingIsValidApplicationIDandIsValidHashchecks at the request boundary.L13 — Asset-Symbol Equivalence Trust Assumption Undocumented (#818)
The operator obligation that all tokens mapped under the same symbol must be economically equivalent was implicit. Documented in
security-and-limitations.mdand theAssetConfigdoc comment; no code changes.L14 — Request-Rate Limiter Applied After Frame Parsing, Not at Frame Layer (#819)
Malformed and unknown-method frames bypassed the RPC-layer rate limiter. Fixed by moving rate limiting to the WebSocket frame layer via a
CompositeFrameRateLimiter. Overrun now closes the connection consistently.L15 — Reorg Double-Spend via Lack of Confirmation Gate (#832)
The node credited off-chain balances the instant it observed a deposit event on-chain, with no confirmation delay. A blockchain reorganization would remove the on-chain deposit while the off-chain credit persisted, allowing an attacker to drain the credited balance before the reorg was detected — a double-spend with real fund-loss impact recoverable only by the attacker.
Fix: A per-chain
ConfirmationGateholds observed deposit events for a configurableconfirmation_delay_secswindow; events withRemoved: truecancel their pending entry so a reorged deposit never reaches the reactor. A startup reconciliation walk over stored block hashes detects reorgs that occurred while the node was offline, usingeth_getBlockByHashto find the last canonical ancestor and replaying from that point. A reactor-level idempotency pre-check (IsContractEventProcessed) prevents double-processing on reconciliation replay, closing both the live and startup attack windows.L16 — Unescaped Error Strings in
NewErrorPayloadProduce Invalid JSON (#829)fmt.Sprintf("%s", errMsg)embedded raw error strings without escaping, breaking JSON on any message containing quotes or control characters. Fixed by encoding withjson.Marshal.L17 — Single-Participant App Session Documentation Incorrect (#828)
Docs claimed "at least 2 participants required"; the implementation allows 1. Updated
app_session_v1/README.md.L18 — Deterministic Lock Ordering for Concurrent Transfer Balance Rows (#830)
Concurrent A→B and B→A transfers could acquire
user_balanceslocks in opposite orders, forming a deadlock cycle resolved by Postgres aborting one transfer. Fixed by introducinglockTransferBalanceswhich always locks both(wallet, asset)rows in ascending wallet-address order before any other work.L19 — Smart Contract Reentrancy + Nitronode Event-Handler Monotonicity (#837)
ChannelHub.sollifecycle entrypoints lacked reentrancy guards. A malicious ERC-777 or ERC-1363 token hook invoked during adepositorwithdrawcall could re-enter any other lifecycle function mid-execution, potentially corrupting on-chain channel state. In parallel, six Nitronode event handlers applied state updates without checking whether the incoming event version was newer than the stored one; a late-arriving or reorg-replayed event could regress local state.Fix: OpenZeppelin
nonReentrantwas added to every external/public lifecycle entrypoint inChannelHub.sol, migrating guards upward from internal fund helpers so the single reentrancy slot covers the full call surface. The six Nitronode handlers gained version-monotonicity guards via a shared helper that drops stale events with a structured warning. A newChainStateRefresherinterface fetches authoritative on-chain state viagetChannelDatawhenever a guard fires, providing defense-in-depth against any out-of-order delivery class.Informational
MAX_SESSION_KEYS_PER_USER=0silently disables the per-user cap; documented as the intended unlimited mode aligned with existing<= 0conventions.nilerrvariable in missing-user-signature error message producing<nil>output; redundant format argument dropped.NewTransactionFromTransitionrelease branch before nil check; check moved ahead of field access with a regression test.MaxInt32inGetOffsetAndLimit.GetActionAllowances()did not normalize wallet addresses withcore.NormalizeHexAddress(), unlike otheruser.v1endpoints, allowing malformed non-empty wallet strings to return valid-looking empty-match responses; removed with ARAL.prevActive && !submittedActive.core.IsValidHashvalidation added toGetAppSessionsandGetEscrowChannel.UpdateChannelrewrote immutable fields (blockchain_id,token,nonce) on every lifecycle call; restricted to mutable lifecycle fields only.close_app_sessionintent was not action-gated, allowing participants to bypass action allowance consumption by withdrawing via session close; removed with ARAL.create_app_session.SubmitStatesilently bootstrapped a synthetic Void state for an active channel; now fails closed withactive channel has no stored state.StoreUserStateignoredRowsAffectedon balance update, allowing silent zero-row mismatches; exactly one affected row is now required.ChannelCheckpointedarrived beforeChannelCreated; checkpoint of a Void row now promotes toOpen.