Skip to content

Add gov proposal based migration trigger#3650

Open
yzang2019 wants to merge 22 commits into
mainfrom
yzang/add-migration-trigger
Open

Add gov proposal based migration trigger#3650
yzang2019 wants to merge 22 commits into
mainfrom
yzang/add-migration-trigger

Conversation

@yzang2019

@yzang2019 yzang2019 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

The state-commitment (SC) store's in-flight memiavl → flatkv migration was previously paced by a node-local config (sc-keys-to-migrate-per-block). Because migration writes feed the AppHash, a per-node rate is consensus-relevant and a divergence risk: two validators draining at different rates fork the chain.
This PR makes the per-block migration rate a module-agnostic governance parameter (migration.NumKeysToMigratePerBlock) that every validator reads from chain state each BeginBlock and applies identically. The gov param also serves as the migration trigger: it defaults to 0 (paused), and raising it via a param-change proposal starts the drain fleet-wide at a deterministic height.
The old node-local config and its fallback are removed entirely — the gov param is now the sole source of the rate.

Key changes

New generic governance parameter

  • app/migration/params.go — new module-agnostic migration params subspace defining NumKeysToMigratePerBlock (default 0), its KeyTable, and validation. Deliberately not EVM-specific so future module migrations can reuse it.
  • app/app.go — registers the migration subspace; stores the *rootmulti.Store on the app and fails fast if the (unsupported) legacy commit multistore is in use.
  • app/abci.goBeginBlock reads NumKeysToMigratePerBlock from chain state and pushes it into the SC store before the block's first write (applyMigrationBatchSize).

Plumbing: push the rate down to the migration router

  • sei-db/state_db/sc/types/types.goCommitter interface gains SetMigrationBatchSize(int) error.
  • sei-cosmos/storev2/rootmulti/store.go — forwards SetMigrationBatchSize to the SC store; adds GetMigrationBatchSize for observability/tests.
  • sei-db/state_db/sc/composite/store.go — holds the gov-set batch size (atomic.Int64); buildRouter and SetMigrationBatchSize use it directly. Removed the effectiveMigrationBatchSize config fallback0 means paused, full stop.
  • sei-db/state_db/sc/migration/*Router interface gains SetMigrationBatchSize; all router types implement it. Only the migration manager acts on it; every other router (module/passthrough/dual-write/thread-safe) treats it as a no-op. router_builder.go now allows a 0 batch size (paused) instead of rejecting it.

Removed the node-local config + fallback

  • sei-db/config/sc_config.go, sei-db/config/toml.go, docker/localnode/config/app.toml, app/seidb.go — dropped the sc-keys-to-migrate-per-block field, default, validation, TOML entry, and flag parsing.

Tests

  • app/migration/params_test.go — unit tests for the new param (default, key-table registration, validation).
  • app/abci_test.go — subspace registration, applyMigrationBatchSize (defaults/set/clamp), and cross-block "param set in block N takes effect in N+1" coverage.
  • SC/rootmulti unit tests now seed the rate via SetMigrationBatchSize(...) after construction instead of the removed config field. The random-migration framework re-applies the rate on every store (re)open, faithfully mirroring how production re-pushes the gov param after each restart.

Integration test

  • integration_test/gov_module/ — new ParameterChangeProposal test that raises NumKeysToMigratePerBlock and asserts it takes effect.
  • integration_test/contracts/verify_flatkv_evm_migrate.sh — rewritten to drive the migration via a governance param-change proposal (submit → deposit → quorum vote → poll for PASSED → verify on-chain) instead of injecting the removed config.

Docs

  • AGENTS.md — Code style now mandates running both gofmt and goimports on every touched .go file.

@cursor

cursor Bot commented Jun 26, 2026

Copy link
Copy Markdown

PR Summary

High Risk
Changes consensus-critical state-commit migration pacing, AppHash, and runtime write-mode transitions at BeginBlock; misconfiguration or param bugs could fork validators or stall migration fleet-wide.

Overview
Governance-controlled migration pacing replaces the removed sc-keys-to-migrate-per-block app.toml knob. A new migration params subspace exposes NumKeysToMigratePerBlock (default 0 = paused, capped at 1M). Every validator reads it in BeginBlock via applyMigrationBatchSize and pushes it into the SC store so all nodes drain at the same rate (AppHash-safe). Raising the param above zero also acts as the fleet-wide migration trigger on auto nodes (memiavl_onlymigrate_evm).

Auto write mode by default: sc-write-mode-enable-auto (default true) forces effective mode auto so legacy configs with explicit memiavl_only still follow gov-driven migration without edits. Test/docker setups pin modes by setting sc-write-mode-enable-auto = false. SetWriteMode no longer rebuilds cached KV views mid-block so BeginBlock kick-off cannot orphan the deliver cache-multi-store.

Integration/tests: FlatKV migrate script drives rate via a param-change proposal; gov module test asserts param updates and migration start on auto clusters. AGENTS.md now requires goimports in addition to gofmt.

Reviewed by Cursor Bugbot for commit 293fe78. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJul 1, 2026, 5:15 AM

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.51515% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.79%. Comparing base (8ebedda) to head (293fe78).

Files with missing lines Patch % Lines
sei-cosmos/storev2/rootmulti/store.go 65.78% 10 Missing and 3 partials ⚠️
app/abci.go 69.23% 2 Missing and 2 partials ⚠️
sei-db/state_db/sc/composite/store.go 71.42% 4 Missing ⚠️
app/app.go 57.14% 2 Missing and 1 partial ⚠️
app/seidb.go 33.33% 1 Missing and 1 partial ⚠️
sei-db/state_db/sc/memiavl/store.go 0.00% 2 Missing ⚠️
sei-db/state_db/sc/migration/module_router.go 77.77% 1 Missing and 1 partial ⚠️
sei-db/state_db/sc/migration/dual_write_router.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3650      +/-   ##
==========================================
- Coverage   59.24%   58.79%   -0.45%     
==========================================
  Files        2272     2223      -49     
  Lines      188175   183105    -5070     
==========================================
- Hits       111479   107655    -3824     
+ Misses      66657    65840     -817     
+ Partials    10039     9610     -429     
Flag Coverage Δ
sei-chain-pr 64.66% <75.55%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?
sei-db-state-db-pr 77.03% <78.57%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/migration/params.go 100.00% <100.00%> (ø)
sei-cosmos/server/config/config.go 92.00% <100.00%> (+0.16%) ⬆️
sei-db/config/sc_config.go 100.00% <100.00%> (+11.76%) ⬆️
sei-db/state_db/sc/migration/migration_manager.go 95.43% <100.00%> (+0.10%) ⬆️
sei-db/state_db/sc/migration/migration_types.go 80.00% <ø> (ø)
sei-db/state_db/sc/migration/passthrough_router.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/migration/router_builder.go 56.84% <ø> (+1.14%) ⬆️
sei-db/state_db/sc/migration/thread_safe_router.go 100.00% <100.00%> (ø)
sei-ibc-go/testing/simapp/test_helpers.go 52.15% <100.00%> (+0.20%) ⬆️
sei-wasmd/app/test_helpers.go 44.10% <100.00%> (+0.21%) ⬆️
... and 8 more

... and 86 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread app/app.go Outdated
// composite SC backend drives the in-flight memiavl->flatkv migration that
// BeginBlock paces via the migration gov param. Fail fast if the legacy
// root multistore is somehow in use.
rs, ok := app.CommitMultiStore().(*rootmulti.Store)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to cast like this, would it make sense to change the type returned by app.CommitMultiStore()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense

Comment thread app/app.go Outdated
genesisImportConfig genesistypes.GenesisImportConfig

stateStore seidb.StateStore
rs *rootmulti.Store

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, would it make sense to use a more descriptive name like rootStore? Out of context, it's not obvious what rs stands for.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion, will change

Comment on lines +105 to +106
if migrationBatchSize < 0 {
return nil, fmt.Errorf("migration batch size must not be negative, got %d", migrationBatchSize)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make migration batch size an unsigned integer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that, looks like it would require a big refactory, will actually do a fallback here to 0 if it's negative

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-structured, thoroughly-tested change that moves the SC migration rate from a node-local config to a consensus-deterministic governance param (migration.NumKeysToMigratePerBlock), correctly read once per block in BeginBlock and pushed down to the composite SC store. No blocking correctness/security issues found; a few non-blocking notes, chiefly that the new subspace value is not genesis-exportable.

Findings: 0 blocking | 6 non-blocking | 2 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Migration param is not genesis-exportable (confirms Codex P2). x/params ExportGenesis (sei-cosmos/x/params/keeper/genesis.go) only exports FeesParams/CosmosGasParams, and the new migration subspace has no owning module's InitGenesis/ExportGenesis. After a seid export/import while a migration is in flight, NumKeysToMigratePerBlock is dropped and BeginBlock re-seeds the 0 (paused) default, halting the drain until governance re-submits a proposal. Data isn't lost (the boundary cursor lives in flatkv) and it's recoverable, but worth either adding genesis plumbing for this subspace or documenting the operational caveat.
  • Operational behavior change worth calling out in release notes: because the param defaults to 0 (paused) and is the sole source of the rate, any chain already running in a migrate write-mode will pause its drain at the upgrade height until a param-change proposal raises the rate. Intentional per the consensus-safety design, but operators must know to submit the proposal.
  • REVIEW_GUIDELINES.md and cursor-review.md were empty/absent, so no repo-specific guidelines or Cursor second opinion were available; only the Codex pass (one P2, addressed above) contributed.
  • Integration test nit (integration_test/gov_module/gov_proposal_test.yaml): the verifier NEW_PARAM == 12345 compares an unquoted numeric against a value the pipeline produces as a string (jq -r .value | tr -d "\""), unlike the sibling assertion NEW_ABCI_PARAM == "true". If the eval engine is type-strict this could mis-compare; quoting "12345" would match the established pattern.
  • 2 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/app.go Outdated
// composite SC backend drives the in-flight memiavl->flatkv migration that
// BeginBlock paces via the migration gov param. Fail fast if the legacy
// root multistore is somehow in use.
rs, ok := app.CommitMultiStore().(*rootmulti.Store)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] This unconditional type assertion now panics in New() if the commit multistore is not *rootmulti.Store. New() is invoked by every seid subcommand that builds the app (export, query tooling, etc.), so please confirm no supported configuration (e.g. a legacy IAVL/storev2-disabled path) can reach here — otherwise this turns a previously-degraded path into a hard crash. If the legacy store is truly unsupported this is fine; a one-line note on why it's unreachable would help.

Comment thread app/abci.go
}
numKeys := migration.DefaultNumKeysToMigratePerBlock
if subspace, ok := app.ParamsKeeper.GetSubspace(migration.SubspaceName); ok {
// The migration subspace has no owning module to seed it in InitGenesis,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] The lazy seed correctly handles a fresh/never-set param deterministically, but because this subspace has no module ExportGenesis (x/params only exports Fees/CosmosGas params), the value is lost across a seid export/import: a mid-migration rate resets to the 0 default here and the drain pauses until governance re-raises it. Consider adding genesis import/export plumbing for this subspace, or documenting the caveat.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Want higher recall? High effort reviews run extra passes and find more bugs. A team admin can switch effort levels in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 91e1766. Configure here.

Comment thread app/receipt_store_config.go Outdated
Comment thread sei-db/state_db/sc/migration/migration_manager.go Outdated
seidroid[bot]
seidroid Bot previously requested changes Jun 26, 2026

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR cleanly moves the SC migration rate from a node-local config to a consensus-safe governance param, with thorough plumbing and tests. However, an accidental rename of the receipt-store backend constant breaks an existing test (and decouples the test harness from the real config key), which blocks merge.

Findings: 1 blocking | 4 non-blocking | 2 posted inline

Blockers

  • None at the file/PR level.
  • 1 blocking issue(s) flagged inline on specific lines.

Non-blocking

  • Cursor's second-opinion pass produced no output (cursor-review.md is empty); REVIEW_GUIDELINES.md is also empty, so no repo-specific standards could be applied.
  • validateNumKeysToMigratePerBlock accepts any uint64. Because the value is cast to int before NextBatch (migration_manager.go:299), a governance value above math.MaxInt64 becomes negative and halts commit deterministically. Consider bounding the param to a sane operational maximum.
  • app/abci_test.go's comment that math.MaxUint64 is 'forwarded verbatim with no conversion or clamping' is misleading — the app layer stores it verbatim, but migration_manager converts it to int downstream.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/receipt_store_config.go Outdated

const (
receiptStoreBackendKey = "receipt-store.rs-backend"
receiptStoreBackendKey = "receipt-store.rootStore-backend"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] This key must stay receipt-store.rs-backend. The parser in sei-db/config/receipt_config.go reads flagRSBackend = "receipt-store.rs-backend" (and the rs-backend mapstructure tag), so renaming this constant to receipt-store.rootStore-backend decouples the app-layer key from what ReadReceiptConfig actually reads. Concretely, TestParseReceiptConfigs_RejectsInvalidBackend (app/seidb_test.go) now sets "rocksdb" under a key the parser ignores, the config falls back to the default pebbledb, and assert.Error fails — breaking CI. This looks like collateral damage from a global rsrootStore rename (the comment at app/test_helpers.go:486 was hit too). Revert to receipt-store.rs-backend.

if advanceMigration {
// Get the next batch of keys to migrate.
valuesToMigrate, newBoundary, err := m.iterator.NextBatch(m.migrationBatchSize)
valuesToMigrate, newBoundary, err := m.iterator.NextBatch(int(m.migrationBatchSize))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] int(m.migrationBatchSize) truncates a uint64 gov param. On 64-bit, a value above math.MaxInt64 becomes negative, NextBatch rejects size <= 0, and ApplyChangeSets returns an error during commit — a deterministic chain halt from a governance typo. Since validateNumKeysToMigratePerBlock accepts any uint64, consider bounding the param to a sane maximum (well under math.MaxInt) at validation time.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converts the SC migration per-block rate from a consensus-unsafe node-local config into a deterministic governance param read in BeginBlock; the design is sound, AppHash-safe (all nodes read the same value, lazy-default-set runs identically), and very thoroughly tested at unit, store, and integration levels. Only minor, non-blocking observations.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • New() now hard-panics if the commit multistore is not *rootmulti.Store (app/app.go:560). This is intentional fail-fast, but it is a behavior change: any App-construction path that could use a non-storev2 store (offline/CLI commands, tooling) would now crash rather than degrade. Worth confirming every code path that builds App goes through the storev2 rootmulti store.
  • The two later casts in app.go (~line 770 and ~2685) still re-derive the store via app.CommitMultiStore().(*rootmulti.Store) instead of reusing the new app.rootStore field; minor redundancy now that the type is guaranteed at construction.
  • Cursor's second-opinion review file (cursor-review.md) was empty — that pass produced no output. Codex reported no material issues.
  • Consider a brief note/log when the lazy default-set writes the param the first time (app/abci.go), purely for operator visibility on the upgrade block; not required for correctness.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/app.go Outdated
@seidroid seidroid Bot dismissed their stale review June 26, 2026 17:55

Superseded: latest AI review found no blocking issues.

seidroid[bot]
seidroid Bot previously requested changes Jun 29, 2026

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plumbing of the per-block migration rate into a governance param is clean, well-documented, and well-tested, but the PR also flips the default WriteMode to migrate_evm and relies on the claim that "paused migrate_evm produces the same app hash as memiavl_only" — which the routing code contradicts: brand-new EVM keys are written to flatkv yet excluded from the AppHash while paused, creating a fork/integrity risk. REVIEW_GUIDELINES.md and cursor-review.md were both empty; only Codex provided a second opinion, and its two P1 findings are corroborated.

Findings: 5 blocking | 4 non-blocking | 3 posted inline

Blockers

  • Default WriteMode + migration trigger are not consistently network-wide. The default flips to migrate_evm (sc_config.go) but sc-write-mode remains node-local (app/seidb.go:109 only overrides when the toml key is non-empty). A node that picks up the new default runs migrate_evm while a peer with an explicit sc-write-mode = memiavl_only stays memiavl_only; once the gov param is raised, only the migrate_evm nodes drain. Because the paused-migrate_evm AppHash is NOT equivalent to memiavl_only (see inline finding), this mixed fleet can fork even before the migration is triggered. The PR's framing of the gov param as the 'sole' migration trigger overstates this — the write mode is still a consensus-relevant local switch. (Corroborates Codex P1 #2.)
  • Missing test coverage for the central safety claim: there is no test asserting that a paused migrate_evm store and a memiavl_only store produce identical AppHashes when brand-new EVM keys are written each block. Existing migration tests all set a positive batch size and drive the migration; the steady-state/paused path with new-key writes (the exact rollout state introduced by the default change) is unverified.
  • 3 blocking issue(s) flagged inline on specific lines.

Non-blocking

  • REVIEW_GUIDELINES.md is empty/missing, so no repo-specific standards were applied. cursor-review.md is empty — the Cursor pass produced no output. Only the Codex pass produced findings (both corroborated here).
  • app/app.go: the fail-fast panic message says expected *storev2_rootmulti.Store but the import alias was changed to rootmulti in this PR — cosmetic mismatch only.
  • integration_test/gov_module/gov_proposal_test.yaml: the verifier NEW_PARAM == 12345 compares a string (from jq -r .value | tr -d '"') against an integer literal, whereas the sibling test uses a quoted string (NEW_ABCI_PARAM == "true"). Confirm the eval framework coerces types here, otherwise the assertion may not behave as intended. There is also a stray q gov proposal ... status query inserted between the two node votes — harmless but reads oddly.
  • app/abci.go applyMigrationBatchSize lazily persists the default param via subspace.Set on the first block it sees it unset. This is deterministic across nodes, but it means BeginBlock writes to the params store at the first post-upgrade height; worth a brief note that this is intended and idempotent.

Comment thread sei-db/config/sc_config.go Outdated
// all_migrated_but_bank, migrate_bank, flatkv_only, test_only_dual_write, auto.
// defaults to memiavl_only.
// defaults to migrate_evm. While the NumKeysToMigratePerBlock gov param is 0
// (the default), migrate_evm is paused and produces the same app hash as

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] This claim is not borne out by the routing code. In migrate_evm, MigrationManager.shouldForwardWriteToNewDB routes any caller write whose key is absent from memiavl directly to flatkv (migration_manager.go:431-435), and EVM creates brand-new keys (storage slots, accounts, receipts) every block. While paused (batch size 0) the boundary never advances, so shouldAppendLatticeHash returns false (composite/store.go:870-882) and flatkv is excluded from CommitInfo/AppHash. Result: paused migrate_evm commits new EVM state into flatkv that is outside the AppHash, whereas memiavl_only puts that same key into memiavl which IS in the AppHash — so the two AppHashes differ. This breaks the equivalence the rollout depends on (fork/halt risk in a mixed fleet) and leaves committed state uncovered by the merkle root. Please either route brand-new keys to memiavl while paused, or do not enable migrate_evm by default until the param is raised.

Comment thread sei-db/config/sc_config.go Outdated
return StateCommitConfig{
Enable: true,
WriteMode: types.MemiavlOnly,
WriteMode: types.MigrateEVM,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] Changing the default WriteMode from MemiavlOnly to MigrateEVM is a consensus-relevant default flip. Combined with the node-local sc-write-mode override (app/seidb.go:109) and the AppHash non-equivalence of paused migrate_evm (see comment above), nodes that adopt this default will diverge from peers still on memiavl_only. Please confirm the upgrade story: existing nodes with an explicit memiavl_only in app.toml keep that value, but any node without it now silently enters migrate_evm. This needs to be coordinated as a network-wide switch rather than a per-node default.

// still route below, but no keys are pulled forward and no boundary
// metadata is rewritten, so the migration holds at its current cursor
// until the batch size is raised again.
advanceMigration := firstBatchInBlock && m.migrationBatchSize > 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] advanceMigration gates only the boundary advance and metadata writes; caller-write routing below (shouldForwardWriteToNewDB) still runs unconditionally, sending brand-new keys to flatkv even when paused. Since the boundary metadata is never written in the paused state, shouldAppendLatticeHash never latches, so those flatkv writes never enter the AppHash. This is the mechanism behind the AppHash divergence flagged on sc_config.go and matches Codex P1 #1. Consider also pausing the new-key forwarding (route everything to the old DB) while migrationBatchSize == 0, so paused migrate_evm is truly AppHash-identical to memiavl_only.

Comment thread sei-db/config/sc_config.go
Comment thread app/migration/params.go Outdated
Comment thread sei-db/state_db/sc/migration/migration_manager.go
Comment thread app/migration/params.go
seidroid[bot]
seidroid Bot previously requested changes Jun 30, 2026

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR cleanly replaces the node-local sc-keys-to-migrate-per-block config with a consensus-deterministic governance param and is well-tested, but the migration kick-off still depends on each node's local sc-write-mode: a node that keeps a fixed memiavl_only config (the value existing app.toml files already carry after upgrade) cannot honor the gov trigger, its failed kick-off is only logged, and it will diverge from Auto-mode nodes once the param is raised. That residual per-node divergence axis plus the silent error handling are blocking for a consensus path.

Findings: 4 blocking | 3 non-blocking | 2 posted inline

Blockers

  • Upgrade/operational fork risk: DefaultStateCommitConfig.WriteMode changes from memiavl_only to auto, but already-running validators keep sc-write-mode = "memiavl_only" baked into their app.toml (rendered from the old default). The default change does not rewrite their config, so after upgrade many nodes will be fixed memiavl_only. When the gov param is later raised, Auto nodes migrate (changing the memiavl/flatkv contents that feed the AppHash) while fixed memiavl_only nodes silently fail the kick-off and never migrate — i.e. the chain forks. The PR's stated goal is to eliminate exactly this per-node divergence, but it introduces a new one (Auto vs fixed). At minimum this needs to be enforced (refuse to start / fail loud if a non-Auto store is in use while migration is/was requested) or made an explicit, hard upgrade prerequisite, not just relied upon operationally.
  • REVIEW_GUIDELINES.md is empty and codex-review.md is empty/absent, so no repo-specific review standards and no OpenAI Codex second-opinion pass were available for this synthesis (noting per instructions; not itself a code defect).
  • 2 blocking issue(s) flagged inline on specific lines.

Non-blocking

  • app.go New() now unconditionally panics if CommitMultiStore() is not *rootmulti.Store. This is intentional fail-fast, but confirm no supported code path (export/CLI tooling, tests, legacy-store startup) constructs the App with a different commit multistore, since this turns a previously-guarded if ok into a hard panic.
  • Cursor P2 (ordering): applyMigrationBatchSize runs after forkInitializer and HardForkManager.ExecuteForTargetHeight. The kick-off's SetWriteMode rejects when any commitment store HasPendingChanges(). This appears safe because BeginBlock writes are buffered in the cache-multistore and only flushed into the commitment stores at Commit (so HasPendingChanges() should be false here), but it's worth confirming no hard-fork handler writes directly into a commitment store's pending buffer in the same block a positive param first applies.
  • MigrationManager.SetMigrationBatchSize writes m.migrationBatchSize without atomics; this is safe only because the manager is always wrapped by threadSafeRouter (write lock serializes it against ApplyChangeSets). Fine as-is, but the invariant is load-bearing and worth a brief assertion/test that migration routers are never used unwrapped.

Comment thread sei-cosmos/storev2/rootmulti/store.go Outdated
return fmt.Errorf("failed to set SC store migration batch size: %w", err)
}
if batchSize > 0 {
if mode, ok := rs.GetWriteMode(); ok && mode == sctypes.MemiavlOnly {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] The kick-off fires whenever the effective mode is MemiavlOnly, but a composite store under a fixed memiavl_only config also reports GetWriteMode() == MemiavlOnly (see composite store.go:428-429 / resolveCurrentWriteMode). So for such a node this calls SetWriteMode(MigrateEVM), which returns the "write mode is fixed by configuration" error. Combined with the log-only handling in BeginBlock, that node never migrates while Auto nodes do — an AppHash divergence. The doc comment above ("every other SC store ... never reports a memiavl_only effective mode, so the kick-off is skipped") is inaccurate for fixed memiavl_only. Either special-case/skip non-Auto stores here, or treat the "fixed" failure as fatal so the divergence can't go unnoticed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fixed MemiavlOnly node, we intentionally want to diverge the app hash and skip the migration. So the switch should only work for auto mode

Comment thread app/abci.go Outdated
numKeys = uint64(math.MaxInt64)
}
if err := app.rootStore.SetMigrationBatchSize(int(numKeys)); err != nil {
logger.Error("failed to set SC migration batch size", "err", err)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] A failure to apply the consensus-relevant migration rate is only logged and execution continues. The level-triggered retry justifies swallowing transient failures (e.g. pending changes), but a permanent failure — chiefly a node pinned to a fixed sc-write-mode retained across upgrade — will silently fork from nodes that do migrate, one log line per block. Recommend distinguishing the non-retryable case (e.g. the "write mode is fixed" error) and halting/panicking, or validating at startup that the store is Auto when migration may be driven by governance.

Comment thread sei-cosmos/storev2/rootmulti/store.go Outdated
- type: eval
expr: NEW_ABCI_PARAM == "true"

- name: Test migration param change proposal should update NumKeysToMigratePerBlock

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks the migration gov tests triggers a real evm migration but only asserts the param value, not the migration. want to make sure it's on purpose

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no good way to check if real migration started or not, I decided to add a check to look for logs signals

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clean, well-tested refactor that moves the SC migration rate from a node-local config to a consensus-deterministic gov param and makes auto the default write mode. No correctness blockers found; the main thing to flag is the operational rollout implication of the default-mode change. Codex and Cursor produced no findings.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Rollout coordination: the default sc-write-mode flips from memiavl_only to auto. Existing validators that already have an explicit sc-write-mode = "memiavl_only" in app.toml are fixed (not auto), so they will skip the kick-off and intentionally diverge/halt at the height the gov proposal raises NumKeysToMigratePerBlock. The release/runbook should make it explicit that every participating validator must set auto before the param-change proposal passes; otherwise lagging nodes fork. This is by design and documented in code, but it is a critical operational precondition rather than an automatic outcome of the upgrade.
  • Second-opinion passes: OpenAI Codex reported "No material findings" and the Cursor review file was empty (no output). REVIEW_GUIDELINES.md was also empty, so no repo-specific standards were applied.
  • applyMigrationBatchSize lazily persists the default (0) into the params store on the first block it sees the key unset. This is deterministic across nodes and effectively one-time (subsequent blocks see Has==true), so it's fine — just noting the write-in-BeginBlock side effect for awareness.
  • Verified the integration-test verifier NEW_PARAM == 12345 (bare number vs string env var) works: integration_test/runner/runner.go compares via ParseFloat, and staking_proposal_test.yaml already uses the same bare-number form — so this is not a bug.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread sei-db/config/sc_config.go Outdated
return StateCommitConfig{
Enable: true,
WriteMode: types.MemiavlOnly,
WriteMode: types.Auto,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] Flipping the default write mode from memiavl_only to auto is the consensus-relevant part of this change. Note that this only governs nodes whose app.toml has an empty/absent sc-write-mode; nodes that already wrote sc-write-mode = "memiavl_only" (the previous template default) remain fixed memiavl_only and will be excluded from the kick-off in rootmulti.SetMigrationBatchSize — they'll intentionally diverge once the gov param is raised. Worth calling out in the upgrade runbook that all participating validators must switch to auto before the param-change proposal lands. Not a code defect — the behavior is documented and tested — just a rollout precondition that's easy to miss.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit at claude.ai/admin-settings/claude-code.

Once credits are available, reopen this pull request to trigger a review.

seidroid[bot]
seidroid Bot previously requested changes Jun 30, 2026

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well-engineered, heavily-tested refactor that moves the SC migration rate from a node-local config to a consensus-deterministic gov param. Two consensus/rollout concerns should be resolved before merge: existing validators are pinned to fixed memiavl_only (so the gov "trigger" is a no-op for them), and BeginBlock lazily writes a new param key into committed state, which is state-machine-breaking despite the "non-app-hash-breaking" label.

Findings: 4 blocking | 5 non-blocking | 1 posted inline

Blockers

  • (Confirmed; agrees with Codex) The gov param is a no-op for the existing fleet. The app.toml template (sei-db/config/toml.go) always renders sc-write-mode = "{{ .StateCommit.WriteMode }}", and parseSCConfigs (app/seidb.go) preserves any explicit value. Nodes provisioned under the old memiavl_only default therefore have an explicit sc-write-mode = "memiavl_only" and are pinned (fixed), not auto. Store.SetMigrationBatchSize only kicks off migrate_evm when ConfiguredWriteMode() == Auto and explicitly skips fixed memiavl_only. So raising migration.NumKeysToMigratePerBlock will NOT start the migration on those validators — contradicting the PR's core claim that the gov param is the sole trigger. The migration would still require a separate, coordinated app.toml edit (set sc-write-mode = auto) on every node first; if only some operators do so, the auto nodes migrate (AppHash includes migration writes) while pinned nodes do not, diverging the chain. Please document/automate the operator step to flip existing nodes to auto, and gate the activation so partial adoption can't fork the chain.
  • BeginBlock writes new committed state, which is state-machine-breaking and at odds with the PR's non-app-hash-breaking label. applyMigrationBatchSize calls subspace.Set(...) the first time the param is unset, adding a key to the x/params store and thus changing the AppHash at the first block the new binary runs. On a rolling (uncoordinated) upgrade, an upgraded node would compute a different AppHash than peers still on the old binary and halt. Confirm this ships behind a coordinated upgrade height (or seed the param via an upgrade handler / InitGenesis instead of lazily in BeginBlock), and reconcile the non-app-hash-breaking label.
  • 1 blocking issue(s) flagged inline on specific lines.
  • 1 blocking issue(s) listed below under unanchored comments.

Non-blocking

  • Cursor produced no review output (cursor-review.md is empty); REVIEW_GUIDELINES.md is also empty, so this review applies general standards only.
  • integration_test/gov_module/gov_proposal_test.yaml: the new verifier expr: NEW_PARAM == 12345 compares against an unquoted numeric literal, whereas the sibling verifier in the same file uses a quoted string (NEW_ABCI_PARAM == "true"). NEW_PARAM is captured from jq -r .value | tr -d '"' (a string). If the eval engine does not coerce string-vs-number, this assertion could silently never hold (false pass risk) or flake. Recommend NEW_PARAM == "12345" for consistency.
  • Store.SetMigrationBatchSize: when the SC store does not expose ConfiguredWriteMode (bool=false), configured is the empty string, which != Auto, so the code logs the 'pinned to fixed memiavl_only' message even though the mode is simply unknown. Harmless in practice (the composite store always implements it) but the log could mislead during debugging.
  • SetWriteMode no longer rejects pending uncommitted changes and no longer rebuilds rs.ckvStores — a meaningful change to a consensus-critical path. The rationale (dynamic router proxies + avoiding orphaning the live deliver cms) and the new regression test TestRootMultiAutoKickoff_LiveCacheStoreNotOrphaned look sound, but this path warrants careful reviewer attention.
  • Tests could not be executed in the review sandbox (no network for the Go 1.25.6 toolchain), matching Codex's note; CI should be relied on for the race/coverage run.

Comments that couldn't be anchored to the diff

  • sei-db/state_db/sc/composite/store.go:1264 -- [blocker] This skip path is the crux of the rollout gap: a node configured with a fixed (non-auto) write mode — including the explicit memiavl_only that the app.toml template writes for every existing fleet member — is excluded from the migration kick-off. Combined with parseSCConfigs preserving explicit sc-write-mode, raising the gov param will not start migration on the current fleet without a separate coordinated config flip to auto. Worth surfacing this required operator action prominently and guarding against partial adoption (which would fork auto vs pinned nodes).

Comment thread app/abci.go
if app.HardForkManager.TargetHeightReached(ctx) {
app.HardForkManager.ExecuteForTargetHeight(ctx)
}
app.applyMigrationBatchSize(ctx)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] applyMigrationBatchSize writes to committed state in BeginBlock: when the param is unset it calls subspace.Set(...), adding a key to the x/params store and changing the AppHash at the first block the new binary runs. That is state-machine-breaking and would diverge nodes on a rolling/uncoordinated upgrade, which seems at odds with the PR's non-app-hash-breaking label. Consider seeding the param via an upgrade handler / InitGenesis instead of lazily here, or confirm this ships at a coordinated upgrade height.

* main:
  feat(seid): ConfigManager selection seam (PLT-775 PR1) (#3671)
  fix(evmrpc): limit listener max open connections, configurable via max_open_connections (PLT-704) (#3637)
  LittDB: Keymap threading improvements (#3645)
  integrate hashlogger (#3647)
  fix(metrics): Prometheus metrics output (#3640)
  [codex] Harden multiversion iterator validation (#3656)
  feat(consensus): mock_chain_validation replay build + memIAVL state-sync restore fixes (#3663)
  chore: replace OLD red SeiLogo banner in README with new 2026 Sei lockup (#3670)
  Require absolute path for evmone lib (#3668)
  fix(evmrpc): apply getLogs maxLog cap during merge instead of after (PLT-687) (#3666)
  feat(evmrpc): pre-decode request size admission control (PLT-295) (#3648)
  Make autobahn block production check wait for progress (#3667)
  fix(sei-tendermint): prevent readRoutine goroutine leak on /websocket when writeChan is full (PLT-707) (#3664)
  Per-block littidx flush + single shard (gated on #3645) (#3660)
  fix(evmrpc): bound debug_traceStateAccess memory and add trace admission control (PLT-360) (#3653)
  [codex] bump go-ethereum to v1.15.7-sei-17 (#3657)
  Upodate checkout GHA step across all workflows (#3659)
  Add GoReleaser release pipeline for static seid binaries (#3425)
  Parallelize littidx eth_getLogs across blocks (#3652)
seidroid[bot]
seidroid Bot previously requested changes Jun 30, 2026

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR cleanly moves the SC migration rate from a node-local config to a consensus-read governance param, with strong test coverage and careful determinism reasoning. The one blocking concern (shared with Codex) is that the new sc-write-mode-enable-auto defaults to true and, when absent, unconditionally forces WriteMode to auto — silently overriding any explicit sc-write-mode (notably flatkv_only) and breaking nodes whose operators don't add the opt-out before upgrade.

Findings: 2 blocking | 5 non-blocking | 2 posted inline

Blockers

  • Silent override of explicit write modes on upgrade (agrees with Codex High). When sc-write-mode-enable-auto is absent, both parse paths force WriteMode = auto regardless of the explicit sc-write-mode. The design intent (memiavl_only fleet auto-participates) is sound, but this also clobbers explicit flatkv_only / test_only_dual_write / pinned configs. toml.go's own comment warns that auto on a flatkv_only node 'either fails every commit with a version-mismatch error or silently serves reads from an empty memiavl' — so flatkv_only nodes that merely upgrade the binary (existing docker scripts only rewrite sc-write-mode, not the new key) will break unless operators proactively add sc-write-mode-enable-auto = false. Recommend either (a) only force auto when the explicit mode is empty or memiavl_only, honoring any other explicit mode as a deliberate pin, or (b) gate this behind an explicit release-note/migration step and a startup safety check. As-is this is a silent, app-hash-affecting footgun.
  • 1 blocking issue(s) flagged inline on specific lines.

Non-blocking

  • sei-db/config/sc_config.go still declares KeysToMigratePerBlock int (line 72) and sets it to 1024 in DefaultStateCommitConfig (line 90), but nothing reads it anymore (buildRouter now uses the atomic migrationBatchSize). The PR description claims the field was dropped; it's now dead config. Remove the field (and its mapstructure tag) to match the stated intent and avoid confusing operators who still have sc-keys-to-migrate-per-block in app.toml (now silently ignored).
  • Behavior diverges between the two parse paths: app/seidb.go skips parsing/validating sc-write-mode entirely when auto is on (default), so a typo'd sc-write-mode is silently ignored, whereas sei-cosmos/server/config/config.go still parses (and can error on) the explicit mode before overriding to auto. Consider making the two consistent.
  • REVIEW_GUIDELINES.md (base branch) is empty — no repo-specific standards were available to apply. cursor-review.md is also empty (Cursor produced no output); only the Codex pass contributed a finding.
  • Doc-only caveat already noted by the author: a seid export taken mid-migration omits NumKeysToMigratePerBlock (no owning module's ExportGenesis serializes it), so a chain bootstrapped from that genesis re-seeds 0 (paused). Acceptable and documented, but ensure operators forking/recovering via export are aware they must re-issue the proposal.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread app/seidb.go Outdated
if v := appOpts.Get(FlagSCWriteModeEnableAuto); v != nil {
scConfig.WriteModeEnableAuto = cast.ToBool(v)
}
if scConfig.WriteModeEnableAuto {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocker] Blocker: when sc-write-mode-enable-auto is absent (the common case for already-provisioned nodes), WriteModeEnableAuto stays true and this unconditionally forces WriteMode = auto, discarding any explicit sc-write-mode. For memiavl_only that's the intended upgrade path, but for an explicit flatkv_only node this is dangerous — toml.go's own comment states auto on a flatkv_only node 'either fails every commit with a version-mismatch error or silently serves reads from an empty memiavl.' Existing startup scripts only rewrite sc-write-mode, so such operators get no signal to add the opt-out and the node breaks on upgrade. Safer: only force auto when the explicit mode is empty or memiavl_only; treat any other explicit mode as a deliberate pin and honor it. The same logic in sei-cosmos/server/config/config.go (~line 357) needs the same treatment.

Comment thread app/abci.go
if app.rootStore == nil {
return
}
numKeys := migration.DefaultNumKeysToMigratePerBlock

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Minor: this lazily writes the default param into the params store during BeginBlock the first time it's unset. It's deterministic across nodes and correctly gated by subspace.Has, so it's fine — but note that the very first ParameterChangeProposal can only be accepted after at least one block has run to seed the key. On a brand-new chain that's block 1+, which is fine; just worth a one-line comment so this ordering dependency isn't surprising later.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR converts the consensus-relevant SC migration rate from a node-local config into a governance param read every BeginBlock, removing the divergence risk of per-node rates. The change is well-factored and unusually well-tested (unit coverage for the auto kick-off, restart/resume, clamping, and cross-block timing); no correctness blockers found, but there are a couple of non-blocking coverage/limitation notes and minor test nits.

Findings: 0 blocking | 7 non-blocking | 2 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Genesis export/import limitation (raised by Codex, and explicitly documented in app/migration/params.go): the migration subspace has no owning AppModule, so its value is not serialized by seid export. A chain bootstrapped from such a genesis re-seeds the default 0 (paused) on the first BeginBlock, silently dropping a previously-approved rate until a new ParameterChangeProposal is issued. This is recoverable and not consensus-fatal, and the team documented it — but consider wiring genesis export/import (or a dedicated module) so operators forking/recovering via export don't lose the rate unknowingly.
  • Integration coverage gap (raised by Codex): the headline path — a gov param raising NumKeysToMigratePerBlock auto-triggering memiavl_only -> migrate_evm — is only covered by Go unit tests. verify_flatkv_evm_migrate.sh still seds sc-write-mode = "migrate_evm" locally per node (with sc-write-mode-enable-auto pinned false by GIGA_MIGRATE_FROM_MEMIAVL), so the E2E run exercises a fixed migrate_evm node plus the gov-controlled rate, not the auto kick-off transition. Consider an integration scenario that leaves auto enabled and lets the gov param alone drive the transition.
  • Cursor produced no review output (cursor-review.md is empty); REVIEW_GUIDELINES.md is also empty, so no repo-specific standards were applied.
  • No prompt-injection or other untrusted-instruction content was found in the PR title/description or diff.
  • Informational: app.New() now panics if the commit multistore is not *rootmulti.Store. This is safe in practice — SetupSeiDB always installs rootmulti.NewStore and already panics when SC is disabled — so the new assertion is purely defensive, not a behavior regression.
  • 2 suggestion(s)/nit(s) flagged inline on specific lines.

expr: PROPOSAL_STATUS == "PROPOSAL_STATUS_PASSED"
# The migration batch size param must reflect the new value
- type: eval
expr: NEW_PARAM == 12345

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] NEW_PARAM == 12345 compares a string (the value is piped through tr -d "\"", yielding the string "12345") against a numeric literal. The sibling verifier on line 72 quotes its expected value (PROPOSAL_STATUS == "PROPOSAL_STATUS_PASSED"), and the pre-existing ABCI-param test uses == "true". Depending on the eval engine's type coercion this comparison may never match (string-vs-number), making the verifier pass vacuously or fail. Suggest NEW_PARAM == "12345" for consistency and to ensure the assertion actually exercises the new value.

- cmd: seidbin=seid; chainid=sei; source integration_test/utils/_tx_helpers.sh && submit_tx_and_wait node_admin gov vote $PROPOSAL_ID yes --fees 2000usei
node: sei-node-0
# sei-node-1 vote yes
- cmd: seid q gov proposal $PROPOSAL_ID --output json | jq -r .status

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] This stray seid q gov proposal status query sits under the # sei-node-1 vote yes comment but does nothing (no env: capture, no node:), so the comment now mislabels two steps. Either drop this line or move it below the node-1 vote and capture its output, so the test reads cleanly.

@seidroid seidroid Bot dismissed stale reviews from themself June 30, 2026 22:00

Superseded: latest AI review found no blocking issues.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the per-block SC migration rate from a node-local config into a consensus-read governance param (migration.NumKeysToMigratePerBlock) is the right fix for the fork risk, and the change is unusually well-documented and heavily tested across unit, integration, and end-to-end layers. One edge-case correctness consideration (raised by Codex) around the newly-legal migrate_evm + batch-size-0 state is worth confirming before merge, but the intended auto-driven path is safe.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Codex flagged (High): making batch size 0 a legal 'paused' state newly allows a store to sit in effective migrate_evm with migrationBatchSize == 0. In that state advanceMigration is false so no boundary metadata is written, yet caller writes for brand-new EVM keys still route to FlatKV via shouldForwardWriteToNewDB; composite.shouldAppendLatticeHash only includes the FlatKV lattice in the AppHash once migrationStarted is true, so those writes would be excluded from the AppHash. Worth confirming this is unreachable in practice: under the intended auto flow it is safe (the kick-off in rootmulti.SetMigrationBatchSize only enters migrate_evm when batch>0, and that same block advances the boundary and latches the lattice; lowering the gov param to 0 afterward is safe because the lattice is already latched). The reachable case is a node pinned to fixed sc-write-mode = migrate_evm while the gov rate is still at its 0 default — previously impossible because the removed config validation forced batch>0 (default 1024). Since the PR documents pinned nodes as deliberately divergent this is likely acceptable, but a brief note or a guard (e.g. reject/kick-off on fixed migrate_evm with rate 0) would remove the footgun.
  • Cursor's second-opinion review file (cursor-review.md) was empty — that pass produced no findings to merge.
  • app/abci.go:applyMigrationBatchSize lazily persists the default (0) into the params subspace on the first BeginBlock that finds the key unset (!subspace.Hassubspace.Set). This is a state write in BeginBlock; it is deterministic across nodes so it is consensus-safe, but note it makes the migration subspace's first appearance an app-hash-affecting event at the activation height — consistent with the app-hash-breaking label, just worth being explicit about in the upgrade notes.
  • Documentation caveat (already self-noted in app/migration/params.go): because the migration subspace has no owning module, seid export omits NumKeysToMigratePerBlock, so a chain bootstrapped from a mid-migration export silently re-seeds the paused default. Good that it's documented; consider surfacing it in operator-facing upgrade docs too.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread sei-db/state_db/sc/migration/migration_manager.go
Comment thread sei-db/config/sc_config.go
@yzang2019 yzang2019 added this pull request to the merge queue Jul 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 1, 2026
@yzang2019 yzang2019 enabled auto-merge July 1, 2026 05:13

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solidly-engineered change that moves the SC memiavl→flatkv migration rate from a node-local config to a governance param and makes auto the default write mode, with thorough tests and unusually detailed rationale comments. No correctness blockers found; the main risk is the "auto by default" upgrade path silently overriding explicitly-pinned non-memiavl write modes, which is documented but has no runtime guard.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Cursor second-opinion review (cursor-review.md) was empty — that pass produced no output.
  • Genesis-export gap (documented in app/migration/params.go): the migration subspace has no owning AppModule, so seid export omits NumKeysToMigratePerBlock; a chain bootstrapped from such a genesis silently re-seeds the default (0/paused) and operators must re-issue the proposal. Consider seeding it via the params ExportGenesis path or documenting it in operator runbooks so a mid-migration export doesn't silently reset the drain.
  • WriteMode resolution is now computed in two places — sei-cosmos/server/config/config.go:GetConfig and app/seidb.go:parseSCConfigs — both calling ApplyWriteModeAuto. They agree today; consider a shared helper or a test asserting parity so the two parse paths can't drift and produce a config divergence.
  • Codex finding #2 (app/abci.go applyMigrationBatchSize logs-and-continues on SetMigrationBatchSize failure, which can include the consensus-critical memiavl_only→migrate_evm kick-off): the author's in-code rationale is sound — a partial failure diverges AppHash and Tendermint halts the affected minority, while an all-fail keeps AppHash identical and the level-triggered kick-off re-fires. Acceptable as-is; flagging that Codex raised it and it was consciously resolved.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

Comment thread sei-cosmos/server/config/config.go
@yzang2019 yzang2019 added this pull request to the merge queue Jul 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 1, 2026
@yzang2019 yzang2019 self-assigned this Jul 1, 2026
@yzang2019 yzang2019 added this pull request to the merge queue Jul 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants