test(grpc-suite): add post-upgrade gRPC surface coverage#2075
test(grpc-suite): add post-upgrade gRPC surface coverage#2075chalabi2 wants to merge 15 commits into
Conversation
Adds tests/upgrade/grpcsuite: a harness-agnostic suite that verifies the chain's gRPC API surface after an upgrade. It dynamically discovers every served query (gRPC reflection) and every registered Msg (interface registry), exercises them strictly over the gRPC API, and fails a coverage gate if any in-scope RPC was never run -- so a new RPC added by a future upgrade turns CI red until a case exists. Queries are auto-covered by a smoke sweep (130/130); transactions are authored, dependency-ordered scenarios sharing a World. Gov-gated messages use a batched propose/vote/wait fast-path. The suite runs as the universal post-upgrade worker (every upgrade) against a testnetify-forked node, and via an in-process driver for fast/CI runs (make test-grpc-surface, new grpc-surface CI job). Current tx coverage: deployment lifecycle, provider, and all MsgUpdateParams; remaining Akash packs (cert, market, escrow, audit, oracle, bme) are tracked by the coverage gate's UNCOVERED list. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Covers the full market bid/lease lifecycle (create bid, create lease, withdraw, lease-start-reclaim rejection, close lease, close bid), auditor attribute sign/delete, and escrow account deposit. Adds a WaitBlocks helper (lease payments settle a block after creation) and a reusable createDeploymentFromSDL helper for packs that need extra orders/escrow accounts. Akash tx coverage now 23/30; queries 130/130. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Generates an mTLS client certificate via KeyPairManager (into a temp dir), PEM-wraps the DER cert/pubkey as the chain expects, creates and then revokes it. Akash tx coverage now 25/30 (remaining: oracle AddPriceEntry, bme). Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Authorizes a price source via a gov param change, then submits an AKT/USD price entry (with a block-time timestamp the oracle accepts). Adds a LatestBlockTime helper. Akash tx coverage now 26/30 (remaining: bme). Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Completes Akash transaction coverage at 30/30 (queries 130/130). The bme pack funds the vault (gov), mints ACT, and tolerates the circuit-breaker rejection on burn ACT / burn-mint (only minting is enabled on main); it re-feeds a fresh oracle price first since bme refuses stale prices. Adds a BroadcastTolerant helper and a reusable feedAKTPrice helper. RequireFullCoverage is now true in both drivers, so the coverage gate fails if any in-scope Akash tx or query stops being exercised. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
… module Every Akash query RPC is now exercised with real, populated inputs (asserting content + pagination + filters), and every major Akash tx has negative/edge cases (not-found, signer mismatch, duplicate, out-of-bounds, disabled/gov-gated) that assert the expected rejection. Strict full-coverage gate stays green (tx 30/30, query 130/130). Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Deployment deposits must be in uact: the chain requires the deposit denom
to match the uact group price. uact is non-transferable on the real forked
chain (bank Send is gov-disabled), so funding sub-accounts with uact via
MsgSend failed there. Make the deployment owner ("tenant") the funder,
which already holds uact on both the forked chain and the in-process net,
and have it deposit its own uact directly. A deposit is an account->module
operation that bypasses SendEnabled, so this mirrors how tenants spend uact
on the real chain. Provider sub-accounts keep their uakt bid deposit.
Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Run the exhaustive gRPC tx/query suite as an optional post-upgrade step rather than unconditionally, matching how the hermes relayer integration is opt-in. The universal post-upgrade worker now runs only when -grpc-suite is set, which the upgrade make target exposes as `make test GRPC_SUITE=true`. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Real testnetify forks can run with tx indexing disabled, so GetTx is not enough to prove broadcast inclusion. Scan committed blocks after CheckTx and rely on state assertions when index events are unavailable. The real fork also starts validator0 without uact. Bootstrap ACT through the BME flow before deployment deposits and refresh the oracle price while waiting for mints to open. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Adds authored Cosmos SDK packs to the post-upgrade gRPC suite so the coverage gate exercises every discovered in-scope transaction and query on both the in-process driver and testnetify-forked upgrade path. Also hardens fork state preparation around rollback/testnetify so cached upgrade runs start from clean pre-upgrade state. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Adds explicit tx-only and query-only local runners for the gRPC surface suite while keeping the default target as the full tx/query coverage gate. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughIntroduces a comprehensive ChangesgRPC Full-Surface Test Suite
Sequence Diagram(s)sequenceDiagram
participant TestFullSurface as TestFullSurfaceGRPC
participant Network as testutil.Network
participant RunFunc as grpcsuite.Run
participant Discovery as Discovery
participant Pack as Pack
participant Broadcaster as Broadcaster
participant Coverage as Coverage
TestFullSurface->>Network: New(t, cfg) with gov genesis intercept
Network-->>TestFullSurface: single validator with GRPCAddress
TestFullSurface->>RunFunc: Run(ctx, t, Env{gRPC, RequireFullCoverage:true})
RunFunc->>Discovery: discover() via reflection + descriptors
Discovery-->>RunFunc: inScopeMsgs[], inScopeQueries[]
loop each available Pack in dependency order
RunFunc->>Pack: Run(Suite)
activate Pack
Note over Pack: check Available(Discovery)
loop transactions to broadcast
Pack->>Broadcaster: Broadcast(fromName, msgs...)
Broadcaster-->>Pack: TxResponse (local sign + gRPC broadcast + inclusion poll)
end
Note over Pack: query gRPC endpoints per module
Pack-->>RunFunc: completion
deactivate Pack
end
RunFunc->>RunFunc: querySmokeSweep() deterministic query invocation
RunFunc->>Coverage: AssertCoverage(RunMode)
Coverage-->>RunFunc: pass/fail based on coverage gaps and strictness
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (2)
tests/upgrade/grpcsuite/pack_cosmos_bank.go (1)
64-65: ⚡ Quick winStop swallowing denom-metadata query errors.
These two calls currently bypass assertions, so endpoint regressions can slip through as false positives in this pack. Assert explicit acceptable outcomes (
niland, if intended,NotFound) instead of discardingerr.♻️ Suggested change
- _, _ = q.DenomMetadata(s.Ctx, &banktypes.QueryDenomMetadataRequest{Denom: s.Env.BondDenom}) - _, _ = q.DenomMetadataByQueryString(s.Ctx, &banktypes.QueryDenomMetadataByQueryStringRequest{Denom: s.Env.BondDenom}) + _, err = q.DenomMetadata(s.Ctx, &banktypes.QueryDenomMetadataRequest{Denom: s.Env.BondDenom}) + require.NoError(s.T, err, "bank DenomMetadata") + _, err = q.DenomMetadataByQueryString(s.Ctx, &banktypes.QueryDenomMetadataByQueryStringRequest{Denom: s.Env.BondDenom}) + require.NoError(s.T, err, "bank DenomMetadataByQueryString")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/upgrade/grpcsuite/pack_cosmos_bank.go` around lines 64 - 65, The DenomMetadata and DenomMetadataByQueryString method calls in the test are currently discarding errors by using blank identifiers (_, _), which allows endpoint regressions to go undetected. Replace the blank error identifiers in both calls with actual error variables (err1, err2 or similar names), and add assertions after each call to verify the errors are either nil or NotFound as appropriate for your test expectations. This ensures that unexpected errors from these query methods will cause the test to fail rather than silently passing.tests/upgrade/grpcsuite/pack_audit.go (1)
19-21: ⚡ Quick winUse the published provider world key instead of hardcoding
"provider".Line 19 validates
wProviderSigner, but Line 20 (andauditNegatives) bypasses it. This weakens the cross-pack contract and will break if the provider signer name ever changes.Suggested patch
func (auditPack) Run(s *Suite) { - require.NotEmpty(s.T, s.World.Get(wProviderSigner), "audit pack needs the provider pack") - provider := s.Addr("provider") + providerSigner := s.World.Get(wProviderSigner) + require.NotEmpty(s.T, providerSigner, "audit pack needs the provider pack") + provider := s.Addr(providerSigner) auditor := s.FundAccountDefault("auditor") @@ - auditNegatives(s, auditor) + auditNegatives(s, provider.String(), auditor) @@ -func auditNegatives(s *Suite, auditor sdk.AccAddress) { +func auditNegatives(s *Suite, provider string, auditor sdk.AccAddress) { s.T.Helper() - provider := s.Addr("provider").String()Also applies to: 56-57, 67-70
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/upgrade/grpcsuite/pack_audit.go` around lines 19 - 21, Replace the hardcoded `"provider"` string with the `wProviderSigner` world key variable in the `s.Addr()` call on line 20 and in all other similar locations (lines 56-57 and 67-70 referenced in the comment). This ensures consistency with the validation on line 19 that checks `wProviderSigner` exists, maintaining the cross-pack contract by using the same published world key throughout instead of relying on hardcoded strings that could become inconsistent if the provider signer name changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/tests.yaml:
- Around line 51-58: The grpc-surface job requires security hardening to follow
least-privilege principles. Add a top-level permissions block to the job that
explicitly sets read-only permissions for contents. In the Checkout code step,
pin the actions/checkout@v4 reference to a specific commit SHA instead of using
the floating v4 tag, and add persist-credentials: false as an input parameter to
prevent credential persistence. This ensures the job only has the minimum
required permissions and reduces the attack surface of the CI workflow.
In `@make/test-upgrade.mk`:
- Line 91: The prepare-state target is missing the --snapshot-url parameter in
the upgrades.sh command invocation. Add --snapshot-url=$(SNAPSHOT_URL) to the
script call to ensure it matches the parameter set used elsewhere (such as line
87), preventing the prepare-state target from using an unintended or default
snapshot source when called directly with make prepare-state.
In `@script/upgrades.sh`:
- Around line 534-535: Remove the duplicate rollback command in the upgrades
script. The lines containing
"$rvaldir/cosmovisor/upgrades/$UPGRADE_TO/bin/akash" rollback --home "$rvaldir"
--hard are identical and appear consecutively, so delete one of the two
duplicate lines to ensure only a single rollback operation is performed.
In `@tests/upgrade/grpcsuite/pack_bme.go`:
- Around line 24-38: The bmePack.Run method calls feedAKTPrice(3) which depends
on a pre-created "pricewriter" account from the oracle pack, but this dependency
is implicit and not guaranteed to be initialized before bmePack runs. Add an
explicit dependency on the oracle pack to ensure the required account and oracle
infrastructure are available, or alternatively add a guard or assertion in the
Run method to verify that the required oracle state exists before calling
feedAKTPrice(3).
In `@tests/upgrade/grpcsuite/pack_gov.go`:
- Around line 38-127: Each module parameter query block that checks HasModule
for modules like "akash.deployment.v1beta4", "cosmos.bank.v1beta1", etc.,
currently silently ignores errors by only appending messages when err is nil.
Apply a fail-fast pattern to all these blocks by checking if err is not nil
after each Params query and returning an error instead of continuing silently,
ensuring that if a module is available but the params query fails, the pack
fails appropriately rather than omitting messages and passing incorrectly.
In `@tests/upgrade/grpcsuite/pack_market.go`:
- Line 59: The assertOrdersPaginate method fails to validate actual pagination
behavior because it executes a query but discards both the returned error and
result at lines 137-138. To fix this, modify the assertOrdersPaginate method
implementation to capture the error and result from the query execution instead
of dropping them, then add assertions to verify the pagination behavior is
correct (such as validating the number of items returned, checking limit and
offset parameters are respected, or confirming page boundaries). This will
ensure the pagination check actually validates the expected behavior rather than
always passing.
- Around line 199-205: The q.Orders gRPC call within the for loop is using s.Ctx
instead of the timeout context ctx that was created on the preceding line, which
means the 30 second timeout established by context.WithTimeout is never applied
to the RPC request. Replace s.Ctx with ctx in the q.Orders method call to ensure
the timeout context with its deadline and cancellation is properly used for the
gRPC operation.
In `@tests/upgrade/grpcsuite/smoke.go`:
- Around line 59-64: The smoke test in the invoke section currently only treats
Unimplemented status codes as failures, but other transient gRPC errors like
Unavailable, DeadlineExceeded, and Internal should also be treated as smoke
failures since they indicate handlers weren't reliably reached. Modify the error
handling block after s.Conn.Invoke to check for these additional status codes in
addition to codes.Unimplemented and report each of them as errors using
s.T.Errorf with appropriate error messages for each case.
In `@tests/upgrade/grpcsuite/txbroadcast.go`:
- Around line 143-144: The `waitForTx` function creates a 2-minute timeout
context, but the `LatestHeight()` call within the height polling loop uses the
default context (`s.Ctx`) instead of the timeout-scoped context. This allows
gRPC calls to hang beyond the intended timeout. Modify the code to pass the
timeout-scoped context created in `waitForTx` to the `LatestHeight()` call (or
create a context-aware version of this method that accepts a context parameter).
Apply this fix throughout the polling logic in the loop that starts with `latest
:= b.s.LatestHeight()` and continues through the height iteration range.
- Around line 92-99: The coverage recording for messages is happening before the
sign operation completes. Move the for loop that iterates through msgs and calls
b.s.Cov.recordMsg(sdk.MsgTypeURL(m)) to execute after the b.sign() call
succeeds. Specifically, place this loop after the error check for b.sign() so
that coverage is only recorded when signing actually succeeds, ensuring that the
tx broadcast path is properly exercised before marking it as covered.
---
Nitpick comments:
In `@tests/upgrade/grpcsuite/pack_audit.go`:
- Around line 19-21: Replace the hardcoded `"provider"` string with the
`wProviderSigner` world key variable in the `s.Addr()` call on line 20 and in
all other similar locations (lines 56-57 and 67-70 referenced in the comment).
This ensures consistency with the validation on line 19 that checks
`wProviderSigner` exists, maintaining the cross-pack contract by using the same
published world key throughout instead of relying on hardcoded strings that
could become inconsistent if the provider signer name changes.
In `@tests/upgrade/grpcsuite/pack_cosmos_bank.go`:
- Around line 64-65: The DenomMetadata and DenomMetadataByQueryString method
calls in the test are currently discarding errors by using blank identifiers (_,
_), which allows endpoint regressions to go undetected. Replace the blank error
identifiers in both calls with actual error variables (err1, err2 or similar
names), and add assertions after each call to verify the errors are either nil
or NotFound as appropriate for your test expectations. This ensures that
unexpected errors from these query methods will cause the test to fail rather
than silently passing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: daa2870b-963c-47b6-8274-04ed1e1fe075
📒 Files selected for processing (36)
.github/workflows/tests.yamlcmd/akash/cmd/root.gomake/test-integration.mkmake/test-upgrade.mkscript/upgrades.shtests/fullsurface/fullsurface_test.gotests/upgrade/grpcsuite/README.mdtests/upgrade/grpcsuite/bootstrap.gotests/upgrade/grpcsuite/coverage.gotests/upgrade/grpcsuite/env.gotests/upgrade/grpcsuite/gov.gotests/upgrade/grpcsuite/grpcconn.gotests/upgrade/grpcsuite/pack.gotests/upgrade/grpcsuite/pack_audit.gotests/upgrade/grpcsuite/pack_bme.gotests/upgrade/grpcsuite/pack_cert.gotests/upgrade/grpcsuite/pack_cosmos_auth_vesting.gotests/upgrade/grpcsuite/pack_cosmos_authz.gotests/upgrade/grpcsuite/pack_cosmos_bank.gotests/upgrade/grpcsuite/pack_cosmos_distribution.gotests/upgrade/grpcsuite/pack_cosmos_feegrant.gotests/upgrade/grpcsuite/pack_cosmos_gov.gotests/upgrade/grpcsuite/pack_cosmos_helpers.gotests/upgrade/grpcsuite/pack_cosmos_misc.gotests/upgrade/grpcsuite/pack_cosmos_staking.gotests/upgrade/grpcsuite/pack_deployment.gotests/upgrade/grpcsuite/pack_escrow.gotests/upgrade/grpcsuite/pack_gov.gotests/upgrade/grpcsuite/pack_market.gotests/upgrade/grpcsuite/pack_oracle.gotests/upgrade/grpcsuite/pack_provider.gotests/upgrade/grpcsuite/smoke.gotests/upgrade/grpcsuite/txbroadcast.gotests/upgrade/grpcsurface_worker_test.gotests/upgrade/types/types.gotests/upgrade/upgrade_test.go
Tighten the post-upgrade gRPC suite around review findings: assert params and pagination queries, keep tx coverage behind local signability, use bounded contexts during tx inclusion polling, and document intentional fork rollback behavior. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Assert acceptable bank metadata query outcomes and route audit provider lookups through the published provider signer world key. Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Description
Closes: 636
Adds the post-upgrade gRPC surface suite for testnetify-forked upgrade runs.
The suite exercises the mounted Akash and Cosmos SDK gRPC transaction/query
surface after an upgrade and fails when discovered in-scope RPCs are not covered.
Key pieces:
tests/upgrade/grpcsuite: authored tx packs, query smoke sweep, coverage gatetests/upgrade/grpcsurface_worker_test.go: universal post-upgrade workertests/fullsurface: fast in-process local/CI drivermake test-grpc-surface[-tx|-query]: local runnersValidation:
make -C tests/upgrade test: PASS, coveragetx 76/76,query 130/130make test-grpc-surface: PASS, coveragetx 76/76,query 130/130make test-grpc-surface-tx: PASS, coveragetx 76/76make test-grpc-surface-query: PASS, coveragequery 130/130