Skip to content

Move validator unregister out of shutdown path#1314

Merged
GabrielePicco merged 2 commits into
masterfrom
fix/nonblocking-validator-unregister
Jun 12, 2026
Merged

Move validator unregister out of shutdown path#1314
GabrielePicco merged 2 commits into
masterfrom
fix/nonblocking-validator-unregister

Conversation

@GabrielePicco

@GabrielePicco GabrielePicco commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Send the validator domain-registry unregister transaction immediately after the shutdown signal while RPC is still serving.
  • Keep transaction confirmation at confirmed commitment, but run it as a best-effort background log task.
  • Avoid waiting for confirmation during shutdown; after a successful send, the worst case is sent but unconfirmed in local logs.

Why

The user-visible outage begins when RPC is cancelled, not when shutdown preparation starts. Awaiting the send before shutdown preparation keeps the transaction from being cancelled when the Tokio runtime drops, while confirmation no longer gates perceived downtime.

Summary by CodeRabbit

  • New Features

    • Configurable RPC commitment for blockchain operations.
    • Public APIs to submit unregistration without waiting and to submit-and-confirm in background (returns a task handle).
  • Improvements

    • Shutdown flow now initiates background unregistration and avoids blocking on confirmation.
    • Transaction sending split into confirmed and non-confirmed paths for clearer, non-blocking behavior.
  • Behavior

    • Unregistration now de-duplicates concurrent attempts and logs background confirmation progress.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1890a2e0-b847-4dad-bee7-d2f0f331ed09

📥 Commits

Reviewing files that changed from the base of the PR and between 022c604 and 433f983.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • test-integration/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • magicblock-api/Cargo.toml
  • magicblock-api/src/domain_registry_manager.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-validator/src/main.rs

📝 Walkthrough

Walkthrough

DomainRegistryManager gains a configurable RPC commitment via pub fn new_with_commitment and uses it when fetching account data. Transaction sending is refactored into shared build_transaction plus confirmed and unconfirmed send paths; send_unregister submits without confirmation. New static APIs submit unregistration and optionally spawn a background Tokio runtime to wait for confirmation. MagicValidator exposes start_unregister_validator_on_chain to trigger a conditional background unregister, stores a JoinHandle, and shutdown now coordinates with that background task without blocking for in-progress confirmations.

Suggested reviewers

  • bmuddha
  • taco-paco
  • thlorenz
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/nonblocking-validator-unregister

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@GabrielePicco GabrielePicco marked this pull request as ready for review June 12, 2026 10:57

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9acb580e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread magicblock-validator/src/main.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-validator/src/main.rs`:
- Around line 159-171: The unregister task may be aborted when the main runtime
drops; instead await it with a bounded timeout after calling api.stop(): if
api.spawn_unregister_validator_on_chain() returned Some(handle), call
tokio::time::timeout(Duration::from_secs(<N>), handle) (or timeout on
handle.await) to keep the runtime alive for that grace period, log success or a
warning on timeout/failure, and only then return; alternatively move spawning to
a dedicated shutdown runtime/thread, but the quickest fix is wrapping the
existing handle.await in tokio::time::timeout and handling the timeout/error
paths so the unregister gets a chance to confirm on-chain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a3eceb9d-8ae2-447b-926f-720cd0e700dd

📥 Commits

Reviewing files that changed from the base of the PR and between 42687b9 and a9acb58.

📒 Files selected for processing (3)
  • magicblock-api/src/domain_registry_manager.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-validator/src/main.rs

Comment thread magicblock-validator/src/main.rs Outdated
@GabrielePicco GabrielePicco force-pushed the fix/nonblocking-validator-unregister branch from a9acb58 to 7eb1a26 Compare June 12, 2026 11:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7eb1a26f6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread magicblock-validator/src/main.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-api/src/domain_registry_manager.rs`:
- Around line 282-305: confirm_signature currently polls indefinitely; modify it
to enforce a bounded timeout (e.g., 30s or a configurable Duration) so it
returns an explicit timeout error instead of looping forever. Implement this by
wrapping the polling loop in a tokio::time::timeout (or by checking
Instant::now() against a deadline) and return a clear Error variant or
anyhow::Error with context like "confirm_signature timed out" when the timeout
elapses; keep the existing handling for Some(Ok(())) and Some(Err(err)) and
continue to await sleep(Duration::from_millis(500)) between polls. Use the
function name confirm_signature and references to
get_signature_status_with_commitment and sleep to locate and update the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c2766dd3-c594-4414-ab55-895e92cf0e39

📥 Commits

Reviewing files that changed from the base of the PR and between a9acb58 and 7eb1a26.

📒 Files selected for processing (3)
  • magicblock-api/src/domain_registry_manager.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-validator/src/main.rs

Comment thread magicblock-api/src/domain_registry_manager.rs Outdated

@thlorenz thlorenz left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after addressing the nits I pointed out.

Comment thread magicblock-api/src/magic_validator.rs
Comment thread magicblock-api/src/domain_registry_manager.rs Outdated

@Dodecahedr0x Dodecahedr0x left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, small comments

Comment thread magicblock-api/src/domain_registry_manager.rs Outdated
Comment thread magicblock-api/src/magic_validator.rs Outdated
@GabrielePicco GabrielePicco force-pushed the fix/nonblocking-validator-unregister branch from a824680 to 460fdc4 Compare June 12, 2026 15:52

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 460fdc4813

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread magicblock-api/src/magic_validator.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
magicblock-api/src/magic_validator.rs (1)

687-690: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align the shutdown unregister gate with the startup registration gate.

start_unregister_validator_on_chain can now run for any mode where CoordinationMode::current().needs_onchain_interactions() is true, but the startup registration path is only spawned under self.is_standalone in Lines 977-988. That mismatch lets a replicated primary hit the new send-only unregister path even though this process never registered on chain; with the existence check gone from the send path, the failure is deferred until background confirmation after the transaction has already been submitted.

Add the same self.is_standalone guard here, or factor a shared predicate used by both startup registration and shutdown unregister so the two paths cannot drift.

Suggested fix
         if self.config.chain_operation.is_none()
+            || !self.is_standalone
             || !matches!(self.config.lifecycle, LifecycleMode::Ephemeral)
             || !CoordinationMode::current().needs_onchain_interactions()
         {
             return None;
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-api/src/magic_validator.rs` around lines 687 - 690, The shutdown
unregister path (start_unregister_validator_on_chain) is allowed whenever
CoordinationMode::current().needs_onchain_interactions() is true, but the
startup registration is only started under self.is_standalone, causing a
mismatch; update start_unregister_validator_on_chain to include the same
self.is_standalone guard (or extract a shared predicate used by both the startup
registration spawn site and start_unregister_validator_on_chain) so both
registration and unregister paths use the identical condition and cannot drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 687-690: The shutdown unregister path
(start_unregister_validator_on_chain) is allowed whenever
CoordinationMode::current().needs_onchain_interactions() is true, but the
startup registration is only started under self.is_standalone, causing a
mismatch; update start_unregister_validator_on_chain to include the same
self.is_standalone guard (or extract a shared predicate used by both the startup
registration spawn site and start_unregister_validator_on_chain) so both
registration and unregister paths use the identical condition and cannot drift.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cf2a791d-43ef-4cce-9235-37db13e76614

📥 Commits

Reviewing files that changed from the base of the PR and between 7eb1a26 and 460fdc4.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • test-integration/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • magicblock-api/Cargo.toml
  • magicblock-api/src/domain_registry_manager.rs
  • magicblock-api/src/magic_validator.rs

@GabrielePicco GabrielePicco force-pushed the fix/nonblocking-validator-unregister branch from 460fdc4 to 022c604 Compare June 12, 2026 16:11

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-api/src/domain_registry_manager.rs`:
- Around line 279-312: The background confirmation task in
send_unregistration_and_confirm_in_background_static currently uses tokio::spawn
and rpc_client.wait_for_confirmed_status, which can be starved by
MagicValidator::stop calling blocking JoinHandle::join on the Tokio worker pool;
change the confirmation spawn to run on a dedicated thread/runtime (e.g., spawn
a std::thread::spawn that builds a small tokio::runtime::Runtime and calls
runtime.block_on(rpc_client.wait_for_confirmed_status(...))) so the
wait_for_confirmed_status future does not consume the shared Tokio worker pool;
additionally, replace all production panics (.expect()/.unwrap()) referenced
(calls like api.start().expect(...),
dispatch.replication_messages.take().expect(...), the runtime build expects in
magic_validator.rs and main.rs, .path(...).expect(...), and .parse().unwrap())
with Result-based error propagation or explicit error conversions so
startup/shutdown return Errors instead of panicking.

In `@magicblock-api/src/magic_validator.rs`:
- Around line 686-695: The shutdown path in start_unregister_validator_on_chain
runs for any ephemeral node with on-chain interactions and can unregister
validators this process never registered; add the same standalone guard used in
start() by returning early if self.is_standalone is false (i.e., check
self.is_standalone before proceeding) so start_unregister_validator_on_chain
only runs for standalone nodes, preserving the invariant that only processes
which performed on-chain registration will attempt unregistering; update the
guard alongside the existing checks (unregister_handle, config.chain_operation,
LifecycleMode::Ephemeral,
CoordinationMode::current().needs_onchain_interactions()) in
start_unregister_validator_on_chain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cfb2460e-ba58-43fe-8382-51a1dff68c64

📥 Commits

Reviewing files that changed from the base of the PR and between 460fdc4 and 022c604.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • test-integration/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • magicblock-api/Cargo.toml
  • magicblock-api/src/domain_registry_manager.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-validator/src/main.rs

Comment thread magicblock-api/src/domain_registry_manager.rs
Comment thread magicblock-api/src/magic_validator.rs
@GabrielePicco GabrielePicco force-pushed the fix/nonblocking-validator-unregister branch from 022c604 to 433f983 Compare June 12, 2026 16:31
@GabrielePicco GabrielePicco merged commit 7f38f43 into master Jun 12, 2026
33 checks passed
@GabrielePicco GabrielePicco deleted the fix/nonblocking-validator-unregister branch June 12, 2026 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants