Move validator unregister out of shutdown path#1314
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (4)
📝 WalkthroughWalkthroughDomainRegistryManager gains a configurable RPC commitment via pub fn new_with_commitment and uses it when fetching account data. Transaction sending is refactored into shared build_transaction plus confirmed and unconfirmed send paths; send_unregister submits without confirmation. New static APIs submit unregistration and optionally spawn a background Tokio runtime to wait for confirmation. MagicValidator exposes start_unregister_validator_on_chain to trigger a conditional background unregister, stores a JoinHandle, and shutdown now coordinates with that background task without blocking for in-progress confirmations. Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a9acb580e0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@magicblock-validator/src/main.rs`:
- Around line 159-171: The unregister task may be aborted when the main runtime
drops; instead await it with a bounded timeout after calling api.stop(): if
api.spawn_unregister_validator_on_chain() returned Some(handle), call
tokio::time::timeout(Duration::from_secs(<N>), handle) (or timeout on
handle.await) to keep the runtime alive for that grace period, log success or a
warning on timeout/failure, and only then return; alternatively move spawning to
a dedicated shutdown runtime/thread, but the quickest fix is wrapping the
existing handle.await in tokio::time::timeout and handling the timeout/error
paths so the unregister gets a chance to confirm on-chain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a3eceb9d-8ae2-447b-926f-720cd0e700dd
📒 Files selected for processing (3)
magicblock-api/src/domain_registry_manager.rsmagicblock-api/src/magic_validator.rsmagicblock-validator/src/main.rs
a9acb58 to
7eb1a26
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7eb1a26f6e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@magicblock-api/src/domain_registry_manager.rs`:
- Around line 282-305: confirm_signature currently polls indefinitely; modify it
to enforce a bounded timeout (e.g., 30s or a configurable Duration) so it
returns an explicit timeout error instead of looping forever. Implement this by
wrapping the polling loop in a tokio::time::timeout (or by checking
Instant::now() against a deadline) and return a clear Error variant or
anyhow::Error with context like "confirm_signature timed out" when the timeout
elapses; keep the existing handling for Some(Ok(())) and Some(Err(err)) and
continue to await sleep(Duration::from_millis(500)) between polls. Use the
function name confirm_signature and references to
get_signature_status_with_commitment and sleep to locate and update the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c2766dd3-c594-4414-ab55-895e92cf0e39
📒 Files selected for processing (3)
magicblock-api/src/domain_registry_manager.rsmagicblock-api/src/magic_validator.rsmagicblock-validator/src/main.rs
thlorenz
left a comment
There was a problem hiding this comment.
LGTM after addressing the nits I pointed out.
Dodecahedr0x
left a comment
There was a problem hiding this comment.
LGTM, small comments
a824680 to
460fdc4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 460fdc4813
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
♻️ Duplicate comments (1)
magicblock-api/src/magic_validator.rs (1)
687-690:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAlign the shutdown unregister gate with the startup registration gate.
start_unregister_validator_on_chaincan now run for any mode whereCoordinationMode::current().needs_onchain_interactions()is true, but the startup registration path is only spawned underself.is_standalonein Lines 977-988. That mismatch lets a replicated primary hit the new send-only unregister path even though this process never registered on chain; with the existence check gone from the send path, the failure is deferred until background confirmation after the transaction has already been submitted.Add the same
self.is_standaloneguard here, or factor a shared predicate used by both startup registration and shutdown unregister so the two paths cannot drift.Suggested fix
if self.config.chain_operation.is_none() + || !self.is_standalone || !matches!(self.config.lifecycle, LifecycleMode::Ephemeral) || !CoordinationMode::current().needs_onchain_interactions() { return None; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@magicblock-api/src/magic_validator.rs` around lines 687 - 690, The shutdown unregister path (start_unregister_validator_on_chain) is allowed whenever CoordinationMode::current().needs_onchain_interactions() is true, but the startup registration is only started under self.is_standalone, causing a mismatch; update start_unregister_validator_on_chain to include the same self.is_standalone guard (or extract a shared predicate used by both the startup registration spawn site and start_unregister_validator_on_chain) so both registration and unregister paths use the identical condition and cannot drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 687-690: The shutdown unregister path
(start_unregister_validator_on_chain) is allowed whenever
CoordinationMode::current().needs_onchain_interactions() is true, but the
startup registration is only started under self.is_standalone, causing a
mismatch; update start_unregister_validator_on_chain to include the same
self.is_standalone guard (or extract a shared predicate used by both the startup
registration spawn site and start_unregister_validator_on_chain) so both
registration and unregister paths use the identical condition and cannot drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cf2a791d-43ef-4cce-9235-37db13e76614
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
magicblock-api/Cargo.tomlmagicblock-api/src/domain_registry_manager.rsmagicblock-api/src/magic_validator.rs
460fdc4 to
022c604
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@magicblock-api/src/domain_registry_manager.rs`:
- Around line 279-312: The background confirmation task in
send_unregistration_and_confirm_in_background_static currently uses tokio::spawn
and rpc_client.wait_for_confirmed_status, which can be starved by
MagicValidator::stop calling blocking JoinHandle::join on the Tokio worker pool;
change the confirmation spawn to run on a dedicated thread/runtime (e.g., spawn
a std::thread::spawn that builds a small tokio::runtime::Runtime and calls
runtime.block_on(rpc_client.wait_for_confirmed_status(...))) so the
wait_for_confirmed_status future does not consume the shared Tokio worker pool;
additionally, replace all production panics (.expect()/.unwrap()) referenced
(calls like api.start().expect(...),
dispatch.replication_messages.take().expect(...), the runtime build expects in
magic_validator.rs and main.rs, .path(...).expect(...), and .parse().unwrap())
with Result-based error propagation or explicit error conversions so
startup/shutdown return Errors instead of panicking.
In `@magicblock-api/src/magic_validator.rs`:
- Around line 686-695: The shutdown path in start_unregister_validator_on_chain
runs for any ephemeral node with on-chain interactions and can unregister
validators this process never registered; add the same standalone guard used in
start() by returning early if self.is_standalone is false (i.e., check
self.is_standalone before proceeding) so start_unregister_validator_on_chain
only runs for standalone nodes, preserving the invariant that only processes
which performed on-chain registration will attempt unregistering; update the
guard alongside the existing checks (unregister_handle, config.chain_operation,
LifecycleMode::Ephemeral,
CoordinationMode::current().needs_onchain_interactions()) in
start_unregister_validator_on_chain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cfb2460e-ba58-43fe-8382-51a1dff68c64
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
magicblock-api/Cargo.tomlmagicblock-api/src/domain_registry_manager.rsmagicblock-api/src/magic_validator.rsmagicblock-validator/src/main.rs
022c604 to
433f983
Compare
Summary
Why
The user-visible outage begins when RPC is cancelled, not when shutdown preparation starts. Awaiting the send before shutdown preparation keeps the transaction from being cancelled when the Tokio runtime drops, while confirmation no longer gates perceived downtime.
Summary by CodeRabbit
New Features
Improvements
Behavior