Skip to content

feat(tool): print state diff on mismatch in forest-dev state replay-compute#7081

Draft
hanabi1224 wants to merge 1 commit into
mainfrom
hm/replay-compute-print-diff
Draft

feat(tool): print state diff on mismatch in forest-dev state replay-compute#7081
hanabi1224 wants to merge 1 commit into
mainfrom
hm/replay-compute-print-diff

Conversation

@hanabi1224
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 commented May 19, 2026

Summary of changes

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced error diagnostics for state computation, now provides detailed state-diff output when computation fails.
  • Refactor

    • Internal code reorganization for improved maintainability and code reusability.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Walkthrough

This PR adds error recovery and diagnostic output to state computation, implementing a CompoundBlockstore utility to compare database state across snapshots and chain configs. It also extracts shared DB initialization logic into a reusable helper function for test utilities.

Changes

State Diagnostics and DB Refactoring

Layer / File(s) Summary
State Computation Error Diagnostics and CompoundBlockstore
src/dev/subcommands/state_cmd.rs
Import DbImpl, wrap state_compute in error handling that on failure recomputes the state root, reloads DB from snapshot and chain config, wraps the DBs in CompoundBlockstore, and prints a state-diff diagnostic before returning the error. Implement CompoundBlockstore as an internal Blockstore that searches multiple DbImpl instances in order.
DB Loading Helper Refactoring
src/tool/subcommands/api_cmd/generate_test_snapshot.rs
Extract DB initialization and CAR/actor bundle loading into a new public load_many_car_db helper. Refactor load_db to delegate to the helper, then wrap the result in ReadOpsTrackingStore.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main feature: printing a state diff on mismatch in the replay-compute command, which aligns with the primary changes in state_cmd.rs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/replay-compute-print-diff
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/replay-compute-print-diff

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review May 20, 2026 12:49
@hanabi1224 hanabi1224 requested a review from a team as a code owner May 20, 2026 12:49
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and sudo-shashank and removed request for a team May 20, 2026 12:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)

78-89: ⚡ Quick win

Add context to each DB-loading stage in the new shared helper.

open_db, load_all_forest_cars, and load_actor_bundles now sit behind both snapshot loading and replay diagnostics, so the bare ?s make failures much harder to attribute.

💡 Suggested patch
 pub async fn load_many_car_db(
     db_root: &Path,
     chain: Option<&NetworkChain>,
 ) -> anyhow::Result<ManyCar<ParityDb>> {
-    let db_writer = open_db(db_root.into(), &Default::default())?;
+    let db_writer = open_db(db_root.into(), &Default::default())
+        .with_context(|| format!("opening db at {}", db_root.display()))?;
     let db = ManyCar::new(db_writer);
     let forest_car_db_dir = db_root.join(CAR_DB_DIR_NAME);
-    load_all_forest_cars(&db, &forest_car_db_dir)?;
+    load_all_forest_cars(&db, &forest_car_db_dir)
+        .with_context(|| format!("loading forest CARs from {}", forest_car_db_dir.display()))?;
     if let Some(chain) = chain {
-        load_actor_bundles(&db, chain).await?;
+        load_actor_bundles(&db, chain)
+            .await
+            .context("loading actor bundles")?;
     }
     Ok(db)
 }

As per coding guidelines, "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs` around lines 78 - 89,
The helper load_many_car_db currently propagates errors from open_db,
load_all_forest_cars, and load_actor_bundles with bare ? which loses context;
update each call to attach descriptive anyhow::Context messages (e.g.,
".context(\"opening ParityDb at {db_root:?}\")", ".context(\"loading Forest CARs
from {forest_car_db_dir:?}\")", and for load_actor_bundles include the chain
like ".context(format!(\"loading actor bundles for chain {:?}\", chain))") so
failures clearly state which stage (open_db, load_all_forest_cars,
load_actor_bundles) and which path/chain caused the error in load_many_car_db.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/dev/subcommands/state_cmd.rs`:
- Around line 160-184: The diagnostics block that runs after
crate::state_manager::utils::state_compute fails must be made best-effort so the
original error `e` is always returned; wrap the whole
recompute/open-db/print-diff sequence (calls to sm.compute_tipset_state,
generate_test_snapshot::load_many_car_db, CompoundBlockstore construction, and
crate::statediff::print_state_diff) in a fallible-to-ignored wrapper (e.g., run
it and if it errors log the diagnostic error but do not propagate it) and then
unconditionally return Err(e) from the state_compute branch; ensure you preserve
the original `e` in the final return while emitting any diagnostic failure via
logging rather than changing the branch's return value.

---

Nitpick comments:
In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs`:
- Around line 78-89: The helper load_many_car_db currently propagates errors
from open_db, load_all_forest_cars, and load_actor_bundles with bare ? which
loses context; update each call to attach descriptive anyhow::Context messages
(e.g., ".context(\"opening ParityDb at {db_root:?}\")", ".context(\"loading
Forest CARs from {forest_car_db_dir:?}\")", and for load_actor_bundles include
the chain like ".context(format!(\"loading actor bundles for chain {:?}\",
chain))") so failures clearly state which stage (open_db, load_all_forest_cars,
load_actor_bundles) and which path/chain caused the error in load_many_car_db.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d876f35d-a4d2-4de5-a3b6-c7bc46b88aad

📥 Commits

Reviewing files that changed from the base of the PR and between 7d19be2 and 78ccec8.

📒 Files selected for processing (2)
  • src/dev/subcommands/state_cmd.rs
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs

Comment on lines +160 to +184
if let Err(e) = crate::state_manager::utils::state_compute::state_compute(
&sm,
ts.shallow_clone(),
&ts_next,
)
.await?;
.await
{
let computed = sm
.compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
.await?
.state_root;
let expected = *ts_next.parent_state();
let db_root_path = {
let (_, config) = read_config(None, Some(chain.clone()))?;
db_root(&chain_path(&config))?
};
let db =
generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
let db: DbImpl = Arc::new(db).into();
let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
println!(
"printing state diff between computed({computed}) and expected({expected}) state roots ..."
);
crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
return Err(e);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the state-diff diagnostics best-effort so the original replay failure is never lost.

This branch adds several fallible steps before return Err(e). If recomputing the state root, opening the configured DB, or printing the diff fails, the command now returns that secondary diagnostics error instead of the original state_compute failure. In practice, replaying a CAR on a machine without the local chain DB will mask the actual mismatch with an unrelated DB-loading error.

💡 Suggested direction
         for _ in 0..n.get() {
             if let Err(e) = crate::state_manager::utils::state_compute::state_compute(
                 &sm,
                 ts.shallow_clone(),
                 &ts_next,
             )
             .await
             {
-                let computed = sm
-                    .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
-                    .await?
-                    .state_root;
-                let expected = *ts_next.parent_state();
-                let db_root_path = {
-                    let (_, config) = read_config(None, Some(chain.clone()))?;
-                    db_root(&chain_path(&config))?
-                };
-                let db =
-                    generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
-                let db: DbImpl = Arc::new(db).into();
-                let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
-                println!(
-                    "printing state diff between computed({computed}) and expected({expected}) state roots ..."
-                );
-                crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
+                if let Err(diag_err) = async {
+                    let computed = sm
+                        .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
+                        .await?
+                        .state_root;
+                    let expected = *ts_next.parent_state();
+                    let db_root_path = {
+                        let (_, config) = read_config(None, Some(chain.clone()))?;
+                        db_root(&chain_path(&config))?
+                    };
+                    let db =
+                        generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
+                    let db: DbImpl = Arc::new(db).into();
+                    let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
+                    println!(
+                        "printing state diff between computed({computed}) and expected({expected}) state roots ..."
+                    );
+                    crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
+                    Ok::<_, anyhow::Error>(())
+                }
+                .await
+                {
+                    eprintln!("failed to print state diff: {diag_err:#}");
+                }
                 return Err(e);
             }
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if let Err(e) = crate::state_manager::utils::state_compute::state_compute(
&sm,
ts.shallow_clone(),
&ts_next,
)
.await?;
.await
{
let computed = sm
.compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
.await?
.state_root;
let expected = *ts_next.parent_state();
let db_root_path = {
let (_, config) = read_config(None, Some(chain.clone()))?;
db_root(&chain_path(&config))?
};
let db =
generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
let db: DbImpl = Arc::new(db).into();
let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
println!(
"printing state diff between computed({computed}) and expected({expected}) state roots ..."
);
crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
return Err(e);
if let Err(e) = crate::state_manager::utils::state_compute::state_compute(
&sm,
ts.shallow_clone(),
&ts_next,
)
.await
{
if let Err(diag_err) = async {
let computed = sm
.compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
.await?
.state_root;
let expected = *ts_next.parent_state();
let db_root_path = {
let (_, config) = read_config(None, Some(chain.clone()))?;
db_root(&chain_path(&config))?
};
let db =
generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
let db: DbImpl = Arc::new(db).into();
let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
println!(
"printing state diff between computed({computed}) and expected({expected}) state roots ..."
);
crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
Ok::<_, anyhow::Error>(())
}
.await
{
eprintln!("failed to print state diff: {diag_err:#}");
}
return Err(e);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/dev/subcommands/state_cmd.rs` around lines 160 - 184, The diagnostics
block that runs after crate::state_manager::utils::state_compute fails must be
made best-effort so the original error `e` is always returned; wrap the whole
recompute/open-db/print-diff sequence (calls to sm.compute_tipset_state,
generate_test_snapshot::load_many_car_db, CompoundBlockstore construction, and
crate::statediff::print_state_diff) in a fallible-to-ignored wrapper (e.g., run
it and if it errors log the diagnostic error but do not propagate it) and then
unconditionally return Err(e) from the state_compute branch; ensure you preserve
the original `e` in the final return while emitting any diagnostic failure via
logging rather than changing the branch's return value.

@hanabi1224 hanabi1224 marked this pull request as draft May 20, 2026 12:59
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 0% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.18%. Comparing base (7d19be2) to head (78ccec8).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/dev/subcommands/state_cmd.rs 0.00% 27 Missing ⚠️
...tool/subcommands/api_cmd/generate_test_snapshot.rs 0.00% 9 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
...tool/subcommands/api_cmd/generate_test_snapshot.rs 6.88% <0.00%> (-0.23%) ⬇️
src/dev/subcommands/state_cmd.rs 0.00% <0.00%> (ø)

... and 9 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d19be2...78ccec8. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant