feat(tool): print state diff on mismatch in forest-dev state replay-compute#7081
feat(tool): print state diff on mismatch in forest-dev state replay-compute#7081hanabi1224 wants to merge 1 commit into
forest-dev state replay-compute#7081Conversation
WalkthroughThis PR adds error recovery and diagnostic output to state computation, implementing a ChangesState Diagnostics and DB Refactoring
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
78-89: ⚡ Quick winAdd context to each DB-loading stage in the new shared helper.
open_db,load_all_forest_cars, andload_actor_bundlesnow sit behind both snapshot loading and replay diagnostics, so the bare?s make failures much harder to attribute.💡 Suggested patch
pub async fn load_many_car_db( db_root: &Path, chain: Option<&NetworkChain>, ) -> anyhow::Result<ManyCar<ParityDb>> { - let db_writer = open_db(db_root.into(), &Default::default())?; + let db_writer = open_db(db_root.into(), &Default::default()) + .with_context(|| format!("opening db at {}", db_root.display()))?; let db = ManyCar::new(db_writer); let forest_car_db_dir = db_root.join(CAR_DB_DIR_NAME); - load_all_forest_cars(&db, &forest_car_db_dir)?; + load_all_forest_cars(&db, &forest_car_db_dir) + .with_context(|| format!("loading forest CARs from {}", forest_car_db_dir.display()))?; if let Some(chain) = chain { - load_actor_bundles(&db, chain).await?; + load_actor_bundles(&db, chain) + .await + .context("loading actor bundles")?; } Ok(db) }As per coding guidelines, "Use
anyhow::Result<T>for most operations and add context with.context()when errors occur".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs` around lines 78 - 89, The helper load_many_car_db currently propagates errors from open_db, load_all_forest_cars, and load_actor_bundles with bare ? which loses context; update each call to attach descriptive anyhow::Context messages (e.g., ".context(\"opening ParityDb at {db_root:?}\")", ".context(\"loading Forest CARs from {forest_car_db_dir:?}\")", and for load_actor_bundles include the chain like ".context(format!(\"loading actor bundles for chain {:?}\", chain))") so failures clearly state which stage (open_db, load_all_forest_cars, load_actor_bundles) and which path/chain caused the error in load_many_car_db.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/dev/subcommands/state_cmd.rs`:
- Around line 160-184: The diagnostics block that runs after
crate::state_manager::utils::state_compute fails must be made best-effort so the
original error `e` is always returned; wrap the whole
recompute/open-db/print-diff sequence (calls to sm.compute_tipset_state,
generate_test_snapshot::load_many_car_db, CompoundBlockstore construction, and
crate::statediff::print_state_diff) in a fallible-to-ignored wrapper (e.g., run
it and if it errors log the diagnostic error but do not propagate it) and then
unconditionally return Err(e) from the state_compute branch; ensure you preserve
the original `e` in the final return while emitting any diagnostic failure via
logging rather than changing the branch's return value.
---
Nitpick comments:
In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs`:
- Around line 78-89: The helper load_many_car_db currently propagates errors
from open_db, load_all_forest_cars, and load_actor_bundles with bare ? which
loses context; update each call to attach descriptive anyhow::Context messages
(e.g., ".context(\"opening ParityDb at {db_root:?}\")", ".context(\"loading
Forest CARs from {forest_car_db_dir:?}\")", and for load_actor_bundles include
the chain like ".context(format!(\"loading actor bundles for chain {:?}\",
chain))") so failures clearly state which stage (open_db, load_all_forest_cars,
load_actor_bundles) and which path/chain caused the error in load_many_car_db.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d876f35d-a4d2-4de5-a3b6-c7bc46b88aad
📒 Files selected for processing (2)
src/dev/subcommands/state_cmd.rssrc/tool/subcommands/api_cmd/generate_test_snapshot.rs
| if let Err(e) = crate::state_manager::utils::state_compute::state_compute( | ||
| &sm, | ||
| ts.shallow_clone(), | ||
| &ts_next, | ||
| ) | ||
| .await?; | ||
| .await | ||
| { | ||
| let computed = sm | ||
| .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced) | ||
| .await? | ||
| .state_root; | ||
| let expected = *ts_next.parent_state(); | ||
| let db_root_path = { | ||
| let (_, config) = read_config(None, Some(chain.clone()))?; | ||
| db_root(&chain_path(&config))? | ||
| }; | ||
| let db = | ||
| generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?; | ||
| let db: DbImpl = Arc::new(db).into(); | ||
| let db = CompoundBlockstore(nunny::vec![sm.db(), &db]); | ||
| println!( | ||
| "printing state diff between computed({computed}) and expected({expected}) state roots ..." | ||
| ); | ||
| crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?; | ||
| return Err(e); |
There was a problem hiding this comment.
Make the state-diff diagnostics best-effort so the original replay failure is never lost.
This branch adds several fallible steps before return Err(e). If recomputing the state root, opening the configured DB, or printing the diff fails, the command now returns that secondary diagnostics error instead of the original state_compute failure. In practice, replaying a CAR on a machine without the local chain DB will mask the actual mismatch with an unrelated DB-loading error.
💡 Suggested direction
for _ in 0..n.get() {
if let Err(e) = crate::state_manager::utils::state_compute::state_compute(
&sm,
ts.shallow_clone(),
&ts_next,
)
.await
{
- let computed = sm
- .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
- .await?
- .state_root;
- let expected = *ts_next.parent_state();
- let db_root_path = {
- let (_, config) = read_config(None, Some(chain.clone()))?;
- db_root(&chain_path(&config))?
- };
- let db =
- generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
- let db: DbImpl = Arc::new(db).into();
- let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
- println!(
- "printing state diff between computed({computed}) and expected({expected}) state roots ..."
- );
- crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
+ if let Err(diag_err) = async {
+ let computed = sm
+ .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced)
+ .await?
+ .state_root;
+ let expected = *ts_next.parent_state();
+ let db_root_path = {
+ let (_, config) = read_config(None, Some(chain.clone()))?;
+ db_root(&chain_path(&config))?
+ };
+ let db =
+ generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?;
+ let db: DbImpl = Arc::new(db).into();
+ let db = CompoundBlockstore(nunny::vec![sm.db(), &db]);
+ println!(
+ "printing state diff between computed({computed}) and expected({expected}) state roots ..."
+ );
+ crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?;
+ Ok::<_, anyhow::Error>(())
+ }
+ .await
+ {
+ eprintln!("failed to print state diff: {diag_err:#}");
+ }
return Err(e);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if let Err(e) = crate::state_manager::utils::state_compute::state_compute( | |
| &sm, | |
| ts.shallow_clone(), | |
| &ts_next, | |
| ) | |
| .await?; | |
| .await | |
| { | |
| let computed = sm | |
| .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced) | |
| .await? | |
| .state_root; | |
| let expected = *ts_next.parent_state(); | |
| let db_root_path = { | |
| let (_, config) = read_config(None, Some(chain.clone()))?; | |
| db_root(&chain_path(&config))? | |
| }; | |
| let db = | |
| generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?; | |
| let db: DbImpl = Arc::new(db).into(); | |
| let db = CompoundBlockstore(nunny::vec![sm.db(), &db]); | |
| println!( | |
| "printing state diff between computed({computed}) and expected({expected}) state roots ..." | |
| ); | |
| crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?; | |
| return Err(e); | |
| if let Err(e) = crate::state_manager::utils::state_compute::state_compute( | |
| &sm, | |
| ts.shallow_clone(), | |
| &ts_next, | |
| ) | |
| .await | |
| { | |
| if let Err(diag_err) = async { | |
| let computed = sm | |
| .compute_tipset_state(ts, crate::state_manager::NO_CALLBACK, VMTrace::NotTraced) | |
| .await? | |
| .state_root; | |
| let expected = *ts_next.parent_state(); | |
| let db_root_path = { | |
| let (_, config) = read_config(None, Some(chain.clone()))?; | |
| db_root(&chain_path(&config))? | |
| }; | |
| let db = | |
| generate_test_snapshot::load_many_car_db(&db_root_path, Some(&chain)).await?; | |
| let db: DbImpl = Arc::new(db).into(); | |
| let db = CompoundBlockstore(nunny::vec![sm.db(), &db]); | |
| println!( | |
| "printing state diff between computed({computed}) and expected({expected}) state roots ..." | |
| ); | |
| crate::statediff::print_state_diff(&Arc::new(db), &computed, &expected, None)?; | |
| Ok::<_, anyhow::Error>(()) | |
| } | |
| .await | |
| { | |
| eprintln!("failed to print state diff: {diag_err:#}"); | |
| } | |
| return Err(e); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/dev/subcommands/state_cmd.rs` around lines 160 - 184, The diagnostics
block that runs after crate::state_manager::utils::state_compute fails must be
made best-effort so the original error `e` is always returned; wrap the whole
recompute/open-db/print-diff sequence (calls to sm.compute_tipset_state,
generate_test_snapshot::load_many_car_db, CompoundBlockstore construction, and
crate::statediff::print_state_diff) in a fallible-to-ignored wrapper (e.g., run
it and if it errors log the diagnostic error but do not propagate it) and then
unconditionally return Err(e) from the state_compute branch; ensure you preserve
the original `e` in the final return while emitting any diagnostic failure via
logging rather than changing the branch's return value.
Codecov Report❌ Patch coverage is
Additional details and impacted files
... and 9 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary of changes
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
Bug Fixes
Refactor