Skip to content

fix(index): live progress during parallel parse and resolve phases#9

Merged
prom3theu5 merged 1 commit into
mainfrom
fix/index-progress-bar
Jun 2, 2026
Merged

fix(index): live progress during parallel parse and resolve phases#9
prom3theu5 merged 1 commit into
mainfrom
fix/index-progress-bar

Conversation

@prom3theu5

Copy link
Copy Markdown
Member

Summary

The parallel-parse work in the perf PR (#8) regressed the indexing progress bar:

  • The parse ran inside one par_iter().collect() with no per-file callback, so the bar jumped straight from 0 to N.
  • Each resolve phase (resolving references, resolving type relationships) reported once and then sat on a static label for seconds, looking hung.

This restores live feedback.

Changes

  • Parse bar climbs again. A scoped observer thread polls an atomic counter that the rayon workers bump, and drives the progress callback (which stays off the worker threads — ProgressFn gains a Sync bound so it can be shared; the binary's closure already qualifies). The bar now climbs smoothly 0 → N during the parse, where the per-file wall-clock now lives.
  • Resolve/write phases advance. IndexProgress gains phase_progress: Option<(done, total)>. resolve_supertypes / resolve_references report per-file progress (coarsely, ~every 1% of files), and a writing graph phase is shown around the batched node write. The bar's bottom line now reads e.g. resolving references 1203/1559… and advances.

Notes

  • Cosmetic only — indexed output is unchanged: verified identical symbol/edge counts and deterministic related output on aeontis-backend before/after.

Test plan

  • cargo fmt --check && cargo clippy --all-targets && cargo test — all green (91 tests).
  • Manual: synapse index --force on a large repo (aeontis-new-reactnative) shows the parse bar climbing, then writing graph, then resolving … N/M… ticking — no more 0→full jump or frozen phase labels.

The parallel-parse refactor broke the progress bar: the parse ran inside
one `par_iter().collect()` with no per-file callback, so the bar jumped
straight from 0 to N, and each resolve phase ("resolving references",
etc.) reported once and then sat on a static label for seconds, looking
hung.

- Parse: a scoped observer thread polls an atomic counter the rayon
  workers bump and drives the progress callback (which stays off the
  worker threads), so the bar climbs smoothly 0 -> N during the parse —
  where the per-file wall-clock now is. `ProgressFn` gains a `Sync` bound
  so the observer can share it; the binary's closure already qualifies.
- Resolve/write phases: `IndexProgress` gains `phase_progress: (done,
  total)`. `resolve_supertypes`/`resolve_references` report per-file
  progress (coarsely, ~every 1%), and a "writing graph" phase is shown
  around the batched node write. The bar's bottom line now reads e.g.
  "resolving references 1203/1559…" and advances.

Cosmetic only — indexed output (counts, edges, deterministic `related`)
is unchanged.
@prom3theu5 prom3theu5 merged commit c59ba46 into main Jun 2, 2026
1 check passed
@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Restore live progress bar during parallel parse and resolve phases

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Restore live progress feedback during parallel parse phase
  - Observer thread polls atomic counter bumped by rayon workers
  - Progress callback driven from main thread, stays off worker threads
• Add per-phase progress tracking with phase_progress: (done, total)
  - Resolve phases report per-file progress at ~1% cadence
  - Writing graph and resolve phases now show advancing counters
• Make ProgressFn Sync to enable shared observer thread access
• Update progress UI to display phase sub-counts (e.g., "resolving references 412/1559…")
Diagram
flowchart LR
  A["Parse Phase<br/>Rayon Workers"] -->|"bump atomic<br/>counter"| B["Atomic Counter"]
  B -->|"poll every 80ms"| C["Observer Thread"]
  C -->|"drive callback"| D["Progress Bar"]
  E["Resolve Phases<br/>resolve_supertypes<br/>resolve_references"] -->|"report per-file<br/>progress ~1%"| F["phase_progress<br/>done/total"]
  F -->|"update UI"| D

Loading

Grey Divider

File Changes

1. src/indexer/mod.rs 🐞 Bug fix +134/-66

Implement live progress tracking for parse and resolve phases

• Add phase_progress: Option field to IndexProgress struct for sub-phase progress tracking
• Add Sync bound to ProgressFn type to enable sharing across observer thread
• Wrap parallel parse in thread::scope with observer thread that polls atomic counter and drives
 progress callback
• Refactor resolve phases to accept progress report callback and call maybe_report at ~1% cadence
• Add maybe_report helper function to throttle progress reports to ~1% of total items
• Update all phase reporting calls to include optional phase_progress parameter
• Remove per-file progress reporting from sequential drain loop

src/indexer/mod.rs


2. src/main.rs ✨ Enhancement +9/-4

Display phase progress counters in progress bar UI

• Update progress bar message formatting to display phase sub-counts when available
• Show format "phase done/total…" instead of just "phase…" for phases with progress tracking
• Improve UI clarity by displaying advancing counters during long-running resolve phases

src/main.rs


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (3)

Grey Divider


Remediation recommended

1. ProgressFn called from thread 📘 Rule violation ☼ Reliability Rule 10
Description
The parse observer invokes cb from within scope.spawn, so progress callbacks run on a separate
thread rather than the caller thread. This silently changes (and contradicts the nearby “stays on
this thread” comment) the public progress-callback threading contract and can break consumers that
require same-thread/UI-thread updates or non-Sync callbacks.
Code

src/indexer/mod.rs[R316-342]

Evidence
The cited code path calls the progress callback inside std::thread::scope().spawn, which means the
closure executes on a spawned OS thread rather than on the initiating/caller thread. At the same
time, ProgressFn is publicly exported (via pub mod indexer) and now includes a Sync bound,
tightening who can implement it; combined with the nearby comment stating the callback “stays on
this thread,” these citations demonstrate a silent divergence from established conventions/patterns
and an API/threading-contract change that downstream consumers may not be prepared for.

Rule Rule 10: Match Existing Codebase Conventions (Do Not Silently Fork Style or Patterns)
src/indexer/mod.rs[297-342]
src/indexer/mod.rs[79-86]
src/indexer/mod.rs[316-342]
src/lib.rs[8-17]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ProgressFn` (a public API surface) is invoked from a scoped spawned thread during parsing, and the type now includes a `Sync` bound; together these changes silently alter the callback execution/threading contract and contradict the comment claiming the callback “stays on this thread,” potentially breaking downstream consumers (e.g., UI-thread progress sinks or non-`Sync` callbacks).

## Issue Context
- Some progress sinks (UIs, loggers, terminal renderers) assume callbacks occur on the initiating/caller thread.
- External users of `synapse::indexer::index_repo` may have been passing callbacks that are not `Sync` and/or must run on the caller thread; invoking them from a spawned thread changes behavior even if the CLI’s closure happens to be thread-safe.
- Decide explicitly whether the crate must preserve semver compatibility for its library API or whether this is an intentional breaking change that should be documented and versioned accordingly.

## Fix Focus Areas
- src/indexer/mod.rs[79-86]
- src/indexer/mod.rs[297-342]
- src/indexer/mod.rs[316-353]
- src/lib.rs[8-17]

Potential approaches (choose one path explicitly):
- Keep same-thread callback semantics: have the observer thread publish progress (e.g., via a channel) and have the indexing/caller thread drain and invoke `cb`.
- If threading the callback is intended:
 - Update rustdoc/contract for `ProgressFn`/`index_repo` to clearly state it may be called off-thread, and ensure bounds reflect that (e.g., `Send + Sync` as appropriate).
 - If maintaining semver: avoid breaking the existing public type alias by introducing a new thread-safe callback type (e.g., `SyncProgressFn = ... + Send + Sync`) and either keep the old API or provide an adapter that runs a non-`Sync` callback on the caller thread.
 - If breaking change is acceptable: document the new requirement and bump the crate version appropriately (major/minor per policy).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. maybe_report() undercounts progress 📘 Rule violation ≡ Correctness Rule 10
Description
Resolve-phase progress reporting uses the 0-based loop index i (often reported before the work is
done) and may never emit a final update equal to total, so the UI can show an off-by-one
done/total that never reaches completion. This is inconsistent with IndexProgress.processed
being documented as 1-based and undermines the accuracy of the new live progress display.
Code

src/indexer/mod.rs[R664-672]

Evidence
The cited resolve loops call maybe_report with i from a 0-based enumerate (0..n-1), meaning
the reported value reflects an index rather than “completed items” (and can be emitted before the
item’s work is finished). Additionally, neither resolve pass issues a final
report(pending.len())/report(total) after the loop completes, while the CLI renders
phase_progress as done/total, so these phases can never display total/total and may appear
stuck just before completion.

Rule Rule 10: Match Existing Codebase Conventions (Do Not Silently Fork Style or Patterns)
src/indexer/mod.rs[56-57]
src/indexer/mod.rs[664-672]
src/indexer/mod.rs[704-706]
src/indexer/mod.rs[796-797]
src/indexer/mod.rs[695-706]
src/indexer/mod.rs[785-799]
src/main.rs[202-223]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Progress reporting during resolve phases is off-by-one because it reports the 0-based loop index (and may report before work is completed) and may never emit a terminal `done == total` update, causing the UI to never show `total/total`.

## Issue Context
`IndexProgress.processed` is documented as 1-based, while the CLI formats `phase_progress` as `done/total`. This PR’s goal is live progress during resolve phases; failing to reach `n/n` (or reporting “started” rather than “completed”) undermines the correctness and perceived completion of the progress display.

## Fix Focus Areas
- src/indexer/mod.rs[56-57]
- src/indexer/mod.rs[664-672]
- src/indexer/mod.rs[695-769]
- src/indexer/mod.rs[704-706]
- src/indexer/mod.rs[785-854]
- src/indexer/mod.rs[796-797]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Per-file progress updates removed 📘 Rule violation ⚙ Maintainability Rule 3
Description
The sequential drain loop no longer invokes the progress callback with the current rel path and
updated counts, so consumers lose per-file scan updates between parse completion and post-loop
phases. This increases behavioral impact beyond the stated goal (live progress during parse/resolve
phases) and diverges from the ProgressFn contract of receiving the current file path.
Code

src/indexer/mod.rs[R355-356]

Evidence
Rule 2 requires minimal-impact changes, and Rule 4 requires adhering to existing conventions;
ProgressFn is documented to receive the current file path, and the CLI expects to display it
during the scan, but the updated drain loop no longer calls cb with rel at all.

Rule Rule 3: Make Surgical, Minimal-Impact Changes (No Unnecessary Refactors or Adjacent Cleanup)
Rule Rule 10: Match Existing Codebase Conventions (Do Not Silently Fork Style or Patterns)
src/indexer/mod.rs[79-85]
src/indexer/mod.rs[355-395]
src/main.rs[202-218]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The stage-2 sequential drain no longer calls `progress` per file, which removes the “current file during the scan” behavior and changes the meaning/utility of `ProgressFn`.

## Issue Context
The CLI’s progress renderer uses the `current` argument when `p.phase` is `None`, implying the callback is expected to be invoked during the scan with a real file path.

## Fix Focus Areas
- src/indexer/mod.rs[355-395]
- src/indexer/mod.rs[79-85]
- src/main.rs[202-218]

Suggested fix:
- Re-introduce a lightweight `cb(rel, &IndexProgress { processed: ..., phase: None, ... })` call in the drain loop (possibly at a coarse cadence if needed) so existing progress semantics remain intact while still adding the new parse/resolve live progress.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Stale parse count at end 🐞 Bug ☼ Reliability
Description
The observer thread loads parsed_count with Relaxed and may break on done without re-loading
the counter with a synchronizing ordering, so the last emitted parse snapshot can under-report the
final parsed total.
Code

src/indexer/mod.rs[R323-340]

Evidence
The observer emits progress based on a relaxed load, and only checks done after emitting. Because
it doesn’t do a final synchronized read on exit, the last emitted processed/phase_progress can
lag the true final count.

src/indexer/mod.rs[314-352]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The parse observer loop uses `parsed_count.load(Relaxed)` and then may exit as soon as it sees `done == true` without guaranteeing it has observed the final `parsed_count`. This can leave the last emitted parse update below `total`.

## Issue Context
This is a low-probability race, but it’s easy to harden: ensure a final progress emission after completion that reflects the true final count.

## Fix Focus Areas
- src/indexer/mod.rs[314-353]

## Suggested fix
One of:
- Rework the observer termination condition to avoid `done` entirely:
 - break when `parsed_count.load(Ordering::Acquire) >= total`, and always emit using that loaded value.
- Or, if keeping `done`:
 - when `done.load(Ordering::Acquire)` is true, do a final `let n = parsed_count.load(Ordering::Acquire); cb(...n...);` and then break.
- Consider pairing with stronger ordering on the increment (e.g. `fetch_add(Ordering::Release)` or `AcqRel`) if you want a strict happens-before relationship for the counter observations.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@prom3theu5 prom3theu5 deleted the fix/index-progress-bar branch June 2, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant