Skip to content

fix(lsp): seal shared Tier-2 cross-registry against O(n^2) resolve hang#677

Open
DeusData wants to merge 1 commit into
mainfrom
fix/c-lsp-shared-registry-on2
Open

fix(lsp): seal shared Tier-2 cross-registry against O(n^2) resolve hang#677
DeusData wants to merge 1 commit into
mainfrom
fix/c-lsp-shared-registry-on2

Conversation

@DeusData

Copy link
Copy Markdown
Owner

Problem

In full mode the pipeline builds one project-wide C cross-registry (cbm_c_build_cross_registry), finalizes it (O(1) hash lookups), and shares it read-only across the parallel resolve workers (registry_shared = true). But the per-file C cross-LSP resolver still mutated that finalized, shared registry (c_lsp.c add_func sites). Each post-finalize add lands in a tail the hash index does not cover, so every subsequent lookup linear-scans an ever-growing tail -> O(files * defs). The Linux-kernel full index hung at [4/9] Resolving on 11 cores for >6 min and never finished, plus a heap data race across workers.

This is the documented "read-only shared registries" rule (036a80e) that was written down but only partially enforced — one site was guarded, the others were not.

Fix

Seal every shared cross-registry at the single chokepoint:

  • CBMTypeRegistry gains a read_only flag.
  • All five cbm_{c,py,cs,ts,go}_build_cross_registry builders set it true right after finalize.
  • cbm_registry_add_func / cbm_registry_add_type no-op on a sealed registry.

One guard covers every language and is robust to any resolver mutation site. The plain-C function-registration path is also skipped directly when the registry is shared. Languages that build a per-file registry (Java, Kotlin, Rust, PHP) are unaffected (per-file registries are never shared) and the seal future-proofs them if they ever become Tier-2.

Tests

Six cross-language invariant tests (tests/test_c_lsp.c): resolving against a finalized shared registry must leave func_count/type_count unchanged — RED before the fix (C +1, C++ +2 post-finalize adds), GREEN after.

Local verification (Apple M3)

  • Full suite: 5720 passed / 0 failed, no regressions.
  • Linux-kernel full index now completes all 9 phases (4.88M nodes, 11.9M edges); the resolve phase runs at full parallelism (~1100% CPU), linear, vs the prior >6-min hang that never finished.
  • Graph quality verified against source: start_kernel callees match init/main.c line-for-line; lsp_direct 0.95-confidence cross-file resolutions intact — no resolution-quality loss.

CI is requested to verify across all platforms (Linux x64/arm64, macOS arm64/Intel, Windows).

The per-file C cross-LSP resolver mutated the finalized, project-wide cross
registry that is shared read-only across the parallel resolve workers. Each
post-finalize add landed in a tail the hash index does not cover, so every
lookup linear-scanned an ever-growing tail -> O(files*defs) on large C
codebases. The Linux-kernel full index hung at "[4/9] Resolving" on 11 cores
for >6 min and never finished, plus a heap data race across workers.

Seal every shared cross-registry: CBMTypeRegistry gains a read_only flag, set
by all five {c,py,cs,ts,go}_build_cross_registry builders right after
finalize, and cbm_registry_add_func/_type no-op on a sealed registry. One
chokepoint guards every language, robust to any resolver mutation site. The
plain-C function-registration site is also skipped directly when the registry
is shared.

Add six cross-language invariant tests (c, c++, py, c#, ts, go): resolving
against a finalized shared registry must leave func_count/type_count
unchanged -- RED before the fix (C +1, C++ +2 post-finalize adds), GREEN
after.

Verified: full suite 5720 passed / 0 failed; Linux-kernel full index now
completes all 9 phases in 643s (4.88M nodes, 11.9M edges) instead of hanging
at phase 4.

Signed-off-by: Martin Vogel <martin.vogel.tech@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant