Skip to content

feat(cpp-phase2): PR-06 validation-driven bug fixes#718

Open
shivasurya wants to merge 3 commits into
shiva/cpp-phase2-pr05-c-transitive-fallbackfrom
shiva/cpp-phase2-pr06-validation-fixes
Open

feat(cpp-phase2): PR-06 validation-driven bug fixes#718
shivasurya wants to merge 3 commits into
shiva/cpp-phase2-pr05-c-transitive-fallbackfrom
shiva/cpp-phase2-pr06-validation-fixes

Conversation

@shivasurya

Copy link
Copy Markdown
Owner

Summary

Three independent bug fixes uncovered while running PR-01..05 end-to-end against real Ubuntu mingw + a local HTTP server (the validation flow we set up before pushing anything to the production CDN). Each commit is one fix, self-contained.

Stacked on PR-05.

What's in here

1. fix(generator)IsPrivateSymbol panic on length-1 names

IsPrivateSymbol indexed name[1] after checking len(name) > 0 — a single-underscore symbol ("_", length 1) crashed the generator mid-walk. Surfaced on real Win32 SDK headers; tightened the early-return guard.

Side effect: Linux C registry grew 9,205 → 9,970 functions because the panic was silently aborting some glibc header extractions.

2. fix(generator) — Ubuntu mingw libstdc++ layout

The PR-03 Windows C++ walker probed /usr/x86_64-w64-mingw32/include/c++/<version> (upstream mingw sysroot layout). Debian/Ubuntu's apt install g++-mingw-w64 instead puts libstdc++ at /usr/lib/gcc/x86_64-w64-mingw32/<ver>-<thread>/include/c++/. Every Windows C++ generation on a Ubuntu runner was failing with "no mingw libstdc++ headers" even with the toolchain installed.

findWindowsMingwCppDir now probes both layouts, preferring the posix-threading variant. SystemTag carries a layout suffix so downstream can tell them apart.

Validated against real Ubuntu 24.04 mingw install: windows/cpp/v1 manifest generated successfully — 121 headers, 504 classes, tag mingw-w64-libstdc++-13-posix-ubuntu.

3. fix(registry)--stdlib-base-url is now the single source of truth

The PR-03 HTTP loader's headerURL preferred the manifest's per-entry URL field over the loader's baseURL. The generator's default --base-url is the production CDN URL, so every manifest baked in production CDN URLs. Result: any --stdlib-base-url override (test, local server, staging deploy) silently bypassed the override and hit the production CDN.

Symptom: pathfinder against a local http://127.0.0.1:8765 server hung indefinitely because the loader fetched the manifest from localhost, then tried to fetch each per-header JSON from https://assets.codepathfinder.dev/... (which doesn't serve them yet).

Fix: headerURL always constructs from baseURL + entry.File. entry.URL stays in the schema for forward compatibility (mirrors, CDN inspection) but the loader doesn't consume it. Two tests pinning the old behavior renamed + repurposed to assert the new contract.

Validation results

HTTP loader vs file:// loader on the same registries, served via Python http.server from $HOME/.test/cpf-clike-registries/:

Project file:// http:// Δ
smoke fixture 100% 100% exact
redis 67.3% 67.9% +0.6 pp
proxygen 28.2% 28.0% -0.2 pp

≤0.6 pp delta is within run-to-run noise. The HTTP loader is wire-equivalent to file://.

Why this matters

This entire PR exists because we ran the validation flow locally (mingw install + local HTTP server) before pushing anything to R2/CDN. The three bugs would have surfaced in production otherwise:

Local validation paid for itself three times over.

Verification

  • gradle buildGo — clean
  • go test ./... — all packages pass
  • golangci-lint run ./... — 0 issues

Test plan

  • CI green
  • No regressions on file:// loader behavior

🤖 Generated with Claude Code

shivasurya and others added 3 commits May 29, 2026 12:10
IsPrivateSymbol indexes name[1] after a HasPrefix("_") check, but the
length check only excluded the empty string. A single-underscore symbol
name (length 1) passed the empty check, passed the underscore-prefix
check, then panicked on the index access.

This surfaced when generating Windows registries from real Ubuntu mingw
headers — one of the Win32 SDK headers declares a symbol literally named
"_" via a #define, and the extractor crashed the whole generator run.

Fix: tighten the early return to len(name) < 2 so the index access is
always safe.

Validation: Linux C registry grew from ~9,205 → ~9,970 functions after
this fix (the panic was silently aborting some glibc header extractions
mid-walk; the recovery surfaces ~750 more symbols).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
PR-03 probed for the mingw libstdc++ tree under
/usr/x86_64-w64-mingw32/include/c++/<version> — the upstream mingw
sysroot layout. Debian/Ubuntu's apt-packaged g++-mingw-w64 splits the
C++ headers under /usr/lib/gcc/x86_64-w64-mingw32/<ver>-<thread>/
include/c++/ instead. Result: every Windows C++ generation on a Ubuntu
runner failed with "no mingw libstdc++ headers" even when g++-mingw-w64
was installed.

windowsCppSource now probes both layouts via a new findWindowsMingwCppDir
helper:

  1. Upstream sysroot layout (preserved for compatibility).
  2. Ubuntu split layout. Scans /usr/lib/gcc/x86_64-w64-mingw32/ for
     <ver>-<thread> subdirs, prefers the posix-threading variant over
     win32-threading.

SystemTag gains a layout tag suffix (-upstream / -<thread>-ubuntu) so
downstream tools can distinguish the source. Two new tests pin both
layouts; existing tests updated to reflect the suffix.

Validated against real Ubuntu 24.04 mingw install: windows/cpp manifest
generated successfully (121 headers, 504 classes, tag
mingw-w64-libstdc++-13-posix-ubuntu).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The PR-03 HTTP loader's headerURL preferred the manifest's per-entry
URL field over the loader's baseURL. Combined with the generator's
default --base-url=https://assets.codepathfinder.dev/registries, this
meant every manifest baked in production CDN URLs — so any test,
local HTTP server, staging deploy, or operator-provided
--stdlib-base-url override silently hit the production CDN instead.

Symptom found during local validation: pathfinder against a local
http://127.0.0.1:8765 server hung indefinitely because the loader
fetched the manifest from localhost (200 OK), then read entry.URL
from the manifest, then tried to fetch each per-header JSON from
https://assets.codepathfinder.dev/... (which doesn't serve them).

Fix: headerURL on both the C and C++ loaders always constructs the
URL from the loader's baseURL + entry.File. The entry.URL field is
preserved in the schema for forward compatibility (mirrors, CDN
inspection) but the loader doesn't consume it.

Tests updated: TestC*StdlibRegistry_HTTP_ManifestEmbeddedURL renamed
to ManifestEmbeddedURLIgnored, fixture URL pointed at a non-routable
host so a regression would manifest as a hang instead of false
success.

Validation: HTTP loader now matches file:// loader within ≤0.6pp on
redis (67.9% vs 67.3%) and proxygen (28.0% vs 28.2%) — wire-
equivalent behavior proven against the same registries served via
both transports.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@shivasurya shivasurya added bug Something isn't working enhancement New feature or request go Pull requests that update go code labels May 29, 2026
@shivasurya shivasurya self-assigned this May 29, 2026
@safedep

safedep Bot commented May 29, 2026

Copy link
Copy Markdown

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

@github-actions

Copy link
Copy Markdown

Code Pathfinder Security Scan

Pass Critical High Medium Low Info

No security issues detected.

Metric Value
Files Scanned 6
Rules 205

Powered by Code Pathfinder

@code-pathfinder

Copy link
Copy Markdown

Pathfinder Report

No security findings on the changed files. This pull request is clean.

View report on the dashboard


Powered by Code Pathfinder.

@codecov

codecov Bot commented May 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.66667% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.69%. Comparing base (8f4b90a) to head (9238397).

Files with missing lines Patch % Lines
...engine/tools/internal/clikeextract/walker_xplat.go 91.42% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@                             Coverage Diff                              @@
##           shiva/cpp-phase2-pr05-c-transitive-fallback     #718   +/-   ##
============================================================================
  Coverage                                        85.69%   85.69%           
============================================================================
  Files                                              203      203           
  Lines                                            29212    29237   +25     
============================================================================
+ Hits                                             25032    25054   +22     
- Misses                                            3207     3209    +2     
- Partials                                           973      974    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant