Skip to content

fix: expand quoted @file response files for GCC/Clang (#2650)#2701

Open
niko-operal wants to merge 1 commit into
mozilla:mainfrom
niko-operal:fix/2650-response-files
Open

fix: expand quoted @file response files for GCC/Clang (#2650)#2701
niko-operal wants to merge 1 commit into
mozilla:mainfrom
niko-operal:fix/2650-response-files

Conversation

@niko-operal
Copy link
Copy Markdown

Summary

Fixes #2650. CMake 4.3+ writes @<file>.modmap arguments whose content is wrapped in double quotes (-fmodule-file="key=value"). Before this PR, the response-file expander in src/compiler/gcc.rs (ExpandIncludeFile) aborted on any quoted character, leaving the raw @file token in the argument stream where it triggered cannot_cache!("@") and disabled the cache for every C++20 module TU.

This PR reuses the existing MSVC SplitMsvcResponseFileArgs tokenizer (which already handles CommandLineToArgvW-style quoted runs and degenerates to plain whitespace tokenisation on the unquoted shape) so existing GCC response-file inputs stay byte-equivalent.

This is the same approach attempted in #1781 (closed 2024-02 for inactivity, not for technical reasons), reduced to a minimal-surface diff: no new shared module yet, just pub(crate) on the existing MSVC tokenizer and a single call site change in GCC. A future PR can extract response_file.rs along the lines of #1781 if maintainers prefer that direction.

Empirical bisection

The bug bisects cleanly to a single behavioural difference in the modmap content emitted by CMake — not in the compile-launcher command line, which is byte-identical between working and broken setups. Tested with LLVM 22.1.1, sccache main + this patch:

CMake release date modmap content sccache result
3.31.6 2025-… -fmodule-file=greet=…pcm 2 misses cold → 2 hits warm
4.1.2 2025-09-30 -fmodule-file=greet=…pcm 2 misses cold → 2 hits warm
4.2.0 2025-11-19 -fmodule-file=greet=…pcm 2 misses cold → 2 hits warm
4.2.5 2026-04-21 -fmodule-file=greet=…pcm 2 misses cold → 2 hits warm
4.3.0 (no patch) 2026-03-17 -fmodule-file="greet=…pcm" Non-cacheable: @ × 2
4.3.1 (no patch) 2026-03-27 -fmodule-file="greet=…pcm" Non-cacheable: @ × 2
4.3.1 + this patch -fmodule-file="greet=…pcm" 2 misses cold → 2 hits warm

clang++ accepts both shapes (verified with cmp on the produced .o — byte-identical), so the quoting was only ever a serialisation choice. Note that 4.2.5 was published after 4.3.0 and is still clean — the 4.2.x maintenance line never picked up the quote-wrapping change, so the change was 4.3-development-only and was not back-ported. This confirms @TroyKomodo's earlier comment on #2650 that 4.1.2 worked.

Changes

  • src/compiler/msvc.rs — make SplitMsvcResponseFileArgs pub(crate) so gcc.rs can reuse it.
  • src/compiler/gcc.rs — replace the bailout-on-quote (if contents.contains('"')) and naive split_whitespace with a single call to the MSVC tokenizer. Net diff: −1 line of code (the comment block grew, the logic shrank).
  • src/compiler/gcc.rs — add two regression tests: at_signs_quoted_modmap (the bug fix) and at_signs_unquoted_modmap (proves backward compatibility on the older shape).
  • tests/integration/scripts/test-cmake-modules-v4.sh — flip from XFAIL to PASS, with a warm pass that asserts ≥ 2 cache hits after wiping the build dir (proves the cache is actually populated, not just that the bailout no longer fires).
  • tests/integration/Makefile — drop the (xfail) label from the help text.

Limitations / follow-up

The MSVC tokenizer follows CommandLineToArgvW semantics, which differ from the GCC @file spec on a few edge cases the current CMake-emitted modmap shape does not exercise:

input MSVC parser (now used for GCC) GCC spec
'foo bar' 2 tokens: 'foo and bar' 1 token: foo bar
\$HOME 2 chars: \ + $HOME 1 char: $HOME
"a\nb" a\nb (literal \) anb (\ escapes n)

For the CMake-emitted format (-fmodule-file="key=value", no embedded spaces, no backslashes, no single-quotes), MSVC parser and GCC parser produce identical output, so this PR is correct in practice. A follow-up PR could move the tokenizer into a shared src/compiler/response_file.rs and add a GCC-spec-conformant variant — that was the ultimate direction of #1781.

Test plan

  • cargo test --lib — 451 passed / 0 failed (449 existing + 2 new)
  • cargo fmt --check — clean
  • New regression tests at_signs_quoted_modmap + at_signs_unquoted_modmap pass
  • Hand-rolled minimal repro (cmake 4.3.1 + clang 22.1.1) — 2 misses cold → 2 hits warm
  • Updated tests/integration/scripts/test-cmake-modules-v4.sh exercised manually on host — passes
  • CI's full integration matrix (will run on push)

Closes #2650.
References #2649 (test fixture), #1781 (prior approach), #2516 (C++20 modules support that is now reachable for cmake 4.3+).

CMake 4.3+ generates `@<file>.modmap` arguments whose content is wrapped
in double quotes (`-fmodule-file="key=value"`). The existing
`ExpandIncludeFile` iterator in `src/compiler/gcc.rs` aborted on any
quoted character, leaving the raw `@file` token in the argument stream
where the post-iteration check fired `cannot_cache!("@")` and disabled
the cache for every C++20 module TU.

Empirically the modmap shape changed between CMake 4.2.x and 4.3.0;
3.31, 4.1.x and 4.2.x emit the same flag without quotes and were never
affected. clang++ accepts both shapes and produces byte-identical .o
output, so the quoting was only ever a serialisation choice.

Reuse the existing MSVC `SplitMsvcResponseFileArgs` tokenizer, which
already handles `CommandLineToArgvW`-style quoted runs and degenerates
to plain whitespace tokenisation for the unquoted shape — so existing
GCC response file inputs stay byte-equivalent. Tracks PR mozilla#1781, which
took the same approach but stalled in 2024-02.

* `src/compiler/msvc.rs` — make `SplitMsvcResponseFileArgs` `pub(crate)`.
* `src/compiler/gcc.rs` — replace the bailout-on-quote + naive
  `split_whitespace` with a single call to the MSVC tokenizer.
* Add `at_signs_quoted_modmap` and `at_signs_unquoted_modmap` regression
  tests covering both modmap shapes.
* Flip `tests/integration/scripts/test-cmake-modules-v4.sh` from XFAIL
  to PASS, with a warm-pass that asserts ≥ 2 cache hits after a build
  dir wipe (proves the cache is actually populated, not just that the
  bailout no longer fires).
* Update Makefile help text accordingly.

Closes mozilla#2650.
Test fixture from mozilla#2649.
Approach previously attempted in mozilla#1781.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

C++20 modules with cmake 4.3.0 are not cacheable due to @response files

1 participant