fix: expand quoted @file response files for GCC/Clang (#2650)#2701
Open
niko-operal wants to merge 1 commit into
Open
fix: expand quoted @file response files for GCC/Clang (#2650)#2701niko-operal wants to merge 1 commit into
niko-operal wants to merge 1 commit into
Conversation
CMake 4.3+ generates `@<file>.modmap` arguments whose content is wrapped
in double quotes (`-fmodule-file="key=value"`). The existing
`ExpandIncludeFile` iterator in `src/compiler/gcc.rs` aborted on any
quoted character, leaving the raw `@file` token in the argument stream
where the post-iteration check fired `cannot_cache!("@")` and disabled
the cache for every C++20 module TU.
Empirically the modmap shape changed between CMake 4.2.x and 4.3.0;
3.31, 4.1.x and 4.2.x emit the same flag without quotes and were never
affected. clang++ accepts both shapes and produces byte-identical .o
output, so the quoting was only ever a serialisation choice.
Reuse the existing MSVC `SplitMsvcResponseFileArgs` tokenizer, which
already handles `CommandLineToArgvW`-style quoted runs and degenerates
to plain whitespace tokenisation for the unquoted shape — so existing
GCC response file inputs stay byte-equivalent. Tracks PR mozilla#1781, which
took the same approach but stalled in 2024-02.
* `src/compiler/msvc.rs` — make `SplitMsvcResponseFileArgs` `pub(crate)`.
* `src/compiler/gcc.rs` — replace the bailout-on-quote + naive
`split_whitespace` with a single call to the MSVC tokenizer.
* Add `at_signs_quoted_modmap` and `at_signs_unquoted_modmap` regression
tests covering both modmap shapes.
* Flip `tests/integration/scripts/test-cmake-modules-v4.sh` from XFAIL
to PASS, with a warm-pass that asserts ≥ 2 cache hits after a build
dir wipe (proves the cache is actually populated, not just that the
bailout no longer fires).
* Update Makefile help text accordingly.
Closes mozilla#2650.
Test fixture from mozilla#2649.
Approach previously attempted in mozilla#1781.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #2650. CMake 4.3+ writes
@<file>.modmaparguments whose content is wrapped in double quotes (-fmodule-file="key=value"). Before this PR, the response-file expander insrc/compiler/gcc.rs(ExpandIncludeFile) aborted on any quoted character, leaving the raw@filetoken in the argument stream where it triggeredcannot_cache!("@")and disabled the cache for every C++20 module TU.This PR reuses the existing MSVC
SplitMsvcResponseFileArgstokenizer (which already handlesCommandLineToArgvW-style quoted runs and degenerates to plain whitespace tokenisation on the unquoted shape) so existing GCC response-file inputs stay byte-equivalent.This is the same approach attempted in #1781 (closed 2024-02 for inactivity, not for technical reasons), reduced to a minimal-surface diff: no new shared module yet, just
pub(crate)on the existing MSVC tokenizer and a single call site change in GCC. A future PR can extractresponse_file.rsalong the lines of #1781 if maintainers prefer that direction.Empirical bisection
The bug bisects cleanly to a single behavioural difference in the modmap content emitted by CMake — not in the compile-launcher command line, which is byte-identical between working and broken setups. Tested with LLVM 22.1.1, sccache
main+ this patch:-fmodule-file=greet=…pcm-fmodule-file=greet=…pcm-fmodule-file=greet=…pcm-fmodule-file=greet=…pcm-fmodule-file="greet=…pcm"Non-cacheable: @ × 2-fmodule-file="greet=…pcm"Non-cacheable: @ × 2-fmodule-file="greet=…pcm"clang++accepts both shapes (verified withcmpon the produced.o— byte-identical), so the quoting was only ever a serialisation choice. Note that 4.2.5 was published after 4.3.0 and is still clean — the 4.2.x maintenance line never picked up the quote-wrapping change, so the change was 4.3-development-only and was not back-ported. This confirms @TroyKomodo's earlier comment on #2650 that 4.1.2 worked.Changes
src/compiler/msvc.rs— makeSplitMsvcResponseFileArgspub(crate)sogcc.rscan reuse it.src/compiler/gcc.rs— replace the bailout-on-quote (if contents.contains('"')) and naivesplit_whitespacewith a single call to the MSVC tokenizer. Net diff: −1 line of code (the comment block grew, the logic shrank).src/compiler/gcc.rs— add two regression tests:at_signs_quoted_modmap(the bug fix) andat_signs_unquoted_modmap(proves backward compatibility on the older shape).tests/integration/scripts/test-cmake-modules-v4.sh— flip from XFAIL to PASS, with a warm pass that asserts ≥ 2 cache hits after wiping the build dir (proves the cache is actually populated, not just that the bailout no longer fires).tests/integration/Makefile— drop the(xfail)label from the help text.Limitations / follow-up
The MSVC tokenizer follows
CommandLineToArgvWsemantics, which differ from the GCC@filespec on a few edge cases the current CMake-emitted modmap shape does not exercise:'foo bar''fooandbar'foo bar\$HOME\+$HOME$HOME"a\nb"a\nb(literal\)anb(\escapesn)For the CMake-emitted format (
-fmodule-file="key=value", no embedded spaces, no backslashes, no single-quotes), MSVC parser and GCC parser produce identical output, so this PR is correct in practice. A follow-up PR could move the tokenizer into a sharedsrc/compiler/response_file.rsand add a GCC-spec-conformant variant — that was the ultimate direction of #1781.Test plan
cargo test --lib— 451 passed / 0 failed (449 existing + 2 new)cargo fmt --check— cleanat_signs_quoted_modmap+at_signs_unquoted_modmappasstests/integration/scripts/test-cmake-modules-v4.shexercised manually on host — passesCloses #2650.
References #2649 (test fixture), #1781 (prior approach), #2516 (C++20 modules support that is now reachable for cmake 4.3+).