Skip to content

ids-check: Update the workflow and script#199710

Merged
Steelskin merged 2 commits into
llvm:mainfrom
Steelskin:fabrice/update-idt-workflow
Jun 10, 2026
Merged

ids-check: Update the workflow and script#199710
Steelskin merged 2 commits into
llvm:mainfrom
Steelskin:fabrice/update-idt-workflow

Conversation

@Steelskin

Copy link
Copy Markdown
Contributor

In compnerd/ids#58, support was added to parse a header file using a given source file's flags. This solves many of the issues we had encountered with the ids-check-helper.py script and its corresponding workflow.

  • Update ids to the current version, which includes the --main-file changes.
  • Use a more recent LLVM compiler to build a subset of LLVM.
  • Build a subset of LLVM targets to properly parse more header files.
  • Use the --main-file argument when invoking idt.
  • Add explicit overrides and exclude header lists.

This was tested on every public header in LLVM and forthcoming PRs will land the changes found with the updated script. Once all of the headers have been updated, the workflow will be re-enabled. This effort is tracked in #109483.

@llvmorg-github-actions

Copy link
Copy Markdown

@llvm/pr-subscribers-github-workflow

Author: Fabrice de Gans (Steelskin)

Changes

In compnerd/ids#58, support was added to parse a header file using a given source file's flags. This solves many of the issues we had encountered with the ids-check-helper.py script and its corresponding workflow.

  • Update ids to the current version, which includes the --main-file changes.
  • Use a more recent LLVM compiler to build a subset of LLVM.
  • Build a subset of LLVM targets to properly parse more header files.
  • Use the --main-file argument when invoking idt.
  • Add explicit overrides and exclude header lists.

This was tested on every public header in LLVM and forthcoming PRs will land the changes found with the updated script. Once all of the headers have been updated, the workflow will be re-enabled. This effort is tracked in #109483.


Patch is 31.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/199710.diff

2 Files Affected:

  • (modified) .github/workflows/ids-check.yml (+49-19)
  • (modified) llvm/utils/git/ids-check-helper.py (+368-132)
diff --git a/.github/workflows/ids-check.yml b/.github/workflows/ids-check.yml
index f4c3c8a8453bf..aa9308e6b0755 100644
--- a/.github/workflows/ids-check.yml
+++ b/.github/workflows/ids-check.yml
@@ -18,7 +18,7 @@ jobs:
     if: github.repository_owner == 'llvm'
     name: Check LLVM_ABI annotations with ids
     runs-on: ubuntu-24.04
-    timeout-minutes: 10
+    timeout-minutes: 20
 
     steps:
       - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -26,7 +26,7 @@ jobs:
           persist-credentials: false
           repository: compnerd/ids
           path: ${{ github.workspace }}/ids
-          ref: b3bf35dd13d7ff244a6a6d106fe58d0eedb5743e # main
+          ref: 6d9b77c89107743159fccaea109e8feb81959844 # main, pinned at the --main-file landing
 
       - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
         with:
@@ -46,7 +46,16 @@ jobs:
 
       - name: Install dependencies
         run: |
-          sudo apt install -y clang-19 ninja-build libclang-19-dev
+          # Pull a recent clang from LLVM's apt repo so idt's parser stays in
+          # sync with the C++ language features used by current LLVM source.
+          sudo install -d -m 0755 /etc/apt/keyrings
+          wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key \
+            | sudo gpg --dearmor -o /etc/apt/keyrings/llvm.gpg
+          echo "deb [signed-by=/etc/apt/keyrings/llvm.gpg] http://apt.llvm.org/noble/ llvm-toolchain-noble-22 main" \
+            | sudo tee /etc/apt/sources.list.d/llvm.list
+          sudo apt update
+          # oprofile provides opagent.h, used by llvm/ExecutionEngine/OProfileWrapper.h
+          sudo apt install -y clang-22 lld-22 ninja-build libclang-22-dev oprofile
           pip install --require-hashes -r ${{ github.workspace }}/llvm-project/llvm/utils/git/requirements.txt
 
       - name: Configure and build minimal LLVM for use by ids
@@ -54,33 +63,53 @@ jobs:
           cmake -B ${{ github.workspace }}/llvm-project/build/ \
                 -S ${{ github.workspace }}/llvm-project/llvm/ \
                 -D CMAKE_BUILD_TYPE=Release \
-                -D CMAKE_C_COMPILER=clang \
-                -D CMAKE_CXX_COMPILER=clang++ \
+                -D CMAKE_C_COMPILER=clang-22 \
+                -D CMAKE_CXX_COMPILER=clang++-22 \
                 -D LLVM_ENABLE_PROJECTS=clang \
                 -D LLVM_TARGETS_TO_BUILD="host" \
+                -D LLVM_INCLUDE_TESTS=OFF \
+                -D LLVM_INCLUDE_BENCHMARKS=OFF \
                 -D CMAKE_EXPORT_COMPILE_COMMANDS=ON \
+                -D CMAKE_DISABLE_PRECOMPILE_HEADERS=ON \
                 -G Ninja
-          cd ${{ github.workspace }}/llvm-project/build/
-          ninja -t targets all | grep "CommonTableGen: phony$" | grep -v "/" | sed 's/:.*//'
 
-      - name: Configure ids
+          # Build the generated-header prerequisites that idt needs to parse
+          # LLVM source.
+          BUILD_DIR=${{ github.workspace }}/llvm-project/build
+          ninja -C "$BUILD_DIR" \
+            llvm_vcsrevision_h \
+            intrinsics_gen omp_gen acc_gen analysis_gen target_parser_gen vt_gen \
+            DllOptionsTableGen LibOptionsTableGen \
+            clang-tablegen-targets \
+            lib/ExecutionEngine/JITLink/COFFOptions.inc
+
+          # Build per-target tablegen output for every enabled target.
+          # Each enabled target exposes a top-level `<Target>CommonTableGen`
+          # phony rule (e.g. `X86CommonTableGen`); discover them from the
+          # ninja graph.
+          TABLEGEN_TARGETS=$(ninja -C "$BUILD_DIR" -t targets all \
+            | awk -F: '
+                $2 ~ /phony/ &&            # phony rules only
+                $1 !~ /\// &&              # top-level (skip nested paths)
+                $1 ~ /CommonTableGen$/ {   # per-target tablegen aggregator
+                  print $1
+                }')
+          ninja -C "$BUILD_DIR" $TABLEGEN_TARGETS
+
+      - name: Configure and build ids
         run: |
           cmake -B ${{ github.workspace }}/ids/build/ \
                 -S ${{ github.workspace }}/ids/ \
                 -D CMAKE_BUILD_TYPE=Release \
-                -D CMAKE_C_COMPILER=clang \
-                -D CMAKE_CXX_COMPILER=clang++ \
+                -D CMAKE_C_COMPILER=clang-22 \
+                -D CMAKE_CXX_COMPILER=clang++-22 \
                 -D CMAKE_EXE_LINKER_FLAGS=-fuse-ld=lld \
-                -D LLVM_DIR=/usr/lib/llvm-19/lib/cmake/llvm/ \
-                -D Clang_DIR=/usr/lib/llvm-19/lib/cmake/clang/ \
-                -D FILECHECK_EXECUTABLE=$(which FileCheck-19) \
+                -D LLVM_DIR=/usr/lib/llvm-22/lib/cmake/llvm/ \
+                -D Clang_DIR=/usr/lib/llvm-22/lib/cmake/clang/ \
+                -D FILECHECK_EXECUTABLE=$(which FileCheck-22) \
                 -D LIT_EXECUTABLE=$(which lit) \
                 -G Ninja
-
-      # TODO: Use an image with a prebuilt idt.
-      - name: Build ids
-        run: |
-          ninja -C ${{ github.workspace }}/ids/build/ all
+          ninja -C ${{ github.workspace }}/ids/build/
 
       - name: Run ids check
         env:
@@ -96,7 +125,8 @@ jobs:
             --compile-commands ${{ github.workspace }}/llvm-project/build/compile_commands.json \
             --start-rev HEAD~1 \
             --end-rev HEAD \
-            --changed-files "$CHANGED_FILES"
+            --changed-files "$CHANGED_FILES" \
+            --verbose
 
       - name: Upload results
         uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
diff --git a/llvm/utils/git/ids-check-helper.py b/llvm/utils/git/ids-check-helper.py
index 1f0b983fb2261..c0936820110a6 100755
--- a/llvm/utils/git/ids-check-helper.py
+++ b/llvm/utils/git/ids-check-helper.py
@@ -5,31 +5,270 @@
 # SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 import argparse
+import json
 import os
 import subprocess
 import sys
+from collections import defaultdict
+from pathlib import Path
 from typing import List, Optional
 
 """
-This script is run by GitHub actions to ensure that the code in PRs properly
-labels LLVM APIs with `LLVM_ABI` so as not to break the LLVM DLL build. It can
-also be installed as a pre-commit git hook to check ABI annotations before
-submitting. The canonical source of this script is in the LLVM source tree
-under llvm/utils/git.
+Run idt on the headers changed in a PR and (optionally) post the resulting
+LLVM_ABI annotation diff as a PR comment.
 
-This script uses the idt (Interface Diff Tool) to check for missing LLVM_ABI,
-LLVM_C_ABI, and DEMANGLE_ABI annotations in header files.
+The heavy lifting (per-category clang flags, system-header-prefix isolation,
+output filtering, the `hasBody` / out-of-line distinction) lives in
+`llvm/utils/idt/idt.cc`. This script is a thin orchestrator: it picks which
+headers to check, finds a translation unit for each (since LLVM's compile
+database doesn't list headers), and forwards the list to idt batched by
+category.
 
-You can install this script as a git hook by symlinking it to the .git/hooks
-directory:
+Usage as a git pre-commit hook:
 
-ln -s $(pwd)/llvm/utils/git/ids-check-helper.py .git/hooks/pre-commit
+    ln -s $(pwd)/llvm/utils/git/ids-check-helper.py .git/hooks/pre-commit
 
-You can control the exact path to idt and compile_commands.json with the
-following environment variables: $IDT_PATH and $COMPILE_COMMANDS_PATH.
+Environment variables `IDT_PATH` and `COMPILE_COMMANDS_PATH` may be used in
+place of the matching command-line arguments.
 """
 
 
+# Headers we skip outright. Each entry is matched against the changed-file
+# path with `startswith`, so it can be either a directory prefix (trailing /)
+# or a specific file path.
+SKIP_HEADERS = [
+    # These are not LLVM component libraries.
+    "llvm/include/llvm/HTTP/",
+    "llvm/include/llvm/Debuginfod/",
+    # These are meant to be included via `Pass.h`, not directly.
+    "llvm/include/llvm/PassAnalysisSupport.h",
+    "llvm/include/llvm/PassSupport.h",
+    # Test-only headers.
+    "llvm/include/llvm/Testing/",
+    # OProfile JIT bridge, disabled by default.
+    "llvm/include/llvm/ExecutionEngine/OProfileWrapper.h",
+    "llvm/include/llvm/ExecutionEngine/RuntimeDyld.h",
+    # PDB DIA requires Windows ATL (atlbase.h), unbuildable on Linux runners.
+    "llvm/include/llvm/DebugInfo/PDB/DIA/",
+    "llvm/include/llvm/DebugInfo/PDB/ConcreteSymbolEnumerator.h",
+    # No in-tree non-tools/non-target source #includes these headers, so we
+    # can't run idt on a TU that pulls them in.
+    "llvm/include/llvm/ExecutionEngine/Interpreter.h",
+    "llvm/include/llvm/Support/DebugLog.h",
+    "llvm/include/llvm/Support/TargetSelect.h",
+    # zOS-specific shims.
+    "llvm/include/llvm/Support/AutoConvert.h",
+    # `class LLVM_ABI DagInit` (and similar in this file) inherit from
+    # `TrailingObjects<DagInit, T1, T2>` with multiple trailing types.
+    # MSVC `__declspec(dllexport)` on the class forces instantiation of
+    # all members including `getTrailingObjects()` (no-arg), which has a
+    # `static_assert(sizeof...(TrailingTys) == 1, ...)` in its body. The
+    # build then fails with C2338. Until `TrailingObjects.h` is updated
+    # to gate the no-arg overload via SFINAE / requires-clause, skip the
+    # whole header.
+    "llvm/include/llvm/TableGen/Record.h",
+    # Pimpl classes annotated `LLVM_ABI` that hold `std::unique_ptr<T>`
+    # of a forward-declared T defined only in the matching .cpp. MSVC's
+    # class-level `__declspec(dllexport)` forces the virtual destructor
+    # to be emitted inline as part of the exported vtable, which requires
+    # T to be complete in every consuming TU. Pulling T's definition into
+    # the public header would leak the implementation, so the right move
+    # is to keep these out of the bulk pass. Long-term: teach idt to
+    # detect this pattern (unique_ptr<forward-decl> + virtual destructor)
+    # and annotate individual methods instead of the class.
+    "llvm/include/llvm/MC/MCWinCOFFObjectWriter.h",
+    # Wraps the Visual Studio "Setup Configuration" COM API. The
+    # declarations rely on `EXTERN_C` / `MAXUINT` from <windows.h>, which
+    # idt's standalone parse doesn't pull in. Same long-term fix shape as
+    # `AutoConvert.h` (zOS): guard the annotations in-source, or include
+    # the prerequisite header. Skip for now.
+    "llvm/include/llvm/WindowsDriver/MSVCSetupApi.h",
+]
+
+
+# Manual header -> source mappings for headers that don't fit the conventional
+# `llvm/include/llvm/<Subsystem>/<Bar>.h` -> `llvm/lib/<Subsystem>/<Bar>.cpp`
+# layout. Used when neither the direct mapping nor the same-subdirectory
+# grep fallback finds a workable source.
+HEADER_SOURCE_OVERRIDES = {
+    # LLVM-C headers
+    "llvm/include/llvm-c/Analysis.h": "llvm/lib/Analysis/Analysis.cpp",
+    "llvm/include/llvm-c/BitReader.h": "llvm/lib/Bitcode/Reader/BitReader.cpp",
+    "llvm/include/llvm-c/BitWriter.h": "llvm/lib/Bitcode/Writer/BitWriter.cpp",
+    "llvm/include/llvm-c/Comdat.h": "llvm/lib/IR/Comdat.cpp",
+    "llvm/include/llvm-c/Core.h": "llvm/lib/IR/Core.cpp",
+    "llvm/include/llvm-c/DebugInfo.h": "llvm/lib/IR/DebugInfo.cpp",
+    "llvm/include/llvm-c/Disassembler.h": "llvm/lib/MC/MCDisassembler/Disassembler.cpp",
+    "llvm/include/llvm-c/ErrorHandling.h": "llvm/lib/Support/ErrorHandling.cpp",
+    "llvm/include/llvm-c/ExecutionEngine.h": "llvm/lib/ExecutionEngine/ExecutionEngineBindings.cpp",
+    "llvm/include/llvm-c/IRReader.h": "llvm/lib/IRReader/IRReader.cpp",
+    "llvm/include/llvm-c/LLJIT.h": "llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp",
+    "llvm/include/llvm-c/LLJITUtils.h": "llvm/lib/ExecutionEngine/Orc/Debugging/LLJITUtilsCBindings.cpp",
+    "llvm/include/llvm-c/Linker.h": "llvm/lib/Linker/LinkModules.cpp",
+    "llvm/include/llvm-c/Object.h": "llvm/lib/Object/Object.cpp",
+    "llvm/include/llvm-c/Orc.h": "llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp",
+    "llvm/include/llvm-c/OrcEE.h": "llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp",
+    "llvm/include/llvm-c/Remarks.h": "llvm/lib/Remarks/RemarkParser.cpp",
+    "llvm/include/llvm-c/Support.h": "llvm/lib/Support/CommandLine.cpp",
+    "llvm/include/llvm-c/Target.h": "llvm/lib/Target/Target.cpp",
+    "llvm/include/llvm-c/TargetMachine.h": "llvm/lib/Target/TargetMachineC.cpp",
+    "llvm/include/llvm-c/Transforms/PassBuilder.h": "llvm/lib/Passes/PassBuilderBindings.cpp",
+    "llvm/include/llvm-c/Types.h": "llvm/lib/IR/Core.cpp",
+    "llvm/include/llvm-c/lto.h": "llvm/tools/lto/lto.cpp",
+    # Top-level llvm/ headers
+    "llvm/include/llvm/Pass.h": "llvm/lib/IR/Pass.cpp",
+    "llvm/include/llvm/PassAnalysisSupport.h": "llvm/lib/IR/Pass.cpp",
+    "llvm/include/llvm/PassRegistry.h": "llvm/lib/IR/PassRegistry.cpp",
+    "llvm/include/llvm/PassSupport.h": "llvm/lib/IR/Pass.cpp",
+    "llvm/include/llvm/InitializePasses.h": "llvm/lib/Analysis/Analysis.cpp",
+    "llvm/include/llvm/LinkAllIR.h": "llvm/lib/IR/Core.cpp",
+    "llvm/include/llvm/LinkAllPasses.h": "llvm/lib/IR/Pass.cpp",
+    # Headers under llvm/include/llvm/Target/ whose same-subdir grep fallback
+    # picks per-target sources (e.g. X86, AArch64). Redirect to lib/CodeGen/.
+    "llvm/include/llvm/Target/CGPassBuilderOption.h": "llvm/lib/CodeGen/TargetPassConfig.cpp",
+    "llvm/include/llvm/Target/TargetOptions.h": "llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp",
+    # Headers pulled in transitively by a small set of "umbrella" sources.
+    "llvm/include/llvm/ADT/ilist_node_base.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/Analysis/SimplifyQuery.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/CodeGenTypes/MachineValueType.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/IR/Analysis.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/IR/ConstantFolder.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/IR/FMF.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/IR/GenericFloatingPointPredicateUtils.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/IR/IRBuilderFolder.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm/Support/Recycler.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm-c/Error.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    "llvm/include/llvm-c/Visibility.h": "llvm/lib/CodeGen/CodeGenPrepare.cpp",
+    # DTLTO drags in Any/LTO/FormatVariadic.
+    "llvm/include/llvm/ADT/Any.h": "llvm/lib/DTLTO/DTLTO.cpp",
+    "llvm/include/llvm/LTO/Config.h": "llvm/lib/DTLTO/DTLTO.cpp",
+    "llvm/include/llvm/Support/FormatVariadicDetails.h": "llvm/lib/DTLTO/DTLTO.cpp",
+    # Orc debugging / executor-side helpers.
+    "llvm/include/llvm/DebugInfo/DWARF/LowLevel/DWARFDataExtractorSimple.h": "llvm/lib/ExecutionEngine/Orc/Debugging/DebugInfoSupport.cpp",
+    "llvm/include/llvm/ExecutionEngine/Orc/MaterializationUnit.h": "llvm/lib/ExecutionEngine/Orc/Debugging/DebugInfoSupport.cpp",
+    "llvm/include/llvm/ExecutionEngine/Orc/TargetProcess/ExecutorBootstrapService.h": "llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp",
+    "llvm/include/llvm-c/blake3.h": "llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp",
+    "llvm/include/llvm/CodeGen/AsmPrinterHandler.h": "llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp",
+}
+
+
+# Include subdir -> lib subdir remap for cases where the names diverge.
+# ADT has no `llvm/lib/ADT/`; its few non-inline implementations live under
+# `llvm/lib/Support/`.
+INCLUDE_TO_LIB_SUBDIR = {
+    "ADT": "Support",
+}
+
+
+# Categories: each header path-fragment maps to (category-name, export-macro).
+# Order matters: more specific fragments first (Demangle before generic LLVM).
+CATEGORIES = [
+    ("Demangle", "llvm/Demangle/", "DEMANGLE_ABI"),
+    ("LLVM-C", "llvm-c/", "LLVM_C_ABI"),
+    ("LLVM", "llvm/", "LLVM_ABI"),
+]
+
+EXPORT_MACRO = {name: macro for name, _fragment, macro in CATEGORIES}
+
+# Predefines applied to every idt invocation (via `--extra-arg`).
+#
+# Why these are required: idt detects an existing annotation by inspecting
+# clang's parsed attributes (`DLLExportAttr` / `DLLImportAttr` /
+# `VisibilityAttr`). In a static LLVM build, the donor source's compile
+# command carries `-DLLVM_BUILD_STATIC`, which collapses the macro guards
+# in `Compiler.h` / `DemangleConfig.h` / `llvm-c/Visibility.h` so that
+# `LLVM_ABI` / `DEMANGLE_ABI` / `LLVM_C_ABI` all expand to *empty*. The
+# parser then sees no attribute on already-annotated declarations, and
+# idt happily prepends another copy of the macro on top.
+#
+# The fix is to make those macros expand to a real attribute regardless
+# of the donor's static-vs-dylib choice:
+#  - `-ULLVM_BUILD_STATIC` removes the kill-switch.
+#  - `-DLLVM_ENABLE_LLVM_EXPORT_ANNOTATIONS` opens the LLVM_ABI /
+#    DEMANGLE_ABI gates.
+#  - `-DLLVM_ENABLE_LLVM_C_EXPORT_ANNOTATIONS` opens the LLVM_C_ABI gate.
+#  - `-DLLVM_EXPORTS` makes the cascade pick `dllexport` over `dllimport`
+#    (either would suffice for detection, dllexport is the symmetric
+#    choice for a "we're inside the library" parse).
+EXTRA_ARGS_COMMON = [
+    "-ULLVM_BUILD_STATIC",
+    "-DLLVM_ENABLE_LLVM_EXPORT_ANNOTATIONS",
+    "-DLLVM_ENABLE_LLVM_C_EXPORT_ANNOTATIONS",
+    "-DLLVM_EXPORTS",
+]
+
+
+def categorize_header(path: str) -> Optional[str]:
+    """Return the category name (LLVM/LLVM-C/Demangle) for a header, or None
+    if the path doesn't fall under any known category."""
+    for name, fragment, _macro in CATEGORIES:
+        if fragment in path:
+            return name
+    return None
+
+
+def find_source_for_header(header: str) -> Optional[str]:
+    """Return a source file (relative path) that includes the given header,
+    or None if no good match.
+
+    LLVM doesn't list headers in compile_commands.json, so idt needs a
+    source TU to parse. With the `hasBody` / out-of-line distinction inside
+    idt, we can pick the conventional `Foo.cpp` partner without losing
+    coverage of `Foo.h`'s own symbols.
+
+    Resolution order:
+    1. Manual override in HEADER_SOURCE_OVERRIDES.
+    2. Direct mapping: llvm/include/llvm/Foo/Bar.h -> llvm/lib/Foo/Bar.cpp,
+       with INCLUDE_TO_LIB_SUBDIR applied (e.g. ADT -> Support).
+    3. Same-subdirectory grep: any .cpp under llvm/lib/Foo/ that
+       `#include`s the header. Constrained to the matching subdirectory
+       so we don't accidentally pick a target-specific source as a
+       generic fallback.
+    4. Bail (returns None; the header is silently skipped).
+    """
+
+    # 1. Manual override for special cases.
+    if header in HEADER_SOURCE_OVERRIDES:
+        return HEADER_SOURCE_OVERRIDES[header]
+
+    if not header.startswith("llvm/include/llvm/"):
+        return None
+
+    # 2. Direct mapping.
+    sub = header[len("llvm/include/llvm/") :]
+    first_part, _, rest = sub.partition("/")
+    remap = INCLUDE_TO_LIB_SUBDIR.get(first_part)
+    sub_remapped = f"{remap}/{rest}" if remap and rest else sub
+    for ext in (".cpp", ".cc"):
+        candidate = Path("llvm/lib") / Path(sub_remapped).with_suffix(ext)
+        if candidate.exists():
+            return str(candidate)
+
+    # 3. Same-subdirectory grep fallback.
+    lib_subdir = f"llvm/lib/{INCLUDE_TO_LIB_SUBDIR.get(first_part, first_part)}"
+    if not Path(lib_subdir).is_dir():
+        return None
+
+    inc = header.removeprefix("llvm/include/")
+    try:
+        result = subprocess.run(
+            ["git", "grep", "-l", f'#include "{inc}"', "--", lib_subdir],
+            capture_output=True,
+            text=True,
+            timeout=15,
+        )
+    except subprocess.TimeoutExpired:
+        return None
+    if result.returncode != 0 or not result.stdout:
+        return None
+    for line in result.stdout.splitlines():
+        if line.endswith((".cpp", ".cc")):
+            return line
+
+    # 4. Bail.
+    return None
+
+
 class IdsCheckArgs:
     start_rev: str = ""
     end_rev: str = ""
@@ -39,7 +278,7 @@ class IdsCheckArgs:
     repo: str = ""
     token: str = ""
     issue_number: int = 0
-    verbose: bool = True
+    verbose: bool = False
 
     def __init__(self, args: argparse.Namespace) -> None:
         self.start_rev = args.start_rev
@@ -50,31 +289,28 @@ def __init__(self, args: argparse.Namespace) -> None:
         s...
[truncated]

-D LLVM_INCLUDE_TESTS=OFF \
-D LLVM_INCLUDE_BENCHMARKS=OFF \
-D CMAKE_EXPORT_COMPILE_COMMANDS=ON \
-D CMAKE_DISABLE_PRECOMPILE_HEADERS=ON \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to disable precompiled headers if we're just building generated headers/tablegen targets?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PCH headers end up appearing in the command line, but they are never produced. Because of that, idt's parsing ends up failing early. Additionally, the PCH format is not stable, which causes incompatibility between what idt tries to parse (from llvm-22) and what is generated by the build (from trunk).

Disabling PCH is the simplest way to work around both issues.

# `llvm/include/llvm/<Subsystem>/<Bar>.h` -> `llvm/lib/<Subsystem>/<Bar>.cpp`
# layout. Used when neither the direct mapping nor the same-subdirectory
# grep fallback finds a workable source.
HEADER_SOURCE_OVERRIDES = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's really no way to infer this, or SKIP_HEADERS above?

This seems like it's going to be pretty fragile and incur quite a maintenance burden.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy with this solution either. At one point, I had some absurdly complex logic that was even more brittle, so I just decided to bite that bullet and have explicit overrides/disable lists. I would think that the maintenance burden is going to be minimum though, as these are really "special" headers in the sense that they are either platform-specific or whole modules.

@Steelskin Steelskin requested a review from boomanaiden154 May 27, 2026 09:45

@boomanaiden154 boomanaiden154 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is fine. I don't have any better ideas on how to make the script simpler.

llvm_vcsrevision_h \
intrinsics_gen omp_gen acc_gen analysis_gen target_parser_gen vt_gen \
DllOptionsTableGen LibOptionsTableGen \
clang-tablegen-targets \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: It would be nice at some point to have a generated-headers target or something. There are a bunch of CI workflows that need to duplicate this to build generated headers.

Comment thread llvm/utils/git/ids-check-helper.py Outdated
In compnerd/ids#58, support was added to parse a header file using a
given source file's flags. This solves many of the issues we had
encountered with the `ids-check-helper.py` script and its corresponding
workflow.

* Update ids to the current version, which includes the `--main-file`
  changes.
* Use a more recent LLVM compiler to build a subset of LLVM.
* Build a subset of LLVM targets to properly parse more header files.
* Use the `--main-file` argument when invoking `idt`.
* Add explicit overrides and exclude header lists.

This was tested on every public header in LLVM and forthcoming PRs will
land the changes found with the updated script. Once all of the headers
have been updated, the workflow will be re-enabled. This effort is
tracked in llvm#109483.
@Steelskin Steelskin force-pushed the fabrice/update-idt-workflow branch from d9ad0d5 to 9ae711c Compare June 10, 2026 14:46
@Steelskin Steelskin enabled auto-merge (squash) June 10, 2026 14:47
@Steelskin Steelskin disabled auto-merge June 10, 2026 14:47
@Steelskin Steelskin enabled auto-merge (squash) June 10, 2026 14:47
@Steelskin Steelskin merged commit 0bb5833 into llvm:main Jun 10, 2026
13 of 14 checks passed
@Steelskin Steelskin deleted the fabrice/update-idt-workflow branch June 10, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants