Skip to content

JIT: enable shared-constant CSE on x64 (#92170)#129941

Draft
AndyAyersMS wants to merge 2 commits into
dotnet:mainfrom
AndyAyersMS:andyayersms/hoist-big-const
Draft

JIT: enable shared-constant CSE on x64 (#92170)#129941
AndyAyersMS wants to merge 2 commits into
dotnet:mainfrom
AndyAyersMS:andyayersms/hoist-big-const

Conversation

@AndyAyersMS

Copy link
Copy Markdown
Member

Buckets pointer-class constants by their upper bits so each shared use becomes a lea reg, [base+offset] instead of a full mov reg, imm64. On x64 the bucket width is 256 and the def value is centered to maximize use of the lea disp8 encoding.

Also extends CSE and hoist eligibility to integral constants that don't fit as imm32 or require relocation, with a per-method use-count gate so single-occurrence constants aren't speculatively hoisted.

About -1.18 MB code size across the standard x64 SPMI collections; arm64 also improves and x86 is unchanged.

Buckets pointer-class constants by their upper bits so each shared use
becomes a `lea reg, [base+offset]` instead of a full `mov reg, imm64`.
On x64 the bucket width is 256 and the def value is centered to maximize
use of the `lea` disp8 encoding.

Also extends CSE and hoist eligibility to integral constants that don't
fit as imm32 or require relocation, with a per-method use-count gate so
single-occurrence constants aren't speculatively hoisted.

About -1.18 MB code size across the standard x64 SPMI collections; arm64
also improves and x86 is unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 28, 2026 02:29
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 28, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the CoreCLR JIT’s constant CSE/hoisting behavior, primarily on x64, to reduce code size by sharing nearby pointer-class constants (so uses can be materialized via base+offset) and by broadening CSE/hoist eligibility to certain “expensive” integral constants.

Changes:

  • Adjusts x64 shared-constant bucketing to use an 8-bit low-bits cut (256-wide buckets) to better enable compact addressing forms.
  • Adds a per-method VN-based occurrence tally to gate hoisting of plain integral constants (avoid hoisting single-occurrence constants that won’t be eliminated).
  • Expands constant CSE/hoist consideration to integral constants that don’t fit in imm32 or require relocation, and tweaks the CSE heuristic cost model for low-use-count constants on xarch.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/target.h Changes x64 shared-constant bucketing width by reducing CSE_CONST_SHARED_LOW_BITS.
src/coreclr/jit/optimizer.cpp Adds method-wide constant occurrence counting and uses it to gate loop-hoisting of plain integral constants.
src/coreclr/jit/optcse.cpp Extends constant eligibility rules, adjusts heuristic costs, and changes shared-constant def-value selection/centering logic.
src/coreclr/jit/jitconfigvalues.h Renames/retargets JitConstCSE option constants and updates the associated comment text.
src/coreclr/jit/compiler.h Introduces VNToCountMap and stores it in LoopHoistContext; also contains shared-constant key encoding helpers.

Comment thread src/coreclr/jit/optcse.cpp Outdated
Comment thread src/coreclr/jit/jitconfigvalues.h Outdated
@AndyAyersMS

Copy link
Copy Markdown
Member Author

diffs.

About -2MB on x64.

* Use a fresh VN matching the centered value for the shared-const CSE temp,
  rather than reusing the original constant's VN; this keeps the value
  number consistent with the actual constant value of the def node.
* Clarify the JitConstCSE comment to note that on x86/x64 only the
  nearby-value (shared) variant is enabled by default; full const CSE is
  still only target-gated for ARM/ARM64/RISCV64.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@EgorBo

EgorBo commented Jun 28, 2026

Copy link
Copy Markdown
Member

diffs.

About -2MB on x64.

I'm worried that all real-world collections (benchmarks, aspnet) seem to be 3x PerfScore regressions

@AndyAyersMS

Copy link
Copy Markdown
Member Author

diffs.
About -2MB on x64.

I'm worried that all real-world collections (benchmarks, aspnet) seem to be 3x PerfScore regressions

I'll try and run some plausible benchmark subset. I suspect that our existing costing under-estimates the perf impact of those large immediate values (so CSE looks like a loss, one extra mov).

@AndyAyersMS

Copy link
Copy Markdown
Member Author

A few representative microbenchmarks to compare PerfScore predictions against actual hardware. FillArrayWithSameRef mirrors the worst SPMI PerfScore regression seen (PerfLabTests.CastingPerf:FooObjIsNull at +8% PerfScore, where the diff swaps a renamer-eliminated mov reg,reg for a real lea reg,[base+disp] while shrinking the prolog 23->14 bytes). ManyBigConstants exercises the case the change is designed for.

@EgorBot -windows_intel -linux_amd

using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

public class Foo { }
public class FooDerived : Foo { }

public class Bench
{
    private const int Iterations = 1000;

    private object _fooObj = default!;
    private object _intObj = default!;
    private Foo[] _fillTarget = default!;
    private object _fillValue = default!;

    [GlobalSetup]
    public void Setup()
    {
        _fooObj = new Foo();
        _intObj = 42;
        _fillTarget = new Foo[Iterations];
        _fillValue = new Foo();
    }

    [Benchmark]
    public void FillArrayWithSameRef()
    {
        var arr = _fillTarget;
        var v = (Foo)_fillValue;
        for (int i = 0; i < arr.Length; i++)
        {
            arr[i] = v;
        }
    }

    [Benchmark]
    public int FooObjIsFoo()
    {
        int count = 0;
        var o = _fooObj;
        for (int i = 0; i < Iterations; i++)
        {
            if (o is Foo) count++;
        }
        return count;
    }

    [Benchmark]
    public int IntObjIsInt()
    {
        int count = 0;
        var o = _intObj;
        for (int i = 0; i < Iterations; i++)
        {
            if (o is int) count++;
        }
        return count;
    }

    [Benchmark]
    public long ManyBigConstants()
    {
        long sum = 0;
        for (int i = 0; i < Iterations; i++)
        {
            sum += 0x123456789ABCDEF0L;
            sum ^= 0x123456789ABCDEF1L;
            sum += 0x123456789ABCDEF2L;
            sum ^= 0x123456789ABCDEF3L;
        }
        return sum;
    }
}

Note

Comment generated with assistance from GitHub Copilot CLI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants