From 853e51b643504449a638d97b3c6c36ea75f56e88 Mon Sep 17 00:00:00 2001
From: Dave Lucia <davelucianyc@gmail.com>
Date: Thu, 21 May 2026 14:24:27 -0700
Subject: [PATCH 1/3] chore(B8): start plan

---
 .agents/plans/B8-inline-numeric-narrowing.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.agents/plans/B8-inline-numeric-narrowing.md b/.agents/plans/B8-inline-numeric-narrowing.md
index ced56f6..5372ce9 100644
--- a/.agents/plans/B8-inline-numeric-narrowing.md
+++ b/.agents/plans/B8-inline-numeric-narrowing.md
@@ -5,7 +5,7 @@ issue: null
 pr: null
 branch: perf/inline-numeric-narrowing
 base: main
-status: ready
+status: in-progress
 direction: B
 unlocks:
   - small but free win on all integer-arithmetic workloads

From ba2f3a25e8832c753cf6c33b27b510106064d9a4 Mon Sep 17 00:00:00 2001
From: Dave Lucia <davelucianyc@gmail.com>
Date: Thu, 21 May 2026 14:28:11 -0700
Subject: [PATCH 2/3] perf(vm): fast-path Numeric.to_signed_int64 for in-range
 integers

The Lua 5.3 wrap-around mask runs on every integer arithmetic result, but
the overwhelming common case is an input already in [-2^63, 2^63 - 1],
which passes through unchanged. Adding a guard-clause clause that returns
the input as-is short-circuits the masking on that branch.

`@compile {:inline, ...}` lets the BEAM inline both clauses at intra-module
call sites; cross-module callers still trip a function boundary but the
guarded clause's match cost is lower than the band+compare body.

On fib(22), Numeric.to_signed_int64 self-time drops 3.82% -> 3.38% under
tprof. On fib(30) wall clock, lua (chunk) improves 873.4ms -> 844.8ms
(-3.3%), comfortably outside the run-to-run deviation band. Luerl (the
control) does not move. Overflow tests (max_int + 1, min_int - 1,
0xFFFF...) still wrap correctly.

Plan: .agents/plans/B8-inline-numeric-narrowing.md
---
 lib/lua/vm/numeric.ex | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/lua/vm/numeric.ex b/lib/lua/vm/numeric.ex
index d465668..c728cd3 100644
--- a/lib/lua/vm/numeric.ex
+++ b/lib/lua/vm/numeric.ex
@@ -39,6 +39,8 @@ defmodule Lua.VM.Numeric do
   @max_int 0x7FFFFFFFFFFFFFFF
   @min_int -0x8000000000000000
 
+  @compile {:inline, signed?: 1, to_signed_int64: 1}
+
   @doc "Maximum signed 64-bit integer (`2^63 - 1`)."
   @spec max_int() :: integer()
   def max_int, do: @max_int
@@ -68,6 +70,10 @@ defmodule Lua.VM.Numeric do
       -1
   """
   @spec to_signed_int64(integer()) :: integer()
+  def to_signed_int64(n) when is_integer(n) and n >= @min_int and n <= @max_int do
+    n
+  end
+
   def to_signed_int64(n) when is_integer(n) do
     masked = band(n, @uint64_mask)
     if masked >= @sign_bit, do: masked - @uint64_modulus, else: masked

From 74f1d237a576687d23acc1ef34d8d79e47874f46 Mon Sep 17 00:00:00 2001
From: Dave Lucia <davelucianyc@gmail.com>
Date: Thu, 21 May 2026 14:29:03 -0700
Subject: [PATCH 3/3] chore(B8): mark plan as review

Records PR #227, captures the discovery that @compile {:inline, ...}
does not cross module boundaries (so the fast path's win comes from
the guard short-circuit only, not from call-site inlining), and the
wall-clock fib(30) delta of -3.3%.
---
 .agents/plans/B8-inline-numeric-narrowing.md | 36 ++++++++++++++++++--
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/.agents/plans/B8-inline-numeric-narrowing.md b/.agents/plans/B8-inline-numeric-narrowing.md
index 5372ce9..afdbe10 100644
--- a/.agents/plans/B8-inline-numeric-narrowing.md
+++ b/.agents/plans/B8-inline-numeric-narrowing.md
@@ -2,10 +2,10 @@
 id: B8
 title: Inline `to_signed_int64/1` for the in-range fast path
 issue: null
-pr: null
+pr: 227
 branch: perf/inline-numeric-narrowing
 base: main
-status: in-progress
+status: review
 direction: B
 unlocks:
   - small but free win on all integer-arithmetic workloads
@@ -155,4 +155,34 @@ Lua.eval!(lua, chunk)
 
 ## Discoveries
 
-(populated during implementation)
+- `@compile {:inline, ...}` only inlines within the same module. Cross-module
+  callers in `Lua.VM.Executor` and `Lua.VM.Value` still trip a function
+  boundary on every call. tprof call count stayed at 85,968 before/after,
+  confirming no inlining happened at the dispatch sites. This caps the
+  realized win below the plan's stretch target — the gain comes entirely
+  from the guard short-circuit, not from inlining at call sites.
+- Profile self-time on fib(22) moved 3.82% → 3.38%, a 12% relative drop
+  on the function itself. Plan's stretch target of < 1.5% was not hit
+  because it implicitly required cross-module inlining.
+- Wall-clock win on fib(30) is real: lua (chunk) 873.4ms → 844.8ms
+  (**-3.3%**), well outside the ±0.5% deviation band. luerl (control)
+  did not move. The plan's 3% stretch floor on fib was met.
+
+## What changed
+
+- `lib/lua/vm/numeric.ex` — added in-range guard clause to
+  `to_signed_int64/1`; added `@compile {:inline, signed?: 1,
+  to_signed_int64: 1}`.
+
+PR: #227
+
+Suite delta: 1692 tests passing → 1692 tests passing (no regression).
+lua53 suite: 29 tests, 0 failures (matches main).
+
+Benchmarks (fib(30), 10s benchee, 2s warmup):
+
+| benchmark    | baseline    | after        | delta  |
+|--------------|-------------|--------------|--------|
+| lua (chunk)  | 873.36 ms   | 844.76 ms    | -3.3%  |
+| lua (eval)   | 876.74 ms   | 852.21 ms    | -2.8%  |
+| luerl (ctl)  | 730.87 ms   | 731.78 ms    | noise  |