Skip to content

Frame-interpolation pipeline + menu setting + OpenXR scaffold#59

Open
elliotttate wants to merge 2 commits into
JRickey:mainfrom
elliotttate:agent/frame-interpolation
Open

Frame-interpolation pipeline + menu setting + OpenXR scaffold#59
elliotttate wants to merge 2 commits into
JRickey:mainfrom
elliotttate:agent/frame-interpolation

Conversation

@elliotttate

@elliotttate elliotttate commented May 1, 2026

Copy link
Copy Markdown

Summary

Display-rate decoupling layered on top of libultraship's matrix-replacement hook. Game logic still ticks at the port's natural 60 Hz cadence; intermediate display frames are produced by lerping per-DObj and camera matrices captured during the previous and current ticks. Opt-in via Settings → Graphics → Frame Interpolation Mult (1 = disabled, default).

SSB64 is 60 fps natively (one tick per VRETRACE; default sSYTaskmanUpdateInterval = 1). So this PR's actual scope is "60 → 120/180/240 Hz on a high-refresh display." On a 60 Hz monitor mult>1 is visually a no-op: the intermediate lerped frame tears (vsync toggled off) and is immediately overwritten by the equivalent un-lerped frame at the next vsync. The recording/lerp infrastructure is still useful as a foundation for future work (per-eye projection matrices reuse the same mtx_replacements plumbing for VR).

This PR needs more testing. It compiles, links, runs at default settings without affecting anything, and has been smoke-tested on the title screen with mult=2. It has not been validated on combat scenes, boss intros, KO replays, results screens, or non-D3D11 backends. Merge with that caveat in mind, or treat this as a base for review/iteration.

What's in scope

  • Recording layer (port/frame_interpolation.{h,cpp})

    • Tree of Path nodes scoped by (stable_ptr, sub_id) labels via RecordOpenChild / RecordCloseChild. Pairwise tree match handles actor spawn/despawn.
    • Op-typed records for every syMatrix*(Mtx*, ...) builder in src/sys/matrix.c (Tra, Sca, Rot{R,D}, RotRpy{R,D}, RotPyr{R}, RotPy{R}, RotRp{R}, RotYaw{R}, RotPitch{R} and the Tra* / TraRot*Sca combinations).
    • Camera composite is input-domain: eye/at/up + fovy/aspect/near/far/scale captured at the F2L call site in gmCameraLookAtFuncMatrix, replay rebuilds lookat_F + persp_F and multiplies them. Element-wise lerp on view*projection is mathematically wrong (non-rigid transform, visible doubling on camera pans) and is not used.
    • F2L catch-all kept for paths not yet input-domain-instrumented.
  • Driver (port/gameloop.cpp)

    • StartRecord at PortPushFrame entry, StopRecord inside port_submit_display_list, then mult presents per submit with FrameInterpolation_Interpolate(i/mult) for i ∈ [1, mult-1] and an empty replacement map for the final pass (bit-exact-fidelity invariant).
    • Slow-motion workaround for high-refresh use case: temporarily flip gVsyncEnabled=0 and bump TargetFps to 60*mult around intermediate presents so the DXGI backend doesn't rate-limit them. The final present runs with the user's vsync setting. Without this, mult=N would stretch one tick across N vsyncs and slow the game to 60/N Hz on any display. Targets DXGI specifically — OpenGL/Metal backends are untouched and will exhibit the slow-motion behaviour.
  • Menu (port/gui/PortMenu.cpp)

    • WIDGET_CVAR_SLIDER_INT bound to gSettings.FrameInterpolationMult, range 1–8, default 1. Read every frame so the slider takes effect without a relaunch. SSB64_INTERP_MULT env var still works as a fallback default.
    • Tooltip explicitly says the feature only helps on 120 Hz+ displays.
  • Self-tests (port/frame_interpolation_selftest.cpp + _test_main.cpp)

    • Six unit tests: record counts, identity lerp, midpoint lerp, t=1 fidelity, short-arc angle wrap, actor-spawn handling.
    • In-process: SSB64_FRAME_INTERP_UNITTEST=1 runs at boot (exit 2 on failure); SSB64_FRAME_INTERP_TELEMETRY=1 logs op counts every ~60 ticks.
    • Standalone CMake target ssb64_frame_interp_test (EXCLUDE_FROM_ALL) links only the recording layer + stub matrix builders; runs without ROM/assets. Useful for CI of the recording infrastructure.
  • OpenXR scaffold (port/xr/xr_runtime.{h,cpp})

    • Lifecycle hooks (init / shutdown / begin_frame / end_frame) wired into PortGameInit / PortPushFrame / PortGameShutdown.
    • Opt-in via SSB64_XR_ENABLE=1; real session bring-up gated behind -DSSB64_ENABLE_OPENXR=ON build flag with extensive TODO comments outlining the cinema-mode (Tier 1) implementation plan: D3D11 binding, local reference space, quad-layer swapchain, world-locked floating screen ~2 m forward.
    • No runtime overhead when disabled.

Known limitations / followups

  • Useful only on 120 Hz+ displays. On 60 Hz monitors mult>1 adds tearing without visual benefit. Worth considering whether to gate the slider on GetCurrentRefreshRate() > 60, or just leave it as-is and let the tooltip do the explaining.
  • Needs broader testing. Only smoke-tested on the title screen. Combat scenes, boss intros, KO replays, menu transitions, and results screens have not been visually evaluated for artefacts.
  • Vsync-toggle workaround introduces tearing on intermediate frames. Subjective evaluation needed.
  • Hitbox debug-render matrices (ftdisplaymain, itdisplay, wpdisplay) are recorded but not OpenChild-scoped; if hitbox visualisation is enabled with mult>1, alignment may drift mid-attack.
  • F2L paths still in use (none in default code path) lerp element-wise and will show paper-thin artefacts on near-180° rotations. Camera no longer uses this path.
  • OpenXR scaffold is structural only — no working VR build yet. Full Tier 1 (cinema mode) is ~1–2 weeks of additional work; the inline TODO list in xr_runtime.cpp documents what's needed.
  • D3D11 only for the slow-motion workaround. The OpenGL/SDL2 backend (gfx_sdl2.cpp) and macOS Metal backend will still rate-limit intermediate presents.
  • The proper long-term fix for the rate-limit is a libultraship-level SetVsyncEnabled(bool) API (or a renderer thread) — not done here.

Test plan

  • Default settings (mult=1) — verify zero perceptible difference from main. Run a full match start-to-finish.
  • mult=2 on a 120 Hz+ monitor — verify smooth motion vs. mult=1. This is the real target use case.
  • mult=2 on a 60 Hz monitor — verify the game still runs at full speed (vsync-toggle workaround working) and no obvious crash. Tearing on intermediate frames is expected.
  • Title screen — verify smooth motion, no doubling on the rotating logo / character cycle.
  • Combat scene — fighter limb animation, attack windup, hitstun, KO. Watch for paper-thin artefacts on fast rotations.
  • Boss intros / DK intro / camera dolly scenes — these have the largest camera deltas per tick; the camera input-domain lerp is most stress-tested here.
  • Results screen / "GAME!" sequence — known camera-cut transitions; without DontInterpolateCamera calls these will show one weird midpoint frame. Decide whether to wire that up.
  • Menu navigation (CSS, stage select) — 2D ortho UI not interpolated; verify no visual desync between cursor and background.
  • cmake --build … --target ssb64_frame_interp_test runs the standalone unit tests and exits 0.
  • SSB64_FRAME_INTERP_UNITTEST=1 at boot — same six tests in-process.
  • Linux/macOS smoke test (this PR was developed on Windows; the OpenGL/Metal backends won't get the slow-mo workaround but should still build).

Display-rate decoupling layered on top of libultraship's matrix-replacement
hook in Interpreter::Run. Game logic still ticks at the port's natural
cadence; intermediate display frames are produced by lerping per-DObj and
camera matrices captured during the previous and current ticks. Opt-in
via Settings -> Graphics -> Frame Interpolation Mult (1 = disabled).

Recording layer (port/frame_interpolation.{h,cpp}):
- Tree of Path nodes scoped by (stable_ptr, sub_id) labels via
  RecordOpenChild/CloseChild. Pairwise tree match handles actor spawn/despawn.
- Op-typed records for every syMatrix*(Mtx*, ...) builder in src/sys/matrix.c
  (Tra, Sca, Rot{R,D}, RotRpy{R,D}, RotPyr{R}, RotPy{R}, RotRp{R}, RotYaw{R},
  RotPitch{R} and the Tra* / TraRot*Sca combinations).
- Camera composite is input-domain: eye/at/up + fovy/aspect/near/far/scale
  are captured at the F2L call site in gmCameraLookAtFuncMatrix; replay
  rebuilds lookat_F + persp_F and multiplies them. Element-wise lerp on
  view*projection is mathematically wrong (non-rigid transform, visible
  doubling on camera pans) and is no longer used.
- F2L catch-all kept for paths we haven't input-domain-instrumented yet.

Driver (port/gameloop.cpp):
- StartRecord at PortPushFrame entry, StopRecord inside
  port_submit_display_list, then mult presents per submit with
  FrameInterpolation_Interpolate(i/mult) for i in [1, mult-1] and an empty
  replacement map for the final pass (bit-exact-fidelity invariant).
- Slow-motion workaround: temporarily flip gVsyncEnabled=0 and bump
  TargetFps to 60*mult around intermediate presents so the DXGI backend
  does not rate-limit them. The final present runs with the user's vsync
  setting. Intermediate frames will tear; the final frame each tick is
  tear-free.

Menu (port/gui/PortMenu.cpp):
- WIDGET_CVAR_SLIDER_INT bound to gSettings.FrameInterpolationMult,
  range 1-8, default 1. Read every frame so the slider takes effect
  without a relaunch.

Self-tests (port/frame_interpolation_selftest.cpp + _test_main.cpp):
- Six unit tests: record counts, identity lerp, midpoint lerp,
  t=1 fidelity, short-arc angle wrap, actor-spawn handling.
- Hooks: SSB64_FRAME_INTERP_UNITTEST=1 runs the tests at boot
  (exit 2 on failure); SSB64_FRAME_INTERP_TELEMETRY=1 logs op counts
  every ~60 ticks.
- Standalone CMake target ssb64_frame_interp_test (EXCLUDE_FROM_ALL)
  links only the recording layer + stub matrix builders; runs without
  ROM/assets. Useful for CI of the recording infrastructure.

OpenXR scaffold (port/xr/xr_runtime.{h,cpp}):
- Lifecycle hooks (init, shutdown, begin_frame, end_frame) wired into
  PortGameInit / PortPushFrame / PortGameShutdown.
- Opt-in via SSB64_XR_ENABLE=1; real session bring-up gated behind
  -DSSB64_ENABLE_OPENXR=ON build flag with extensive TODO comments
  outlining the cinema-mode (Tier 1) implementation plan: D3D11
  binding, local reference space, quad-layer swapchain, world-locked
  floating screen ~2 m forward.
- No runtime overhead when disabled.

KNOWN LIMITATIONS (need more testing):
- Frame interpolation feature is opt-in (default mult=1) and has only
  been smoke-tested. Per-actor matrix interpolation has been validated
  on the title screen; combat scenes, boss intros, and KO replays have
  not been tested for visual artefacts.
- The vsync toggle slow-mo workaround introduces tearing on intermediate
  frames; needs subjective evaluation across genres/scenes.
- Hitbox debug-render matrices (ftdisplaymain, itdisplay, wpdisplay) are
  recorded but not OpenChild-scoped; if hitbox visualisation is enabled
  with mult>1, alignment may drift mid-attack.
- F2L paths still in use (none in default config) lerp element-wise and
  will show paper-thin artefacts on near-180-degree rotations.
- OpenXR scaffold is structural only — no working VR build yet.
- Tested on Windows / D3D11 only; OpenGL / Metal backends untouched and
  the vsync-toggle workaround targets DXGI specifically.
The original SSB64 ticks at 60 Hz (one tick per VRETRACE, default
sSYTaskmanUpdateInterval=1). My earlier comments and tooltip implied
the port was running at 2x speed and that mult=2 fills a 30->60 Hz
gap — neither is true.

Updated framing:
  - mult=1 matches the original game's 60 Hz cadence.
  - mult>1 only helps on 120 Hz+ displays (60 -> 120/180/240 Hz).
  - On a 60 Hz monitor mult>1 is visually a no-op: the intermediate
    lerped frame tears (vsync toggled off) and is immediately
    overwritten by the equivalent un-lerped frame at the next vsync.

No code change to behavior — only comments and the menu tooltip.
@Jameriquiah

Copy link
Copy Markdown
Contributor

are u able to resolve conflicts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants