Skip to content

Enhance the performance of fn_ref.#52

Merged
Kim-J-Smith merged 15 commits into
version-2.1.4from
perf/no_odr_fn_ref
May 16, 2026
Merged

Enhance the performance of fn_ref.#52
Kim-J-Smith merged 15 commits into
version-2.1.4from
perf/no_odr_fn_ref

Conversation

@Kim-J-Smith
Copy link
Copy Markdown
Owner

In previous implementation, both fn_ref and fn will pass the address of m_erasure when calling operator(). However, it is actually an ODR use of m_erasure, which means that the compiler CANNOT optimize the object stack space of fn_ref.

In new implementation, erasure_type::ErasurePass is added as the new first parameter for the ABI of m_invoker to take place of erasure_type::ErasureBase*. This allows operator() to pass m_erasure by value in fn_ref to avoid ODR use of the object.

Here is the assembly difference.

"check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)":
        sub     rsp, 24
        mov     edx, 3
        mov     QWORD PTR [rsp], rdi ; extra stack frame and indirection
        mov     rdi, rsp
        mov     QWORD PTR [rsp+8], rsi ; extra stack frame and indirection
        mov     esi, 2
        call    [QWORD PTR [rsp+8]]
        add     rsp, 24
        ret
"check2(std::function_ref<bool (int, int)>)":
        mov     rax, rdi
        mov     edx, 3
        mov     rdi, rsi
        mov     esi, 2
        jmp     rax
"check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)":
        mov     rax, rsi
        mov     edx, 3
        mov     esi, 2
        jmp     rax
"check2(std::function_ref<bool (int, int)>)":
        mov     rax, rdi
        mov     edx, 3
        mov     rdi, rsi
        mov     esi, 2
        jmp     rax

…tack optimization

In previous implementation, both `fn_ref` and `fn` will pass the address of `m_erasure` when calling `operator()`. However, it is actually an ODR use of `m_erasure`, which means that the compiler CANNOT optimize the object stack space of `fn_ref`.

In new implementation, `erasure_type::ErasurePass` is added as the new first parameter for the ABI of `m_invoker` to take place of `erasure_type::ErasureBase*`. This allows `operator()` to pass `m_erasure` by value in `fn_ref` to avoid ODR use of the object.

Here is the assembly difference.

- old(GCC 16.1 -std=c++26 -Os) <https://godbolt.org/z/EadE5813z> :
```asm
"check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)":
        sub     rsp, 24
        mov     edx, 3
        mov     QWORD PTR [rsp], rdi ; extra stack frame and indirection
        mov     rdi, rsp
        mov     QWORD PTR [rsp+8], rsi ; extra stack frame and indirection
        mov     esi, 2
        call    [QWORD PTR [rsp+8]]
        add     rsp, 24
        ret
"check2(std::function_ref<bool (int, int)>)":
        mov     rax, rdi
        mov     edx, 3
        mov     rdi, rsi
        mov     esi, 2
        jmp     rax
```

- new(GCC 16.1 -std=c++26 -Os) <https://godbolt.org/z/Yrzsvz3G9> :
```asm
"check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)":
        mov     rax, rsi
        mov     edx, 3
        mov     esi, 2
        jmp     rax
"check2(std::function_ref<bool (int, int)>)":
        mov     rax, rdi
        mov     edx, 3
        mov     rdi, rsi
        mov     esi, 2
        jmp     rax
```
@Kim-J-Smith Kim-J-Smith added the enhancement New feature or request label May 15, 2026
@Kim-J-Smith Kim-J-Smith added the documentation Improvements or additions to documentation label May 16, 2026
Comment thread include/embed/embed_function.hpp Outdated
Comment thread include/embed/embed_function.hpp
@Kim-J-Smith Kim-J-Smith merged commit cb20f93 into version-2.1.4 May 16, 2026
29 checks passed
@Kim-J-Smith Kim-J-Smith deleted the perf/no_odr_fn_ref branch May 16, 2026 12:51
Kim-J-Smith added a commit that referenced this pull request May 17, 2026
* docs: update the version number.

* fix: fix bugs in converting from fn_ref to fn/unique_fn/safe_fn.

* test: add 'size_' and 'make_fn' to converting test.

* test: more fail-tests are added.

* Add `noexcept` deduction ability to `make_fn`. (#50)

* fix: fix bug in deducing noexcept functor when using make_fn.

* feat: add noexcept deduction for make_fn.

* fix: avoid triggering ICE in MSVC caused by 14.36~14.44 regression bug.

* fix: avoid using bool... in the template parameter list in order to support MSVC 19.22 ~ 19.33 .

* fix: avoid using sizeof(T) in conditional_t in order to support MSVC 19.20, 19.21.

* Avoid storing stateless standard function objects. (#51)

* perf: add specialization standard operator wrapper.

* style: use function pointer size instead of pointer size.

* perf: enhance the make_fn<fn_ref> performance. See <https://godbolt.org/z/MfWd4hxrv>

* feat: add make_fn_log_error to provide better error log for make_fn.

* perf: relax the condition for passing parameters through registers to 2 pointer sizes.

* Enhance the performance of `fn_ref`. (#52)

* style: use function pointer size instead of pointer size.

* perf: enhance the make_fn<fn_ref> performance. See <https://godbolt.org/z/MfWd4hxrv>

* feat: add make_fn_log_error to provide better error log for make_fn.

* perf: relax the condition for passing parameters through registers to 2 pointer sizes.

* perf: pass erasure by value in `fn_ref` to avoid ODR use and enable stack optimization

* benchmark: add runtime benchmark for fn_ref.

* fix: add `const` and `volatile` in cast to avoid UB.

* docs: update release notes.

* chore: fix misspelling and add [[nodiscard]] to `is_empty`.

* feat: support C++23 static operator().

* test: add tests for new feature.

* ci: add VS2026 test.

* docs: update release notes.

* chore: add comments for `init` function.

* perf: eliminate stateless functor's storage.

* test: add test for new feature(stateless storage elimination).

* perf: relax the stateless constraint.

* docs: update README.md .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant