Enhance the performance of fn_ref.#52
Merged
Merged
Conversation
… 2 pointer sizes.
…tack optimization In previous implementation, both `fn_ref` and `fn` will pass the address of `m_erasure` when calling `operator()`. However, it is actually an ODR use of `m_erasure`, which means that the compiler CANNOT optimize the object stack space of `fn_ref`. In new implementation, `erasure_type::ErasurePass` is added as the new first parameter for the ABI of `m_invoker` to take place of `erasure_type::ErasureBase*`. This allows `operator()` to pass `m_erasure` by value in `fn_ref` to avoid ODR use of the object. Here is the assembly difference. - old(GCC 16.1 -std=c++26 -Os) <https://godbolt.org/z/EadE5813z> : ```asm "check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)": sub rsp, 24 mov edx, 3 mov QWORD PTR [rsp], rdi ; extra stack frame and indirection mov rdi, rsp mov QWORD PTR [rsp+8], rsi ; extra stack frame and indirection mov esi, 2 call [QWORD PTR [rsp+8]] add rsp, 24 ret "check2(std::function_ref<bool (int, int)>)": mov rax, rdi mov edx, 3 mov rdi, rsi mov esi, 2 jmp rax ``` - new(GCC 16.1 -std=c++26 -Os) <https://godbolt.org/z/Yrzsvz3G9> : ```asm "check1(ebd::detail::function<8ul, ebd::detail::fn_traits::config_package<true, true, false, false>, bool (int, int)>)": mov rax, rsi mov edx, 3 mov esi, 2 jmp rax "check2(std::function_ref<bool (int, int)>)": mov rax, rdi mov edx, 3 mov rdi, rsi mov esi, 2 jmp rax ```
Kim-J-Smith
commented
May 16, 2026
Kim-J-Smith
commented
May 16, 2026
Kim-J-Smith
added a commit
that referenced
this pull request
May 17, 2026
* docs: update the version number. * fix: fix bugs in converting from fn_ref to fn/unique_fn/safe_fn. * test: add 'size_' and 'make_fn' to converting test. * test: more fail-tests are added. * Add `noexcept` deduction ability to `make_fn`. (#50) * fix: fix bug in deducing noexcept functor when using make_fn. * feat: add noexcept deduction for make_fn. * fix: avoid triggering ICE in MSVC caused by 14.36~14.44 regression bug. * fix: avoid using bool... in the template parameter list in order to support MSVC 19.22 ~ 19.33 . * fix: avoid using sizeof(T) in conditional_t in order to support MSVC 19.20, 19.21. * Avoid storing stateless standard function objects. (#51) * perf: add specialization standard operator wrapper. * style: use function pointer size instead of pointer size. * perf: enhance the make_fn<fn_ref> performance. See <https://godbolt.org/z/MfWd4hxrv> * feat: add make_fn_log_error to provide better error log for make_fn. * perf: relax the condition for passing parameters through registers to 2 pointer sizes. * Enhance the performance of `fn_ref`. (#52) * style: use function pointer size instead of pointer size. * perf: enhance the make_fn<fn_ref> performance. See <https://godbolt.org/z/MfWd4hxrv> * feat: add make_fn_log_error to provide better error log for make_fn. * perf: relax the condition for passing parameters through registers to 2 pointer sizes. * perf: pass erasure by value in `fn_ref` to avoid ODR use and enable stack optimization * benchmark: add runtime benchmark for fn_ref. * fix: add `const` and `volatile` in cast to avoid UB. * docs: update release notes. * chore: fix misspelling and add [[nodiscard]] to `is_empty`. * feat: support C++23 static operator(). * test: add tests for new feature. * ci: add VS2026 test. * docs: update release notes. * chore: add comments for `init` function. * perf: eliminate stateless functor's storage. * test: add test for new feature(stateless storage elimination). * perf: relax the stateless constraint. * docs: update README.md .
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In previous implementation, both
fn_refandfnwill pass the address ofm_erasurewhen callingoperator(). However, it is actually an ODR use ofm_erasure, which means that the compiler CANNOT optimize the object stack space offn_ref.In new implementation,
erasure_type::ErasurePassis added as the new first parameter for the ABI ofm_invokerto take place oferasure_type::ErasureBase*. This allowsoperator()to passm_erasureby value infn_refto avoid ODR use of the object.Here is the assembly difference.