Skip to content

fix: _cycles_per_sec might crash due to overflow by reordering of rdtsc#12

Merged
BewareMyPower merged 4 commits into
fast:mainfrom
BewareMyPower:bewaremypower/fix-init-failure
May 13, 2026
Merged

fix: _cycles_per_sec might crash due to overflow by reordering of rdtsc#12
BewareMyPower merged 4 commits into
fast:mainfrom
BewareMyPower:bewaremypower/fix-init-failure

Conversation

@BewareMyPower
Copy link
Copy Markdown
Contributor

fixes #5

Copy link
Copy Markdown
Collaborator

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, in a good direction. Comments inline.

Comment thread src/tsc_now.rs Outdated
Comment on lines +5 to +8
#[cfg(all(target_arch = "x86", not(target_feature = "sse2")))]
use core::sync::atomic::compiler_fence;
#[cfg(all(target_arch = "x86", not(target_feature = "sse2")))]
use core::sync::atomic::Ordering;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this import to the block use it below. And I remember we prefer use std:: since this crate requires std.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. The original code is generated by LLM with full qualified path, so I told LLM to import the function or method instead.

  • Use use imports instead of fully qualified paths. For example, import std::sync::atomic::compiler_fence and use compiler_fence(...) rather than std::sync::atomic::compiler_fence(...).

Now I added another convention:

  • Scope use imports inside cfg blocks rather than at module top level when they are only needed there

Comment thread src/tsc_now.rs Outdated
Copy link
Copy Markdown
Collaborator

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a new commit for tidy code. LGTM.

cc @andylokandy @zhongzc

@BewareMyPower BewareMyPower merged commit 164fc07 into fast:main May 13, 2026
9 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-init-failure branch May 13, 2026 07:09
@tisonkun
Copy link
Copy Markdown
Collaborator

@BewareMyPower Do you have a bench report after introducing this fence like #11 (comment)?

@BewareMyPower
Copy link
Copy Markdown
Contributor Author

introducing this fence

I assume you mentioned this PR.

The current workflow already run the bench, see https://github.com/fast/fastant/actions/runs/25777647841/job/75713505393

Here is a Python script that parses the raw outputs:
bench_table.py

Implementation Time (ns) vs std
fastant 9.53 ns (9.52-9.55) 0.56x faster
quanta 11.84 ns (11.83-11.86) 0.69x faster
std 17.15 ns (17.14-17.16) baseline

The change in this PR does not matter because it only affects the warm-up phase. The fence introduced in monotonic_with_tsc, which is only called by _cycles_per_sec():

#[small_ctor::ctor] init()              // tsc_now.rs:24  (runs before main())
  → TSCLevel::get()                     // tsc_now.rs:68
    → is_tsc_stable()                   // tsc_now.rs:95  (CPUID + /sys checks)
    → cycles_per_sec(anchor)            // tsc_now.rs:150
      → _cycles_per_sec()               // tsc_now.rs:160  ← CRASH HERE

_cycles_per_sec() runs only during program initialization, before main(), because init() is annotated with #[small_ctor::ctor] which places it in the .init_array linker section.

On the hot path, Instant::now() does not call anything heavy:

Instant::now()                          // instant.rs:32
  → crate::current_cycle()             // lib.rs:61
    → tsc_now::current_cycle()         // tsc_now.rs:50
      → tsc().wrapping_sub(cycles_from_anchor)  // just reads cached values

tsc() still does not use fence, that's the key point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic due to overflow in _cycles_per_sec()

2 participants