Skip to content

add atomic global counter system#1323

Open
Vizonex wants to merge 22 commits into
aio-libs:masterfrom
Vizonex:global-state
Open

add atomic global counter system#1323
Vizonex wants to merge 22 commits into
aio-libs:masterfrom
Vizonex:global-state

Conversation

@Vizonex
Copy link
Copy Markdown
Member

@Vizonex Vizonex commented Apr 21, 2026

What do these changes do?

These changes are based off @devdanzin's second fuzzer for freethreading mode which noted using a global atomic counter system instead of a normal uint64_t value I did however make one small nitpick/change from that original design to just recast (_Atomic(uint64_t)*)&state->global_version which was not in the original suggestion mainly due to my code editor misbehaving on me when I would go to set it into the actual structure.

Are there changes in behavior for the user?

Just bug fixes.

Related issue number

#1321

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes

@Vizonex Vizonex requested a review from asvetlov as a code owner April 21, 2026 23:34
Comment thread tests/isolated/multidict_global_counter.py Fixed
@Vizonex Vizonex requested a review from webknjaz as a code owner April 21, 2026 23:36
@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided There is a change note present in this PR label Apr 21, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 22, 2026

Merging this PR will not alter performance

✅ 245 untouched benchmarks


Comparing Vizonex:global-state (2ab5cb1) with master (8d0edb4)

Open in CodSpeed

Comment thread multidict/_multilib/state.h Outdated
Co-authored-by: J. Nick Koston <nick+github@koston.org>
Comment thread multidict/_multilib/state.h Outdated
Comment thread multidict/_multilib/state.h Outdated
Comment thread setup.py Outdated
Vizonex and others added 3 commits April 21, 2026 22:22
Comment thread setup.py Outdated
Comment thread setup.py Outdated
Copy link
Copy Markdown
Member

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Vizonex

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates multidict’s global version counter to be atomic (supporting CPython free-threading) and adds an isolated regression test to detect lost increments under concurrent mutation.

Changes:

  • Make mod_state.global_version an atomic and update NEXT_VERSION() to use atomic_fetch_add_explicit(..., memory_order_relaxed).
  • Add an isolated free-threading test that asserts the version delta matches the number of concurrent writes.
  • Adjust build configuration to add C11 atomics flags for MSVC via a custom build_ext.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
multidict/_multilib/state.h Switches the module-level global counter to an atomic and updates increment logic.
tests/isolated/multidict_global_counter.py New isolated regression test for concurrent version increments (free-threaded builds).
tests/test_leaks.py Registers the new isolated script in the isolated-script runner list.
setup.py Introduces custom build_ext to apply platform-specific C compile flags (incl. MSVC C11 atomics).
CHANGES/1328.bugfix.rst Adds a changelog entry for the atomic counter bugfix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread setup.py
Comment on lines +33 to +36
for flag in BASE_CFLAGS:
# XXX: MSVC Doesn't have a /O3 flag only O2 is possible...
ext.extra_compile_args.append("/O2" if flag == "O3" else f"/{flag}")
else:
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSVC handling currently prefixes BASE_CFLAGS values with / (e.g. O0 -> /O0, g3 -> /g3). These aren't valid MSVC flags (/Od and /Zi//Z7 are the usual equivalents), so Windows builds with MULTIDICT_DEBUG_BUILD=1 will fail. Consider defining a separate MSVC-specific debug/release flag list instead of reusing BASE_CFLAGS verbatim.

Copilot uses AI. Check for mistakes.
Comment thread tests/isolated/multidict_global_counter.py Outdated
Comment thread setup.py
Vizonex and others added 2 commits April 21, 2026 23:16
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
import sysconfig
import threading

import multidict
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably resolve this, no reason to import it twice.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concur — easiest fix is from multidict import MultiDict, getversion and then call getversion(md) directly. That also removes one of the two type: ignore[arg-type] comments below, since getversion would be imported with its real signature rather than accessed through the bare module namespace.

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Apr 22, 2026

Looks like the tests still need some work

@Vizonex
Copy link
Copy Markdown
Member Author

Vizonex commented Apr 25, 2026

Looks like the tests still need some work

@bdraco I think the functions aren't all locked yet which is why the numbers are still off other than that I agree with you. I think I'll create an extra PR today for locking the functions down in free-threaded mode since I have the time to.

Comment thread CHANGES/1328.bugfix.rst Outdated
md[f"k{tid}_{i}"] = i


if (__name__ == "__main__") and FREETHREADED:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as in other PRs: we shouldn't need to gate on FT.

cc @Dreamsorcerer

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreeing with this concern. Now that global_version is atomic, the invariant delta == N*M should hold under the GIL too — it just becomes a less interesting test. Gating on FREETHREADED means non‑FT CI runs the script for no benefit (still pays a subprocess + import) and any future regression that reverts NEXT_VERSION back to a plain ++ won't be caught unless someone happens to run an FT build locally. Unless there's a concrete __setitem__ path on the GIL build that legitimately skips a NEXT_VERSION call, I'd drop the gate. If the assertion does fail on a GIL build today, that's a separate bug that this PR is masking by gating it out.


md: MultiDict[int] = MultiDict()
N, M = 3, 100
baseline = multidict.getversion(md) # type: ignore[arg-type]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ignore?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's an error here, then maybe my strict Multidict[object] is causing problems. Might need to be changed to Multidict[Any]. Probably a contravariant thing I missed.

for t in threads:
t.join()

observed = multidict.getversion(md) - baseline # type: ignore[arg-type]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ignore?

Copy link
Copy Markdown
Member Author

@Vizonex Vizonex Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiDict[object] is what getversion(..) wants but here we have a completely different type hence type: ignore[arg-type]

Vizonex and others added 2 commits April 29, 2026 20:46
Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
@Vizonex Vizonex added the free-threading FT-related discussion, issue or PR label May 1, 2026
@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@aiolibsbot review

@aiolibsbot
Copy link
Copy Markdown
Contributor

PR Review — add atomic global counter system

The core change — making mod_state.global_version atomic and using atomic_fetch_add_explicit(..., memory_order_relaxed) — is correct and the rationale comment now in place addresses @bdraco's earlier ask. memory_order_relaxed is the right choice because the counter only needs RMW atomicity, not ordering with respect to other memory operations.

Blocking issues before merge:

  1. setup.py MSVC debug flags: /O0 and /g3 are not valid MSVC options. Debug builds on Windows will warn or fail. Adopt @bdraco's cleaner table‑based approach and add an explicit MSVC mapping.
  2. Duplicate multidict import in the new isolated test (CodeQL + @Dreamsorcerer).
  3. FREETHREADED gating turns the test into a no‑op on regular CPython, and the assertion itself appears to require the not‑yet‑landed per‑dict locking. As @bdraco noted ("tests still need some work"), either mark the test xfail / hold it for the locking PR, or rescope it so it only validates the atomic counter independently of __setitem__ thread‑safety.

Lower priority: apply @webknjaz's changelog wording, the inline‑formatting suggestion on NEXT_VERSION, and the MultiDict[object] typing nudge.


🟡 Important

1. MSVC debug build will pass invalid flags (`setup.py`, L13)

With DEBUG_BUILD=1 on MSVC, the loop translates BASE_CFLAGS = ["O0", "g3", "UNDEBUG"] into /O0, /g3, /UNDEBUG. The mapping only special‑cases O3 → /O2. However:

  • /O0 is not a valid MSVC option (MSVC uses /Od to disable optimization).
  • /g3 is not a valid MSVC option (MSVC uses /Z7 / /Zi / /ZI for debug info).
  • /UNDEBUG happens to work because /U<symbol> undefines a macro.

So make debug builds on MSVC will either fail or print warnings about unrecognized options. Either drop the auto‑translate scheme and keep the original - prefixes for Unix (as @bdraco suggested) plus an explicit MSVC table:

BASE_UNIX_DEBUG = ["-O0", "-g3", "-UNDEBUG"]
BASE_UNIX_RELEASE = ["-O3", "-DNDEBUG"]
BASE_MSVC_DEBUG = ["/Od", "/Zi", "/UNDEBUG"]
BASE_MSVC_RELEASE = ["/O2", "/DNDEBUG"]

or at least extend the translation to handle O0 and g* explicitly.

BASE_CFLAGS = ["O0", "g3", "UNDEBUG"] if DEBUG_BUILD else ["O3", "DNDEBUG"]
...
    ext.extra_compile_args.append("/O2" if flag == "O3" else f"/{flag}")
2. Duplicate import of `multidict` (`tests/isolated/multidict_global_counter.py`, L3-5)

CodeQL and @Dreamsorcerer both flagged this: the script does import multidict and from multidict import MultiDict. Pick one. The cleanest fix is to import everything you need directly:

from multidict import MultiDict, getversion

Then getversion(md) instead of multidict.getversion(md). That also lets you drop the second type: ignore[arg-type] if getversion is re‑exported with the correct annotation.

import multidict
from multidict import MultiDict
3. Gating on `FREETHREADED` makes this a no‑op under the GIL (`tests/isolated/multidict_global_counter.py`, L20-33)

As @webknjaz noted, the threaded section is wrapped in if (__name__ == "__main__") and FREETHREADED:. On a regular CPython build this script imports cleanly, runs the module‑level MultiDict() + getversion(md), and then exits without exercising the assertion. That means:

  1. The regression is only verifiable on free‑threaded builds.
  2. The test is registered in tests/test_leaks.py, so under the GIL it still spends time spawning a subprocess just to do nothing useful.

For a counter that's now atomic, the assertion observed == N * M should hold under the GIL too (it's just less interesting). Dropping the FREETHREADED gate would (a) make the script meaningful for every supported interpreter and (b) catch any future regression where someone reverts NEXT_VERSION to a non‑atomic increment without anyone running an FT build locally.

If there's a concrete reason the assertion can't hold under the GIL (e.g. MultiDict.__setitem__ itself fails to bump the version in some path), please document it inline.

if (__name__ == "__main__") and FREETHREADED:
4. Assertion will likely fail without per‑dict locking (`tests/isolated/multidict_global_counter.py`, L24-32)

As both you and @bdraco noted in the issue thread, the assertion expected == observed requires every concurrent md[k] = v to result in exactly one increment of NEXT_VERSION. An atomic counter only guarantees that if NEXT_VERSION is called NM times the counter advances by NM. If __setitem__ itself races (e.g. a concurrent insert short‑circuits, or a Py_INCREF/DECREF tears), some calls to NEXT_VERSION may be skipped or duplicated relative to what the test counts.

Until per‑MultiDict locking lands (the follow‑up PR you mentioned), this test will flake on FT builds and the failure will look like "atomic counter is broken" when really the dict mutation path is what needs serializing. Two options:

  1. Land this PR with the test marked xfail (or removed) and re‑enable it together with the locking PR.
  2. Narrow the test so it only exercises NEXT_VERSION directly via a synthetic accessor, decoupling the counter check from __setitem__ semantics.

Option 1 is probably cleanest given the comment trail.

observed = multidict.getversion(md) - baseline
expected = N * M
assert expected == observed, ...

🟢 Suggestions

1. Prefer keeping dashes inline per reviewer suggestion (`setup.py`, L13-42)

@bdraco already laid out a cleaner shape that avoids the string‑stripping dance entirely: keep BASE_CFLAGS with the proper - prefix and translate them inside BuildExt.build_extensions only when self.compiler.compiler_type == "msvc". That removes the f"-{flag}" / f"/{flag}" reconstruction step and makes the per‑platform tables self‑documenting. It also sidesteps the /O0//g3 issue above because the translation becomes explicit.

2. `extra_compile_args=[]` is dead and misleading (`setup.py`, L46-48)

Now that BuildExt.build_extensions populates ext.extra_compile_args, the literal extra_compile_args=[] on the Extension is pure noise. Either drop it (it defaults to None) or set it to None so a future reader doesn't think they need to add flags here.

Extension(
    "multidict._multidict",
    ["multidict/_multidict.c"],
    extra_compile_args=[],
),
3. Pass `cmdclass` symmetrically (`setup.py`, L54-58)

setup(ext_modules=extensions, cmdclass={"build_ext": BuildExt}) is only used in the accelerated branch. The pure‑Python branch is fine because no extensions are compiled, but matching @bdraco's snippet — setup() for the no‑extensions path and setup(ext_modules=..., cmdclass={"build_ext": BuildExt}) otherwise — is exactly what you already do, so this is already correct. Worth confirming this PR didn't inadvertently break pip install -e . on Windows by running an editable install once before merging.

4. Apply @bdraco's single‑line formatting suggestion (`multidict/_multilib/state.h`, L133-136)

The committed comment is good, but the call is split across three lines while the prior, non‑atomic implementation was a single statement. @bdraco's suggested form keeps the diff tighter and matches the surrounding style:

/* relaxed is fine here as we only care about the atomicity of the RMW itself */
return atomic_fetch_add_explicit(
    &state->global_version, 1, memory_order_relaxed) + 1;

Non‑blocking — purely cosmetic.

return atomic_fetch_add_explicit(
           &state->global_version, 1, memory_order_relaxed) +
       1;
5. `MultiDict[int]` vs `MultiDict[object]` invariance (`tests/isolated/multidict_global_counter.py`, L12)

The # type: ignore[arg-type] here exists because getversion is annotated for MultiDict[object] and MultiDict[int] isn't a subtype due to invariance. Two cleaner options:

  1. Declare md: MultiDict[object] = MultiDict() — the test never reads values back, so the wider element type costs nothing.
  2. Adjust getversion's signature to accept MultiDict[Any] (per @Dreamsorcerer's comment) so callers don't have to pay the variance tax.

Either removes both type: ignore comments and makes the test read more naturally.

md: MultiDict[int] = MultiDict()
...
baseline = multidict.getversion(md)  # type: ignore[arg-type]
6. Take @webknjaz's changelog suggestion (`CHANGES/1328.bugfix.rst`, L1-2)

@webknjaz already proposed dropping the trailing -- by :user:\Vizonex`.` to match the project's changelog conventions (contributor attribution is captured via git log/PR rather than inline). Worth applying before merge.

Fixed global counter system using an atomic variable
-- by :user:`Vizonex`.

Checklist

  • Atomic semantics correct for the counter
  • Memory ordering justification documented
  • Build flags valid on all supported toolchains — warning #1
  • New tests are meaningful on all supported interpreters — warning #7
  • No duplicate / dead imports — warning #6
  • Changelog follows project convention — suggestion #9
  • No resource leaks introduced
  • No hardcoded secrets or unsafe deserialization

Summary

The core change — making mod_state.global_version atomic and using atomic_fetch_add_explicit(..., memory_order_relaxed) — is correct and the rationale comment now in place addresses @bdraco's earlier ask. memory_order_relaxed is the right choice because the counter only needs RMW atomicity, not ordering with respect to other memory operations.

Blocking issues before merge:

  1. setup.py MSVC debug flags: /O0 and /g3 are not valid MSVC options. Debug builds on Windows will warn or fail. Adopt @bdraco's cleaner table‑based approach and add an explicit MSVC mapping.
  2. Duplicate multidict import in the new isolated test (CodeQL + @Dreamsorcerer).
  3. FREETHREADED gating turns the test into a no‑op on regular CPython, and the assertion itself appears to require the not‑yet‑landed per‑dict locking. As @bdraco noted ("tests still need some work"), either mark the test xfail / hold it for the locking PR, or rescope it so it only validates the atomic counter independently of __setitem__ thread‑safety.

Lower priority: apply @webknjaz's changelog wording, the inline‑formatting suggestion on NEXT_VERSION, and the MultiDict[object] typing nudge.


Automated review by Kōandcbc141
df0477a
6a35f59
b785c82
adcec68
aaf0b25
ae8b776
d949beb
89214fa
6024eaa
d9e4b00
3ac82ec
9a73bd4
b5f675d
a7fad51
e872f37
8bb2038
abdbf7f
7c4b6b7
fbc5351
0298263

@aiolibsbot
Copy link
Copy Markdown
Contributor

@Vizonex: > Looks like the tests still need some work

That matches @bdraco's read. Suggestion: until the per‑dict locking PR lands, either (a) mark tests/isolated/multidict_global_counter.py as xfail on FT builds, or (b) rescope this test to validate NEXT_VERSION directly (e.g. call it from N threads through a tiny C accessor) so it's decoupled from MultiDict.__setitem__ thread‑safety. Otherwise this test will read as "the atomic counter is broken" on FT CI when really it's the mutation path that races. The atomic counter change itself is sound and worth shipping independently of the locking work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR free-threading FT-related discussion, issue or PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants