Summary
tests/test_buffer.py::SimpleArrayPlexTC::test_SimpleArrayPlex_typed_roundtrip flakes on the Windows-2022 Release CI job (run_pilot_pytest) with:
E AssertionError: nan != nan
tests/test_buffer.py:4100: AssertionError
SUBFAILED(dtype='float32') ... test_SimpleArrayPlex_typed_roundtrip
It is intermittent and platform-specific: Windows Debug passes, and macOS / Ubuntu (Release and Debug) pass. On one PR branch I saw it fail, pass, then fail across three identical CI runs (no source change between them), so it is roughly a coin flip on Windows Release. Because run_pilot_pytest uses -x, this one failure stops the whole Windows-Release suite.
Root cause
The test builds an uninitialized array and then checks that two views of the same buffer agree:
plex = modmesh.SimpleArray((2, 3, 4), dtype=dtype) # not initialized
typed = plex.typed
plex2 = typed.plex
...
ndarr = np.array(plex, copy=False) # intended to be a view
ndarr.flat[0] = 1 # seed element [0,0,0]
self.assertEqual(plex2[0, 0, 0], typed[0, 0, 0])
Two things combine to make it flaky:
np.array(plex, copy=False) does not guarantee a view — copy=False means "avoid a copy if possible", not "never copy". When it copies (apparently the case on Windows Release), the ndarr.flat[0] = 1 seed never reaches the shared buffer, so [0,0,0] stays uninitialized.
- Uninitialized
float32 memory is sometimes a NaN bit pattern. The assertion then compares nan == nan, which is always false by IEEE-754 — so it fails even though plex2 and typed read the same bytes. The roundtrip the test means to verify is actually fine; the test is just reading garbage and comparing NaN to NaN.
This is purely a test-robustness problem; the buffer/view code is not implicated.
Suggested fix
Make the element deterministic before comparing, and don't depend on copy=False being a view. For example, seed through the SimpleArray API (or np.asarray after asserting a view), assert against the written value rather than view-vs-view, and use a NaN-safe comparison. e.g.:
plex = modmesh.SimpleArray((2, 3, 4), dtype=dtype)
plex.fill(1) # deterministic, no uninitialized read
typed = plex.typed
plex2 = typed.plex
self.assertEqual(plex2[0, 0, 0], typed[0, 0, 0]) # both == 1
Evidence
Same test, failing then passing then failing on identical trees:
Summary
tests/test_buffer.py::SimpleArrayPlexTC::test_SimpleArrayPlex_typed_roundtripflakes on the Windows-2022 Release CI job (run_pilot_pytest) with:It is intermittent and platform-specific: Windows Debug passes, and macOS / Ubuntu (Release and Debug) pass. On one PR branch I saw it fail, pass, then fail across three identical CI runs (no source change between them), so it is roughly a coin flip on Windows Release. Because
run_pilot_pytestuses-x, this one failure stops the whole Windows-Release suite.Root cause
The test builds an uninitialized array and then checks that two views of the same buffer agree:
Two things combine to make it flaky:
np.array(plex, copy=False)does not guarantee a view —copy=Falsemeans "avoid a copy if possible", not "never copy". When it copies (apparently the case on Windows Release), thendarr.flat[0] = 1seed never reaches the shared buffer, so[0,0,0]stays uninitialized.float32memory is sometimes a NaN bit pattern. The assertion then comparesnan == nan, which is always false by IEEE-754 — so it fails even thoughplex2andtypedread the same bytes. The roundtrip the test means to verify is actually fine; the test is just reading garbage and comparing NaN to NaN.This is purely a test-robustness problem; the buffer/view code is not implicated.
Suggested fix
Make the element deterministic before comparing, and don't depend on
copy=Falsebeing a view. For example, seed through the SimpleArray API (ornp.asarrayafter asserting a view), assert against the written value rather than view-vs-view, and use a NaN-safe comparison. e.g.:Evidence
Same test, failing then passing then failing on identical trees: