Skip to content

Flaky test: SimpleArrayPlexTC.test_SimpleArrayPlex_typed_roundtrip (nan != nan) on Windows Release #884

@tigercosmos

Description

@tigercosmos

Summary

tests/test_buffer.py::SimpleArrayPlexTC::test_SimpleArrayPlex_typed_roundtrip flakes on the Windows-2022 Release CI job (run_pilot_pytest) with:

E   AssertionError: nan != nan
tests/test_buffer.py:4100: AssertionError
SUBFAILED(dtype='float32') ... test_SimpleArrayPlex_typed_roundtrip

It is intermittent and platform-specific: Windows Debug passes, and macOS / Ubuntu (Release and Debug) pass. On one PR branch I saw it fail, pass, then fail across three identical CI runs (no source change between them), so it is roughly a coin flip on Windows Release. Because run_pilot_pytest uses -x, this one failure stops the whole Windows-Release suite.

Root cause

The test builds an uninitialized array and then checks that two views of the same buffer agree:

plex = modmesh.SimpleArray((2, 3, 4), dtype=dtype)   # not initialized
typed = plex.typed
plex2 = typed.plex
...
ndarr = np.array(plex, copy=False)   # intended to be a view
ndarr.flat[0] = 1                    # seed element [0,0,0]
self.assertEqual(plex2[0, 0, 0], typed[0, 0, 0])

Two things combine to make it flaky:

  1. np.array(plex, copy=False) does not guarantee a view — copy=False means "avoid a copy if possible", not "never copy". When it copies (apparently the case on Windows Release), the ndarr.flat[0] = 1 seed never reaches the shared buffer, so [0,0,0] stays uninitialized.
  2. Uninitialized float32 memory is sometimes a NaN bit pattern. The assertion then compares nan == nan, which is always false by IEEE-754 — so it fails even though plex2 and typed read the same bytes. The roundtrip the test means to verify is actually fine; the test is just reading garbage and comparing NaN to NaN.

This is purely a test-robustness problem; the buffer/view code is not implicated.

Suggested fix

Make the element deterministic before comparing, and don't depend on copy=False being a view. For example, seed through the SimpleArray API (or np.asarray after asserting a view), assert against the written value rather than view-vs-view, and use a NaN-safe comparison. e.g.:

plex = modmesh.SimpleArray((2, 3, 4), dtype=dtype)
plex.fill(1)                 # deterministic, no uninitialized read
typed = plex.typed
plex2 = typed.plex
self.assertEqual(plex2[0, 0, 0], typed[0, 0, 0])   # both == 1

Evidence

Same test, failing then passing then failing on identical trees:

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrayMulti-dimensional array implementationbuildBuild system and automationtesttesting and continuous integration

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions