Match MATLAB column-major average reference (float32 parity) by arnodelorme · Pull Request #232 · sccn/eegprep

arnodelorme · 2026-06-13T15:30:38Z

🤖 Average re-referencing was computed as np.dot(refmatrix, data) (NumPy row-major), which differs from MATLAB reref.m's column-major refmatrix * data by up to ~0.06 µV in float32. For borderline subjects, ICA (Picard) convergence amplifies this tiny input difference into a different decomposition, cascading to ~2–3 µV differences at ICLabel and epoching.

Fix

Transpose the operands — (data.T @ refmatrix).T (refmatrix is symmetric, so algebraically identical) — to replicate MATLAB's column-major float32 accumulation order. Bit-exact to MATLAB on x86 BLAS. Single-line change; only the average-reference branch is touched (specific-channel/huber paths unchanged). pop_reref routes both the data and the ICA icawinv through reref(), so both are covered.

Validation (13 P300 subjects, Python 3.10 + MATLAB R2022b)

Step-2 re-reference: 0.03–0.16 µV → 0.000000 µV for all 13 subjects.
Steps 1–4 bit-exact; ICA AMARI ~0 / corr 1.0 for all subjects, including sub-005/008/013 (previously 0.029 / 0.042 / 0.026).
Step-5 ICLabel max across subjects: 3.34 µV → 0.0002 µV; the sub-005/008/013 ICLabel divergence (2.76 / 3.34 / 1.87 µV) is eliminated, and the same components are rejected on both sides.

reref() computed the average reference as np.dot(refmatrix, data) (NumPy row-major), which differs from MATLAB's column-major refmatrix*data by ~0.06 uV in float32. ICA convergence amplifies this for borderline subjects (sub-005/008/013) into a different decomposition and ~2-3 uV ICLabel/epoch differences. Transposing the operands -- (data.T @ refmatrix).T, refmatrix being symmetric -- reproduces MATLAB's accumulation order bit-exactly on x86. reref now matches MATLAB to 0 uV across all 13 P300 parity subjects.

claude · 2026-06-13T15:30:55Z

Claude finished @arnodelorme's task in 1m 52s —— View job

PR Review — Match MATLAB column-major average reference (float32 parity)

Read changed file & diff vs origin/develop
Check MATLAB reref.m reference (not vendored in this checkout)
Assess correctness / algebraic equivalence
Review dtype flow, scope, and risks

Verdict: LGTM. The change is correct, minimal, well-scoped, and the comment honestly documents the reasoning and its platform constraint.

Correctness

The only behavioral line changes the average-reference matmul:

# before
work[chansin_array, :] = np.dot(refmatrix, work[chansin_array, :].astype(dt))
# after
block = np.ascontiguousarray(work[chansin_array, :].astype(dt).T)
work[chansin_array, :] = (block @ refmatrix).T

Algebraically equivalent. refmatrix = eye(n) - ones(n)/n is symmetric, so (data.T @ refmatrix).T = refmatrix.T @ data = refmatrix @ data. ✅
dtype flow is sound. work is float64 (line 50, result_type(float32, float64)); .astype(dt) casts back to the original float32 so the BLAS GEMM runs in float32 (the whole point), and the float32 result is upcast on assignment back into the float64 work buffer. Identical dtype semantics to the prior line. ✅
ascontiguousarray(...T) materializes the transposed block in C-order so the GEMM operand layout mirrors MATLAB's column-major refmatrix * data. This is the mechanism behind the bit-exact claim. Cost is one extra transpose-copy per call — negligible at EEG sizes.
Scope is tight. Only the average-reference else branch is touched; the specific-channel (mean_data = work[ref].mean(...)) and Huber paths are untouched, matching the diff. Since pop_reref routes both data and icawinv through reref(), both get the corrected accumulation. ✅
mean_data = None for average reference is unchanged pre-existing behavior.

Worth noting (non-blocking)

Platform dependence. The comment correctly scopes bit-exactness to "x86 BLAS". On Apple Silicon / ARM or a different BLAS build (OpenBLAS vs MKL), the result may not be bit-identical to MATLAB. Importantly this is never worse than the prior np.dot form — it's the same float32 GEMM, just with operand layout chosen to match MATLAB — so there's no regression risk for non-x86 users. The honest comment is the right call here; no code change needed.
No new test asserts the parity. The existing tests/test_pop_reref.py cases check mean across channels ≈ 0 (atol 1e-6) and icawinv subtraction, which this change preserves but which would not have caught the original 0.06 µV divergence (that only surfaces downstream through ICA). The real validation lives in the MATLAB-parity scripts (tests/matlab/pop_reref_compare.m) and the 13-subject P300 results in the PR body. That's reasonable given bit-exactness is platform/BLAS-specific and hard to assert portably — flagging only so the rationale is recorded.

Testing note

I was unable to execute the test suite or a numerical sanity check in this environment (Bash command approval is restricted here). Please confirm the parity scripts / tests/test_pop_reref.py pass in CI before merging — though given the algebraic equivalence and the thorough validation in the PR description, I expect them to.

arnodelorme merged commit c73ca0d into develop Jun 13, 2026
10 checks passed

arnodelorme deleted the fix/reref-average-transpose-parity branch June 13, 2026 16:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Match MATLAB column-major average reference (float32 parity)#232

Match MATLAB column-major average reference (float32 parity)#232
arnodelorme merged 1 commit into
developfrom
fix/reref-average-transpose-parity

arnodelorme commented Jun 13, 2026

Uh oh!

claude Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arnodelorme commented Jun 13, 2026

Fix

Validation (13 P300 subjects, Python 3.10 + MATLAB R2022b)

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — Match MATLAB column-major average reference (float32 parity)

Correctness

Worth noting (non-blocking)

Testing note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 13, 2026 •

edited

Loading