Commit 614f4e5
committed
feat: complete IL kernel migration batch 2 - Dot.NDMD, CumSum axis, Shift, Var/Std SIMD
Major changes:
- Dot.NDMD: 15,880 → 419 lines (97% reduction) with SIMD for float/double
- CumSum axis: IL kernel with caching, optimized inner contiguous path
- LeftShift/RightShift: New ILKernelGenerator.Shift.cs (546 lines) with SIMD
- Var/Std axis: SIMD support for int/long/short/byte types
New IL infrastructure:
- ILKernelGenerator.Shift.cs - Bit shift operations with Vector256
- ILKernelGenerator.Scan.cs - Extended with axis cumsum support
- ILKernelGenerator.Reduction.cs - SIMD for integer types in Var/Std
Bug fixes:
- Single element Var/Std with ddof >= size returns NaN (NumPy parity)
- Dot tests Dot3412x5621 and Dot311x511 now pass (removed OpenBugs)
Documentation:
- CLAUDE.md updated with all migrations
- PR #573 comments with progress updates and Definition of Done
Test coverage:
- All Var/Std/CumSum/Shift/Dot tests passing1 parent 5f48da5 commit 614f4e5
12 files changed
Lines changed: 1995 additions & 15959 deletions
File tree
- .claude
- src/NumSharp.Core/Backends
- Default/Math
- BLAS
- Reduction
- Kernels
- test/NumSharp.UnitTest/LinearAlgebra
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
98 | 100 | | |
99 | 101 | | |
100 | 102 | | |
| |||
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
117 | | - | |
| 119 | + | |
| 120 | + | |
118 | 121 | | |
119 | | - | |
| 122 | + | |
| 123 | + | |
120 | 124 | | |
121 | 125 | | |
122 | 126 | | |
123 | | - | |
| 127 | + | |
124 | 128 | | |
125 | 129 | | |
126 | | - | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
127 | 134 | | |
128 | 135 | | |
129 | 136 | | |
| |||
211 | 218 | | |
212 | 219 | | |
213 | 220 | | |
| 221 | + | |
214 | 222 | | |
215 | 223 | | |
216 | 224 | | |
| |||
Lines changed: 343 additions & 15804 deletions
Large diffs are not rendered by default.
0 commit comments