|
| 1 | +# NumSharp Release Notes - Long Indexing Branch |
| 2 | + |
| 3 | +**456 files changed | 95,375 insertions | 8,238 deletions** |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Major Features |
| 8 | + |
| 9 | +### Int64/Long Indexing Support (Arrays >2GB) |
| 10 | + |
| 11 | +Complete migration from `int` to `long` indexing across the entire codebase, enabling arrays larger than 2.1 billion elements (~2GB for byte arrays, ~16GB for doubles). |
| 12 | + |
| 13 | +**Core Changes:** |
| 14 | +- `Shape.dimensions`, `Shape.strides`, `Shape.size`, `Shape.offset` changed to `long[]` / `long` |
| 15 | +- `NDArray.size`, `NDArray.len`, all indexers changed to `long` |
| 16 | +- `ArraySlice<T>`, `UnmanagedMemoryBlock<T>`, `UnmanagedStorage` migrated to `long` indexing |
| 17 | +- `NDIterator`, `MultiIterator` updated for `long` coordinates and offsets |
| 18 | +- All 20+ ILKernelGenerator partial classes updated for `long` loop counters and offsets |
| 19 | +- Added `UnmanagedSpan<T>` (ported from dotnet/runtime) for Span-like semantics with `long` length |
| 20 | +- Added `LongIntroSort` for sorting large arrays |
| 21 | +- `Hashset<T>` upgraded to long-based indexing with 33% growth for large collections |
| 22 | + |
| 23 | +**API Additions for Long Indexing:** |
| 24 | +- `long[]` overloads for `NDArray.GetInt32()`, `GetInt64()`, `GetSingle()`, etc. |
| 25 | +- `long` population size support in `np.random.choice` |
| 26 | +- `long` repeat counts in `np.repeat` |
| 27 | +- All random sampling functions support `long[]` size parameters |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +### NumPy 2.x Alignment |
| 32 | + |
| 33 | +**`np.arange()` Now Returns Int64:** |
| 34 | +- Integer inputs now return `Int64` arrays (NumPy 2.x behavior) |
| 35 | +- Fixed negative step behavior to match NumPy exactly |
| 36 | +- Fixed integer arithmetic for dtype casting (matches NumPy's template approach) |
| 37 | +- Inlined type-specific loops matching NumPy's `arraytypes.c.src` implementation |
| 38 | + |
| 39 | +**Type System Overhaul:** |
| 40 | +- Added `NPTypeHierarchy.cs` encoding NumPy's exact type tree structure (from `multiarraymodule.c`) |
| 41 | +- `Bool` is NOT under `Number` (NumPy 2.x critical behavior) |
| 42 | +- `issubdtype(int32, int64)` correctly returns `False` (concrete types are siblings) |
| 43 | + |
| 44 | +**New Type Introspection APIs:** |
| 45 | +- `np.can_cast(from, to, casting)` - Full NumPy-compatible type casting checks |
| 46 | +- `np.promote_types(type1, type2)` - Type promotion following NumPy rules |
| 47 | +- `np.result_type(*arrays_and_dtypes)` - Result type inference |
| 48 | +- `np.min_scalar_type(value)` - Minimum scalar type for a value |
| 49 | +- `np.common_type(*arrays)` - Common type for arrays |
| 50 | +- `np.issubdtype(arg1, arg2)` - Type hierarchy checking |
| 51 | +- `np.isreal()`, `np.iscomplex()`, `np.isrealobj()`, `np.iscomplexobj()` |
| 52 | +- `np.finfo(dtype)` - Machine limits for floating-point types |
| 53 | +- `np.iinfo(dtype)` - Machine limits for integer types |
| 54 | + |
| 55 | +**Container Protocol (Python `in` operator):** |
| 56 | +- `NDArray.Contains()` now propagates broadcasting errors (matches NumPy's `__contains__`) |
| 57 | +- `[1,2] in np.array([1,2,3])` now throws `IncorrectShapeException` |
| 58 | +- Type mismatch returns `False` (e.g., `"hello" in np.array([1,2,3])`) |
| 59 | + |
| 60 | +--- |
| 61 | + |
| 62 | +### New NDArray Methods |
| 63 | + |
| 64 | +- **`NDArray.tolist()`** - Convert NDArray to nested lists (NumPy parity) |
| 65 | +- **`NDArray.item(*args)`** - Copy element to standard Python scalar |
| 66 | +- **`np.frombuffer()`** - Complete rewrite with full NumPy-compatible signature: |
| 67 | + - `count` and `offset` parameters |
| 68 | + - Big-endian byte swap support via dtype strings (`">u4"`, `">i4"`) |
| 69 | + - `ArraySegment<byte>`, `Memory<byte>`, `IntPtr`, `void*` overloads |
| 70 | + - Optional dispose callback for native memory ownership |
| 71 | + - View semantics (pinned buffer, modifications affect original) |
| 72 | + |
| 73 | +--- |
| 74 | + |
| 75 | +## Performance Improvements |
| 76 | + |
| 77 | +### SIMD-Optimized MatMul |
| 78 | +- **35-100x speedup** over scalar path for matrix multiplication |
| 79 | +- Added `SimdMatMul` with V128/V256/V512 vector support |
| 80 | +- Long indexing support for matrices >2GB |
| 81 | + |
| 82 | +### SIMD NaN Statistics |
| 83 | +- `nansum`, `nanmean`, `nanstd`, `nanvar`, `nanmin`, `nanmax` optimized with SIMD |
| 84 | +- Added `ILKernelGenerator.Reduction.NaN.cs` (1,097 lines of IL generation) |
| 85 | + |
| 86 | +### General SIMD Improvements |
| 87 | +- All reduction operations (sum, prod, min, max, mean, std, var) with SIMD paths |
| 88 | +- Scan operations (cumsum, cumprod) with SIMD optimization |
| 89 | +- Boolean reductions (any, all) with SIMD fast paths |
| 90 | + |
| 91 | +### np.arange() Performance |
| 92 | +- Inlined type-specific loops (no delegate overhead per element) |
| 93 | +- Direct pointer casts matching NumPy's template-generated fill functions |
| 94 | + |
| 95 | +--- |
| 96 | + |
| 97 | +## Bug Fixes |
| 98 | + |
| 99 | +### Core Functionality |
| 100 | +- **`np.any/np.all` with axis parameter**: Now supports 0D (scalar) arrays with `axis=0` or `axis=-1` |
| 101 | +- **`np.arange` negative step**: Fixed to return `[10,8,6,4,2]` instead of `[9,7,5,3,1]` for `arange(10,0,-2)` |
| 102 | +- **Scalar broadcast assignment**: Fixed cross-dtype conversion |
| 103 | +- **Fancy indexing**: Support for all integer dtypes (Int16, Int32, Int64, etc.) |
| 104 | +- **`NDArray.unique()`**: Fixed for long indexing support |
| 105 | +- **`np.repeat`**: Fixed dtype handling and long count support |
| 106 | +- **`np.random.choice`**: Fixed for long population sizes |
| 107 | +- **`np.shuffle`**: Aligned with NumPy legacy API (removed axis parameter that didn't exist) |
| 108 | +- **`np.random.standard_normal`**: Fixed typo in API |
| 109 | +- **`np.random()`**: Added alias for uniform distribution |
| 110 | + |
| 111 | +### ILKernel Fixes |
| 112 | +- Fixed numerous int32 overflow issues in loop counters |
| 113 | +- Fixed `TransformOffset` calculations for >2GB arrays |
| 114 | +- Fixed SIMD helper functions for long indexing |
| 115 | + |
| 116 | +### Test Fixes |
| 117 | +- Fixed 71 test failures from NumPy 2.x Int64 alignment |
| 118 | +- Removed `[OpenBugs]` from 74 now-passing tests |
| 119 | +- Fixed dtype-specific getter mismatches throughout test suite |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## Refactoring |
| 124 | + |
| 125 | +### ValueType to Object Migration |
| 126 | +- All scalar return types migrated from `ValueType` to `object` |
| 127 | +- `NPTypeCode.GetDefaultValue()` returns `object` |
| 128 | +- All operators migrated to NumPy-aligned object pattern |
| 129 | +- NDArray null checks converted from `== null` to `is null` pattern |
| 130 | + |
| 131 | +### Type System Consolidation |
| 132 | +- `can_cast` derived from type promotion tables (replaced 80+ lines of switch cases) |
| 133 | +- Single source of truth for type hierarchy (`NPTypeHierarchy`) |
| 134 | +- Removed duplicate `TypeKind` enum and category helper methods |
| 135 | + |
| 136 | +### Code Cleanup |
| 137 | +- Removed unused `Fx.cs` (953 lines of pooling code) |
| 138 | +- Removed `KernelKey.cs`, `KernelSignatures.cs`, `SimdThresholds.cs`, `TypeRules.cs` |
| 139 | +- Removed `StorageType.cs`, `np.linalg.norm.cs` (incomplete LAPACK bindings) |
| 140 | +- Removed `LongList<T>` utility class |
| 141 | +- Removed LINQ extension files (`IEnumeratorExtensions.cs`, `MaxBy.cs`) |
| 142 | + |
| 143 | +--- |
| 144 | + |
| 145 | +## Documentation |
| 146 | + |
| 147 | +### New Guides |
| 148 | +- **Buffering, Arrays and Unmanaged Memory** - Memory architecture, view vs copy semantics, ownership model |
| 149 | +- **IL Kernel Generation** - How ILKernelGenerator works with SIMD |
| 150 | +- **NumPy .npy/.npz Format Reference** - Binary format implementation details |
| 151 | +- **Int64 Indexing Migration Guide** - Patterns for large array support |
| 152 | + |
| 153 | +### API Documentation |
| 154 | +- Complete typing function documentation with NumPy alignment notes |
| 155 | +- `np.frombuffer` overloads and ownership model documentation |
| 156 | + |
| 157 | +--- |
| 158 | + |
| 159 | +## Test Improvements |
| 160 | + |
| 161 | +### New Test Infrastructure |
| 162 | +- `[HighMemory]` attribute for tests requiring 8GB+ RAM |
| 163 | +- `[SkipOnLowMemory]` runtime memory check attribute |
| 164 | +- `TestMemoryTracker` for diagnosing CI OOM failures |
| 165 | +- Proper TUnit category exclusion in CI |
| 166 | + |
| 167 | +### New Test Coverage |
| 168 | +- **~500+ new battle tests** validated against actual NumPy 2.x output |
| 169 | +- `LongIndexingSmokeTest` - 96 np.* function coverage with 1M element arrays |
| 170 | +- `LongIndexingBroadcastTest` - 2.36 billion element broadcast iterations |
| 171 | +- `LongIndexingMasterTest` - Full 2.4GB array allocations |
| 172 | +- Comprehensive `np.arange` battle tests (50+ cases) |
| 173 | +- Container protocol tests (100+ cases) |
| 174 | +- Type hierarchy tests (74 cases) |
| 175 | +- All typing functions have battle test files |
| 176 | + |
| 177 | +### Test Fixes |
| 178 | +- Fixed 71 tests for NumPy 2.x Int64 alignment |
| 179 | +- Enabled 74 previously failing tests (marked as OpenBugs but passing) |
| 180 | +- CI workflow updated to properly exclude `[HighMemory]` tests on Ubuntu |
| 181 | + |
| 182 | +--- |
| 183 | + |
| 184 | +## Breaking Changes |
| 185 | + |
| 186 | +| Change | Migration | |
| 187 | +|--------|-----------| |
| 188 | +| `Shape.dimensions` changed to `long[]` | Update code accessing dimensions directly | |
| 189 | +| `Shape.strides` changed to `long[]` | Update code accessing strides directly | |
| 190 | +| `NDArray.size` changed to `long` | Use `long` or cast to `int` where safe | |
| 191 | +| `np.arange(int)` returns `Int64` | Use `.astype(np.int32)` if Int32 needed | |
| 192 | +| `Contains()` throws on shape mismatch | Wrap in try-catch if relying on `False` | |
| 193 | +| `ValueType` returns changed to `object` | Cast return values explicitly | |
| 194 | +| `np.shuffle` removed axis parameter | Was non-functional, use correct NumPy API | |
| 195 | + |
| 196 | +--- |
| 197 | + |
| 198 | +## Summary Statistics |
| 199 | + |
| 200 | +| Metric | Value | |
| 201 | +|--------|-------| |
| 202 | +| Commits | 166 | |
| 203 | +| Files Changed | 456 | |
| 204 | +| Lines Added | 95,375 | |
| 205 | +| Lines Removed | 8,238 | |
| 206 | +| New C# Files | 50+ | |
| 207 | +| New Test Files | 30+ | |
| 208 | +| Battle Tests Added | ~500+ | |
| 209 | +| Previously Failing Tests Fixed | 145 | |
0 commit comments