Skip to content

Commit eddce70

Browse files
committed
refactor(test): add [LargeMemoryTest] attribute inheriting from [OpenBugs]
Created LargeMemoryTestAttribute that inherits from OpenBugsAttribute, automatically gaining the "OpenBugs" category for CI exclusion. This provides semantic clarity - tests marked [LargeMemoryTest] are not bugs, they're just too memory-intensive for CI runners. Updated: - AllocationTests: [HighMemory][OpenBugs] -> [LargeMemoryTest] - LongIndexingBroadcastTest: [HighMemory][OpenBugs] -> [LargeMemoryTest]
1 parent 10e2e14 commit eddce70

4 files changed

Lines changed: 241 additions & 10 deletions

File tree

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# NumSharp Release Notes - Long Indexing Branch
2+
3+
**456 files changed | 95,375 insertions | 8,238 deletions**
4+
5+
---
6+
7+
## Major Features
8+
9+
### Int64/Long Indexing Support (Arrays >2GB)
10+
11+
Complete migration from `int` to `long` indexing across the entire codebase, enabling arrays larger than 2.1 billion elements (~2GB for byte arrays, ~16GB for doubles).
12+
13+
**Core Changes:**
14+
- `Shape.dimensions`, `Shape.strides`, `Shape.size`, `Shape.offset` changed to `long[]` / `long`
15+
- `NDArray.size`, `NDArray.len`, all indexers changed to `long`
16+
- `ArraySlice<T>`, `UnmanagedMemoryBlock<T>`, `UnmanagedStorage` migrated to `long` indexing
17+
- `NDIterator`, `MultiIterator` updated for `long` coordinates and offsets
18+
- All 20+ ILKernelGenerator partial classes updated for `long` loop counters and offsets
19+
- Added `UnmanagedSpan<T>` (ported from dotnet/runtime) for Span-like semantics with `long` length
20+
- Added `LongIntroSort` for sorting large arrays
21+
- `Hashset<T>` upgraded to long-based indexing with 33% growth for large collections
22+
23+
**API Additions for Long Indexing:**
24+
- `long[]` overloads for `NDArray.GetInt32()`, `GetInt64()`, `GetSingle()`, etc.
25+
- `long` population size support in `np.random.choice`
26+
- `long` repeat counts in `np.repeat`
27+
- All random sampling functions support `long[]` size parameters
28+
29+
---
30+
31+
### NumPy 2.x Alignment
32+
33+
**`np.arange()` Now Returns Int64:**
34+
- Integer inputs now return `Int64` arrays (NumPy 2.x behavior)
35+
- Fixed negative step behavior to match NumPy exactly
36+
- Fixed integer arithmetic for dtype casting (matches NumPy's template approach)
37+
- Inlined type-specific loops matching NumPy's `arraytypes.c.src` implementation
38+
39+
**Type System Overhaul:**
40+
- Added `NPTypeHierarchy.cs` encoding NumPy's exact type tree structure (from `multiarraymodule.c`)
41+
- `Bool` is NOT under `Number` (NumPy 2.x critical behavior)
42+
- `issubdtype(int32, int64)` correctly returns `False` (concrete types are siblings)
43+
44+
**New Type Introspection APIs:**
45+
- `np.can_cast(from, to, casting)` - Full NumPy-compatible type casting checks
46+
- `np.promote_types(type1, type2)` - Type promotion following NumPy rules
47+
- `np.result_type(*arrays_and_dtypes)` - Result type inference
48+
- `np.min_scalar_type(value)` - Minimum scalar type for a value
49+
- `np.common_type(*arrays)` - Common type for arrays
50+
- `np.issubdtype(arg1, arg2)` - Type hierarchy checking
51+
- `np.isreal()`, `np.iscomplex()`, `np.isrealobj()`, `np.iscomplexobj()`
52+
- `np.finfo(dtype)` - Machine limits for floating-point types
53+
- `np.iinfo(dtype)` - Machine limits for integer types
54+
55+
**Container Protocol (Python `in` operator):**
56+
- `NDArray.Contains()` now propagates broadcasting errors (matches NumPy's `__contains__`)
57+
- `[1,2] in np.array([1,2,3])` now throws `IncorrectShapeException`
58+
- Type mismatch returns `False` (e.g., `"hello" in np.array([1,2,3])`)
59+
60+
---
61+
62+
### New NDArray Methods
63+
64+
- **`NDArray.tolist()`** - Convert NDArray to nested lists (NumPy parity)
65+
- **`NDArray.item(*args)`** - Copy element to standard Python scalar
66+
- **`np.frombuffer()`** - Complete rewrite with full NumPy-compatible signature:
67+
- `count` and `offset` parameters
68+
- Big-endian byte swap support via dtype strings (`">u4"`, `">i4"`)
69+
- `ArraySegment<byte>`, `Memory<byte>`, `IntPtr`, `void*` overloads
70+
- Optional dispose callback for native memory ownership
71+
- View semantics (pinned buffer, modifications affect original)
72+
73+
---
74+
75+
## Performance Improvements
76+
77+
### SIMD-Optimized MatMul
78+
- **35-100x speedup** over scalar path for matrix multiplication
79+
- Added `SimdMatMul` with V128/V256/V512 vector support
80+
- Long indexing support for matrices >2GB
81+
82+
### SIMD NaN Statistics
83+
- `nansum`, `nanmean`, `nanstd`, `nanvar`, `nanmin`, `nanmax` optimized with SIMD
84+
- Added `ILKernelGenerator.Reduction.NaN.cs` (1,097 lines of IL generation)
85+
86+
### General SIMD Improvements
87+
- All reduction operations (sum, prod, min, max, mean, std, var) with SIMD paths
88+
- Scan operations (cumsum, cumprod) with SIMD optimization
89+
- Boolean reductions (any, all) with SIMD fast paths
90+
91+
### np.arange() Performance
92+
- Inlined type-specific loops (no delegate overhead per element)
93+
- Direct pointer casts matching NumPy's template-generated fill functions
94+
95+
---
96+
97+
## Bug Fixes
98+
99+
### Core Functionality
100+
- **`np.any/np.all` with axis parameter**: Now supports 0D (scalar) arrays with `axis=0` or `axis=-1`
101+
- **`np.arange` negative step**: Fixed to return `[10,8,6,4,2]` instead of `[9,7,5,3,1]` for `arange(10,0,-2)`
102+
- **Scalar broadcast assignment**: Fixed cross-dtype conversion
103+
- **Fancy indexing**: Support for all integer dtypes (Int16, Int32, Int64, etc.)
104+
- **`NDArray.unique()`**: Fixed for long indexing support
105+
- **`np.repeat`**: Fixed dtype handling and long count support
106+
- **`np.random.choice`**: Fixed for long population sizes
107+
- **`np.shuffle`**: Aligned with NumPy legacy API (removed axis parameter that didn't exist)
108+
- **`np.random.standard_normal`**: Fixed typo in API
109+
- **`np.random()`**: Added alias for uniform distribution
110+
111+
### ILKernel Fixes
112+
- Fixed numerous int32 overflow issues in loop counters
113+
- Fixed `TransformOffset` calculations for >2GB arrays
114+
- Fixed SIMD helper functions for long indexing
115+
116+
### Test Fixes
117+
- Fixed 71 test failures from NumPy 2.x Int64 alignment
118+
- Removed `[OpenBugs]` from 74 now-passing tests
119+
- Fixed dtype-specific getter mismatches throughout test suite
120+
121+
---
122+
123+
## Refactoring
124+
125+
### ValueType to Object Migration
126+
- All scalar return types migrated from `ValueType` to `object`
127+
- `NPTypeCode.GetDefaultValue()` returns `object`
128+
- All operators migrated to NumPy-aligned object pattern
129+
- NDArray null checks converted from `== null` to `is null` pattern
130+
131+
### Type System Consolidation
132+
- `can_cast` derived from type promotion tables (replaced 80+ lines of switch cases)
133+
- Single source of truth for type hierarchy (`NPTypeHierarchy`)
134+
- Removed duplicate `TypeKind` enum and category helper methods
135+
136+
### Code Cleanup
137+
- Removed unused `Fx.cs` (953 lines of pooling code)
138+
- Removed `KernelKey.cs`, `KernelSignatures.cs`, `SimdThresholds.cs`, `TypeRules.cs`
139+
- Removed `StorageType.cs`, `np.linalg.norm.cs` (incomplete LAPACK bindings)
140+
- Removed `LongList<T>` utility class
141+
- Removed LINQ extension files (`IEnumeratorExtensions.cs`, `MaxBy.cs`)
142+
143+
---
144+
145+
## Documentation
146+
147+
### New Guides
148+
- **Buffering, Arrays and Unmanaged Memory** - Memory architecture, view vs copy semantics, ownership model
149+
- **IL Kernel Generation** - How ILKernelGenerator works with SIMD
150+
- **NumPy .npy/.npz Format Reference** - Binary format implementation details
151+
- **Int64 Indexing Migration Guide** - Patterns for large array support
152+
153+
### API Documentation
154+
- Complete typing function documentation with NumPy alignment notes
155+
- `np.frombuffer` overloads and ownership model documentation
156+
157+
---
158+
159+
## Test Improvements
160+
161+
### New Test Infrastructure
162+
- `[HighMemory]` attribute for tests requiring 8GB+ RAM
163+
- `[SkipOnLowMemory]` runtime memory check attribute
164+
- `TestMemoryTracker` for diagnosing CI OOM failures
165+
- Proper TUnit category exclusion in CI
166+
167+
### New Test Coverage
168+
- **~500+ new battle tests** validated against actual NumPy 2.x output
169+
- `LongIndexingSmokeTest` - 96 np.* function coverage with 1M element arrays
170+
- `LongIndexingBroadcastTest` - 2.36 billion element broadcast iterations
171+
- `LongIndexingMasterTest` - Full 2.4GB array allocations
172+
- Comprehensive `np.arange` battle tests (50+ cases)
173+
- Container protocol tests (100+ cases)
174+
- Type hierarchy tests (74 cases)
175+
- All typing functions have battle test files
176+
177+
### Test Fixes
178+
- Fixed 71 tests for NumPy 2.x Int64 alignment
179+
- Enabled 74 previously failing tests (marked as OpenBugs but passing)
180+
- CI workflow updated to properly exclude `[HighMemory]` tests on Ubuntu
181+
182+
---
183+
184+
## Breaking Changes
185+
186+
| Change | Migration |
187+
|--------|-----------|
188+
| `Shape.dimensions` changed to `long[]` | Update code accessing dimensions directly |
189+
| `Shape.strides` changed to `long[]` | Update code accessing strides directly |
190+
| `NDArray.size` changed to `long` | Use `long` or cast to `int` where safe |
191+
| `np.arange(int)` returns `Int64` | Use `.astype(np.int32)` if Int32 needed |
192+
| `Contains()` throws on shape mismatch | Wrap in try-catch if relying on `False` |
193+
| `ValueType` returns changed to `object` | Cast return values explicitly |
194+
| `np.shuffle` removed axis parameter | Was non-functional, use correct NumPy API |
195+
196+
---
197+
198+
## Summary Statistics
199+
200+
| Metric | Value |
201+
|--------|-------|
202+
| Commits | 166 |
203+
| Files Changed | 456 |
204+
| Lines Added | 95,375 |
205+
| Lines Removed | 8,238 |
206+
| New C# Files | 50+ |
207+
| New Test Files | 30+ |
208+
| Battle Tests Added | ~500+ |
209+
| Previously Failing Tests Fixed | 145 |

test/NumSharp.UnitTest/Backends/Unmanaged/AllocationTests.cs

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,10 @@
66
namespace NumSharp.UnitTest.Backends.Unmanaged
77
{
88
/// <summary>
9-
/// Tests for large memory allocations.
10-
/// Marked as [OpenBugs] because they allocate 4-44GB of memory and
11-
/// cause OOM crashes on CI runners.
9+
/// Tests for large memory allocations (4-44GB).
10+
/// Marked as [LargeMemoryTest] to auto-exclude from CI.
1211
/// </summary>
13-
[HighMemory]
14-
[OpenBugs]
12+
[LargeMemoryTest]
1513
public class AllocationTests
1614
{
1715
private const long onegb = 1_073_741_824;

test/NumSharp.UnitTest/LongIndexing/LongIndexingBroadcastTest.cs

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,10 @@ namespace NumSharp.UnitTest.LongIndexing;
2121
/// - Operations that produce output (add, multiply, etc.) allocate full-size output arrays
2222
/// even when input is broadcast, causing OutOfMemoryException
2323
///
24-
/// NOTE: Marked [OpenBugs] because iterating over 2.36 billion elements causes
25-
/// excessive CPU/memory pressure when TUnit runs tests in parallel, leading to
26-
/// OOM crashes on CI runners.
24+
/// NOTE: Marked [LargeMemoryTest] because iterating over 2.36 billion elements causes
25+
/// excessive CPU/memory pressure when TUnit runs tests in parallel.
2726
/// </summary>
28-
[HighMemory]
29-
[OpenBugs]
27+
[LargeMemoryTest]
3028
public class LongIndexingBroadcastTest
3129
{
3230
/// <summary>

test/NumSharp.UnitTest/TestCategory.cs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,3 +289,29 @@ public class HighMemoryAttribute : CategoryAttribute
289289
{
290290
public HighMemoryAttribute() : base(TestCategory.HighMemory) { }
291291
}
292+
293+
/// <summary>
294+
/// Attribute for tests that allocate large amounts of memory and crash CI runners.
295+
/// Inherits from <see cref="OpenBugsAttribute"/> so tests are automatically excluded
296+
/// from CI via the <c>[Category!=OpenBugs]</c> filter.
297+
///
298+
/// <para>Use this instead of [OpenBugs] for memory-intensive tests that aren't actually bugs,
299+
/// just too heavy for CI runners.</para>
300+
/// </summary>
301+
/// <example>
302+
/// <code>
303+
/// [Test]
304+
/// [LargeMemoryTest] // Auto-excluded from CI
305+
/// public async Task Allocate_4GB()
306+
/// {
307+
/// var arr = np.ones&lt;int&gt;((4L * 1024 * 1024 * 1024 / 4)); // 4GB
308+
/// }
309+
/// </code>
310+
/// </example>
311+
public class LargeMemoryTestAttribute : OpenBugsAttribute
312+
{
313+
public LargeMemoryTestAttribute()
314+
{
315+
// Inherits OpenBugs category for CI exclusion
316+
}
317+
}

0 commit comments

Comments
 (0)