Problem
CsvReader.ReadFromMemory(ReadOnlyMemory<char>) and CsvReader.ReadFromMemoryOptimized(ReadOnlyMemory<char>, ...) terminate enumeration on the first blank line they encounter in the middle of the source, dropping every record that comes after.
Repro:
var csv = "a,b,c\r\n1,2,3\r\n\r\n4,5,6\r\n";
foreach (var row in CsvReader.ReadFromMemoryOptimized(csv.AsMemory()))
Console.WriteLine(row.Raw);
// Output: just "1,2,3" -- the "4,5,6" record is silently dropped.
Compare with CsvReader.Read(TextReader) over the same data, which correctly skips the blank line (via the default CsvOptions.SkipRow predicate (row, idx) => row.Span.IsEmpty || row.Span[0] == '#') and yields both records.
Root cause
MemorySliceLineSource.TryReadLine (in Csv/CsvReader.Engine.cs) and MemoryReaderLineSource.TryReadLine both return false when the just-read line is empty. The engine treats false as end-of-stream, so a blank middle-of-stream line stops the loop.
// MemorySliceLineSource.TryReadLine
if (slice.IsEmpty)
{
line = default;
return false; // <-- terminates enumeration, NOT EOF
}
The MemoryReaderLineSource case has the same shape:
line = csv.ReadLine(ref position);
return !line.IsEmpty; // <-- returns false on blank line
The Read / ReadAsSpan / ReadAsync paths don't have this bug because TextReader.ReadLine() distinguishes blank line (returns "") from EOF (returns null); the engine sees the empty string as a real line, passes it to SkipRow, which skips it.
Pre-existing
This is not a regression introduced by #118 — it's a pre-existing behavior in ReadLineOptimized and the MemoryText.ReadLine(ref position) extension. The pre-refactor ReadFromMemoryOptimizedImpl did while ((line = ReadLineOptimized(...)).IsEmpty == false) (where the optimized reader returned Empty for both blank lines AND EOF). The pre-refactor ReadFromMemory did the same with csv.ReadLine(ref position). #118 preserved this verbatim per its scope ("unify the loops, preserve behavior") and did not attempt to fix it.
Gemini flagged this during the #118 review (#121).
Proposed fix
TryReadLine must distinguish "blank line in middle of stream" from "EOF". For both memory-based sources:
- Return
true whenever a line was successfully sliced — even if it's zero-length.
- Return
false only when position >= csv.Length BEFORE attempting to read.
For MemorySliceLineSource:
public bool TryReadLine(out MemoryText line, out string? lineString)
{
lineString = null;
if (position >= csv.Length)
{
line = default;
return false;
}
var span = csv.Span.Slice(position);
var newlineIndex = span.IndexOfAny('\n', '\r');
if (newlineIndex == -1)
{
line = csv.Slice(position);
position = csv.Length;
return true; // even if empty, this is the last line
}
line = csv.Slice(position, newlineIndex);
position += newlineIndex;
// ... consume the newline sequence ...
return true; // even for blank lines
}
For MemoryReaderLineSource: return true unconditionally after a successful csv.ReadLine(ref position) (the EOF check at the top of the method already gates entry).
After this change, the default CsvOptions.SkipRow predicate will correctly skip blank middle-of-stream lines, and the memory-based paths will match the TextReader-based paths' behavior.
Acceptance
-
A new test in Csv.Tests/EngineUnificationTests.cs parameterized over all 5 reader paths, asserting that:
a,b,c
1,2,3
<-- blank line
4,5,6
produces two records (1,2,3 and 4,5,6) on every path, not one.
-
A negative test asserting that if the user sets CsvOptions.SkipRow = (_, _) => false, the blank line is returned as an empty record (3 records total on the HeaderPresent path: header + 1,2,3 + empty + 4,5,6 minus the header).
Risk
Low. The fix changes the semantic of TryReadLine from "false on empty OR EOF" to "false only on EOF". The engine's existing SkipRow flow handles the empty-line case correctly downstream. The cross-path matrix test in EngineUnificationTests.cs would pin the new uniformity.
Problem
CsvReader.ReadFromMemory(ReadOnlyMemory<char>)andCsvReader.ReadFromMemoryOptimized(ReadOnlyMemory<char>, ...)terminate enumeration on the first blank line they encounter in the middle of the source, dropping every record that comes after.Repro:
Compare with
CsvReader.Read(TextReader)over the same data, which correctly skips the blank line (via the defaultCsvOptions.SkipRowpredicate(row, idx) => row.Span.IsEmpty || row.Span[0] == '#') and yields both records.Root cause
MemorySliceLineSource.TryReadLine(inCsv/CsvReader.Engine.cs) andMemoryReaderLineSource.TryReadLineboth returnfalsewhen the just-read line is empty. The engine treatsfalseas end-of-stream, so a blank middle-of-stream line stops the loop.The
MemoryReaderLineSourcecase has the same shape:The
Read/ReadAsSpan/ReadAsyncpaths don't have this bug becauseTextReader.ReadLine()distinguishes blank line (returns"") from EOF (returnsnull); the engine sees the empty string as a real line, passes it toSkipRow, which skips it.Pre-existing
This is not a regression introduced by #118 — it's a pre-existing behavior in
ReadLineOptimizedand theMemoryText.ReadLine(ref position)extension. The pre-refactorReadFromMemoryOptimizedImpldidwhile ((line = ReadLineOptimized(...)).IsEmpty == false)(where the optimized reader returnedEmptyfor both blank lines AND EOF). The pre-refactorReadFromMemorydid the same withcsv.ReadLine(ref position). #118 preserved this verbatim per its scope ("unify the loops, preserve behavior") and did not attempt to fix it.Gemini flagged this during the #118 review (#121).
Proposed fix
TryReadLinemust distinguish "blank line in middle of stream" from "EOF". For both memory-based sources:truewhenever a line was successfully sliced — even if it's zero-length.falseonly whenposition >= csv.LengthBEFORE attempting to read.For
MemorySliceLineSource:For
MemoryReaderLineSource:return trueunconditionally after a successfulcsv.ReadLine(ref position)(the EOF check at the top of the method already gates entry).After this change, the default
CsvOptions.SkipRowpredicate will correctly skip blank middle-of-stream lines, and the memory-based paths will match theTextReader-based paths' behavior.Acceptance
A new test in
Csv.Tests/EngineUnificationTests.csparameterized over all 5 reader paths, asserting that:produces two records (
1,2,3and4,5,6) on every path, not one.A negative test asserting that if the user sets
CsvOptions.SkipRow = (_, _) => false, the blank line is returned as an empty record (3 records total on theHeaderPresentpath: header +1,2,3+ empty +4,5,6minus the header).Risk
Low. The fix changes the semantic of
TryReadLinefrom "false on empty OR EOF" to "false only on EOF". The engine's existingSkipRowflow handles the empty-line case correctly downstream. The cross-path matrix test inEngineUnificationTests.cswould pin the new uniformity.