Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions docs/Rules/EPC38.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# EPC38 - Do not re-enumerate `IEnumerable<Task>`

Detects when a deferred `IEnumerable<Task>` / `IEnumerable<Task<T>>` is observed (enumerated) more than once
inside the same method body.

## Description

The standard way to "fan out" async work in .NET is to build a sequence of tasks from some input and then
wait for all of them:

```csharp
var tasks = ids.Select(async id => await DoWorkAsync(id));
await Task.WhenAll(tasks);
```

This *looks* like it produces a stable collection, but `Select` returns a **deferred** `IEnumerable<Task<T>>`:
every enumeration restarts the projection and creates a fresh set of tasks. So this seemingly innocent code:

```csharp
var tasks = ids.Select(async id => await DoWorkAsync(id));
await Task.WhenAll(tasks);
foreach (var t in tasks) // ❌ EPC38
{
Process(await t);
}
```

…does **not** process the results of the tasks we awaited. It starts a brand-new set of `DoWorkAsync` calls,
awaits each one again, and processes those. The work is duplicated, side effects fire twice, and timing-
sensitive behavior breaks in confusing ways.

Materializing the sequence up front fixes the bug:

```csharp
var tasks = ids.Select(async id => await DoWorkAsync(id)).ToArray();
await Task.WhenAll(tasks);
foreach (var t in tasks) // ✅ tasks is Task<T>[] — safe to re-enumerate
{
Process(await t);
}
```

The analyzer flags **any** symbol whose static type is `IEnumerable<Task>`, `IEnumerable<Task<T>>`,
`IEnumerable<ValueTask>`, or `IEnumerable<ValueTask<T>>` that is enumerated more than once inside the same
method body. Both observations matter — the rule does not require the first one to be `Task.WhenAll`.

## Code that triggers the analyzer

```csharp
async Task FooAsync(IEnumerable<int> ids)
{
var tasks = ids.Select(async id => { await Task.Delay(1); return id; });
await Task.WhenAll(tasks);
foreach (var t in tasks) { await t; } // ❌ EPC38
}

void Counts(IEnumerable<Task<int>> tasks)
{
var count = tasks.Count();
var sum = tasks.Sum(t => t.Result); // ❌ EPC38
}

async Task ManualAsync(IEnumerable<Task> tasks)
{
foreach (var t in tasks) { await t; }
foreach (var t in tasks) { _ = t.Status; } // ❌ EPC38 (strict superset case)
}
```

## How to fix

Materialize the sequence once and use the concrete collection from then on:

```csharp
async Task FooAsync(IEnumerable<int> ids)
{
var tasks = ids.Select(async id => { await Task.Delay(1); return id; }).ToArray();
await Task.WhenAll(tasks);
foreach (var t in tasks) { await t; } // ✅ Task<int>[] — safe
}
```

`ToList()`, `ToArray()`, and similar materializing operators all work. If the source is a parameter and you
control the API, prefer to take `IReadOnlyList<Task<T>>` or `Task<T>[]` instead of `IEnumerable<Task<T>>`,
which both communicates intent and makes the rule fall silent.

## When NOT to suppress

In almost every case this is a real bug. Suppressing it usually means you've convinced yourself the producer
is *not* deferred — in which case the right fix is to change the static type of the variable to the
materialized type (`IReadOnlyList<Task<T>>`, `Task<T>[]`, etc.) so future readers don't have to make the
same leap.

## Limitations (v1)

- Analysis is scoped to a single method body. The rule does not track tasks passed across method boundaries.
- Field and property accesses are not tracked.
- Lambdas / local functions are analyzed as independent scopes.
- Re-assignment (`tasks = freshSource;`) resets tracking from that point onward.

## See also

- [CA1851: Possible multiple enumerations of `IEnumerable` collection](https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1851) — the general-purpose rule. EPC38 specifically targets the task-producing case where re-enumeration is almost always a bug, not just a performance issue.
- [EPC39](EPC39.md) — quadratic enumeration inside a loop.
98 changes: 98 additions & 0 deletions docs/Rules/EPC39.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# EPC39 - Same enumerable iterated more than once inside a loop

Detects three patterns where the *same* `IEnumerable` is iterated more than once inside an outer loop,
producing O(N²) (or worse) behavior.

## Description

Iterating the same source twice inside a loop turns an O(N) walk into an O(N·M) one. When the inner pass
is a deferred LINQ chain, every outer iteration also re-runs every chained filter / projection.

The analyzer recognizes three sub-patterns and reports each with a tailored message:

### Q1 — Same-symbol nested enumeration

```csharp
foreach (var x in coll)
{
var n = coll.Count(); // ❌ EPC39 — Count() re-iterates the entire collection on every step
}
```

### Q2 — Deferred IEnumerable<T> re-enumerated in a loop

```csharp
IEnumerable<int> q = source.Where(x => x > 0);
foreach (var i in outer)
{
var c = q.Count(); // ❌ EPC39 — Where(...) re-runs every iteration
}
```

### Q4 — Nested foreach over the same source

```csharp
foreach (var x in coll)
{
foreach (var y in coll) { ... } // ❌ EPC39 — O(N²)
}
```

## How to fix

### Q1 / Q2 — materialize once, or use a different shape

```csharp
var arr = coll.ToArray(); // or .ToList()
foreach (var x in arr)
{
var n = arr.Length; // O(1)
}
```

If the inner work is genuinely "count every element matching X", pre-compute it outside the loop or replace
the loop body with a single LINQ aggregation that performs one pass.

### Q4 — symmetric problems usually benefit from a hash join

```csharp
var lookup = new HashSet<int>(coll);
foreach (var x in coll)
{
if (lookup.Contains(-x)) { ... }
}
```

## When NOT to suppress

If the data set is *known* to be small (say, ≤ 16 elements) and the loop is on a hot path with a tight
cache footprint, a linear scan can actually be faster than a HashSet lookup. In those cases prefer to
suppress at the call site with a comment explaining the choice:

```csharp
#pragma warning disable EPC39 // List is bounded to <=8 elements; HashSet has higher constant factor.
```

## Patterns intentionally NOT flagged

- Calling `List<T>.Contains`, `Array.IndexOf`, `List<T>.FindIndex`, etc. inside a loop on a collection
that is **not** the loop's source. The cost is O(N·M), but in real codebases these patterns are
frequently correct on small inputs and previously produced too much noise. Use
[CA1851](https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1851)
if you want broader "any double-enumeration" coverage, or [EPC23](EPC23.md) for the specific
`HashSet.Contains`-via-LINQ misuse.
- `ImmutableArray<T>.Contains` and `Span<T>.IndexOf` — they don't allocate and are typically used on
tiny inputs.

## Limitations (v2)

- Analysis is scoped to a single method body. Helper calls that internally enumerate are not tracked.
- Q1/Q2 LINQ root resolution stops at the first non-chain call. A chain like
`GetItems().Where(...).Count()` is not flagged unless the root is a local/parameter (because resolving
the source through arbitrary method bodies is unsound).

## See also

- [CA1851: Possible multiple enumerations of `IEnumerable` collection](https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1851) — the general-purpose rule. EPC39 is more aggressive in the *quadratic* dimension (it fires inside loops) but narrower than CA1851 in scope (it only fires when the source is the same enumerable).
- [EPC38](EPC38.md) — re-enumeration of `IEnumerable<Task>` (correctness bug).
- [EPC23](EPC23.md) — `Enumerable.Contains` on a `HashSet<T>` (a related "wrong tool for the job" pattern).
58 changes: 58 additions & 0 deletions samples/ErrorProne.Samples/AsyncStuff/TaskReEnumeration.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace ErrorProne.Samples.AsyncStuff
{
public class TaskReEnumeration
{
// Canonical bug: WhenAll consumes the deferred IEnumerable<Task<TResult>>; the subsequent foreach
// re-runs Select() and starts a NEW set of tasks that are completely unrelated to the ones we awaited.
public async Task<int> ComputeTotalAsync(IEnumerable<int> ids)
{
var tasks = ids.Select(async id =>
{
await Task.Delay(1);
return id * 2;
});

await Task.WhenAll(tasks);

var total = 0;
foreach (var t in tasks) // ❌ EPC38 — second enumeration starts new tasks
{
total += await t;
}

return total;
}

// Two LINQ aggregations on the same deferred enumerable: also a re-enumeration bug.
public int CountAndSum(IEnumerable<Task<int>> tasks)
{
var count = tasks.Count();
var sum = tasks.Sum(t => t.Result); // ❌ EPC38
return count + sum;
}

// Fix: materialize the deferred enumerable first so re-iteration is safe.
public async Task<int> ComputeTotalFixedAsync(IEnumerable<int> ids)
{
var tasks = ids.Select(async id =>
{
await Task.Delay(1);
return id * 2;
}).ToArray();

await Task.WhenAll(tasks);

var total = 0;
foreach (var t in tasks) // ✅ safe — `tasks` is Task<int>[]
{
total += await t;
}

return total;
}
}
}
63 changes: 63 additions & 0 deletions samples/ErrorProne.Samples/CoreAnalyzers/QuadraticEnumeration.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
using System.Collections.Generic;
using System.Linq;

namespace ErrorProne.Samples.CoreAnalyzers
{
public class QuadraticEnumeration
{
// Q1 — LINQ-enumerating call whose source is the same enumerable being iterated. O(N^2).
public int SameSourceCount(IEnumerable<int> coll)
{
var total = 0;
foreach (var x in coll)
{
var n = coll.Count(); // ❌ EPC39 (Q1)
total += x * n;
}

return total;
}

// Q2 — re-enumeration of a deferred IEnumerable<T> inside a loop. Each iteration re-runs Where().
public int DeferredInLoop(IEnumerable<int> source, int[] outer)
{
IEnumerable<int> q = source.Where(x => x > 0);
var total = 0;
foreach (var i in outer)
{
total += q.Count(); // ❌ EPC39 (Q2)
}

return total;
}

// Q4 — nested foreach over the same source.
public int NestedForeachSameSource(List<int> coll)
{
var pairs = 0;
foreach (var x in coll)
{
foreach (var y in coll) // ❌ EPC39 (Q4)
{
if (x + y == 0) pairs++;
}
}

return pairs;
}

// ✅ Fixes:

public int DeferredInLoopFixed(IEnumerable<int> source, int[] outer)
{
var q = source.Where(x => x > 0).ToList(); // materialize once
var total = 0;
foreach (var i in outer)
{
total += q.Count;
}

return total;
}
}
}
Loading
Loading