Tests | Flakiness improvements to XEventsTracingTest by edwardneal · Pull Request #4262 · dotnet/SqlClient

edwardneal · 2026-05-07T04:59:45Z

Description

This performs some reliability improvements to XEventsTracingTest. We can see intermittent failures, most of which are the result of them being killed to resolve deadlocks.

I wouldn't normally expect the original SQL statements (sp_help and SELECT @@VERSION) to encounter that, but sp_help returns the list of objects in the current database; perhaps if objects are being created by another CI run, this becomes an issue.

To break the dependency on server state, I've replaced both method calls with a call to a new SP which just runs SELECT 1, and a SQL statement which runs SELECT 1 directly.

During investigation it also became clear that an activity ID is recorded (and the test is capable of passing) even when a deadlock or other SQL error occurs. I've made this explicit via a new test case. Technically this means that we could simply broaden the error handling to swallow all SqlException errors when executing the command and the test would continue to pass. I've not done this because I'm a little more concerned that we're encountering deadlocks on comparatively simple statements, and don't want to mask any underlying issue.

Besides this, there are a few QoL improvements:

One test method is now refactored into three coherent test cases. This also makes the reflection work to retrieve the test case name unnecessary.
Added XML documentation which describes the tests (and the reason why they're marked as flaky.)
A new FlushResultSet helper was added in an earlier PR, and we now use it.

Issues

Contributes to #3453.

Testing

One new test case. All three XEvents tests pass, but I don't think I can easily reproduce the same kind of load. Could someone run CI against this PR multiple times at peak load please?

@Version

* Refactor one test method into three distinct test cases. * Add reasons for the tests being marked as flaky. * Switch call to 'sp_help' to a new, simpler SP. * Switch execution of 'SELECT @@Version' to a simpler 'SELECT 1' statement. * Use new FlushResultSet helper. * Simplify XEvent session name generation. * Add test case to verify that an activity ID is recorded in the extended event when the SQL statement throws an error.

cheenamalhotra · 2026-05-08T22:15:06Z

/azp run

azure-pipelines · 2026-05-08T22:16:41Z

Azure Pipelines successfully started running 2 pipeline(s).

edwardneal · 2026-05-10T07:06:37Z

That's looking positive for the first run, although it looks like the LocalAppContextSwitchesTest.TestDefaultAppContextSwitchValues unit test is unrelatedly flaky - possibly because it's running in parallel with other unit tests which change the values. I'd appreciate a few more runs of this.

mdaigle · 2026-05-12T00:02:23Z

/azp run

azure-pipelines · 2026-05-12T00:02:47Z

Azure Pipelines successfully started running 2 pipeline(s).

edwardneal · 2026-05-12T05:55:02Z

No deadlocks on the second run, but it looks like there are a few problems with the number of sessions open against a single Azure SQL instance - there's a hard limit of 128MB session memory:

Microsoft.Data.SqlClient.SqlException : Operation failed. Operation will cause database event session memory to exceed allowed limit. Event session memory may be released by stopping active sessions or altering session memory options. Check sys.dm_xe_database_sessions for active sessions that can be stopped or altered.

We'd originally increased the MAX_MEMORY from 4MB to 16MB in an attempt to tackle the deadlocks, but perhaps these have been addressed at source. I've cut this back to 4MB, can someone run the pipelines a few times please?

priyankatiwari08 · 2026-05-12T07:33:38Z

                $"WITH (" +
                $"  {duration} " +
-                $"  MAX_MEMORY=16 MB," +
+                $"  MAX_MEMORY=4 MB," +


Could we add a one-line comment next to MAX_MEMORY=4 MB explaining why this value matters? The PR description has great context about Azure SQL's 128 MB global cap and why raising it caused parallel-CI failures — but that context lives in the PR, not the code. Without the comment, the next person debugging XEvent buffer pressure will likely bump it back up to "fix" their immediate symptom and re-trigger the original problem. Even a short // Don't raise — Azure SQL caps total XEvent session memory at 128MB. See PR #4262. would protect us.

priyankatiwari08 · 2026-05-12T07:36:50Z

/azp run

azure-pipelines · 2026-05-12T07:37:15Z

Azure Pipelines successfully started running 2 pipeline(s).

edwardneal · 2026-05-12T16:15:17Z

I've added an explanatory comment and pushed, thanks @priyankatiwari08. The last pipeline run passed, so I'd appreciate a few more at peak load before I unmark them as flaky.

mdaigle · 2026-05-12T16:40:05Z

/azp run

azure-pipelines · 2026-05-12T16:40:36Z

Azure Pipelines successfully started running 2 pipeline(s).

edwardneal · 2026-05-12T22:38:57Z

All tests passed on those two runs, which looks encouraging - particularly since this run had one or two failures which seemed to be an indicator that the remote server is under load. Thanks.

Could you run this twice more at peak please?

mdaigle · 2026-05-12T23:08:40Z

/azp run

azure-pipelines · 2026-05-12T23:09:07Z

Azure Pipelines successfully started running 2 pipeline(s).

edwardneal requested a review from a team as a code owner May 7, 2026 04:59

github-project-automation Bot added this to SqlClient Board May 7, 2026

github-project-automation Bot moved this to To triage in SqlClient Board May 7, 2026

cheenamalhotra moved this from To triage to In review in SqlClient Board May 7, 2026

cheenamalhotra added the Area\Tests Issues that are targeted to tests or test projects label May 7, 2026

cheenamalhotra added this to the 7.1.0-preview2 milestone May 7, 2026

cheenamalhotra assigned mdaigle and priyankatiwari08 May 8, 2026

Cut MAX_MEMORY on XEvents session to 4MB

07dbf26

priyankatiwari08 reviewed May 12, 2026

View reviewed changes

Add explanatory comment to MAX_MEMORY

6c8fdc8

Conversation

edwardneal commented May 7, 2026

Description

Issues

Testing

Uh oh!

cheenamalhotra commented May 8, 2026

Uh oh!

azure-pipelines Bot commented May 8, 2026

Uh oh!

edwardneal commented May 10, 2026

Uh oh!

mdaigle commented May 12, 2026

Uh oh!

azure-pipelines Bot commented May 12, 2026

Uh oh!

edwardneal commented May 12, 2026

Uh oh!

priyankatiwari08 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

priyankatiwari08 commented May 12, 2026

Uh oh!

azure-pipelines Bot commented May 12, 2026

Uh oh!

edwardneal commented May 12, 2026

Uh oh!

mdaigle commented May 12, 2026

Uh oh!

azure-pipelines Bot commented May 12, 2026

Uh oh!

edwardneal commented May 12, 2026

Uh oh!

mdaigle commented May 12, 2026

Uh oh!

azure-pipelines Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants