Affected packages:
- azure-core-http-netty 1.15.5
- azure-core 1.58.0
- azure-ai-agents 2.1.0
Severity: High — causes silent failure of all retried sync HTTP requests
Description
When using the Azure SDK's synchronous HTTP pipeline (HttpPipeline.sendSync) with the Netty HTTP client, a transient connection failure on the first attempt triggers a retry. During cleanup, RetryPolicy.attemptSync calls response.close() → NettyAsyncHttpResponse.close() → NettyUtility.closeConnection() → channel.eventLoop(), which throws:
java.lang.IllegalStateException: channel not registered to an event loop
at io.netty.channel.AbstractChannel.eventLoop(AbstractChannel.java:163)
at com.azure.core.http.netty.implementation.NettyUtility.closeConnection(NettyUtility.java:79)
at com.azure.core.http.netty.implementation.NettyAsyncHttpResponse.close(NettyAsyncHttpResponse.java:116)
at com.azure.core.http.policy.RetryPolicy.attemptSync(RetryPolicy.java:249)
at com.azure.core.http.policy.RetryPolicy.processSync(RetryPolicy.java:160)
at com.azure.core.http.HttpPipeline.sendSync(HttpPipeline.java:138)
Root Cause
Reactor Netty creates a Channel object during a connection attempt but registers it to an event loop asynchronously. If the connection fails (TCP or SSL) before channel.register(eventLoop) completes, the Channel object is left in an unregistered state. AbstractChannel.eventLoop() unconditionally throws IllegalStateException when eventLoop is null, yet NettyUtility.closeConnection() calls it without any null/registration guard.
Steps to Reproduce
- Use NettyAsyncHttpClientBuilder to create an HttpClient
- Pass it to any AgentsClientBuilder (or any Azure SDK builder) and call a sync method (e.g., ResponseService.create(...))
- Introduce a transient network error on the first attempt (or have the endpoint experience a brief hiccup)
- The RetryPolicy logs "Retrying." at DEBUG and then immediately throws IllegalStateException before the retry is attempted
Expected Behavior
NettyUtility.closeConnection() (or NettyAsyncHttpResponse.close()) should guard against an unregistered channel — either by checking channel.isRegistered() before calling channel.eventLoop(), or by catching IllegalStateException and closing via a fallback path (e.g., channel.unsafe().closeForcibly()).
Workaround
Switch the HTTP transport to azure-core-http-okhttp, which is thread-pool-based and has no channel-registration lifecycle:
com.azure
azure-core-http-okhttp
1.12.8
Environment
┌─────────────────────────┬───────────────────────┐
│ │ │
├─────────────────────────┼───────────────────────┤
│ Java │ 17 │
├─────────────────────────┼───────────────────────┤
│ Spring Boot │ 3.4.3 │
├─────────────────────────┼───────────────────────┤
│ azure-ai-agents │ 2.1.0 │
├─────────────────────────┼───────────────────────┤
│ azure-core │ 1.58.0 │
├─────────────────────────┼───────────────────────┤
│ azure-core-http-netty │ 1.15.5 │
├─────────────────────────┼───────────────────────┤
│ netty-transport │ 4.1.118.Final │
├─────────────────────────┼───────────────────────┤
│ OS │ Linux (containerized) │
└─────────────────────────┴───────────────────────┘
Affected packages:
Severity: High — causes silent failure of all retried sync HTTP requests
Description
When using the Azure SDK's synchronous HTTP pipeline (HttpPipeline.sendSync) with the Netty HTTP client, a transient connection failure on the first attempt triggers a retry. During cleanup, RetryPolicy.attemptSync calls response.close() → NettyAsyncHttpResponse.close() → NettyUtility.closeConnection() → channel.eventLoop(), which throws:
java.lang.IllegalStateException: channel not registered to an event loop
at io.netty.channel.AbstractChannel.eventLoop(AbstractChannel.java:163)
at com.azure.core.http.netty.implementation.NettyUtility.closeConnection(NettyUtility.java:79)
at com.azure.core.http.netty.implementation.NettyAsyncHttpResponse.close(NettyAsyncHttpResponse.java:116)
at com.azure.core.http.policy.RetryPolicy.attemptSync(RetryPolicy.java:249)
at com.azure.core.http.policy.RetryPolicy.processSync(RetryPolicy.java:160)
at com.azure.core.http.HttpPipeline.sendSync(HttpPipeline.java:138)
Root Cause
Reactor Netty creates a Channel object during a connection attempt but registers it to an event loop asynchronously. If the connection fails (TCP or SSL) before channel.register(eventLoop) completes, the Channel object is left in an unregistered state. AbstractChannel.eventLoop() unconditionally throws IllegalStateException when eventLoop is null, yet NettyUtility.closeConnection() calls it without any null/registration guard.
Steps to Reproduce
Expected Behavior
NettyUtility.closeConnection() (or NettyAsyncHttpResponse.close()) should guard against an unregistered channel — either by checking channel.isRegistered() before calling channel.eventLoop(), or by catching IllegalStateException and closing via a fallback path (e.g., channel.unsafe().closeForcibly()).
Workaround
Switch the HTTP transport to azure-core-http-okhttp, which is thread-pool-based and has no channel-registration lifecycle:
com.azure azure-core-http-okhttp 1.12.8Environment
┌─────────────────────────┬───────────────────────┐
│ │ │
├─────────────────────────┼───────────────────────┤
│ Java │ 17 │
├─────────────────────────┼───────────────────────┤
│ Spring Boot │ 3.4.3 │
├─────────────────────────┼───────────────────────┤
│ azure-ai-agents │ 2.1.0 │
├─────────────────────────┼───────────────────────┤
│ azure-core │ 1.58.0 │
├─────────────────────────┼───────────────────────┤
│ azure-core-http-netty │ 1.15.5 │
├─────────────────────────┼───────────────────────┤
│ netty-transport │ 4.1.118.Final │
├─────────────────────────┼───────────────────────┤
│ OS │ Linux (containerized) │
└─────────────────────────┴───────────────────────┘