Skip to content

[BUG] Possible memory leak azure-sdk-bom 1.3.6 with netty 4.1.134-FINAL #49440

@riconstantin

Description

@riconstantin

Describe the bug
We have discovered a possible memory leak with a very specific use-case: we use azure communication only when an application is deployed and never after. Because of this, we run into Azure Service Bus AMQP idle-timeout reconnects. From our analysis on some memory dumps from before the app gets OOM killed, we noticed that every reconnect allocated ~67–134 MB of native memory that the allocator never returned to the OS. Over 6 days and dozens of reconnect cycles, this accumulated to ~1.8–2.0 GB of invisible off-heap memory — exhausting the 6 GiB container limit.

Exception or Stack Trace

The specific chain:

Azure Service Bus broker: 300-second AMQP idle timeout

broker sends amqp:connection:forced

Azure SDK (Reactor/Netty): ReactorSession + RequestResponseChannel errors

Netty PooledByteBufAllocator allocates new native Chunk(s) (~67–134 MB)
via sun.misc.Unsafe.allocateMemory() — bypasses ALL JVM memory metrics

Old chunks returned to pool arena, but native OS pages NEVER freed

RSS grows by ~67–134 MB per reconnect, permanently

After ~11 reconnects: RSS at 97% of 6 GiB limit

13:16:46Z: minor GC burst + CPU spike on both pods

RSS crosses 6 GiB → kernel OOM killer → SIGKILL on both replicas

Setup (please complete the following information):

  • Library/Libraries: azure-sdk-bom 1.3.6, netty 4.1.134-FINAL
  • Java version: 21
  • App Server/Environment: Openshift
  • Frameworks: SpringBoot 3.5.8

Additional context
I can provide (if needed) memory dump + the analysis.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

Metadata

Metadata

Assignees

No one assigned

    Labels

    customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-triageWorkflow: This is a new issue that needs to be triaged to the appropriate team.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions