fix: prevent thread and memory leaks from PostHog telemetry by utkarsh240799 · Pull Request #4535 · mem0ai/mem0

utkarsh240799 · 2026-03-25T08:47:24Z

Description

Every call to capture_event() created a new AnonymousTelemetry instance, which spawned a new PostHog client with a background consumer thread. That thread was never shut down, causing unbounded thread and memory growth in any long-running process (web servers, APIs, etc.). Additionally, Memory had no close() method, so SQLite connections were never explicitly released.

This was confirmed as recently as March 2026 by users on v1.0.4, with reports of containers being regularly killed by OOM.

Problem

Three resource leaks reported in #3376:

Thread leak: capture_event() in telemetry.py instantiated a new AnonymousTelemetry (→ new Posthog() client → new background consumer thread) on every call. These threads were never shut down.
Memory leak: The accumulated PostHog clients and their threads were never garbage collected, causing unbounded memory growth.
Connection leak: Memory and AsyncMemory had no close() method, so SQLite connections from SQLiteManager were never explicitly released.

Solution

Telemetry singleton (`mem0/memory/telemetry.py`)

Replaced per-call AnonymousTelemetry() construction with a thread-safe lazy singleton (_get_oss_telemetry()) using double-checked locking — only one PostHog client and background thread per process, no matter how many capture_event() calls are made.
Registered an atexit handler (_shutdown_oss_telemetry) for process-level cleanup. The singleton is not shut down per-Memory.close() call — this avoids the design flaw identified in PR refactor: add close() to Memory and thread-safe telemetry singleton #4497 where closing one Memory instance would kill telemetry for all other living instances.
Made AnonymousTelemetry.close() idempotent by setting self.posthog = None after shutdown.
Added atexit.register(client_telemetry.close) for the module-level client_telemetry singleton.

Memory lifecycle (`mem0/memory/main.py`)

Added Memory.close() — closes the SQLite connection via self.db.close().
Added __enter__ / __exit__ for context manager support (with Memory() as m: ...).
Added AsyncMemory.close() with __aenter__ / __aexit__ for async context manager support (async with AsyncMemory() as m: ...).
Defensive hasattr guard so close() is safe even if __init__ failed partway through.

Backward compatibility

No public API changes: capture_event(), capture_client_event(), client_telemetry, AnonymousTelemetry, and MEM0_TELEMETRY retain identical signatures and import paths.
Memory / AsyncMemory: Only new methods added. No existing methods modified. Code that never calls close() works exactly as before.
All new internals (_oss_telemetry_instance, _get_oss_telemetry, _shutdown_oss_telemetry) are private (underscore-prefixed).

Testing

New tests added (`tests/test_telemetry.py` — 43 tests total)

TestAnonymousTelemetryClose (4 tests):

close() calls posthog.shutdown()
close() sets posthog to None (idempotency)
Double close() doesn't raise and only calls shutdown() once
capture_event() is a no-op after close()

TestTelemetrySingleton (8 tests):

Singleton reuses the same instance across repeated calls
Only one instance created across 20 concurrent threads (thread safety)
atexit.register called exactly once
50 capture_event() calls → only 1 Posthog() constructor call (core leak fix)
_shutdown_oss_telemetry() closes and clears the singleton
Double shutdown is idempotent
Shutdown is a no-op when singleton was never created
Disabled telemetry never initializes the singleton

TestMemoryLifecycle (6 tests):

close() calls db.close()
Double close() is safe
close() safe when db is None
close() safe when db attribute not set (partial __init__)
Context manager (with) calls close() on exit
Context manager calls close() even on exception

TestAsyncMemoryLifecycle (5 tests):

close() calls db.close()
close() safe when db is None or not set
Async context manager (async with) calls close() on exit
Async context manager calls close() even on exception

Full suite results

481 passed, 66 skipped, 0 failed

All pre-existing tests continue to pass. The 66 skips are pre-existing (Neptune provider requiring langchain_aws).

Each call to capture_event() created a new AnonymousTelemetry instance (and PostHog background thread) that was never shut down, causing unbounded thread/memory growth in long-running processes. - Replace per-call AnonymousTelemetry creation with a thread-safe lazy singleton (_get_oss_telemetry) using double-checked locking - Register atexit handler for process-level PostHog cleanup - Make AnonymousTelemetry.close() idempotent (set posthog=None) - Add Memory.close() and context manager support (with/async with) - Add AsyncMemory.close() and async context manager support - Add comprehensive tests for singleton, lifecycle, and edge cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

xkonjin

I found one likely correctness regression in the new telemetry singleton.

_get_oss_telemetry(vector_store=...) only uses the vector_store from the first Memory/AsyncMemory instance that emits telemetry. After that, every later capture_event() call reuses the same singleton and never re-runs get_or_create_user_id() for other vector stores.

Why this matters:

before this change, each capture_event() constructed AnonymousTelemetry(vector_store=memory_instance._telemetry_vector_store), so every store got a chance to persist / look up the user identity
after this change, only the first store ever gets that initialization side effect
in a long-running process that touches multiple stores, later stores will silently skip that setup path

I think the leak fix is still right, but this part probably needs to separate the PostHog client singleton from the per-store identity bootstrap, or explicitly preserve the previous per-store initialization behavior.

A regression test that creates two memory instances with different _telemetry_vector_store values and verifies both stores are initialized would make this safer.

utkarsh240799

Thanks for the thorough review @xkonjin!

I looked into this and I believe it's not a correctness regression. Here's why:

get_or_create_user_id(vector_store) (setup.py:35-56) works as follows:

Reads user_id from ~/.mem0/config.json — this is always the same value regardless of which vector store is passed
Tries to look up / persist that ID in the vector store as a best-effort side effect (both get and insert are wrapped in bare except: pass)
Returns the config file's user_id in all cases

So the vector_store parameter only affects a persistence side effect, not the return value. The telemetry distinct_id sent to PostHog is identical regardless of which store initializes the singleton. Before this fix, every capture_event() call was re-running that lookup, but after the first successful insert, subsequent calls just found the existing entry and returned the same ID — it was already a no-op.

The singleton simply avoids repeating that redundant lookup on every call, which is the right tradeoff given that the alternative was spawning a new PostHog thread per call.

whysosaket

Overall this is a well-structured fix for a real resource leak. The singleton approach is correct, tests are thorough, and backward compatibility is maintained. A few issues worth addressing before merge — see inline comments.

…ector_store parameter

)

) Co-authored-by: utkarsh240799 <utkarsh240799@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: kartik-mem0 <kartik.labhshetwar@mem0.ai>

utkarsh240799 and others added 2 commits March 25, 2026 14:11

fix: remove unused import flagged by linter

2c8a6f0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

utkarsh240799 requested review from kartik-mem0 and whysosaket March 25, 2026 11:50

xkonjin reviewed Mar 25, 2026

View reviewed changes

utkarsh240799 commented Mar 25, 2026

View reviewed changes

utkarsh240799 requested a review from xkonjin March 25, 2026 12:33

kartik-mem0 removed request for kartik-mem0 and xkonjin March 27, 2026 12:30

whysosaket reviewed Mar 28, 2026

View reviewed changes

Comment thread mem0/memory/telemetry.py Outdated

Comment thread tests/test_telemetry.py Outdated

Comment thread mem0/memory/telemetry.py

kartik-mem0 mentioned this pull request Mar 30, 2026

Bug: Memory leak and thread leak issues in mem0 library #3376

Closed

refactor: update telemetry singleton to support shutdown and remove v…

b80ef21

…ector_store parameter

whysosaket approved these changes Apr 3, 2026

View reviewed changes

whysosaket merged commit fcbb70a into main Apr 3, 2026
8 checks passed

whysosaket deleted the fix/memory-thread-leak-3376 branch April 3, 2026 14:31

lukaj99 added a commit to lukaj99/mem0 that referenced this pull request Apr 4, 2026

fix: prevent thread and memory leaks from PostHog telemetry (mem0ai#4535

4b5fd5c

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent thread and memory leaks from PostHog telemetry#4535

fix: prevent thread and memory leaks from PostHog telemetry#4535
whysosaket merged 3 commits intomainfrom
fix/memory-thread-leak-3376

utkarsh240799 commented Mar 25, 2026 •

edited

Loading

Uh oh!

xkonjin left a comment

Uh oh!

utkarsh240799 left a comment

Uh oh!

whysosaket left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

utkarsh240799 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Telemetry singleton (mem0/memory/telemetry.py)

Memory lifecycle (mem0/memory/main.py)

Backward compatibility

Testing

New tests added (tests/test_telemetry.py — 43 tests total)

Full suite results

Uh oh!

xkonjin left a comment

Choose a reason for hiding this comment

Uh oh!

utkarsh240799 left a comment

Choose a reason for hiding this comment

Uh oh!

whysosaket left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

utkarsh240799 commented Mar 25, 2026 •

edited

Loading

Telemetry singleton (`mem0/memory/telemetry.py`)

Memory lifecycle (`mem0/memory/main.py`)

New tests added (`tests/test_telemetry.py` — 43 tests total)