Skip to content

Harden dedup/reconcile pipeline#26

Open
aayush3011 wants to merge 1 commit into
AzureCosmosDB:mainfrom
aayush3011:users/akataria/dedup_reconcile_improvements
Open

Harden dedup/reconcile pipeline#26
aayush3011 wants to merge 1 commit into
AzureCosmosDB:mainfrom
aayush3011:users/akataria/dedup_reconcile_improvements

Conversation

@aayush3011

Copy link
Copy Markdown
Contributor

Summary

Hardens the memory dedup/reconcile pipeline and simplifies the search surface. Sync, async, and durable (Function App) paths are kept in lockstep.

Changes

  • Extraction watermark - auto-trigger paths size recent_k from a persisted per-thread watermark (last_extract_count on the counter doc) instead of a fixed window. The watermark advances only after a successful extract, so transient extract failures no longer strand turns.
  • Vector dedup ladder - near-exact extractions auto-drop; borderline ones are persisted and tagged sys:dup-candidate for the LLM reconcile. Stale tags on seeds that never cluster are cleared.
  • Reconcile backstop - the full-pool re-cluster cadence is driven by the persisted counter (not an in-memory per-worker counter), so it fires reliably on Function App deployments. Threaded consistently through sync, async, and durable.
  • Distance-function aware - dedup reads the container's configured distanceFunction and disables the cosine-calibrated auto-drop for euclidean.
  • Search API - removed the hybrid_search flag; every search_cosmos call now fuses vector + BM25 automatically (keyword extraction with graceful fall-back to pure vector).

Testing

  • 905 unit tests + ruff clean
  • Full live integration suite green; added async end-to-end + extraction-time vector-dedup integration tests
  • All 15 samples and 4 demo notebooks run clean against live Azure

Copilot AI review requested due to automatic review settings July 1, 2026 05:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the memory extraction/dedup/reconcile pipeline across sync, async, and Durable Function App paths by introducing a persisted extraction watermark, improving vector-distance awareness, and simplifying search to always attempt hybrid (vector + full-text) ranking with a safe fallback to vector-only.

Changes:

  • Added persisted per-thread extraction watermark (last_extract_count) to size recent_k and advance only after successful extract→persist, preventing stranded turns after transient failures.
  • Implemented/validated a vector-dedup “ladder” and candidate-mode reconcile behavior, including distance-function awareness (cosine/dotproduct vs euclidean) and a persisted-counter cadence for periodic full-pool backstops.
  • Removed hybrid_search flag and switched search to automatic keyword extraction with a Cosmos FullTextScore term cap and vector-only fallback for all-stopword queries.

Reviewed changes

Copilot reviewed 63 out of 63 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unit/test_utils.py Adds unit coverage for keyword extraction, vector-distance helpers, and container policy distanceFunction reads.
tests/unit/test_thresholds.py Adds coverage for env-backed threshold getters and internalized (non-env) dedup/search constants.
tests/unit/test_reconcile.py Pins legacy reconcile paths for existing tests; removes extract-time UPDATE tests.
tests/unit/test_process_now.py Updates expectations for fact+episodic reconcile being invoked.
tests/unit/test_procedural_synthesis.py Pins legacy extract dedup knobs for stability.
tests/unit/test_pipeline_confidence.py Pins legacy extract dedup knobs for stability.
tests/unit/test_cosmos_memory_client.py Updates constructor/serverless autoscale behavior and hybrid search SQL expectations; adds forwarding tests.
tests/unit/test_auto_trigger.py Adds watermark-driven recent_k tests and persisted-counter full-rebuild cadence tests.
tests/unit/store/test_memory_store.py Verifies hybrid SQL uses @kwN params and stopword fallback to vector-only.
tests/unit/services/test_pipeline_service.py Updates extract behavior to “add-only” facts; pins legacy dedup/reconcile mode for tests.
tests/unit/services/test_extract_dry.py Adds dry-extract stage-1 search behavior tests and async mirrors.
tests/unit/services/test_dedup_vector.py Adds extensive sync unit coverage for vector dedup ladder + candidate-mode reconcile.
tests/unit/services/test_chaos_extract_persist.py Pins extract dedup knobs across sync/async for chaos tests.
tests/unit/processors/test_protocol_satisfaction.py Updates processor protocol to accept recent_k.
tests/unit/processors/test_inprocess.py Updates in-process processor behavior for fact+episodic reconcile and recent_k plumbing.
tests/unit/function_app/test_orchestrators.py Updates orchestration chain to Extract→Dedup→Persist and reconciles fact+episodic; adds watermark advance activity tests.
tests/unit/function_app/test_change_feed.py Adds watermark-based recent_k assertions and persisted-counter full_rebuild cadence tests.
tests/unit/aio/test_reconcile_telemetry.py Pins legacy async reconcile mode for telemetry tests.
tests/unit/aio/test_process_now.py Updates async process_now expectations for fact+episodic reconcile being awaited.
tests/unit/aio/test_cosmos_memory_client.py Updates async hybrid search SQL expectations and forwarding tests; serverless autoscale ignore behavior.
tests/unit/aio/test_auto_trigger.py Adds async watermark recent_k tests and async full-rebuild cadence tests.
tests/unit/aio/services/test_dedup_vector_async.py Adds extensive async unit coverage for vector dedup ladder + candidate-mode reconcile.
tests/unit/aio/processors/test_protocol_satisfaction.py Updates async processor protocol to accept recent_k.
tests/unit/aio/processors/test_inprocess.py Updates async in-process processor behavior for fact+episodic reconcile and recent_k plumbing.
tests/integration/test_processor_integration.py Updates sync integration to expect fact+episodic reconcile calls.
tests/integration/test_processor_integration_async.py Updates async integration to expect fact+episodic reconcile calls.
tests/integration/test_full_pipeline.py Removes hybrid_search flag usage; adds live integration for extract-time vector dedup.
tests/integration/test_async_full_pipeline.py Adds new async live integration smoke test mirroring sync behavior.
Samples/Notebooks/Demo_async.ipynb Removes hybrid_search flag usage in the async notebook demo.
Samples/Advanced/advanced_search_patterns.py Updates narrative to reflect hybrid-by-default search; removes flag usage.
function_app/triggers/change_feed.py Computes recent_k from persisted watermark and sets persisted-counter full-rebuild cadence.
function_app/shared/counters.py Preserves last_extract_count; adds read/advance watermark helpers.
function_app/shared/config.py Removes unused float parsing helper.
function_app/orchestrators/extract_memories.py Inserts Dedup activity; forwards recent_k and full_rebuild; advances watermark post-persist.
function_app/local.settings.json.template Adds DEDUP_EVERY_N to the template.
Docs/troubleshooting.md Updates configuration guidance (no hybrid flag; throughput guidance via client args).
Docs/public_api.md Updates public API docs to remove hybrid_search argument and describe fallback behavior.
Docs/design_patterns.md Removes hybrid_search flag from examples.
Docs/concepts.md Documents watermarking, vector-floor ladder, dual-mode reconcile, and hybrid-search behavior.
azure/cosmos/agent_memory/thresholds.py Internalizes several dedup/search knobs as fixed constants with accessor functions.
azure/cosmos/agent_memory/store/memory_store.py Switches to keyword extraction and hybrid SQL driven by extracted terms.
azure/cosmos/agent_memory/store/_search_helpers.py Builds hybrid SQL using per-keyword @kwN parameters with vector-only fallback.
azure/cosmos/agent_memory/services/_pipeline_helpers.py Improves LLM JSON parse errors with truncation heuristics and clearer guidance.
azure/cosmos/agent_memory/prompts/extract_memories.prompty Removes extract-time UPDATE/CONTRADICT schema and increases maxOutputTokens; clarifies speaker discrimination.
azure/cosmos/agent_memory/prompts/dedup.prompty Increases maxOutputTokens.
azure/cosmos/agent_memory/prompts/dedup_episodic.prompty Adds new episodic merge-only reconcile prompt.
azure/cosmos/agent_memory/prompts/_schemas.py Adds episodic dedup schema; removes action/supersedes fields from extraction schema.
azure/cosmos/agent_memory/processors/inprocess.py Adds recent_k plumbing and fact+episodic reconcile routing with optional full_rebuild.
azure/cosmos/agent_memory/processors/durable.py Extends protocol to accept recent_k and full_rebuild (no-op).
azure/cosmos/agent_memory/processors/base.py Extends processor protocol with recent_k and full_rebuild.
azure/cosmos/agent_memory/cosmos_memory_client.py Removes hybrid_search plumbing; forwards episodic search options; reconcile uses full rebuild.
azure/cosmos/agent_memory/auto_trigger.py Uses persisted watermark for recent_k; persisted-counter full-rebuild cadence; advances watermark on success only.
azure/cosmos/agent_memory/aio/store/memory_store.py Async mirror of keyword-extraction hybrid search behavior.
azure/cosmos/agent_memory/aio/processors/inprocess.py Async mirror of processor changes for recent_k and fact+episodic reconcile routing.
azure/cosmos/agent_memory/aio/processors/durable.py Async protocol extension (no-op).
azure/cosmos/agent_memory/aio/processors/base.py Async protocol extension.
azure/cosmos/agent_memory/aio/cosmos_memory_client.py Async mirror of client search/reconcile changes.
azure/cosmos/agent_memory/aio/auto_trigger.py Async mirror of watermark recent_k and persisted-counter full-rebuild cadence.
azure/cosmos/agent_memory/_utils.py Adds keyword extraction (stopwords + 30-term cap) and distance-function utilities.
azure/cosmos/agent_memory/_counters.py Adds read/advance extract watermark helpers; preserves watermark across updates.
.env.template Removes throughput/embedding-distance env knobs now expected to be passed explicitly as client args.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Docs/concepts.md
Comment on lines 162 to 165
### Tunable

`DEDUP_EVERY_N` (default 5) controls how often `reconcile_memories` runs in the auto-trigger path. Set to `0` to disable. The candidate cap `n` (default 50) is tunable per call; larger values give the LLM a wider view at higher token cost.
`DEDUP_EVERY_N` (default 5) controls how often reconcile runs in the auto-trigger path. Set to `0` to disable. The candidate cap `n` (default `DEDUP_POOL_SIZE`, 50) is tunable per call; larger values give the LLM a wider view at higher token cost. `DEDUP_FULL_RECLUSTER_EVERY_N` (default 12) sets how often the full-pool backstop fires.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants