You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR adds support for data-parallel embedding computation in the SentenceTransformerEmbeddingFunction. Fixes#6903 . This brings it to parity with other local embedding functions (like FastEmbed) and allows users to fully utilize multi-core CPUs and multi-GPU setups.
Improvements & Bug fixes
N/A
New functionality
batch_size support: Users can now specify a custom batch_size for local encoding. It defaults to None, which falls back to the sentence-transformers internal default (32).
Multi-process pool support: Introduced the multiprocess_devices parameter. When provided (e.g., ["cpu", "cpu"] or ["cuda:0", "cuda:1"]), a persistent multi-process pool is initialized and used for all embedding calls.
Automatic Cleanup: Implemented __del__ to ensure the multi-process pool is properly stopped when the embedding function instance is garbage collected.
Schema Integration: Updated the sentence_transformer JSON schema to include the new parameters as nullable fields, ensuring configuration persistence and validation work correctly.
Test plan
How are these changes tested?
Tests pass locally with pytest for python
Created chromadb/test/ef/test_sentence_transformer_ef.py with 7 new unit tests covering initialization, execution, resource cleanup, and configuration roundtrips.
Updated chromadb/test/utils/test_embedding_function_schemas.py to verify that the updated schema correctly validates the new fields (including null values).
Migration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
No migrations are required. The changes are fully backwards compatible:
Existing collections without batch_size or multiprocess_devices in their config will default to None, maintaining the previous behavior (single-process, library-default batch size).
The JSON schema updates include null in the permitted types for the new fields to handle existing deployments.
Observability plan
What is the plan to instrument and monitor this change?
Standard python logging is used. No additional observability instrumentation is required for this local compute change.
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section]?
For the [docs section], a small update to the Embedding Functions page would be beneficial to highlight the parallelization capabilities.
Add parallel and batched encoding support to SentenceTransformerEmbeddingFunction
This PR introduces new functionality to SentenceTransformerEmbeddingFunction by adding configurable batch_size and persistent multi-process encoding via multiprocess_devices. The implementation initializes a shared sentence-transformers process pool when devices are provided, passes pool/batch options into encode, and includes cleanup logic in __del__ to stop the pool.
It also updates config persistence and validation paths so these new options round-trip correctly through get_config/build_from_config and JSON schema validation. Test coverage is expanded with a new focused test file for initialization, encode argument behavior, pool usage/cleanup, and config roundtrip, plus a schema test validating nullable fields.
This summary was automatically generated by @propel-code-bot
The reason will be displayed to describe this comment to others. Learn more.
Review found no issues; changes appear well-implemented, tested, and low risk.
Status: No Issues Found | Risk: Low
Review Details
📁 4 files reviewed | 💬 0 comments
Ashish-Abraham
changed the title
ENH: add parallel embedding support to SentenceTransformerEmbeddingFunction
[ENH]: add parallel embedding support to SentenceTransformerEmbeddingFunction
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of changes
Summarize the changes made by this PR.
This PR adds support for data-parallel embedding computation in the
SentenceTransformerEmbeddingFunction. Fixes #6903 . This brings it to parity with other local embedding functions (likeFastEmbed) and allows users to fully utilize multi-core CPUs and multi-GPU setups.batch_sizesupport: Users can now specify a custombatch_sizefor local encoding. It defaults toNone, which falls back to thesentence-transformersinternal default (32).multiprocess_devicesparameter. When provided (e.g.,["cpu", "cpu"]or["cuda:0", "cuda:1"]), a persistent multi-process pool is initialized and used for all embedding calls.__del__to ensure the multi-process pool is properly stopped when the embedding function instance is garbage collected.sentence_transformerJSON schema to include the new parameters as nullable fields, ensuring configuration persistence and validation work correctly.Test plan
How are these changes tested?
pytestfor pythonchromadb/test/ef/test_sentence_transformer_ef.pywith 7 new unit tests covering initialization, execution, resource cleanup, and configuration roundtrips.chromadb/test/utils/test_embedding_function_schemas.pyto verify that the updated schema correctly validates the new fields (includingnullvalues).Migration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
No migrations are required. The changes are fully backwards compatible:
batch_sizeormultiprocess_devicesin their config will default toNone, maintaining the previous behavior (single-process, library-default batch size).nullin the permitted types for the new fields to handle existing deployments.Observability plan
What is the plan to instrument and monitor this change?
Standard python logging is used. No additional observability instrumentation is required for this local compute change.
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section]?