Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 7 additions & 17 deletions docs/reference/data-sources/mongodb.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,7 @@ The full set of configuration options is available [here](https://rtd.feast.dev/

## Vector Search

The MongoDB online store supports [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/), enabling similarity search over feature embeddings stored in MongoDB Atlas. This is powered by the `$vectorSearch` aggregation stage and requires MongoDB Atlas (or the `mongodb/mongodb-atlas-local` Docker image for local development).

See [PR #6344](https://github.com/feast-dev/feast/pull/6344) for full implementation details.
The MongoDB online store supports [MongoDB Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/), enabling similarity search over feature embeddings stored in MongoDB. This is powered by the `$vectorSearch` aggregation stage and supports MongoDB Atlas, self-hosted MongoDB with Atlas Search indexes, and the `mongodb/mongodb-atlas-local` Docker image for local development.

### Configuration

Expand All @@ -41,7 +39,7 @@ project: my_project
provider: local
online_store:
type: mongodb
connection_string: mongodb+srv://<user>:<pass>@cluster.mongodb.net
connection_string: mongodb+srv://<user>:<pass>@cluster.mongodb.net # pragma: allowlist secret
vector_enabled: true
similarity: cosine # cosine | euclidean | dotProduct
vector_index_wait_timeout: 60 # seconds to wait for index to become queryable
Expand Down Expand Up @@ -76,32 +74,24 @@ item_embeddings = FeatureView(
)
```

When `feast apply` (or `store.update()`) runs with `vector_enabled=True`, Atlas vector search indexes are automatically created for any field with `vector_index=True`. Indexes are also automatically dropped when feature views are removed.
When `feast apply` (or `store.update()`) runs with `vector_enabled=True`, MongoDB vector search indexes are automatically created for any field with `vector_index=True`. Indexes are also automatically dropped when feature views are removed.

### Retrieving Documents via Vector Search

Use `retrieve_online_documents_v2()` to perform similarity search:

```python
source = FeatureStore(repo_path=".")
store = FeatureStore(repo_path=".")
results = store.retrieve_online_documents_v2(
config=repo_config,
table=item_embeddings,
requested_features=["embedding", "title"],
embedding=[0.1, 0.2, ...], # query vector
features=["item_embeddings:embedding", "item_embeddings:title"],
query=[0.1, 0.2, ...], # query vector
top_k=5,
)

# Each result is a (event_timestamp, entity_key_proto, feature_dict) tuple.
# feature_dict includes a synthetic "distance" key with the vector search score.
for ts, entity_key, features in results:
print(features["title"].string_val, features["distance"].float_val)
```
```

### How It Works

- **Index creation**: `update()` creates an Atlas vector search index named `<feature_view>__<field>__vs_index` for each vector-indexed field. It waits for the index to reach `READY` status before proceeding.
- **Index creation**: `update()` creates a MongoDB vector search index named `<feature_view>__<field>__vs_index` for each vector-indexed field. It waits for the index to reach `READY` status before proceeding.
- **Query execution**: `retrieve_online_documents_v2()` builds a `$vectorSearch` aggregation pipeline with `numCandidates = max(top_k * 10, 100)` and the specified `limit`.
- **Score**: Results include a `distance` field populated from `$meta: "vectorSearchScore"`.
- **BSON compatibility**: Query vectors are coerced to native Python floats to avoid numpy serialization issues.
Expand Down
2 changes: 0 additions & 2 deletions docs/reference/offline-stores/mongodb.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
## Description

The MongoDB offline store provides support for reading [MongoDBSource](../data-sources/mongodb.md).
* Uses a single shared collection with a compound index for all FeatureViews, distinguished by a `feature_view` discriminator field.
* Entity dataframes can be provided as a Pandas dataframe. The offline store converts entity identifiers into serialized entity keys for efficient lookup against the collection.

## Getting started

Expand Down
Loading