Skip to content

Commit ce6883e

Browse files
committed
Apply npx prettier
1 parent 6c70558 commit ce6883e

1 file changed

Lines changed: 25 additions & 22 deletions

File tree

_posts/2025-5-12-s3-vectors.md

Lines changed: 25 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@ category:
88
tags: machine-learning, mlops
99
---
1010

11-
S3 Vectors adds **native vector storage + ANN search** to S3 via *vector buckets* and *vector indexes*. It’s cost‑oriented, elastic, and ideal for big volumes with **sub‑second** queries when throughput is moderate. It’s in **preview** and has sharp edges: hard **Top‑K=30**, only **float32** vectors. This post focuses on where it fits and the gotchas that actually matter in design.
11+
S3 Vectors adds **native vector storage + ANN search** to S3 via _vector buckets_ and _vector indexes_. It’s cost‑oriented, elastic, and ideal for big volumes with **sub‑second** queries when throughput is moderate. It’s in **preview** and has sharp edges: hard **Top‑K=30**, only **float32** vectors. This post focuses on where it fits and the gotchas that actually matter in design.
1212

1313
---
1414

1515
## Why S3 Vectors matters
1616

1717
Embeddings are everywhere (RAG, recommendations, search). Most teams start with a hosted vector DB and quickly hit two things:
1818

19-
1) **Cost at scale**: millions to billions of vectors add up.
20-
2) **Ops overhead**: clusters, replicas, upgrades, and capacity planning.
19+
1. **Cost at scale**: millions to billions of vectors add up.
20+
2. **Ops overhead**: clusters, replicas, upgrades, and capacity planning.
2121

2222
**S3 Vectors** flips that default: store vectors in S3 with a **purpose‑built index** that you can **query directly**. You don’t manage nodes; you pay S3‑style pricing for storage + per‑request querying. If your workload tolerates low to mid QPS but needs large capacity and durability, this is a compelling baseline.
2323

@@ -36,18 +36,20 @@ Key properties you’ll design around:
3636
- **Distance**: Cosine or Euclidean (immutable per index).
3737
- **Throughput**: write RPS per index is low; design for batching and sharding.
3838
- **Retrieve limit**: Top-K is capped at 30 per similarity query — and there is no pagination mechanism for ANN queries.
39+
3940
---
4041

4142
## Hard limits (and how to design around them)
4243

4344
- **Top‑K ≤ 30** (no pagination). If you need >30 another vector database could be more suitable.
4445

45-
- **Inserts**: up to **500 vectors** per `PutVectors`; **5 write RPS per index**.
46+
- **Inserts**: up to **500 vectors** per `PutVectors`; **5 write RPS per index**.
47+
4648
- **Design**: batch & backpressure; parallelize across **multiple indexes** (logical shards) if you need higher throughput.
4749

48-
- **Data types**: **`float32` only** for vector values. If you pass other types, S3 Vectors converts to float32.
50+
- **Data types**: **`float32` only** for vector values. If you pass other types, S3 Vectors converts to float32.
4951

50-
- **Immutable index schema**: `dimension`, `distance`, and **non‑filterable metadata keys** can’t be changed.
52+
- **Immutable index schema**: `dimension`, `distance`, and **non‑filterable metadata keys** can’t be changed.
5153

5254
---
5355

@@ -67,36 +69,38 @@ S3 Vectors aims to **lower total cost** for large volumes with moderate QPS: you
6769

6870
## When to choose S3 Vectors vs alternatives
6971

70-
| Need | S3 Vectors | OpenSearch (Serverless/Managed) | SaaS Vector DB |
71-
|---|---|---|---|
72-
| Massive volume, low cost, low‑to‑mid QPS | **Yes** | Maybe (if QPS climbs) | Maybe |
73-
| Ultra‑low latency, high QPS, complex filters | Can fall short | **Yes** | **Yes** |
74-
| Simple ops (no clusters) | **Yes** | Less | Less |
75-
| Bedrock tie‑in (embeddings/KB) | **Yes** | Indirect | Varies |
72+
| Need | S3 Vectors | OpenSearch (Serverless/Managed) | SaaS Vector DB |
73+
| -------------------------------------------- | -------------- | ------------------------------- | -------------- |
74+
| Massive volume, low cost, low‑to‑mid QPS | **Yes** | Maybe (if QPS climbs) | Maybe |
75+
| Ultra‑low latency, high QPS, complex filters | Can fall short | **Yes** | **Yes** |
76+
| Simple ops (no clusters) | **Yes** | Less | Less |
77+
| Bedrock tie‑in (embeddings/KB) | **Yes** | Indirect | Varies |
7678

7779
---
7880

7981
## S3 Vectors vs Upstash Vectors (serverless)
8082

8183
**High‑level**
8284

83-
| Aspect | **S3 Vectors** | **Upstash Vectors** |
84-
|---|---|---|
85-
| Nature | Object store with **native vector indexes** (preview) | Managed **serverless vector DB** |
86-
| Pricing model | Storage + per‑request query (see AWS announcement) | Per‑request pricing (e.g., $0.4/100K req in PAYG) + storage (e.g., $0.25/GB) |
87-
| Query limits | **Top‑K ≤ 30**, **no pagination** in `QueryVectors` | Client‑set `topK` **with pagination** via **Resumable Query** (cursor) |
88-
| Filtering | Metadata filters; declare **non‑filterable** keys at index create | **Metadata filtering**; **namespaces** for isolation |
89-
| Data type | **`float32` only** for vector values | Upstash accepts numeric vector payloads (not advertised as multiple dtypes; equivalently treated as float arrays). |
90-
| Multi‑tenancy | **One bucket, many indexes** (index per tenant) recommended | Multiple **indexes** + **namespaces**; multiple DBs by project |
85+
| Aspect | **S3 Vectors** | **Upstash Vectors** |
86+
| ------------- | ----------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
87+
| Nature | Object store with **native vector indexes** (preview) | Managed **serverless vector DB** |
88+
| Pricing model | Storage + per‑request query (see AWS announcement) | Per‑request pricing (e.g., $0.4/100K req in PAYG) + storage (e.g., $0.25/GB) |
89+
| Query limits | **Top‑K ≤ 30**, **no pagination** in `QueryVectors` | Client‑set `topK` **with pagination** via **Resumable Query** (cursor) |
90+
| Filtering | Metadata filters; declare **non‑filterable** keys at index create | **Metadata filtering**; **namespaces** for isolation |
91+
| Data type | **`float32` only** for vector values | Upstash accepts numeric vector payloads (not advertised as multiple dtypes; equivalently treated as float arrays). |
92+
| Multi‑tenancy | **One bucket, many indexes** (index per tenant) recommended | Multiple **indexes** + **namespaces**; multiple DBs by project |
9193

9294
**Links**
95+
9396
- [AWS announcement](https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/)
9497
- [S3 Vectors Query limits](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-limitations.html)
9598
- [Upstash pricing](https://upstash.com/pricing/vector) and [FAQ](https://upstash.com/docs/vector/help/faq)
9699
- [Upstash filtering & namespaces](https://upstash.com/docs/vector/features/filtering) and [SDK query params](https://upstash.com/docs/vector/sdks/py/example_calls/query)
97100
- [Upstash Resumable Query](https://upstash.com/docs/vector/features/resumablequery)
98101

99102
**Implications**
103+
100104
- If you need **scroll/infinite‑results UIs**, Upstash’s **resumable query** is simpler. In S3 Vectors you’ll fan‑out across segments and re‑rank to bypass `Top‑K=30`. The fan-out could end in a clique with means some vectors would never be retrieved.
101105
- If your profile is **huge corpuses + moderate QPS + lowest storage cost**, S3 Vectors is attractive. For **real‑time UX** with richer ranking/pagination, Upstash is frictionless.
102106
- Multi‑tenant: both solve it, but S3 Vectors centralizes observability/logging per bucket; Upstash leans on namespaces and per‑index isolation.
@@ -107,5 +111,4 @@ S3 Vectors aims to **lower total cost** for large volumes with moderate QPS: you
107111

108112
S3 Vectors won’t replace every vector DB —but it **changes the starting point**. If your priority is **cost and simplicity** with large datasets and moderate QPS, start with S3 Vectors and scale out to OpenSearch or a dedicated vector DB only where the **latency/query profile** demands it.
109113

110-
*Questions you want me to benchmark next?* Throughput vs. shards, recall vs. filters, or evaluator design for reranking under `Top‑K=30`.
111-
114+
_Questions you want me to benchmark next?_ Throughput vs. shards, recall vs. filters, or evaluator design for reranking under `Top‑K=30`.

0 commit comments

Comments
 (0)